Thursday, February 16, 2012

Regular Expressions

Regular expressions are a way of specifying text search or match criteria that allow alternate strings and repeated sequences inside the template.  This can be used
  +  to determine whether a string follows specific rules,
  +  to find a string within text that fits a specified pattern, or
  +  to split text into lexical elements (tokens).

First some definitions:

A (character) string is a finite sequence of characters, the number of characters in the string is called the length of the string.

A (formal) language is a set of strings in which all the characters in all the strings are members of a finite set (often called the alphabet of the language).  Note that although the length each string is finite and the number of distinct characters in all the strings in the language is finite, the language may contain an infinite number of strings.

Now it starts getting messy.  A regular expression describes the strings in a language (for some languages).  A regular expression contains
  +  characters from the alphabet of the language, and
  +  Syntactic elements, which can include
       -  repetition, often specified by an asterisk (*) or a plus sign (+),
       -  alternative (or choice), often specified by a vertical bar (|),
       -  grouping, surrounding groups by pairs of symbols like parentheses,  ( and ) or square
          brackets ( [ and ] ), or by the same symbol before and after the group, like apostrophes ('),
          and quotation marks ("); any program that allows the full range of regular expressions must
          provide for repetition, alternatives, and grouping,
       -  the null string, often represented in textbooks by an upper case Greek lambda (which looks
          like a peaked roof or the front of a tent), a caret (^) will be used here,
       -  classes of characters, such as letter, digit, whitespace,
       -  other symbols provided for in the program that processes the regular expression.
Read more »

Wednesday, February 15, 2012

Who's using Processing?

Processing is a computer language for producing graphics and animation; it is easy to use (although I am still trying to understand colors and text fonts), but talking about it is difficult.  A conversation may go something like this:

Bud:  Have you heard about Processing?
Lou:  Have I heard about processing what?
Bud:  No, have you heard about the Processing language?
Lou:  The processing language for what?
Bud:  I mean the language Processing.
Lou:  What language processing?
Bud:  The language called Processing. It is a graphics language with a syntax similar to C.
Lou:  A syntax similar to see what?

[ Bud and Lou are the first names of comedians Abbot and Costello, their "Who's on First" is a classic comedy routine.]