Grammar, why the formal language?
In formal language theory, a grammar consists of
- A set of characters (terminal symbols), the alphabet of the language
- A set of variables (non-terminal symbols)
- A non-terminal symbol (the start symbol or variable)
- A set of productions or rules consisting of pairs of finite strings, where the strings consist of terminal and non-terminal symbols and the first string (the left-hand side) contains at least one non-terminal symbol
A production is usually written with a right-pointing arrow between the left-hand and right-hand sides and a right-hand side that is the null string is represented by a lower-case Greek epsilon.
A grammar is classified according to its productions
- If the left-hand side of each production consists of a single non-terminal symbol and the right-hand side consists of
- one terminal symbol or
- the empty string or
- one terminal symbol followed by one non-terminal symbol, and all other productions that do not fit in (1) or (2) fit in (3), or
- one non-terminal symbol followed by one terminal symbol, and all other productions that do not fit in (1) or (2) fit in (4)
- If the left-hand side of each production consists of a single non-terminal symbol and the grammar is not regular, the grammar is context-free (type 2).
- If the left-hand side of each production that does not consist of a single non-terminal symbol contains a non-terminal symbol and the right-hand side consists of
- the substring preceding that non-terminal symbol in the left-hand side, followed by
- at least two (terminal or non-terminal) symbols, followed by
- the substring following that non-terminal symbol in the left-hand side
- If the grammar is not regular, context-free, or context-sensitive, it is type 0.
- If the sentential variable does not appear in any right-hand side and the only production with an empty right-hand side has a left-hand side consisting solely of the sentential variable, the grammar is said to be epsilon-free.
- the terminal symbols are a and b
- The non-terminal symbols are S, A, and B
- The sentential variable is S
- The productions are S -> a, S -> aA, S -> b, S -> bB, A -> a, A -> aA, B -> b, B -> bB
A direct derivation is a pair of strings where the second string is the result of replacing an occurrence within the first string of the left-hand side of some production with the right-hand side of that production (for example aaaA => aaaaA). A derivation is a sequence of strings where each pair of consecutive string is a direct derivation (for example aaA => aaaA => aaaaA => aaaaa), the derivation can be summarized by its first and last strings (in this example aaA =>* aaaaa). The language generated by a grammar is the set of strings of terminals that can be derived from the sentential variable.
A language that can be generated by a regular grammar is regular, a language that can be generated by a context-free grammar is context-free, and a language that can be generated by a context-sensitive grammar is context-sensitive.
0 Comments:
Post a Comment
Please enter your comment here. Comments wil be reviewed before being published.
Subscribe to Post Comments [Atom]
<< Home