Formal Languages and Parsing
How do we get a computer to understand a programming language? After all, a program is just a huge string.
- We specify a language using one of the methods in Regular Languages and Kleene's Theorem
- Language: what words are valid in the language?
- We use a lexer (scanner) to split up our program string into individual valid words (tokens)
- We specify a grammar for a parser
- Grammar: what sequences of words are valid in the language? How should we interpret it?
- Parser: parses the tokens from the lexer and with the grammar forms a syntax tree