COMS W4115
Programming Languages and Translators
Lecture 4: Language Processing Tools
February 4, 2008
Lecture Outline
- Review
- Language design issues
- Language processing tools
- Structure of a compiler
- Reading
1. Review
- Project teams should now be formed
- Language whitepaper: due February 27, 2008
- Concurrency
2. Language Design Issues
- Syntax: specifies the structure of a well-formed program
- Semantics: specifies the meaning of a program
- Computational model
- von Neumann
- functional
- logic or constraint-based
- dataflow
- object oriented
- scripting
- Programming style
- imperative ("how")
- declarative ("what")
- Names
- A name is a character string used to represent a program element.
- In C, each name has a storage class that specifies where it is
and a type that specifies what it is
- Bindings
- A binding is an association between a name and the object it represents.
- Binding time
- Time at which a binding is created.
- Binding times range from language design time to run time.
- Bindings done at run time are dynamic; otherwise, they are static.
- Scope
- The textual region of a program in which a binding is active.
- C is statically scoped; the binding between names and objects can
be determined at compile time.
- Control flow
- Specifies order in which operations are executed
- Principal mechanisms
- sequencing; e.g.,
begin . . . end
- selection: e.g.,
if-then
- iteration: e.g.,
while-loops
- procedures/recursion
- concurrency
- nondeterminism
- Structured vs. unstructured control flow
- See E.W. Dijkstra, Go To statement considered harmful,
Comm. ACM, 11(3), pp. 147-148, March 1968.
- Types
- types determine the permissible values and operations within
a program
- they provide an implicit context for operations
- they help eliminate bugs in programs at compile time
- a type system is a set of rules for
- defining and associating types with various parts of a program
- defining type equivalence, compatability, and inference
- In C the fundamental types are
- char
- integer
- floating point
- enumeration
- void
- In C the derived types are
- arrays of objects of a given type
- functions returning objects of a given type
- pointers to objects of a given type
- structures containing a sequence of objects of various types
- unions containing any one of several objects of various types
- Data abstraction and object orientation
3. Language Processing Tools
- Basic compiler
- Interpreter
- Bytecode interpreter
- Just-in-time compiler
- Linker and loader
- Preprocessor
- Compiler-compiler
- Compiler component generators
- lex, flex
- yacc, bison
- antlr
- Debugger
- Profiler
- Make facility
- Version control
- Integrated software development environment
- See GCC, the GNU Compiler Collection
for compilers from GNU
4. Structure of a Compiler
- Front end: analysis
- Back end: synthesis
- IR: Intermediate representation(s)
- Phases
- lexical analyzer (scanner)
- syntax analyzer (parser)
- semantic analyzer
- intermediate code generator
- machine-independent code optimizer
- code generator
- machine-dependent code optimizer
- Symbol table
- Error handler
- See Fig. 1.6, ALSU p. 5.
5. Reading
aho@cs.columbia.edu