C Compiler
A collaborative project with Satyam Jay
Overview
This project involves the development of a C compiler that is capable of parsing C code, building an Abstract Syntax Tree (AST), and emitting LLVM Intermediate Representation (IR). The compiler supports basic C constructs and is designed to be extended with more complex features and optimizations.
Development Phases
Part 1: Building AST
- Objective: Build an AST using the provided Bison grammar specification.
- Implementation:
- Modified the
c.yfile to support function definitions and expressions. - Implemented a
dump_ast()function to visualize the AST structure. - Ensured the AST could handle basic examples like
1 2 3 4 5 6 7 8 9
void empty() { } void simple_arith() { (10 - 10/3) << 3 | (23+8*12) & 1024; } void simple_arith_with_arg(int d) { (d > d/2) || (d >= 100) && (d < 99); }
- Modified the
Part 2: Supporting More Complex Constructs
- Objective: Enhance the compiler to support additional C constructs necessary for compiling more complex programs.
- Implementation:
- Added support for variable declarations, control flow statements (if, while, return, goto).
- Implemented a symbol table for tracking scopes and declarations.
- Ensured the compiler could parse and build AST for
1 2 3 4 5 6 7 8
int printf(char const *format, ...); int main(int argc, char **argv) { printf("hello, world\n"); return 0; }
Part 3: Emitting LLVM IR
- Objective: Generate LLVM IR from the AST.
- Implementation:
- Integrated LLVM libraries to convert the AST directly into LLVM IR.
- Ensured the generated LLVM IR could be executed using the LLVM interpreter (
lli).
Part 4: Local Optimizations
- Objective: Implement local optimizations in the LLVM IR generation phase.
- Implementation:
- Added optimizations such as constant folding, dead code removal, and trivial constant propagation.
- Example optimization:
- Algebraic Simplification.
Adding 0, Multiplying 0/1... - DeadCode Removal.
if(1) then remove else. if(0) then remove if. while(0) remove while. - Constant Propagation.
Only supported propagation in the same block. x=1; y=x+1; => y=2
- Algebraic Simplification.
Testing
Tested on:
CMake v3.16.3Boost v1.71.0llvm-19 / llvm-17
Build Instructions:
mkdir build && cd buildcmake ..make
Compile from source to LLVM IR
./cc --in input.cRun with default options no optimization, input file is “input.c”./cc --in input.c --all-optEnable all optimizations, input file is “input.c”./cc --helpto see all options
Run LLVM IR:
lli-17 input.c.ll
This post is licensed under
CC BY 4.0
by the author.
Comments powered by Disqus.