The Project

Semantikin is the codename of a compiler for a subset of the C programming language. My goal with this project is to develop a fully working compiler, which is able to perform simple optimizations, like common subexpression elimination, constant folding, etc. and produce assembly code for at least x86-64 and ARM.

In the next sections I describe in some details the structure of the compiler and the programming language supported. If you have any feedback feel free to get in touch!

The Compiler

Command-line options

Front-end

The Lexer

The Parser

The Semantic Analyzer

The Intermediate Representation

Optimization Passes

Instruction Selection

Register Allocation

The Language

The language supported by Semantikin is a subset of the C programming language. The syntax and semantic tries to follow as close a possible that of the C language. In the future I intend to expand the language with more constructs, like structures, but this will be added only after I get a fully retargetable optimizing compiler.

Keywords

int	float	string	void
if	else	elseif	return	while
printInt	printStr	printFloat	readInt	readStr	readFloat

Comments

The only comment style supported is the C-like double backslash, which makes all following text in the same line to be ignored by the parser.

Identifiers

Identifiers can start with a digit or an alphanumeric letter and can consist of an arbitrarily long sequence consisting of such chars. The regular expression used to recognize identifiers is this: [0-9a-zA-Z]*

Types

These are the available data types: int, float and string. Arrays and matrix are supported. Currently there is no support for user-defined data types.

The system does not have direct support for boolean expressions, however the language and compiler implicitly use the literal 1 (or a non-empty string) as true, and 0 (or a empty string) as false.

Function Declaration, Function Call and Variable Declaration

Function and variable declaration follows the syntax of the C language. Currently the only calling convention supported is that of the C/Linux/x86.

Scalar variables can have an initialization value, arrays cannot.

Examples:

// First function
void fun1(int n) {
	int a = n, b = n + 1;
	int c = fun1(a + b);
}

// Second function
int fun2(int a) {
	return fun1(a);
}

Operators

These are all available operators. Operators closer to the top have lower precedence than those closer to the bottom.

%=			right
*=	/=		right
+=	-=		right
=			right
\|\|			left
&&			left
\|			left
^			left
&			left
==	!=		left
>	>=		left
<	<=		left
+	-		left
*	/	%	left
&			right
~			right
!			right

All operands are well defined for numerical types, however bit manipulation operands and the modular arithmetic operand are only valid for integer operands. That means that you can, for instance, use integers in boolean expressions. For instance:

void noeffect() {
	int a = 1 || 0;		// Evaluate to true
	int b = 1 && 0;		// Evaluate to false
}

The following operators support string operands under certain circumstances:

Operator	Semantic
+	Concatenate strings. The programmer must ensure that the target has sufficient space to hold the resulting string.
==	Compare two strings.
!=	Check whether two strings are different.

What is Not Supported

It is not valid to use nested assignments. e.g., a = b = c;
It is not valid to use bit operators with float or string operands;
Structures are not supported.
Global variables are not supported.
Pointers are not supported.
There is no preprocessor.

TODO

Rename the fields chd1, chd2, chd3 of the IR::Instruction class for something more meaningful, e.g., target, left and right.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
tests		tests
.cproject		.cproject
.gitignore		.gitignore
.project		.project
.ycm_extra_conf.py		.ycm_extra_conf.py
.ycm_extra_conf.pyc		.ycm_extra_conf.pyc
AbstractSyntaxTree.cpp		AbstractSyntaxTree.cpp
AbstractSyntaxTree.h		AbstractSyntaxTree.h
AstSemaVisitor.cpp		AstSemaVisitor.cpp
AstTACGenVisitor.cpp		AstTACGenVisitor.cpp
AstToDotVisitor.cpp		AstToDotVisitor.cpp
AstVisitors.h		AstVisitors.h
DataFlowAnalysis.h		DataFlowAnalysis.h
Driver.cpp		Driver.cpp
Driver.h		Driver.h
ErrorReporting.cpp		ErrorReporting.cpp
ErrorReporting.h		ErrorReporting.h
FlexScanner.h		FlexScanner.h
IR.cpp		IR.cpp
IR.h		IR.h
IRToDotVisitor.cpp		IRToDotVisitor.cpp
IRVisitors.h		IRVisitors.h
InstructionSelection.h		InstructionSelection.h
Jenkinsfile		Jenkinsfile
Lexin.l		Lexin.l
LiveVariables.cpp		LiveVariables.cpp
MachineInstruction.cpp		MachineInstruction.cpp
MachineInstruction.h		MachineInstruction.h
MachineInstructionPrinter.cpp		MachineInstructionPrinter.cpp
MachineInstructionPrinter.h		MachineInstructionPrinter.h
Makefile		Makefile
MaximalMunch.cpp		MaximalMunch.cpp
MaximalMunch.h		MaximalMunch.h
Parsin.yy		Parsin.yy
README.md		README.md
ReachingDefinitions.cpp		ReachingDefinitions.cpp
Semantikin.h		Semantikin.h
SymbolTable.cpp		SymbolTable.cpp
SymbolTable.h		SymbolTable.h
executable		executable
lib.c		lib.c
main.cpp		main.cpp
x86_isa.txt		x86_isa.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Project

The Compiler

Command-line options

Front-end

The Lexer

The Parser

The Semantic Analyzer

The Intermediate Representation

Optimization Passes

Instruction Selection

Register Allocation

The Language

Keywords

Comments

Identifiers

Types

Function Declaration, Function Call and Variable Declaration

Operators

What is Not Supported

TODO

References

Change Log

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

JohnTortugo/Semantikin

Folders and files

Latest commit

History

Repository files navigation

The Project

The Compiler

Command-line options

Front-end

The Lexer

The Parser

The Semantic Analyzer

The Intermediate Representation

Optimization Passes

Instruction Selection

Register Allocation

The Language

Keywords

Comments

Identifiers

Types

Function Declaration, Function Call and Variable Declaration

Operators

What is Not Supported

TODO

References

Change Log

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages