XVR AOT Compiler
The XVR language uses an AOT (Ahead-of-Time) compiler that generates native executables via LLVM IR.
Compilation Pipelineβ
βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ
β .xvr File βββββΆβ Lexer βββββΆβ Parser βββββΆβ AST Nodes βββββΆβ LLVM β
β (Source) β β (Tokens) β β (Errors) β β (Tree) β β IR β
βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ
β
βΌ
βββββββββββββββββββ
β LLVM Optimizer β
β (Optional) β
βββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββ βββββββββββββββββββ
β Executable βββββ Link ββββ βββββββββΆβ Object File β
β (a.out) β β β β (.o/.obj) β
βββββββββββββββ β βββββββββββββββββββ β βββββββββββββββββββ
βββββββββΆβ LLVM MCJIT ββββββββββ
β or JIT (dev) β
βββββββββββββββββββ
Overviewβ
The XVR compiler translates .xvr source files into:
- Native executables (via LLVM IR β object file β linked binary)
- LLVM IR (for debugging/inspection)
- Object files (for custom linking)
Architectureβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β XVR Compiler β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββββββββββββββββββββββ β
β β Lexer βββββΆβ Parser βββββΆβ LLVM Backend β β
β β xvr_lexer.c β βxvr_parser.c β β β β
β βββββββββββββββ βββββββββββββββ β ββββββββββββββββββββββββββββββββββββββββββββ β
β β β xvr_llvm_codegen.c ββ β
β β β (Coordinator) ββ β
β β ββββββββββββββββββββ¬ββββββββββββββββββββββββ β
β β β β β
β β ββββββββββββββββββΌβββββββββββββββββββββ β β
β β β β β β β
β β βΌ βΌ | βΌ β
β β ββββββββββββ ββββββββββββ ββββββββββββββββ β β
β β βContext β βType β βModule β β β
β β βManager β βMapper β βManager β β β
β β β.c β β.c β β.c β β β
β β ββββββββββββ ββββββββββββ ββββββββββββββββ β β
β β ββββββββββββ ββββββββββββ ββββββββββββββββ β β
β β βIR β βExpression β βFunction β β β
β β βBuilder β βEmitter β βEmitter β β β
β β β.c β β.c β β.c β β β
β β ββββββββββββ ββββββββββββ ββββββββββββββββ β β
β β ββββββββββββ ββββββββββββ ββββββββββββββββ β β
β β βControl β βOptimizer β βTarget β β β
β β βFlow β β.c β β.c β β β
β β β.c β β β β β β β
β β ββββββββββββ ββββββββββββ ββββββββββββββββ β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Backend Module Flowβ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LLVM Backend Data Flow β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
AST Node Codegen Stage LLVM IR Output
ββββββββ βββββββββββββ βββββββ ββββββ
ββββββββββββ ββββββββββββββββ βββββββββββββ
β VAR_DECL ββββββββββΆβ xvr_llvm_ ββββββββΆβ %x = β
β var x=42 β β codegen.c β β alloca i32β
ββββββββββββ ββββββββββββββββ βββββββββββββ
β
βΌ
ββββββββββββββββββββ
β xvr_llvm_type_ β
β mapper.c ββββββββββΆ i32, i8*, float, etc.
ββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββ
β xvr_llvm_ir_ β
β builder.c ββββββββββΆ LLVMBuildAlloca, LLVMBuildStore
ββββββββββββββββββββ
ββββββββββββ ββββββββββββββββ βββββββββββββ
β BINARY ββββββββββΆβ xvr_llvm_ ββββββββΆβ %add = β
β x + y β β expression β β add i32 β
ββββββββββββ β emitter.c β β %x, %y β
ββββββββββββββββ βββββββββββββ
ββββββββββββ ββββββββββββββββ βββββββββββββ
β IF/ ββββββββββΆβ xvr_llvm_ ββββββββΆβ br i1 % β
β WHILE β β control_flow β β cond, β
β β β .c β β label, β
ββββββββββββ ββββββββββββββββ β label β
βββββββββββββ
Source File Organizationβ
src/backend/
βββ xvr_llvm_codegen.h/.c # Main coordinator, entry points
βββ xvr_llvm_context.h/.c # LLVM context management
βββ xvr_llvm_type_mapper.h/.c # Type mapping (XVR β LLVM types)
βββ xvr_llvm_module_manager.h/.c # Module creation & IR printing
βββ xvr_llvm_ir_builder.h/.c # IR building (alloca, store, load, etc.)
βββ xvr_llvm_expression_emitter.h/.c # Expressions, arrays, indexing
βββ xvr_llvm_function_emitter.h/.c # Function definitions
βββ xvr_llvm_control_flow.h/.c # If/while/for generation
βββ xvr_llvm_optimizer.h/.c # Optimization passes
βββ xvr_llvm_target.h/.c # Target machine configuration
βββ xvr_format_string.h/.c # Format string {} parser
Usageβ
# Compile and run
xvr script.xvr
# Compile to executable (default: a.out)
xvr script.xvr -o myprogram
# Compile to object file
xvr script.xvr -c -o program.o
# Dump LLVM IR
xvr script.xvr -l
# Show help
xvr -h
# Show version
xvr -v
Language Featuresβ
Variablesβ
var x = 42;
var name = "hello";
var pi = 3.14;
If Statementsβ
Statement Formβ
var score: int32 = 85;
if (score >= 90) {
std::print("Grade: A");
} else if (score >= 80) {
std::print("Grade: B");
} else {
std::print("Grade: C or lower");
}
The if can be used as an expression that returns a value:
var score: int32 = 85;
var grade: string = if (score >= 90) {
"A"
} else if (score >= 80) {
"B"
} else {
"C or lower"
};
std::print(grade);
Note: Expression-based if requires explicit type annotation on the variable.
While Loopsβ
var i = 0;
while (i < 10) {
std::print("{}", i);
i = i + 1;
}
For Loopsβ
The for loop is ideal when you know the number of iterations:
// Basic for loop
for (var i = 0; i < 5; i++) {
std::print("{}", i); // 0, 1, 2, 3, 4
}
// With std::print
include std;
for (var i = 0; i < 20; i++) {
std::print("{}\n", i);
}
Syntax:
for (init; condition; increment) {
// body
}
| Part | Description |
|---|---|
init | Variable initialization (e.g., var i = 0) |
condition | Boolean expression evaluated before each iteration |
increment | Executed after each iteration (e.g., i++, i--) |
Increment/Decrement Operatorsβ
| Operator | Description |
|---|---|
++ | Increment by 1 |
-- | Decrement by 1 |
// Count up
for (var i = 0; i < 5; i++) { } // 0, 1, 2, 3, 4
// Count down
for (var i = 5; i > 0; i--) { } // 5, 4, 3, 2, 1
Break and Continueβ
Control loop execution with break and continue:
var i = 0;
while (i < 100) {
i = i + 1;
if (i == 50) {
break; // Exit loop when i reaches 50
}
if (i % 2 == 0) {
continue; // Skip even numbers
}
std::print("{}", i); // Only prints odd numbers
}
Conditions must be boolean - the compiler validates this and provides helpful hints:
var x = 5;
if (x) { } // ERROR: condition must be boolean
### Print with Format Strings
XVR uses `{}` placeholders:
```xvr
var name = "world";
var num = 42;
std::print("Hello, {}!", name); // Hello, world!
std::print("Value: {}", num); // Value: 42
std::print("{} + {} = {}", 1, 2, 3); // 1 + 2 = 3
// Direct array printing
var arr = [1, 2, 3];
std::print(arr); // prints: 1 2 3
While Loopsβ
var i = 0;
while (i < 10) {
std::print("{}", i);
i = i + 1;
}
Break and Continueβ
Control loop execution with break and continue:
var i = 0;
while (i < 100) {
i = i + 1;
if (i == 50) {
break; // Exit loop when i reaches 50
}
if (i % 2 == 0) {
continue; // Skip even numbers
}
std::print("{}", i); // Only prints odd numbers
}
Conditions must be boolean - the compiler validates this and provides helpful hints:
var x = 5;
while (x) { } // ERROR: condition must be boolean
// help: use a comparison operator (e.g., 'x > 0') or wrap the condition with 'bool()'
Static Arraysβ
var arr = [1, 2, 3, 4, 5];
std::print(arr[0]); // prints 1
// Array assignment
arr[1] = 20;
std::print(arr[1]); // prints 20
String Concatenationβ
XVR supports compile-time and runtime string concatenation:
// Compile-time concatenation (both operands are string literals)
var msg = "Hello, " + "World!"; // "Hello, World!" at compile time
// Runtime concatenation (at least one operand is runtime)
var name = "John";
var greeting = "Hello, " + name; // uses string_concat runtime proc
Optimization: When both operands are string literals, the compiler constant-folds them into a single string at compile time. When at least one operand is a runtime value (like a variable or function parameter), it falls back to the string_concat runtime procedure.
Format String Parserβ
The xvr_format_string.c module handles {} interpolation:
Format String Parsing Flow:
βββββββββββββββββββ
β "Hello {}!" β (input)
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Parse for {} placeholders β
β Count arguments β
ββββββββββ¬βββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Infer types from LLVM values β
β integer β %d β
β float β %lf β
β string β %s β
ββββββββββ¬βββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β "Hello %s!" β (printf format)
βββββββββββββββββββ
Type Mappingβ
| XVR Type | LLVM Type | printf |
|---|---|---|
string | i8* | %s |
integer | i32 | %d |
float | float β double | %lf |
boolean | i1 | %s |
Output Examplesβ
LLVM IRβ
$ xvr test.xvr -l
; ModuleID = 'test'
source_filename = "test"
@fmt_str = private unnamed_addr constant [11 x i8] c"Hello, %s!\00", align 1
define i32 @main() {
entry:
%name = alloca ptr, align 8
store ptr @str_literal, ptr %name, align 8
%name1 = load ptr, ptr %name, align 8
%printf_call = call i32 (ptr, ...) @printf(ptr @fmt_str, ptr %name1)
ret i32 0
}
declare i32 @printf(ptr, ...)
Object Fileβ
$ xvr test.xvr -c -o test.o
$ file test.o
test.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
Error Handlingβ
The XVR compiler provides clear, actionable error messages with helpful hints.
Type Mismatchβ
XVR validates explicit type annotations:
var x: int = 1.5; // error: type mismatch: cannot convert from 'float' to 'int'
var x: string = 123; // error: type mismatch: cannot convert from 'int' to 'string'
var x: bool = 1; // error: type mismatch: cannot convert from 'int' to 'bool'
Non-Boolean Conditionsβ
Control flow conditions must be boolean:
var x = 5;
if (x) { } // ERROR
Error output:
error: condition of if statement must be boolean, got 'i32'
help: use a comparison operator (e.g., 'x > 0') or wrap the condition with 'bool()'
Break/Continue Outside Loopβ
if (true) {
break; // ERROR
}
Error output:
error: break statement must be inside a loop
help: place the 'break' statement inside a 'while' or 'for' loop
Maximum Loop Nestingβ
Loops can be nested up to 64 levels deep:
error: maximum loop nesting depth (64) exceeded
help: simplify nested loop structure
Buildingβ
Requirementsβ
- LLVM 21+ (with C API headers)
- C compiler with C18 support
Buildβ
make
Output: out/xvr
Differences from Interpreterβ
- AOT only: No interpreter, no REPL
- Format strings: Uses
{}instead of printf%syntax - Variables: Use
varkeyword (notlet) - Semicolons: Optional
- Procedures: Use
prockeyword with optional return type annotation
Future Enhancementsβ
- Struct types
- Better optimization passes
- Multiple return values