Table of Contents

AOT Compiler

The Ahead of Time Compiler that was originally used to compile SharpOS did its job but suffered from a few architectural weaknesses that made it difficult to expand. The SharpOS developers thus decided to concentrate on a new compiler with a more compartmentalized design. However, the AOT Compiler is still a valuable learning tool for those people interested in what it takes to generate machine code from Common Intermediate Language instructions.

Phases

IR Generation

The IR generation phase of the AOT compiler generates a tree of IR information which is rooted in the SharpOS.AOT.IR.Engine instance. It creates a SharpOS.AOT.IR.Class instance for each type, and a SharpOS.AOT.IR.Method instance for each method. The AOT skips methods which are marked as ADC stubs, and replaces calls to them to the architecture-specific implementation which is selected for the current compilation. It also skips methods which are marked as unmanaged (for now, as there is currently no way to compile native code into the kernel). The AOT also generates IR for itself, but only for classes marked with SharpOS.AOT.Attributes.IncludeAttribute applied.

IR Processing

Encoding

In the method Run after processing the IR, asm.Encode (this, options.!OutputFilename), is called, where this is the engine containing all the IR data and the second parameter is the name of the binary file that will contain the encoded data (currently a PE file). As you noticed asm is only an interface (IAssembly) that every encoder has to implement in order to be used by the engine.

The x86 encoder starts with adding the PE header (AddPEHeader) that is also containing the MultiBoot (AddMultiBootHeader) structure so that Grub recognizes the binary as a valid kernel and runs it. After calling AddEntryPoint, which actually adds the code that is called by grub once the kernel was loaded, which also setups the x86 stack and calls all the Type Initializers (.cctor) of all defined types in the kernel, the encoder gives the control to AssemblyMethod for every defined method. After encoding all the methods, AddHelperFunctions gets called which adds support for arithmetic operations that need long operands.

Architecture Dependent Code

The SharpOS.AOT provides a method of switching between implementations (layers) of a common class interface. The AOT looks for an assembly attribute of type SharpOS.AOT.Attributes.ADCInterfaceAttribute, which informs the AOT which namespace represents the ADC interface. This namespace would be the one used by other kernel code to make calls into the ADC implementation. Each method that is supposed to be implemented by the underlying ADC layer should be marked with SharpOS.AOT.Attributes.ADCStubAttribute.

Each implementation must also provide an ADCLayerAttribute that expresses the namespace containing the implementation code and the name of the processor that the layer applies to. The SharpOS.AOT chooses which ADC layer to use based on what processor it is compiling for. When the AOT finds a reference to an ADC stub method, it translates the reference into an equivalent one that points directly to the ADC implementation method. It does this by first chopping off the ADC interface namespace (provided by the assembly-level SharpOS.AOT.Attributes.ADCInterfaceAttribute), and replacing it with the ADC layer namespace (provided by the assembly-level ADCLayerAttribute). Thus, the code gets compiled as if the original code referenced the processor-specific implementation. This translation is done before the IL is converted to the internal representation (while the SharpOS.AOT is still working with data provided by Mono.Cecil).

AOT Source Code

Step by Step

1. Loads the IL using Mono.Cecil (AOT/Core/IR/Engine.cs:448)

2. Creates intermediate representation (IR) objects to represent types/methods. (AOT/Core/IR/Engine.cs:521)

3. Converts the IL bytecode into the intermediate code representation (fills the methods) (AOT/Core/IR/Engine.cs:587)

4. Performs a number of IR optimizations on each method (AOT/Core/IR/Method.cs:2712)

5. Calls an IAssembly implementation to handle encoding to native code (AOT/Core/IR/Engine.cs:622)

6. IAssembly prepares the IR for encoding as native code.

The rest of the sequence is dependent on the IAssembly (aka, the architecture) being compiled for. For X86, the encoding is like this:

1. PE/COFF header: provides method names when debugging

2. Multiboot header: hints for the boot loader, including entry point address

3. Methods

4. Helper functions

5. Data

6. Symbols

Then it saves all this encoded data to the file. This process happens in X86.Assembly.Encode() (AOT/Core/X86/Assembly.cs:1271).