Machine Code Generation for C++
Globals, code storage, and emitter interface.
To generate code you would need to instantiate at least the following classes:
The example below shows how the mentioned classes interact to generate X86 code:
The example above used x86::Assembler as an emitter. AsmJit provides the following emitters that offer various levels of abstraction:
AsmJit's Target is an interface that provides basic target abstraction. At the moment AsmJit provides only one implementation called JitRuntime, which as the name suggests provides JIT code target and execution runtime. JitRuntime provides all the necessary stuff to implement a simple JIT compiler with basic memory management. It only provides JitRuntime::add() and JitRuntime::release() functions that are used to either add code to the runtime or release it. JitRuntime doesn't do any decisions on when the code should be released, the decision is up to the developer.
See more at Virtual Memory group.
In the previous example the CodeInfo is retrieved from JitRuntime. It's logical as JitRuntime always returns a CodeInfo that is compatible with the host runtime environment. For example if your application runs in 64-bit mode the CodeInfo returned will use ArchInfo::kIdX64 architecture in contrast to ArchInfo::kIdX86, which will be used in 32-bit mode.
AsmJit allows to setup CodeInfo manually and to select a different architecture when necessary. So let's do something else this time, let's always generate a 32-bit code and print its binary representation. To do that, we can create our own CodeInfo and initialize it to ArchInfo::kIdX86. CodeInfo will populate all basic fields just based on the architecture we provide, so it's super-easy:
CodeInfo contains much more information than just the target architecture. It can be configured to specify a base-address (or a virtual base-address in a linker terminology), which could be static (useful when you know the location where the target's machine code will be) or dynamic. AsmJit assumes dynamic base-address by default and relocates the code held by CodeHolder to a user provided address on-demand. To be able to relocate to a user provided address it needs to store some information about relocations, which is represented by RelocEntry. Relocation entries are only required if you call external functions from the generated code that cannot be encoded by using a 32-bit displacement (64-bit displacements are not provided by aby supported architecture).
There is also a concept called LabelLink - label link is a lightweight data structure that doesn't have any identifier and is stored in LabelEntry as a single-linked list. Label link represents either unbound yet used label and cross-sections links (only relevant to code that uses multiple sections). Since crossing sections is something that cannot be resolved immediately these links persist until offsets of these sections are assigned and until CodeHolder::resolveUnresolvedLinks() is called. It's an error if you end up with code that has unresolved label links after flattening. You can verify it by calling CodeHolder::hasUnresolvedLinks(), which inspects the value returned by CodeHolder::unresolvedLinkCount().
AsmJit can flatten code that uses multiple sections by assigning each section an incrementing offset that respects its alignment. Use CodeHolder::flatten() to do that. After the sections are flattened their offsets and virtual-sizes are adjusted to respect each section's buffer size and alignment. The CodeHolder::resolveUnresolvedLinks() function must be called before relocating the code held by CodeHolder. You can also flatten your code manually by iterating over all sections and calculating their offsets (relative to base) by your own algorithm. In that case CodeHolder::flatten() should not be called, however, CodeHolder::resolveUnresolvedLinks() should be.
The example below shows how to use a built-in virtual memory allocator JitAllocator instead of using JitRuntime (just in case you want to use your own memory management) and how to relocate the generated code into your own memory block - you can use your own virtual memory allocator if you prefer that, but that's OS specific and not covered by the documentation.
The following code is similar to the previous one, but implements a function working in both 32-bit and 64-bit environments:
If you know the base-address in advance (before the code generation) the CodeInfo::setBaseAddress() function can be used to setup it. In that case the Assembler will know the absolute position of each instruction and would be able to use it during instruction encoding to prevent relocations where possible. The following example shows how to configure the base address: