
AsmJit
Low-Latency Machine Code Generation
Assembler interface and operands.
AsmJit's Assembler is used to emit machine code directly into a CodeBuffer. In general, code generation with assembler requires the knowledge of the following:
Let's start with operands. Operand is a data structure that defines a data layout of any operand. It can be inherited, but any class inheriting it cannot add any members to it, only the existing layout can be reused. AsmJit allows to construct operands dynamically, to store them, and to query a complete information about them at run-time. Operands are small (always 16 bytes per Operand) and can be copied and passed by value. Please never allocate individual operands dynamically by using a new
keyword - it would work, but then you would have to be responsible for deleting such operands. In AsmJit operands are always part of some other data structures like InstNode, which is part of Builder tool.
Operands contain only identifiers, but not pointers to any code-generation data. For example Label operand only provides label identifier, but not a pointer to LabelEntry structure. In AsmJit such IDs are used to link stuff together without having to deal with pointers.
AsmJit's operands all inherit from a base class called Operand. Operands have the following properties that are commonly accessible by getters and setters:
AsmJit allows to construct operands dynamically, to store them, and to query a complete information about them at run-time. Operands are small (always 16 bytes per Operand
) and should be always copied (by value) if you intend to store them (don't create operands by using new
keyword, it's not recommended). Operands are safe to be passed to memcpy()
and memset()
, which becomes handy when working with arrays of operands. If you set all members of an Operand to zero the operand would become NONE operand, which is the same as a default constructed Operand.
The example below illustrates how operands can be used and modified even without using any other code generation classes. The example uses X86 architecture-specific operands.
Some operands have to be created explicitly by emitters. For example labels must be created by BaseEmitter::newLabel(), which creates a label entry and returns a Label operand with the id that refers to it. Such label then can be used by emitters.
Some architectures like X86 provide a complex memory addressing model that allows to encode addresses having a BASE register, INDEX register with a possible scale (left shift), and displacement (called offset in AsmJit). Memory address on X86 can also specify memory segment (segment-override in X86 terminology) and some instructions (gather / scatter) require INDEX to be a x86::Vec register instead of a general-purpose register.
AsmJit allows to encode and work with all forms of addresses mentioned and implemented by X86. In addition, it also allows to construct absolute 64-bit memory address operands, which is only allowed in one form of 'mov' instruction.
Memory operands can optionally contain memory size. This is required by instructions where the memory size cannot be deduced from other operands, like inc
and dec
on X86:
Memory operands provide API that can be used to access its properties:
Making changes to memory operand is very comfortable when emitting loads and stores:
Adds constructors and member functions to a class that implements abstract register.
Abstract register is register that doesn't have type or signature yet, it's a base class like x86::Reg
or arm::Reg
.
Adds constructors and member functions to a class that implements final register.
Final registers MUST HAVE a valid signature.
Register mask is a convenience typedef that describes a mask where each bit describes a physical register id in the same RegGroup.
At the moment 32 bits are enough as AsmJit doesn't support any architecture that would provide more than 32 registers for a register group.
Operand type used by Operand_.
Register type.
Provides a unique type that can be used to identify a register or its view.
Constant | Description |
---|---|
kNone | No register - unused, invalid, multiple meanings. |
kLabelTag | This is not a register type. This value is reserved for a Label that's used in BaseMem as a base. Label tag is used as a sub-type, forming a unique signature across all operand types as 0x1 is never associated with any register type. This means that a memory operand's BASE register can be constructed from virtually any operand (register vs. label) by just assigning its type (register type or label-tag) and operand id. |
kGp8Lo | 8-bit low general purpose register (X86|X86_64). |
kGp8Hi | 8-bit high general purpose register (X86|X86_64). |
kGp16 | 16-bit general purpose register (X86|X86_64). |
kGp32 | 32-bit general purpose register (X86|X86_64|AArch32|AArch64). |
kGp64 | 64-bit general purpose register (X86_64|AArch64). |
kVec8 | 8-bit view of a vector register (AArch64). |
kVec16 | 16-bit view of a vector register (AArch64). |
kVec32 | 32-bit view of a vector register (AArch32|AArch64). |
kVec64 | 64-bit view of a vector register (AArch32|AArch64).
|
kVec128 | 128-bit view of a vector register (X86|X86_64|AArch32|AArch64). |
kVec256 | 256-bit view of a vector register (X86|X86_64). |
kVec512 | 512-bit view of a vector register (X86|X86_64). |
kVec1024 | 1024-bit view of a vector register (future). |
kVecNLen | View of a vector register, which width is implementation specific (AArch64). |
kMask | Mask register (X86|X86_64|AArch64). |
kTile | Tile register (X86_64: |
kSegment | Segment register (X86|X86_64: None, ES, CS, SS, DS, FS, GS). |
kControl | Control register (X86|X86_64: |
kDebug | Debug register (X86|X86_64: |
kX86_Mm | MMX register (X86|X86_64: |
kX86_St | FPU (x87) register (X86|X86_64: |
kX86_Bnd | Bound register (X86|X86_64: |
kPC | Universal type describing program counter (PC) or instruction pointer (EIP/RIP) register, if the target architecture actually exposes it as a separate register type, which most modern architectures do. X86 SpecificInstruction pointer (RIP), only addressable in x86::Mem in 64-bit targets. |
kMaxValue | Maximum value of |
Register group.
Provides a unique value that identifies groups of registers and their views.
Constant | Description |
---|---|
kGp | General purpose register group compatible with all backends. |
kVec | Vector register group compatible with all backends. Describes |
kMask | Mask register group compatible with all backends that can use masking. Describes |
kExtraVirt3 | Extra virtual group #3 that can be used by Compiler for register allocation. |
kTile | TMM register group (X86|X86_64). |
kSegment | Segment register group (X86|X86_64). |
kControl | Control register group (X86|X86_64). |
kDebug | Debug register group (X86|X86_64). |
kX86_MM | MMX register group (MM) - maps to RegGroup::kExtraVirt3 (X86|X86_64). |
kX86_St | FPU register group (X86|X86_64). |
kX86_Bnd | BND register group (X86|X86_64). |
kPC | Program counter group (represents also EIP/RIP on X86|X86_64 targets). |
kMaxValue | Maximum value of |
kMaxVirt | Last value of a virtual register that is managed by BaseCompiler. |