Instruction DB

Instruction database (introspection, read/write, validation, ...).

Overview

AsmJit provides a public instruction database that can be used to query information about a complete instruction. The instruction database requires the knowledge of the following:

  • BaseInst - Base instruction that contains instruction id, options, and a possible extra-register that represents either REP prefix counter or AVX-512 selector (mask).
  • Operand - Represents operands of an instruction.

Each instruction can be then queried for the following information:

  • InstRWInfo - Read/write information of instruction and its oprands (includes OpRWInfo).
  • CpuFeatures - CPU features required to execute the instruction.

In addition to query functionality AsmJit is also able to validate whether an instruction and its operands are valid. This is useful for making sure that what user tries to emit is correct and it can be also used by other projects that parse user input, like AsmTK project.

Query API

The instruction query API is provided by InstAPI namespace. The following queries are possible:

  • InstAPI::queryRWInfo() - queries read/write information of the given instruction and its operands. Includes also CPU flags read/written.
  • InstAPI::queryFeatures() - queries CPU features that are required to execute the given instruction. A full instruction with operands must be given as some architectures like X86 may require different features for the same instruction based on its operands.
  • asmjit_test_instinfo.cpp can be also used as a reference about accessing instruction information.

Validation API

The instruction validation API is provided by InstAPI namespace in the similar fashion like the Query API, however, validation can also be turned on at BaseEmitter level. The following is possible:

Namespaces

Classes

Typedefs

Enumerations

Typedef Documentation

typedef uint32_t InstId

Describes an instruction id and modifiers used together with the id.

Each architecture has a set of valid instructions indexed from 0. Instruction with 0 id is, however, a special instruction that describes a "no instruction" or "invalid instruction". Different architectures can assign a. different instruction to the same id, each architecture typicall has its own instructions indexed from 1.

Instruction identifiers listed by architecture:

Enumeration Type Documentation

InstIdParts : uint32_tenumstrong

Instruction id parts.

A mask that specifies a bit-layout of InstId.

ConstantDescription
kRealId 

Real id without any modifiers (always 16 least significant bits).

kAbstract 

Instruction is abstract (or virtual, IR, etc...).

kA32_DT 

AArch32 first data type, used by ASIMD instructions (inst.dt.dt2).

kA32_DT2 

AArch32 second data type, used by ASIMD instructions (inst.dt.dt2).

kARM_Cond 

AArch32/AArch64 condition code.

InstOptions : uint32_tenumstrong

Instruction options.

Instruction options complement instruction identifier and attributes.

ConstantDescription
kNone 

No options.

kReserved 

Used internally by emitters for handling errors and rare cases.

kUnfollow 

Prevents following a jump during compilation (Compiler).

kOverwrite 

Overwrite the destination operand(s) (Compiler).

Hint that is important for register liveness analysis. It tells the compiler that the destination operand will be overwritten now or by adjacent instructions. Compiler knows when a register is completely overwritten by a single instruction, for example you don't have to mark "movaps" or "pxor x, x", however, if a pair of instructions is used and the first of them doesn't completely overwrite the content of the destination, Compiler fails to mark that register as dead.

X86 Specific

  • All instructions that always overwrite at least the size of the register the virtual-register uses, for example "mov", "movq", "movaps" don't need the overwrite option to be used - conversion, shuffle, and other miscellaneous instructions included.
  • All instructions that clear the destination register if all operands are the same, for example "xor x, x", "pcmpeqb x x", etc...
  • Consecutive instructions that partially overwrite the variable until there is no old content require BaseCompiler::overwrite() to be used. Some examples (not always the best use cases thought):
    • movlps xmm0, ? followed by movhps xmm0, ? and vice versa
    • movlpd xmm0, ? followed by movhpd xmm0, ? and vice versa
    • mov al, ? followed by and ax, 0xFF
    • mov al, ? followed by mov ah, al
    • pinsrq xmm0, ?, 0 followed by pinsrq xmm0, ?, 1
  • If the allocated virtual register is used temporarily for scalar operations. For example if you allocate a full vector like x86::Compiler::newXmm() and then use that vector for scalar operations you should use overwrite() directive:
    • sqrtss x, y - only LO element of x is changed, if you don't use HI elements, use compiler.overwrite().sqrtss(x, y).
kShortForm 

Emit short-form of the instruction.

kLongForm 

Emit long-form of the instruction.

kTaken 

Conditional jump is likely to be taken.

kNotTaken 

Conditional jump is unlikely to be taken.

kX86_ModMR 

Use ModMR instead of ModRM if applicable.

kX86_ModRM 

Use ModRM instead of ModMR if applicable.

kX86_Vex3 

Use 3-byte VEX prefix if possible (AVX) (must be 0x00000400).

kX86_Vex 

Use VEX prefix when both VEX|EVEX prefixes are available (HINT: AVX_VNNI).

kX86_Evex 

Use 4-byte EVEX prefix if possible (AVX-512) (must be 0x00001000).

kX86_Lock 

LOCK prefix (lock-enabled instructions only).

kX86_Rep 

REP prefix (string instructions only).

kX86_Repne 

REPNE prefix (string instructions only).

kX86_XAcquire 

XACQUIRE prefix (only allowed instructions).

kX86_XRelease 

XRELEASE prefix (only allowed instructions).

kX86_ER 

AVX-512: embedded-rounding {er} and implicit {sae}.

kX86_SAE 

AVX-512: suppress-all-exceptions {sae}.

kX86_RN_SAE 

AVX-512: round-to-nearest (even) {rn-sae} (bits 00).

kX86_RD_SAE 

AVX-512: round-down (toward -inf) {rd-sae} (bits 01).

kX86_RU_SAE 

AVX-512: round-up (toward +inf) {ru-sae} (bits 10).

kX86_RZ_SAE 

AVX-512: round-toward-zero (truncate) {rz-sae} (bits 11).

kX86_ZMask 

AVX-512: Use zeroing {k}{z} instead of merging {k}.

kX86_ERMask 

AVX-512: Mask to get embedded rounding bits (2 bits).

kX86_AVX512Mask 

AVX-512: Mask of all possible AVX-512 options except EVEX prefix flag.

kX86_OpCodeB 

Force REX.B and/or VEX.B field (X64 only).

kX86_OpCodeX 

Force REX.X and/or VEX.X field (X64 only).

kX86_OpCodeR 

Force REX.R and/or VEX.R field (X64 only).

kX86_OpCodeW 

Force REX.W and/or VEX.W field (X64 only).

kX86_Rex 

Force REX prefix (X64 only).

kX86_InvalidRex 

Invalid REX prefix (set by X86 or when AH|BH|CH|DH regs are used on X64).

InstControlFlow : uint32_tenumstrong

Instruction control flow.

ConstantDescription
kRegular 

Regular instruction.

kJump 

Unconditional jump.

kBranch 

Conditional jump (branch).

kCall 

Function call.

kReturn 

Function return.

kMaxValue 

Maximum value of InstType.

InstSameRegHint : uint8_tenumstrong

Hint that is used when both input operands to the instruction are the same.

Provides hints to the instrution RW query regarding special cases in which two or more operands are the same registers. This is required by instructions such as XOR, AND, OR, SUB, etc... These hints will influence the RW operations query.

ConstantDescription
kNone 

No special handling.

kRO 

Operands become read-only, the operation doesn't change the content - X & X and similar.

kWO 

Operands become write-only, the content of the input(s) don't matter - X ^ X, X - X, and similar.

CpuRWFlags : uint32_tenumstrong

CPU read/write flags used by InstRWInfo.

These flags can be used to get a basic overview about CPU specifics flags used by instructions.

ConstantDescription
kNone 

No flags.

kCF 

Carry flag.

kOF 

Signed overflow flag.

kSF 

Sign flag (negative/sign, if set).

kZF 

Zero and/or equality flag (1 if zero/equal).

kX86_CF 

Carry flag (X86, X86_64).

kX86_OF 

Overflow flag (X86, X86_64).

kX86_SF 

Sign flag (X86, X86_64).

kX86_ZF 

Zero flag (X86, X86_64).

kX86_AF 

Adjust flag (X86, X86_64).

kX86_PF 

Parity flag (X86, X86_64).

kX86_DF 

Direction flag (X86, X86_64).

kX86_IF 

Interrupt enable flag (X86, X86_64).

kX86_AC 

Alignment check flag (X86, X86_64).

kX86_C0 

FPU C0 status flag (X86, X86_64).

kX86_C1 

FPU C1 status flag (X86, X86_64).

kX86_C2 

FPU C2 status flag (X86, X86_64).

kX86_C3 

FPU C3 status flag (X86, X86_64).

OpRWFlagsenumstrong

Operand read/write flags describe how the operand is accessed and some additional features.

ConstantDescription
kNone 

No flags.

kRead 

Operand is read.

kWrite 

Operand is written.

kRW 

Operand is both read and written.

kRegMem 

Register operand can be replaced by a memory operand.

kConsecutive 

The register must be allocated to the index of the previous register + 1.

This flag is used by all architectures to describe instructions that use consecutive registers, where only the first one is encoded in the instruction, and the others are just a sequence that starts with the first one. On X86/X86_64 architecture this is used by instructions such as V4FMADDPS, V4FMADDSS, V4FNMADDPS, V4FNMADDSS, VP4DPWSSD, VP4DPWSSDS, VP2INTERSECTD, and VP2INTERSECTQ. On ARM/AArch64 this is used by vector load and store instructions that can load or store multiple registers at once.

kZExt 

The extendByteMask() represents a zero extension.

kRegPhysId 

Register operand must use OpRWInfo::physId().

kMemPhysId 

Base register of a memory operand must use OpRWInfo::physId().

kMemFake 

This memory operand is only used to encode registers and doesn't access memory.

X86 Specific

Instructions that use such feature include BNDLDX, BNDSTX, and LEA.

kMemBaseRead 

Base register of the memory operand will be read.

kMemBaseWrite 

Base register of the memory operand will be written.

kMemBaseRW 

Base register of the memory operand will be read & written.

kMemIndexRead 

Index register of the memory operand will be read.

kMemIndexWrite 

Index register of the memory operand will be written.

kMemIndexRW 

Index register of the memory operand will be read & written.

kMemBasePreModify 

Base register of the memory operand will be modified before the operation.

kMemBasePostModify 

Base register of the memory operand will be modified after the operation.

ValidationFlags : uint32_tenumstrong

Validation flags that can be used with InstAPI::validate().

ConstantDescription
kNone 

No flags.

kEnableVirtRegs 

Allow virtual registers in the instruction.