From 63051c6f1cf59e468cb4175deea83c82888d18c2 Mon Sep 17 00:00:00 2001 From: LLLL Colonq Date: Sun, 26 Apr 2026 22:55:20 -0400 Subject: Working assembler --- notes/notes.org | 85 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100644 notes/notes.org (limited to 'notes/notes.org') diff --git a/notes/notes.org b/notes/notes.org new file mode 100644 index 0000000..d1afd0a --- /dev/null +++ b/notes/notes.org @@ -0,0 +1,85 @@ +* x86_64 +[[./flowchart.png]] +(optional) legacy prefixes +-> (optional) REX prefix +-> primary opcode +-> (optional) ModRM +-> (optional) SIB +-> (optional) displacement +-> (optional) immediate +** Legacy prefixes (up to 5) +*** 0x66 - operand-size override +Makes operand size 16-bit instead of default 32-bit. +(REX.W has higher priority and makes operand size 64-bit.) +*** 0x67 - address-size override +Makes address size 32-bit instead of default 64-bit. +** REX +0x40 through 0x4f +Low bits: +3 2 1 0 +W R X B + +When W = 1, the operand size is 64-bits (RAX instead of EAX, etc.) +R is used as an extra high bit for the ModRM reg field (to indicate regs R8 through R15) +X is used as an extra high bit for the SIB index register +B is used as an extra high bit for the SIB base register, ModRM r/m field, or an opcode-specific reg field +** primary opcode map +Operand notation +*** Location +E - general purpose register or memory specified by ModRM.r/m / SIB +F - rFLAGS register +G - general purpose register specified by ModRM.reg +I - immediate value encoded in the immediate field +J - instruction encoding includes a relative offset added to rIP +M - memory specified by mod and r/m in ModRM. ModRM.mod /= 0b11 +O - offset of an operand is encoded in the instruction. no ModRM +R - general purpose register specified by ModRM.r/m. ModRM.mod == 0b11 +*** Type +b - a byte +c - a byte or 2 bytes, depending on effective operand size +d - 4 bytes +i - a 16-bit integer +j - a 32-bit integer +m - a bit mask of size equal to source +mn - where n = 2,4,8, or 16: a bit mask of size n +q - 8 bytes +v - 2 bytes, 4 bytes, or 8 bytes, depending on effective operand size +w - 2 bytes +y - 4 bytes or 8 bytes depending on effective operand size +z - 2 bytes if the effective operand size is 16 bits, or 4 bytes if the effective operand size is 32 or 64 bits +** ModRM +Used to specify either 2 register operands or 1 register and 1 memory operand. +Three fields: MOD, REG, and R/M +When REX prefix is present, REX.R is used as a high bit on REG and REX.B is used as a high bit on R/M +7 6 5 4 3 2 1 0 +MOD REG-- R/M-- + +When MOD is 0b11, both REG and R/M denote registers as follows +| 000 | rAX, XMM0, etc. | +| 001 | rCX, XMM1, etc. | +| 010 | rDX, XMM2, etc. | +| 011 | rBX, XMM3, etc. | +| 100 | AH, rSP, XMM4, etc. | +| 101 | CH, rBP, XMM5, etc. | +| 110 | DH, rSI, XMM6, etc. | +| 111 | BH, rDI, XMM7, etc. | +(When REX prefix is specified, REX.R is an extra high bit that allows access to registers 8 through 15) + +When MOD is not 0b11, R/M denotes a base register for memory access +| 000 | [rAX] | +| 001 | [rCX] | +| 010 | [rDX] | +| 011 | [rBX] | +| 100 | see SIB byte | +| 101 | [rBP] or [RIP] if MOD = 0b00 (even if REX.B is 1!!!) | +| 110 | [rSI] | +| 111 | [rDI] | +The offset from the base register is determined by the displacement bytes (these follow SIB) +If MOD is 01 there is a 1-byte displacement, if MOD is 10 there is a 4-byte displacement +** SIB +Occurs only after ModRM +** Displacement +A displacement is a signed offset from a base used to indicate a memory address. +Either 1 or 4 bytes depending on MOD, sign-extended to 64 bits during address calculation. +** Immediate +Either 1, 2, 4, or 8 (only for MOV into GPR) bytes. -- cgit v1.2.3