Re: R Hyde's Art of Assembly - best version?
Posted: Mon Sep 03, 2012 9:35 pm
Ah, I looked through the Index at the start of the above page, and went right to Section 22.
That has most of the answer there, but it is spread out between section 22 to 25. Here is
what it says:
instruction like inc eax will change the instruction to inc ax, because
Intel decided that 16-bit was not as prevalently used when working with
32-bits. So instead of jumping from 8-bits and 16-bits as found with the
8086, they went from 8-bits and 32-bits, then stuck in 66 hex in front
of 32-bits to drop back to 16-bits when required. It also makes the
386 and above somewhat incompatible with the 8086 way of doing things.
But it also states at one point that if the instruction has a default mode
of 16 bits, the 66 hex prefix will make it 32-bit instead, and vice versa. So
now I guess you have to know what the default mode is on an instruction
by instruction basis, or leave it to the Assembler to get it right. Which is
likely no problem with PureBasic and Fasm, but maybe not so much with the
built-in assembler in HotBasic. There you have to find out which external
assembler gets it right to work in giving you the missing opcodes to plug
in yourself. Makes PureBasic somewhat more attractive, right?
Anyway, here is sections 22 through 25 if you are interested:
===================================================================
22. Encoding Eight, Sixteen, and Thirty-Two Bit Operands
When Intel designed the 8086, one bit in the opcode, s,
selected between 8 and 16 bit integer operand sizes.
Later, when CPU added 32-bit integers to its architecture
on 80386 chip, there was a problem:
three encodings were needed to support 8, 16, and 32 bit sizes.
Solution was an operand size prefix byte.
x86 ADD Opcode:
Intel studied x86 instruction set and came to the conclusion:
in a 32-bit environment, programs were more likely to use
8-bit and 32-bit operands far more often than 16-bit operands.
So Intel decided to let the size bit s in the opcode select
between 8- and 32-bit operands.
23. Encoding Sixteen Bit Operands
32-bit programs don't use 16-bit operands that often, but they do
need them now and then.
To allow for 16-bit operands, Intel added prefix a 32-bit mode
instruction with the operand size prefix byte with value 66h.
This prefix byte tells the CPU to operand on 16-bit data rather
than 32-bit data.
x86 instruction format:
There is nothing programmer has to do explicitly to put an operand
size prefix byte in front of a 16-bit instruction:
the assembler does this automatically as soon as 16-bit operand is
found in the instruction.
However, keep in mind that whenever you use a 16-bit operand in a
32-bit program, the instruction is longer by one byte:
Opcode Instruction
-------- ------------
41h INC ECX
66h 41h INC CX
Be careful about using 16-bit instructions if size (and to a lesser
extent, speed) are important, because instructions are longer, and
slower because of their effect on the instruction cache.
24. x86 Instruction Prefix Bytes
x86 instruction can have up to 4 prefixes.
Each prefix adjusts interpretation of the opcode:
Repeat/lock prefix byte guarantees that instruction will have exclusive
use of all shared memory, until the instruction completes execution:
F0h = LOCK
String manipulation instruction prefixes
F3h = REP, REPE
F2h = REPNE
where REP repeats instruction the number of times specified by iteration
count in ECX.
REPE and REPNE prefixes allow to terminate loop on the value of ZF CPU flag.
Related string manipulation instructions are:
MOVS, move string
STOS, store string
SCAS, scan string
CMPS, compare string, etc.
See also string manipulation sample program: rep_movsb.asm
Segment override prefix causes memory access to use specified segment
instead of default segment designated for instruction operand.
2Eh = CS
36h = SS
3Eh = DS
26h = ES
64h = FS
65h = GS
Operand override, 66h. Changes size of data expected by default mode of
the instruction e.g. 16-bit to 32-bit and vice versa.
Address override, 67h. Changes size of address expected by the instruction.
32-bit address could switch to 16-bit and vice versa.
25. Alternate Encodings for Instructions
To shorten program code, Intel created alternate (shorter) encodings of
some very commonly used instructions.
For example, x86 provides a single byte opcode for
add al, constant ; one-byte opcode and no MOD-REG-R/M byte
add eax, constant ; one-byte opcode and no MOD-REG-R/M byte
the opcodes are 04h and 05h, respectively. Also,
These instructions are one byte shorter than their standard ADD immediate
counterparts.
Note that
add ax, constant ; operand size prefix byte + one-byte opcode,
; no MOD-REG-R/M byte
requires an operand size prefix just as a standard ADD AX, constant
instruction, yet is still one byte shorter than the corresponding
standard version of ADD immediate.
Any decent assembler will automatically choose the shortest possible
instruction when translating program into machine code.
Intel only provides alternate encodings only for the accumulator registers
AL, AX, EAX.
This is a good reason to use accumulator registers if you have a choice
(also a good reason to take some time and study encodings of the x86
instructions.)
That has most of the answer there, but it is spread out between section 22 to 25. Here is
what it says:
Now it states in several places that hex 66 in the byte before a 32-bit22. Encoding Eight, Sixteen, and Thirty-Two Bit Operands
When Intel designed the 8086, one bit in the opcode, s,
selected between 8 and 16 bit integer operand sizes.
Later, when CPU added 32-bit integers to its architecture
on 80386 chip, there was a problem:
three encodings were needed to support 8, 16, and 32 bit sizes.
Solution was an operand size prefix byte.
x86 ADD Opcode:
Intel studied x86 instruction set and came to the conclusion:
in a 32-bit environment, programs were more likely to use
8-bit and 32-bit operands far more often than 16-bit operands.
So Intel decided to let the size bit s in the opcode select
between 8- and 32-bit operands.
23. Encoding Sixteen Bit Operands
32-bit programs don't use 16-bit operands that often, but they do
need them now and then.
To allow for 16-bit operands, Intel added prefix a 32-bit mode
instruction with the operand size prefix byte with value 66h.
This prefix byte tells the CPU to operand on 16-bit data rather
than 32-bit data.
x86 instruction format:
There is nothing programmer has to do explicitly to put an operand
size prefix byte in front of a 16-bit instruction:
the assembler does this automatically as soon as 16-bit operand is
found in the instruction.
However, keep in mind that whenever you use a 16-bit operand in a
32-bit program, the instruction is longer by one byte:
Opcode Instruction
-------- ------------
41h INC ECX
66h 41h INC CX
Be careful about using 16-bit instructions if size (and to a lesser
extent, speed) are important, because instructions are longer, and
slower because of their effect on the instruction cache.
24. x86 Instruction Prefix Bytes
x86 instruction can have up to 4 prefixes.
Each prefix adjusts interpretation of the opcode:
Repeat/lock prefix byte guarantees that instruction will have exclusive
use of all shared memory, until the instruction completes execution:
F0h = LOCK
String manipulation instruction prefixes
F3h = REP, REPE
F2h = REPNE
where REP repeats instruction the number of times specified by iteration
count in ECX.
REPE and REPNE prefixes allow to terminate loop on the value of ZF CPU flag.
Related string manipulation instructions are:
MOVS, move string
STOS, store string
SCAS, scan string
CMPS, compare string, etc.
See also string manipulation sample program: rep_movsb.asm
Segment override prefix causes memory access to use specified segment
instead of default segment designated for instruction operand.
2Eh = CS
36h = SS
3Eh = DS
26h = ES
64h = FS
65h = GS
Operand override, 66h. Changes size of data expected by default mode of
the instruction e.g. 16-bit to 32-bit and vice versa.
Address override, 67h. Changes size of address expected by the instruction.
32-bit address could switch to 16-bit and vice versa.
25. Alternate Encodings for Instructions
To shorten program code, Intel created alternate (shorter) encodings of
some very commonly used instructions.
For example, x86 provides a single byte opcode for
add al, constant ; one-byte opcode and no MOD-REG-R/M byte
add eax, constant ; one-byte opcode and no MOD-REG-R/M byte
the opcodes are 04h and 05h, respectively. Also,
These instructions are one byte shorter than their standard ADD immediate
counterparts.
Note that
add ax, constant ; operand size prefix byte + one-byte opcode,
; no MOD-REG-R/M byte
requires an operand size prefix just as a standard ADD AX, constant
instruction, yet is still one byte shorter than the corresponding
standard version of ADD immediate.
Any decent assembler will automatically choose the shortest possible
instruction when translating program into machine code.
Intel only provides alternate encodings only for the accumulator registers
AL, AX, EAX.
This is a good reason to use accumulator registers if you have a choice
(also a good reason to take some time and study encodings of the x86
instructions.)
instruction like inc eax will change the instruction to inc ax, because
Intel decided that 16-bit was not as prevalently used when working with
32-bits. So instead of jumping from 8-bits and 16-bits as found with the
8086, they went from 8-bits and 32-bits, then stuck in 66 hex in front
of 32-bits to drop back to 16-bits when required. It also makes the
386 and above somewhat incompatible with the 8086 way of doing things.
But it also states at one point that if the instruction has a default mode
of 16 bits, the 66 hex prefix will make it 32-bit instead, and vice versa. So
now I guess you have to know what the default mode is on an instruction
by instruction basis, or leave it to the Assembler to get it right. Which is
likely no problem with PureBasic and Fasm, but maybe not so much with the
built-in assembler in HotBasic. There you have to find out which external
assembler gets it right to work in giving you the missing opcodes to plug
in yourself. Makes PureBasic somewhat more attractive, right?
Anyway, here is sections 22 through 25 if you are interested:
===================================================================
22. Encoding Eight, Sixteen, and Thirty-Two Bit Operands
When Intel designed the 8086, one bit in the opcode, s,
selected between 8 and 16 bit integer operand sizes.
Later, when CPU added 32-bit integers to its architecture
on 80386 chip, there was a problem:
three encodings were needed to support 8, 16, and 32 bit sizes.
Solution was an operand size prefix byte.
x86 ADD Opcode:
Intel studied x86 instruction set and came to the conclusion:
in a 32-bit environment, programs were more likely to use
8-bit and 32-bit operands far more often than 16-bit operands.
So Intel decided to let the size bit s in the opcode select
between 8- and 32-bit operands.
23. Encoding Sixteen Bit Operands
32-bit programs don't use 16-bit operands that often, but they do
need them now and then.
To allow for 16-bit operands, Intel added prefix a 32-bit mode
instruction with the operand size prefix byte with value 66h.
This prefix byte tells the CPU to operand on 16-bit data rather
than 32-bit data.
x86 instruction format:
There is nothing programmer has to do explicitly to put an operand
size prefix byte in front of a 16-bit instruction:
the assembler does this automatically as soon as 16-bit operand is
found in the instruction.
However, keep in mind that whenever you use a 16-bit operand in a
32-bit program, the instruction is longer by one byte:
Opcode Instruction
-------- ------------
41h INC ECX
66h 41h INC CX
Be careful about using 16-bit instructions if size (and to a lesser
extent, speed) are important, because instructions are longer, and
slower because of their effect on the instruction cache.
24. x86 Instruction Prefix Bytes
x86 instruction can have up to 4 prefixes.
Each prefix adjusts interpretation of the opcode:
Repeat/lock prefix byte guarantees that instruction will have exclusive
use of all shared memory, until the instruction completes execution:
F0h = LOCK
String manipulation instruction prefixes
F3h = REP, REPE
F2h = REPNE
where REP repeats instruction the number of times specified by iteration
count in ECX.
REPE and REPNE prefixes allow to terminate loop on the value of ZF CPU flag.
Related string manipulation instructions are:
MOVS, move string
STOS, store string
SCAS, scan string
CMPS, compare string, etc.
See also string manipulation sample program: rep_movsb.asm
Segment override prefix causes memory access to use specified segment
instead of default segment designated for instruction operand.
2Eh = CS
36h = SS
3Eh = DS
26h = ES
64h = FS
65h = GS
Operand override, 66h. Changes size of data expected by default mode of
the instruction e.g. 16-bit to 32-bit and vice versa.
Address override, 67h. Changes size of address expected by the instruction.
32-bit address could switch to 16-bit and vice versa.
25. Alternate Encodings for Instructions
To shorten program code, Intel created alternate (shorter) encodings of
some very commonly used instructions.
For example, x86 provides a single byte opcode for
add al, constant ; one-byte opcode and no MOD-REG-R/M byte
add eax, constant ; one-byte opcode and no MOD-REG-R/M byte
the opcodes are 04h and 05h, respectively. Also,
These instructions are one byte shorter than their standard ADD immediate
counterparts.
Note that
add ax, constant ; operand size prefix byte + one-byte opcode,
; no MOD-REG-R/M byte
requires an operand size prefix just as a standard ADD AX, constant
instruction, yet is still one byte shorter than the corresponding
standard version of ADD immediate.
Any decent assembler will automatically choose the shortest possible
instruction when translating program into machine code.
Intel only provides alternate encodings only for the accumulator registers
AL, AX, EAX.
This is a good reason to use accumulator registers if you have a choice
(also a good reason to take some time and study encodings of the x86
instructions.)