R Hyde's Art of Assembly - best version?

Bare metal programming in PureBasic, for experienced users
oldefoxx
Enthusiast
Enthusiast
Posts: 532
Joined: Fri Jul 25, 2003 11:24 pm

Re: R Hyde's Art of Assembly - best version?

Post by oldefoxx »

Ah, I looked through the Index at the start of the above page, and went right to Section 22.
That has most of the answer there, but it is spread out between section 22 to 25. Here is
what it says:
22. Encoding Eight, Sixteen, and Thirty-Two Bit Operands

When Intel designed the 8086, one bit in the opcode, s,
selected between 8 and 16 bit integer operand sizes.


Later, when CPU added 32-bit integers to its architecture
on 80386 chip, there was a problem:

three encodings were needed to support 8, 16, and 32 bit sizes.


Solution was an operand size prefix byte.

x86 ADD Opcode:
Intel studied x86 instruction set and came to the conclusion:

in a 32-bit environment, programs were more likely to use
8-bit and 32-bit operands far more often than 16-bit operands.

So Intel decided to let the size bit s in the opcode select
between 8- and 32-bit operands.


23. Encoding Sixteen Bit Operands
32-bit programs don't use 16-bit operands that often, but they do
need them now and then.

To allow for 16-bit operands, Intel added prefix a 32-bit mode
instruction with the operand size prefix byte with value 66h.

This prefix byte tells the CPU to operand on 16-bit data rather
than 32-bit data.

x86 instruction format:
There is nothing programmer has to do explicitly to put an operand
size prefix byte in front of a 16-bit instruction:

the assembler does this automatically as soon as 16-bit operand is
found in the instruction.

However, keep in mind that whenever you use a 16-bit operand in a
32-bit program, the instruction is longer by one byte:
Opcode Instruction
-------- ------------
41h INC ECX
66h 41h INC CX

Be careful about using 16-bit instructions if size (and to a lesser
extent, speed) are important, because instructions are longer, and
slower because of their effect on the instruction cache.

24. x86 Instruction Prefix Bytes
x86 instruction can have up to 4 prefixes.
Each prefix adjusts interpretation of the opcode:

Repeat/lock prefix byte guarantees that instruction will have exclusive
use of all shared memory, until the instruction completes execution:
F0h = LOCK

String manipulation instruction prefixes
F3h = REP, REPE
F2h = REPNE

where REP repeats instruction the number of times specified by iteration
count in ECX.

REPE and REPNE prefixes allow to terminate loop on the value of ZF CPU flag.

Related string manipulation instructions are:
MOVS, move string
STOS, store string
SCAS, scan string
CMPS, compare string, etc.

See also string manipulation sample program: rep_movsb.asm

Segment override prefix causes memory access to use specified segment
instead of default segment designated for instruction operand.
2Eh = CS
36h = SS
3Eh = DS
26h = ES
64h = FS
65h = GS

Operand override, 66h. Changes size of data expected by default mode of
the instruction e.g. 16-bit to 32-bit and vice versa.

Address override, 67h. Changes size of address expected by the instruction.
32-bit address could switch to 16-bit and vice versa.

25. Alternate Encodings for Instructions
To shorten program code, Intel created alternate (shorter) encodings of
some very commonly used instructions.

For example, x86 provides a single byte opcode for
add al, constant ; one-byte opcode and no MOD-REG-R/M byte
add eax, constant ; one-byte opcode and no MOD-REG-R/M byte

the opcodes are 04h and 05h, respectively. Also,


These instructions are one byte shorter than their standard ADD immediate
counterparts.

Note that
add ax, constant ; operand size prefix byte + one-byte opcode,
; no MOD-REG-R/M byte

requires an operand size prefix just as a standard ADD AX, constant
instruction, yet is still one byte shorter than the corresponding
standard version of ADD immediate.

Any decent assembler will automatically choose the shortest possible
instruction when translating program into machine code.

Intel only provides alternate encodings only for the accumulator registers
AL, AX, EAX.

This is a good reason to use accumulator registers if you have a choice
(also a good reason to take some time and study encodings of the x86
instructions.)
Now it states in several places that hex 66 in the byte before a 32-bit
instruction like inc eax will change the instruction to inc ax, because
Intel decided that 16-bit was not as prevalently used when working with
32-bits. So instead of jumping from 8-bits and 16-bits as found with the
8086, they went from 8-bits and 32-bits, then stuck in 66 hex in front
of 32-bits to drop back to 16-bits when required. It also makes the
386 and above somewhat incompatible with the 8086 way of doing things.

But it also states at one point that if the instruction has a default mode
of 16 bits, the 66 hex prefix will make it 32-bit instead, and vice versa. So
now I guess you have to know what the default mode is on an instruction
by instruction basis, or leave it to the Assembler to get it right. Which is
likely no problem with PureBasic and Fasm, but maybe not so much with the
built-in assembler in HotBasic. There you have to find out which external
assembler gets it right to work in giving you the missing opcodes to plug
in yourself. Makes PureBasic somewhat more attractive, right?

Anyway, here is sections 22 through 25 if you are interested:
===================================================================
22. Encoding Eight, Sixteen, and Thirty-Two Bit Operands

When Intel designed the 8086, one bit in the opcode, s,
selected between 8 and 16 bit integer operand sizes.


Later, when CPU added 32-bit integers to its architecture
on 80386 chip, there was a problem:

three encodings were needed to support 8, 16, and 32 bit sizes.


Solution was an operand size prefix byte.

x86 ADD Opcode:
Intel studied x86 instruction set and came to the conclusion:

in a 32-bit environment, programs were more likely to use
8-bit and 32-bit operands far more often than 16-bit operands.

So Intel decided to let the size bit s in the opcode select
between 8- and 32-bit operands.


23. Encoding Sixteen Bit Operands
32-bit programs don't use 16-bit operands that often, but they do
need them now and then.

To allow for 16-bit operands, Intel added prefix a 32-bit mode
instruction with the operand size prefix byte with value 66h.

This prefix byte tells the CPU to operand on 16-bit data rather
than 32-bit data.

x86 instruction format:
There is nothing programmer has to do explicitly to put an operand
size prefix byte in front of a 16-bit instruction:

the assembler does this automatically as soon as 16-bit operand is
found in the instruction.

However, keep in mind that whenever you use a 16-bit operand in a
32-bit program, the instruction is longer by one byte:
Opcode Instruction
-------- ------------
41h INC ECX
66h 41h INC CX

Be careful about using 16-bit instructions if size (and to a lesser
extent, speed) are important, because instructions are longer, and
slower because of their effect on the instruction cache.

24. x86 Instruction Prefix Bytes
x86 instruction can have up to 4 prefixes.
Each prefix adjusts interpretation of the opcode:

Repeat/lock prefix byte guarantees that instruction will have exclusive
use of all shared memory, until the instruction completes execution:
F0h = LOCK

String manipulation instruction prefixes
F3h = REP, REPE
F2h = REPNE

where REP repeats instruction the number of times specified by iteration
count in ECX.

REPE and REPNE prefixes allow to terminate loop on the value of ZF CPU flag.

Related string manipulation instructions are:
MOVS, move string
STOS, store string
SCAS, scan string
CMPS, compare string, etc.

See also string manipulation sample program: rep_movsb.asm

Segment override prefix causes memory access to use specified segment
instead of default segment designated for instruction operand.
2Eh = CS
36h = SS
3Eh = DS
26h = ES
64h = FS
65h = GS

Operand override, 66h. Changes size of data expected by default mode of
the instruction e.g. 16-bit to 32-bit and vice versa.

Address override, 67h. Changes size of address expected by the instruction.
32-bit address could switch to 16-bit and vice versa.

25. Alternate Encodings for Instructions
To shorten program code, Intel created alternate (shorter) encodings of
some very commonly used instructions.

For example, x86 provides a single byte opcode for
add al, constant ; one-byte opcode and no MOD-REG-R/M byte
add eax, constant ; one-byte opcode and no MOD-REG-R/M byte

the opcodes are 04h and 05h, respectively. Also,


These instructions are one byte shorter than their standard ADD immediate
counterparts.

Note that
add ax, constant ; operand size prefix byte + one-byte opcode,
; no MOD-REG-R/M byte

requires an operand size prefix just as a standard ADD AX, constant
instruction, yet is still one byte shorter than the corresponding
standard version of ADD immediate.

Any decent assembler will automatically choose the shortest possible
instruction when translating program into machine code.

Intel only provides alternate encodings only for the accumulator registers
AL, AX, EAX.

This is a good reason to use accumulator registers if you have a choice
(also a good reason to take some time and study encodings of the x86
instructions.)
has-been wanna-be (You may not agree with what I say, but it will make you think).
Thorium
Addict
Addict
Posts: 1271
Joined: Sat Aug 15, 2009 6:59 pm

Re: R Hyde's Art of Assembly - best version?

Post by Thorium »

If you want to go with PureBasic, you dont need to care about instruction encoding anymore. There are no limitations of the PureBasic inline asm. It just uses FASM to assemble the code.

Also there is no segmentation anymore. On Windows all addresses are linear 32bit (or 64bit on x64). Memory segmentation is a relic of the past.
oldefoxx
Enthusiast
Enthusiast
Posts: 532
Joined: Fri Jul 25, 2003 11:24 pm

Re: R Hyde's Art of Assembly - best version?

Post by oldefoxx »

Whoo! That last post sort of got away from me. Somehow my paste went between
the quote and after it as well. Some of my remarks were lost as well.

To boil it down, use a prefix of 66h if you want to drop from 32-bit reference back to
16-bit reference. Unless the instruction defaults to 16 bits, in which case the 66 hex
converts it from 16-bits to 32-bits. You have to know what the default mode is for
the instruction to know what the 66 hex is going to do. Normally leave it to the
Assembler to make the right call for you. If having to do it manually, you may have
to try it both ways and judge the effects.

That's just for switching between 32 bits and 16 bits, or back the other way. I wonder
what 66 hex and 8-bits hold in store for each other? It doesn't say.
has-been wanna-be (You may not agree with what I say, but it will make you think).
oldefoxx
Enthusiast
Enthusiast
Posts: 532
Joined: Fri Jul 25, 2003 11:24 pm

Re: R Hyde's Art of Assembly - best version?

Post by oldefoxx »

What you said about segmentation is what i was lead to understand. The argument
about whether in 16-bit segment mode or 32-bit segment mode threw me off.

It's possible to test if 66 hex is effecting some instructions one way or another. Take
the movsw or movsd instructions for instance. If they both appear the same, then
somehow your assembler is making no distinction between them. One of them should
have the added 66 hex prefix in front. As to whether it's doing it the right way, that
is a bit more tricky. put 01020304h in the eax register, rol eax, 16 and it is now in the
upper 16 bits. move 05060708h into the ax register. Return the eax register to a variable
if you have 102030405060708h or its decimal equivalent in the variable, your assembler
did that right. Then go the other way. Put 01020304h in the ax register. Do a rol eax,16.
Now put 05060708 into eax, and return that to a variable. You should have 5060708h
or its decimal equivalent in the variable. If you instead get 102030405060708h or its
decimal equivalent again, that would be wrong.

For strings, try something different. Have a$="123456789". have b$="abcdef" Use ASM
code to move b$'s contents into a$, first using movsw, then repeat with movsd. If you first
get "ab3456789" then "abcd56789", it got that right as well. If you got "ab3456789" twice,
or "abcd56789" twice, it messed up in distinguishing between using a word and a doubleword.
has-been wanna-be (You may not agree with what I say, but it will make you think).
Thorium
Addict
Addict
Posts: 1271
Joined: Sat Aug 15, 2009 6:59 pm

Re: R Hyde's Art of Assembly - best version?

Post by Thorium »

oldefoxx wrote:What you said about segmentation is what i was lead to understand. The argument
about whether in 16-bit segment mode or 32-bit segment mode threw me off.
The prefix does simply select the bit size of the instruction operand. Addressing (referencing) is allways 32bit without segmentation or 64bit in long mode. While the x86 does support segmentation in 32bit mode, Windows and Linux do not use it. On long mode (64bit) segmentation isnt supported anymore by the CPU.
Post Reply