Created On 12. Apr 2021
Updated: 2021-06-06 01:51:25.295591000 +0000
Created By: acidghost
The assembly language is one of the most ancient and insane languages that our computers have to deal with. Actually, it is the only language the our machines understand. Each architecture addresses the memory in different ways, and assembly is one of those things that is not portable between architectures. In high level languages, there is a great perk that functions and most lines of code can be expressed in few human words and be easily configured (by including libraries etc.) to be transferred to run on any other machine. With assembly things are different. To make something out of the assembly languages, it is needed to grasp the fundamentals and a lot of practice. With little knowledge the limit will be hit very fast and alas for all who gave up early, Stackoverflow will be of no use here.
I will be mostly covering ways to address memory in the x86-x64 CPU family, in the long mode memory model. In IA-32 the memory model is known as protected mode flat model. There are more differences between these two including that in the 64-bit long mode the registers have been extended to a 64 bit width and they are renamed with an R as for example ESP to RSP. In our days everything shifting towards the long mode, however it is important and will also add up a lot of knowledge knowing some of the addressing highlights in the pre-historical real mode flat model, that 8008 with internal 8-bit in size registers still ran on and real mode segmented model that were in 8086 and 8088 with 16-bit registers. For reference, 32 bits can address 4 gigabytes of memory while with 64 this goes up to 16 exabytes, which is a billion gigabytes or 2^60 bytes. The long mode is like an overdose of mana for modern devs, however it is still the same protected mode flat model, with some other differences as how Linux handles system calls. This is why knowing protected mode flat model is also useful for advancing in long mode.
When writing code that will be executed by the CPU, the only thing it is concerned about, are the binary encoded instructions. It does not matter in which languages these are written, as the CPU will interpret in its core everything in assembly. The most appropriate language in which the assembly language can be expressed is C. This is one of the reasons why all of the major operating systems are written in C and C++.
In the ancient times when our ancestors where still sending rockets to moon with 2KBs of memory (literally!), the 8008 didn't exist yet. When the 8008 came, the registers were known as
a, c, d, b, e, h, l
followed by the next generation of
ax, cx, dx, bx, sp, si, di.
Later with the x86 architecture they expanded further
eax, ecx, edx, ebx, esp, ebp, esi, edi
and nowadays, with amd64 we became greatly overpowered with
rax, rcx, rdx, rbx, rsp, rbp, rsi, rdi, r8, r9, r10, r11, r12, r13, r14, r15
There are also other registers that were added in various multimedia extensions.
Registers are hardware components in the CPU that represent the internal state of the processor. In the early days, each register had specific jobs. In those times the basic registers were known as:
- eax - Accumulator
- ebx - Base Index
- ecx - Counter
- edx - Data
Each register had a specific job, because each of them was designed to deal with certain operations better. For example calculations would be executed faster in ecx, and they were directed in there. With x86_64 this doesn't matter much anymore, and they are all usually mentioned as general purpose registers.
Other important registers are:
- esi - Source Index
- edi - Destination Index
- eip - Instruction Pointer
- esp - Stack Pointer
- ebp - Base Pointer
All registers can be accessed partially by other ones. For example, ax will access the lower half of eax, and eax will consequently similarly address rax. One important thing to remember is that the 32-bit register will wipe out the 64-bit when moved in. For example, if eax will be moved into rax, the lower half will contain eax, but the upper one will be 0ed out. This is not true for 16, 8H and 8L registers.
Instructions tell the CPU what to do. Remember the data flows from right to left. Below is the Intel syntax. In AT&T syntax, the data flows from left to right, however, I will be mostly using the Intel one, since it is more readable.
mov rax, rbx
In this example
rbx will be moved into
rax. In C this will be analog to
rax = rbx
When disassembled (in base 16) it will be translated to opcode as following:
48 89 d8
Assembly is much easier understood by humans when written with mnemonics.
mov is a mnemonic. However, the processor will always interpret the the assembly in opcode format.
A good reference for x86 opcode instruction can be found here
In the opcode example above, the instruction consists of 3 bytes.
48 is a byte. A byte consists of 2 nibbles. In this case 4 is a nibble. A nibble is represented in one hexadecimal digit, which is 4 bits. A byte will consist then of 8 bits.
In the flags register, every bit is a flag. This means that it represents True or False. Depending on the state of the program, it changes. The basic 4 flag registers are:
- Zero (ZF) - it will set the flag to 1 whenever the last calculation had the result zero and cleared to 0 when it was nonzero. For example, depending on the result of an arithmetic operation, the flag will be set accordingly. This is the most basic flag that can be used to check if any bit is 0 or 1.
Sign (SF) - it equals the most significant bit of the last result. If the result is positive in two's complement representation it will be set to 0 and if it is negative then it will be set to 1. For example,
edx == 0xffffffffwill set it to 1 and
edx == 1will set it to 0.
- Carry (CF) - the flag will be set if the addition of two numbers causes a carry out of the leftmost bits. This happens when the result of unsigned addition or subtraction is wrong. The carry flag will be set to 1 when something is wrong, and cleared to 0 when everything is fine (no wrap around occurs).
- Overflow (OF) - the flag will be set if the signed addition or subtraction of two numbers is wrong. The processor looks on the most significant bits of the operands and of the result. The most significant bit is the sign of the number. The Overflow Flag will be set, in case:
- addition of two positive numbers has a negative result
- addition of two negative numbers has a positive result
- "positive - negative" has a negative result
- "negative - positive* has a positive result
It won't be set when the following operations are performed:
- "positive + negative"
- "positive - positive"
- "negative - negative"
Note: as mentioned above, wrong does not mean that something went wrong, and the result of the arithmetic operation is not correct. Since the calculations are performed in two's complement representation, it is normal that in certain cases the binary addition of two positive numbers can have a negative result. As for example:
In this example, the addition between al and cl will result in 0x80. In binary representation this will look:
7f 01 80
01111111 + 00000001 = 10000000
Now a question for the reader. Will the overflow flag be set or cleared? What about the carry flag?
Few examples of data manipulation:
mov rax, [rbx+10] ; example of memory access where [rbx+10] is the referenced location
inc [rax] ; inc - increment. Increment the memory location of that value in memory (of rax)
Control flow and conditionals
jmp 0x100 ; change the value of IP, which means jump to a different location of the program during execution
cmp rax, rbx ; compare the results between rax and rbx. Comparison occurs with subtraction (rax - rbx)
Simple control flow program:
In this example, there is a label that indicates a specific location in the program. Labels like this can be used anywhere in the program. Further, ecx gets incremented and after the
jmp instruction makes the program jump back to label. In this case, it will be an infinite loop, where ecx will keep getting incremented and
jmp will force the program to start over from label.
jmp can be also used to jump forward in the program and skip some of its parts.
Unconditional jumps involve the flags register for comparison. Some of them are:
|jz/je (jump if zero/equal)
|ZF = 1
|jnz/jne (jump if not zero/not equal)
|ZF = 0
|jb/jnae (jump below/not above equal; unsigned)
|ZF = 0
|jbe/jna (jump below equal/not above; unsigned)
|CF 1 or ZF = 1
|ja/jnbe (jump above/not below equal; unsigned)
|CF = 0 and ZF = 0
|jae/jnb (jump above equal/not below; unsigned)
|CF = 0
|jg/jnle (jump greater/not less equal; signed)
|SF = OF and ZF = 0
|jge/jnl (jump greater equal/not less; signed)
|SF = OF
|jl/jnge (jump less/not greater equal; signed)
|SF != OF
|jle/jng (jump less equal/not greater; signed)
|SF != OF or ZF =1
Programs need need to interact with the outside world as well, and this is where system calls come in the game. They can be looked up by their number also in the man page, see
man syscalls. System calls are triggered by placing the system call number into rax, then storing the arguments into rbx, rcx, rdi etc. and calling the interrupt -
syscall on x64 and
int 80h on x86.
Finally, let's get to the last example of the assembly references I pilled up here. The program below is one of Jeff Duntemann's programs (probably written sometime in the 80's) that converts lowercase letters from a file to uppercase. It can be assembled with the following command (works on Ubuntu 20.04 as well!):
$ nasm -f elf -g -F stabs uppercase.asm
nasm invokes the assembler, whereas
-f elf specifies that the file should be in elf format, then
-g indicates that the debug information should be included in the .o file and
-F stabs stands for generation of debug information in stabs format.
After that run on a 64bit system:
$ ld -m elf_i386 -s -o uppercase uppercase.o
Feel free to ignore the warning. Then you can run it like this:
$ ./uppercase > ouput file < input file
SECTION .bss ; Section with uninitialized data
BUFFLEN equ 1024 ; Length of buffer
Buff: resb BUFFLEN ; Text buffer
SECTION .data ; Section with initialised data
SECTION .text ; Section with code
global _start ; Linker needs this to find the entry point
nop ; This no-op keeps gdb happy
; Read a buffer full of text from stdin:
mov eax,3 ; Specify sys_read call
mov ebx,0 ; Specify File Descriptor 0: Standard Input
mov ecx,Buff ; Pass offset of the buffer to read to
mov edx,BUFFLEN ; Pass number of bytes to read at one pass
int 80h ; Call sys_read to fill the buffer
mov esi,eax ; Copy sys_read return value for safekeeping
cmp eax,0 ; If eax=0, sys_read reached EOF on stdin
je Done ; Jump if Equal (to 0, from compare)
; Set up registers for the process buffer step:
mov ecx, esi ; Place the number of bytes read into ecx
mov ebp,Buff ; Place address of buffer into ebp
dec ebp ; Adjust count to offset
; Go through the buffer and convert lowercase to uppercase characters:
cmp byte [ebp+ecx],61h ; Test input char against lowercase
jb Next ; If below 'a' in ASCII, not lowercase
cmp byte [ebp+ecx],7Ah ; Test input char against lowecase 'z'
ja Next ; If above 'z' in ASCII, not lowercase
sub byte [ebp+ecx],20h ; Subtract 20H to give uppercase
dec ecx ; Decrement counter
jnz Scan ; If characters remain, loop back
; Write the buffer full of processed text to stdout;
mov eax,4 ; Specify sys_write call
mov ebx,1 ; Specify File Descriptor 1: Standard Input
mov ecx,Buff ; Pass offset of the buffer
mov edx,esi ; Pass the # of bytes of data in the buffer
int 80h ; Make sys_write kernel call
jmp read ; Loop back and load another buffer full
; All done! Let's end this party
mov eax,1 ; Code for Exit Syscall
mov ebx,0 ; Return a code of zero
int 80H ; Make sys_exit kernel call
Where to learn more about Assembly
There are many resources, that help in understanding it better. Just as with everything else, additionally to the references below, try looking up in your favorite search engine, finding a good book or a course on YouTube.
- If you have never seen assembly before, this is a nice one to get you slowly started with x86 - https://compas.cs.stonybrook.edu/~nhonarmand/courses/sp17/cse506/ref/pcasm-book.pdf
- A guide to assembly and differences between the Intel and AT&T syntax (this is the only time I'm showing something related to AT&T syntax) - http://www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html
- 8086 - http://www2.imm.dtu.dk/courses/02131/asm/asm01001.htm
- Various references and cheat sheets - https://www.cs.uaf.edu/2011/fall/cs301/
- Introduction to X86_X64 - https://compas.cs.stonybrook.edu/~nhonarmand/courses/sp17/cse506/ref/assembly.html
- Good guide to assembly and programming with Raspberry PI - https://bob.cs.sonoma.edu/IntroCompOrg-RPi/intro-co-rpi.html
- Here are your favorite books, the Intel Manuals - https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html
- A very compact assembly reference guide, everything a beginner needs - http://www.bitsavers.org/pdf/borland/turbo_assembler/Turbo_Assembler_Version_5_Quick_Reference.pdf
- A nice assembly cheat sheet- https://github.com/2OURC3/Assembly/blob/assembly/CheatSHEET%20-ASM.s
Duntemann J. - Assembly Language Step by Step, 3rd Edition. Programming with Linux
PWN COLLEGE - https://pwn.college/modules/intro
Section: Reverse EngineeringBack