Two's Complement
Created On 10. Jun 2021
Updated: 2021-06-10 19:15:08.008346000 +0000
Created By: acidghost
Every computer uses the two's complement to represent integers. It is a great representation of simplified math. To get a number in two's complement, it has to be in binary form with inverted digits and 1 added at the end. This representation is used because it can deal well differentiating between the positive and negative numbers.
How to differentiate between positive and negative numbers?
The first bit of each number is the sign bit that distinguishes negative from positive values. If it will be 1 then the binary will be negative and 0 will stay for positive. What does this mean? Let's take for example this calculation:
In binary this will be 000011111
which stands for 15
. To form it as a negative number, we invert all the digits 11110000
and add 1 to it 11110001
. In two's complement it will be understood that this is a negative since the high order bit, 1
, will stand for the negative numbers. The calculation will then be:
-1 x 2^7
will result -128
in base 10. Now try out the addition -128 + 64 + 32 + 16 + 1
. It will be -15
.
Signed and Unsigned
There is only 1 representation for 0
. 0-1
will be -1
(signed) and 255
(unsigned). Both of them use in hex 0xff
as binary representation for b11111111
.
In two's complement every operation will always mean the same. It means -1-1
will always represent 11111110
and be 0xfe
in hex.
The CPU does not care and does not follow any specific pattern for identifying if the numbers are unsigned or signed. This is for the developers who are writing the programs to decide if they are using signed numbers or not. Respectively, then they would decide for themselves which flags should be used:
- (unsigned) b11111111 + 1 == 255 + 1 == 0 == b00000000
- (signed) b11111111 + 1 == -1 + 1 == 0 == b00000000
Note: an unsigned 1 byte integer has a range of 0 to 255 and for the signed it is -128 to 127 -> b00000000 (8bit byte)
Conditionals
In most cases we won't care about how signed or unsigned numbers are represented, however, there are important moments as in conditional operations where the signedness is important.
Recall Control flow and conditionals from Assembly References and take a look at the example below:
- cmp eax, 16; jae too_big
- unsigned comparison
- eax = 0xffffffff will result in checking 0xffffffff > 16 and a jump
- cmp eax, 16; jge too_big
- signed comparison
- eax = 0xffffffff will result in checking -1 > 16, and no jump
In both examples 16 is compared against eax, where the value is 0xffffffff
. In unsigned comparison 0xffffffff
is 4,294,967,295 (unsigned 32bit numbers range from 0 to 4,294,967,295) and since this will be more than 16 the condition will be fulfilled. In the signed comparison 0xffffffff
is -1 (signed 32bit numbers range from -2,147,483,648 to +2,147,483,647) and this will result in the jump to not be fulfilled. Respectively, when writing an application, the developer has to account for the signed and unsigned variables with the corresponding conditional instruction.
References and Further Reads
https://stackoverflow.com/questions/1049722/what-is-2s-complement
https://www.cs.cornell.edu/~tomf/notes/cps104/twoscomp.html
https://pwn.college/modules/intro
Section: Binary Exploitation (PWN)
Back