Two's Complement

Created On 10. Jun 2021

Updated: 2021-06-10 19:15:08.008346000 +0000

Created By: acidghost

Every computer uses the two's complement to represent integers. It is a great representation of simplified math. To get a number in two's complement, it has to be in binary form with inverted digits and 1 added at the end. This representation is used because it can deal well differentiating between the positive and negative numbers.

How to differentiate between positive and negative numbers?

The first bit of each number is the sign bit that distinguishes negative from positive values. If it will be 1 then the binary will be negative and 0 will stay for positive. What does this mean? Let's take for example this calculation:

equation

In binary this will be 000011111 which stands for 15. To form it as a negative number, we invert all the digits 11110000 and add 1 to it 11110001. In two's complement it will be understood that this is a negative since the high order bit, 1, will stand for the negative numbers. The calculation will then be:

equation

-1 x 2^7 will result -128 in base 10. Now try out the addition -128 + 64 + 32 + 16 + 1. It will be -15.

Signed and Unsigned

There is only 1 representation for 0. 0-1 will be -1 (signed) and 255 (unsigned). Both of them use in hex 0xff as binary representation for b11111111.
In two's complement every operation will always mean the same. It means -1-1 will always represent 11111110 and be 0xfe in hex.

The CPU does not care and does not follow any specific pattern for identifying if the numbers are unsigned or signed. This is for the developers who are writing the programs to decide if they are using signed numbers or not. Respectively, then they would decide for themselves which flags should be used:

  • (unsigned) b11111111 + 1 == 255 + 1 == 0 == b00000000
  • (signed) b11111111 + 1 == -1 + 1 == 0 == b00000000

Note: an unsigned 1 byte integer has a range of 0 to 255 and for the signed it is -128 to 127 -> b00000000 (8bit byte)

Conditionals

In most cases we won't care about how signed or unsigned numbers are represented, however, there are important moments as in conditional operations where the signedness is important.
Recall Control flow and conditionals from Assembly References and take a look at the example below:

- cmp eax, 16; jae too_big
  1. unsigned comparison
  2. eax = 0xffffffff will result in checking 0xffffffff > 16 and a jump
- cmp eax, 16; jge too_big
  1. signed comparison
  2. eax = 0xffffffff will result in checking -1 > 16, and no jump

In both examples 16 is compared against eax, where the value is 0xffffffff. In unsigned comparison 0xffffffff is 4,294,967,295 (unsigned 32bit numbers range from 0 to 4,294,967,295) and since this will be more than 16 the condition will be fulfilled. In the signed comparison 0xffffffff is -1 (signed 32bit numbers range from -2,147,483,648 to +2,147,483,647) and this will result in the jump to not be fulfilled. Respectively, when writing an application, the developer has to account for the signed and unsigned variables with the corresponding conditional instruction.

References and Further Reads

https://stackoverflow.com/questions/1049722/what-is-2s-complement
https://www.cs.cornell.edu/~tomf/notes/cps104/twoscomp.html
https://pwn.college/modules/intro

Section: Binary Exploitation (PWN)

Back