# Endianess

For multi-byte variables as `int`, it matters how this sequence of bytes is stored in the memory. The two possibilities are little endian and big endian.

But first, let's recap, how hexadecimal values are written and interpreted and how their bits are stored. The following variable `i` is of type `integer` and uses four bytes of storage:

```int i = 0x12345678;
```

The digits are called nibble, each having a value of 0 to f (representing 15). In this example, the 8 is the least significant digit, its value is factored with ${16}^{0}=1$. The next digit from the right, 7, must be multiplied with ${16}^{1}=16$, thus it adds the value $7\cdot 16=112$. The other nibbles are handled accordingly up to the leftmost (8th) place (value 1), having a factor of ${16}^{7}=268 435 456$.

Each nibble can easily be converted to binary because there are only 16 different values. For example: 8 = `0b1000`. When those are written piece by piece, the hex value can be converted to binary:

```0x12345678
= 0x    1    2    3    4    5    6    7    8
= 0b 0001 0010 0011 0100 0101 0110 0111 1000
```

As in decimal numbers, the least significant bit is far to the right and the most significant bit is left. Bitfields are usually displayed with bit 0 (the least significant) to the right and with increasing bit positions to the left.

# Little endian

Systems using little endian byte order store the least significant byte at the lowest address.

```int i = 0x12345678;
```

As in decimal numbers, the rightmost digit (8) is the least significant and the leftmost digit (1) is the most significant. Two hexadecimal digits (two nibbles) are stored in one byte:

 ↑ large adresses 0x1003 12hex most significant byte 0x1002 34hex 0x1001 56hex ↓ small adresses 0x1000 78hex least significant byte

This becomes twisted, if multiple bytes are displayed in a row. If the bytes in the row are numbered increasing from right to left, the above sequence of digits can be recognized easily:

0x100f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 0x1000
00 00 00 00 00 00 00 00 00 00 00 00 12 34 56 78

But if the bytes are numbered (more intuitively) from left to right, the sequence of pairs is reversed:

0x1000 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0x100f
78 56 34 12 00 00 00 00 00 00 00 00 00 00 00 00

Note, that only the bytes (pairs of hexadecimal digits) are in a different sequence, each pair for itself remains with its less significant digit on the right side (similar to decimal numbers like 42). Some Hex-Viewers show pairs of bytes (`short`, words) -- when these are stored in little endian format and displayed from left to right, they show up: `5678 1234`.

The first version appears preferable, at least for multi-byte integers. With character strings, this is different:

```char str[] = "Hello world";
```

This is comparable to an array with the letter `H` in the first element, i.e. the lowest address. Placing this string behind the variable `i` and showing addresses increasing from right to left:

0x100f 0e 0d 0c 0b 0a 09 08 07 06 05 04 03 02 01 0x1000
'\0' 'd' 'l' 'r' 'o' 'w' ' ' 'o' 'l' 'l' 'e' 'H' 12 34 56 78

In the case of strings, increasing addresses from left to right (as one reads english text) is favorable (by twisting the integer, again):

0x1000 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0x100f
78 56 34 12 'H' 'e' 'l' 'l' 'o' ' ' 'w' 'o' 'r' 'l' 'd' '\n'

# Big endian

In big endian, the least significant byte is stored at the largest address:

 ↑ large adresses 0x1003 78hex least significant byte 0x1002 56hex 0x1001 34hex ↓ small adresses 0x1000 12hex most significant byte

In this byte order, addresses are usually displayed increasing from left to right as this allows to read the multi-byte integer as well as the string:

0x1000 .1 .2 .3 .4 .5 .6 .7 .8 .9 .a .b .c .d .e 0x100f
12 34 56 78 'H' 'e' 'l' 'l' 'o' ' ' 'w' 'o' 'r' 'l' 'd' '\n'

# Network byte order

Documents, that are exchanged between systems and especially network transmissions should care for the byte order. In the internet protocols, a network byte order is defined (which is big endian). There are functions to convert network byte order to the host byte order:

```#include <netinet/in.h>
unsigned long htonl(unsigned long hostlong)   // host to network, long (32 bit)
unsigned long ntohl(unsigned long netlong)    // network to host, long (32 bit)
```

# Programming

When does a program need to care for endianess?

Of course, when exchanging data with other instances (other programs or the same program running on a different system) either via files or network, the byte order matters. Only if all systems use the same byte order (for example, all are x86 systems), it can be ignored.

The internet protocol (BSD sockets) libraries use network byte order and require the IP address to be converted with `htonl()`.

In internal data structures, the byte order matters if `union`s or pointers are used to access portions of other variables. As long as only math operations and casts are used, it can be ignored:

```union {
uint32_t u32;
uint8_t  u8;
} demo;

demo.u32 = 0x12345678;
/*
* using the address, it depends on the byte ordering, what comes out.
*/
printf("lowest address: u8 = %hhx \n", demo.u8);
printf("highest address: u8 = %hhx \n", demo.u8);
/*
* using math operations, the least significant byte can be masked or calculated
* independently from the byte ordering.
*/
printf("least significant byte: %hhx \n", demo.u32 % 256);              // modulo
printf("least significant byte: %hhx \n", demo.u32 & 0xff);             // bitwise AND
printf("most significant byte: %hhx \n", demo.u32 / (256 * 256 * 256)); // division
printf("most significant byte: %hhx \n", demo.u32 >> 24);               // bit shift
```