Compact Encoding of Routing Prefixes

How many bits of information are contained in a routing prefix? When printed, a routing prefix commonly looks like an IP address followed by a slash and then a prefix length: 192.0.2.0/24 or 2001:db8::/32. That should mean a minimum of 32 + ceiling(log₂(32 + 1)) = 38 bits for an IPv4 prefix and 128 + ceiling(log₂(128 + 1)) = 136 bits for IPv6. However, it is possible to encode a prefix in just one extra bit (for a total of 33 or 129 bits).

Routers build gigantic tables of routing prefixes that map to network interfaces and various other information. They are used to determine which interface a packet should be sent to depending on the destination IP address. These tables take up a large amount of memory in the router. Perhaps encoding prefixes in fewer bits can save some memory.

The key to encoding prefixes in fewer bits is the fact that the rightmost part of the IP address in the prefix consists of zeros. The number of zeroes must be at least as many as the prefix length. Here is a binary representation of the prefix 192.0.2.240/28:

192 0 2 240

1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 - - - -

192	0	2	240
1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	1	1	1	1	-	-	-	-

The bits marked with - correspond to the bits that have to be zero in the prefix. Since the prefix length is 28 there are 32 - 28 = 4 such bits in this example. The encoding works by making room for one extra bit in this field and then encoding the inverse of the length by making a notch (simply a 1-bit) in the leftmost part of the field, like so:

192 0 2 240

1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 0 0

192	0	2	240
1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	1	0	1	1	1	1	1	0	0	0	0

To get back the prefix length all we have to do is count the bit position of the notch, and after removing it we get back the address part. Here is an encoder for IPv4 in Python:

import socket, struct

def encode_ip4_prefix(prefix):
    """Encodes an IPv4 prefix (x.y.z.w/len) as a 33-bit integer."""
    # Get the address and inverse length as integers.
    address_str, length = prefix.split('/')
    address_bin = socket.inet_pton(socket.AF_INET, address_str)
    address_int, = struct.unpack("!I", address_bin)
    inv_length = 32 - int(length)

    # Encode.
    notch = 1 << inv_length       # notch to mark the length
    mask = (1 << inv_length) - 1
    address_int &= ~mask          # clear the rightmost bits

    return (address_int << 1) | notch  # make room and place notch

This is easily translated to your language of choice if it has an integer data type that can represent a larger than 32-bit unsigned integer. For IPv6 it gets trickier in low-level languages. (Except in assembly with a carry bit, where it suddenly gets easier. Go figure.)

To decode we need to find the rightmost 1-bit. The best reference for the required bit twiddling is the book Hacker’s Delight (Henry S. Warren, Jr., Addison-Wesley). I gifted my first edition copy to a mathematician friend of mine, but in the second edition the relevant section is 5–4 “Counting Trailing 0’s”. The Python code below uses int.bit_length(), which is not uncommon in modern languages. In GNU C the code could use __builtin_ffs() and in R6RS Scheme it could be using bitwise-first-bit-set. For other languages you may need to check the book.

import socket, struct

def decode_ip4_compact_prefix(x):
    """Decodes a prefix that was encoded with encode_ip4_prefix."""
    if x == 0:
        return None

    # Decode.
    first_bit_set = x & (-x)  # isolate rightmost 1-bit
    x &= x - 1                # clear rightmost 1-bit
    inv_length = first_bit_set.bit_length() - 1
    address_int = x >> 1
    length = 32 - inv_length

    # Turn back into a string.
    address_bin = struct.pack("!I", address_int)
    address_str = socket.inet_ntop(socket.AF_INET, address_bin)

    return "{}/{}".format(address_str, length)

And that’s all you need for encoding a prefix in just one bit more than the address length itself. Whether or not it’s practical in your application is a question you’ll need to answer yourself.

This was written as part of a series of articles on stuff that has been lying around on this website for a long time without any real commentary. The code it talks about was first posted in 2011 and contains some bugs. Thanks to Christian Häggström for reading an early version of this article and showing me int.bit_length().