17 juillet 2023
Adresses Réseau: Historique et Caractéristiques des Adresses IP (article en anglais)
From the beginning of networking, remote hosts were identified by short strings called hostnames, which prevented users from tediously remembering numeric network addresses. This post is about the history and details of these addresses, from the early 70s to IPv4 and IPv6.
When the ARPANET brought the early networks together, they were interconnected by IMPs, the fridge-sized ancestors to the current routers, located at key positions and using the existing telephone lines. Addresses were highly dependent on the architecture, operating system, and type of network, but these IMPs allowed different networks to speak to each other, as long as they followed the correct protocol RFC 1.
Users who wanted to connect to a remote server would use a terminal and make a connection using a hostname. This was mapped to an address that was one octet (8 bits) long, of which the high two bits designated the host number, and low six bits were the IMP number. This allowed for 64 IMPs, each with a maximum of 4 hosts, for a total of 256 distinct hosts, which was deemed large enough for the foreseeable future.
As examples of network addresses, the following table lists the hosts that were available in 1972 on the first four IMPs, already out of 35:
||1||1||0||UCLA, Network Measurement Center|
||65||1||1||UCLA, Campus Computing Network|
||2||2||0||SRI, Augmented Research Center|
||66||2||1||SRI, Artificial Intelligence Group|
||4||4||0||University of Utah|
Needless to say, as interest grew and networks were added, a single octet for network addresses was not sufficient, and work towards a replacement began. Although it was a work-in-progress for a few years by then, at the turn of the decade, in january 1980, RFC 760 was proposed with a better alternative known as IPv4.
IPv4 addresses were four octets (32 bits) long, the first of which designating the network number. This allowed for four times as many networks, 256, each with up to 16 million hosts. These new addresses were four times larger than the previous ones, and were deemed large enough for the foreseeable future.
xxxxxxxx yyyyyyyy yyyyyyyy yyyyyyyy
IP Address Classes
The idea was that organisations and companies, instead of requesting a network number, would request a network class that would be adequate for their need. With this classful system, more than 2 million networks would be available.
The Internet became increasingly popular in the mid-1980s when every company started using this new tool called electronic mail. It soon became evident that not even network classes would be enough, and RFC 950 was published in 1985 to standardize how a network should split into an arbitrary number of subnets, optionally configured at each network level.
For example, a class B network could have decided for 16 subnets (four bits), thus 4,096 hosts (12 bits) in each subnet:
10xxxxxx xxxxxxxx yyyyzzzz zzzzzzzz
Even if subnets were limited to within a specific network, this was still an important change in how routing worked. Before that, organizations that wanted different networks, eg. for separate buildings, would require multiple IP address ranges. If an organization had four buildings with 1,000 hosts each, it would need four class B networks, and would only use 4,000 IP addresses out of the ~260,000 that would be accessible. Subnets allowed for more efficient use of the network address ranges, since that same organization could now subnet a single class B network.
At the beginning of the 1990s, RFC 1380 was published, explaining some problems that required immediate attention. In particular, the Internet was faced against two imminent and critical issues:
- class B networks would soon be exhausted
- the routing tables were becoming too large for the current technology
The source of the first issue was that already in 1992, barely 10 years after they were implemented, 54% of class A and 43% of class B were assigned. Since class A networks were too large and rarely given, and class C networks were too small for a typical organization, class B was going to run out.
The second issue was because it took more time, roughly 18 months, for computer performance to double, whereas the Internet was doubling in size every 12 months. Routers’ hardware just could not keep up for long.
Some help was given to the routing tables with the Border Gateway Protocol, proposed in 1989 with RFC 1105, and implemented on the Internet around 1994. With BGP, not all routers needed to keep the complete routing table. This allowed a larger network range to be forwarded to another BGP router, which could be further refined as it neared its destination.
There was also a third issue, but which did not require immediate attention like the other two: actual IPv4 addresses shortage. It was eventually going to happen, so the only thing that anyone could do was slow down when it would happen.
With all that, the mid-1990s were right when computers were now affordable to almost everyone. Reaching most other countries by now, the Internet was about to get its biggest growth yet, exascerbating the pressing need to find an adequate replacement to IPv4.
RFC 1517 was published in september 1993 describing Classless Inter-Domain Routing, or CIDR, a measure to slow down the issues mentioned earlier. CIDR removed the old classful network ranges in favor of an additional number that defined how many bits were reserved for the network prefix.
Routing throughout the Internet now needed this additional piece of information. Unallocated IP ranges, mostly class C at this point, could now be assigned with more flexibility, according to an organization’s needs. This system also allowed IP addresses to be better aggregated during routing, simplifying the routing tables.
At its core, CIDR is just a bitmask. For example, a CIDR value of
22 would result in a bitmask with the highest 22 bits set to 1, and the network prefix could be calculated by
ANDing the IP address and the bitmask:
xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx AND 11111111 11111111 11111100 00000000 --------------------------------------- xxxxxxxx xxxxxxxx xxxxxx00 00000000
Network Address Translation, or NAT, was proposed in 1994 with RFC 1597. Before that, a company would need an IP address for each computer that was connected to the Internet. NAT helped by allowing a NAT gateway to map the entire private network to a single IP address, using port numbers.
After a couple of years, IPv6 was proposed in december 1995 as the successor to IPv4 with RFC 1883, and later standardized in 1998. Since it changed the Internet Protocol, IPv4 and IPv6 are not backward-compatible, and a number of mechanisms were implemented over the years during the slow transition that is still going on to this day.
IPv6 addresses are 16 octets long (128 bits), the first half of which determining the network identifier and subnet, and the rest identifies the host within the network. This results in 2^64 (almost two quintillion) networks, each with as many possible hosts. This is again four times larger than before, and was deemed large enough, again, for the foreseeable future.
IP Address Format
So far, only the internal format of IP addresses were mentioned. An IPv4 address is 4 octets (32 bits) long, while an IPv6 address is 16 octets (128 bits). This is how they are used internally by hardware and software, just a simple integer value, and how they should be persisted to a database (not as a string!). Even better, whenever possible, the features that a database offers around IP addresses should be used. For example, PostgreSQL has types for
cidr, and functions that can be used on them.
This raw value, eg. 0x7f000001 for the
localhost IPv4 address, is obviously not how they are to be represented for humans to read, and the following sections show how IP addresses are displayed and parsed.
Displaying IPv4 Addresses
An IPv4 address is displayed by separating each octet, in decimal, by a dot character. This is called the dot-decimal notation. Since leading zeroes are not displayed, each part will then be from
255. For example, the
localhost IPv4 address is displayed as
Parsing IPv4 Addresses
From the dot-decimal notation, it is easy to parse a string into its IPv4 address: split the string at every
. character, convert each parts into integer values, shift them by the appropriate number of bits, then add them all together.
Most systems and tools use
inet_aton (and its inverse,
inet_ntoa) to convert a string into an IPv4 address’ underlying 32 bits.
inet_aton is a library function that was added in the BSD operating system in the mid-1980s, and which gained popularity in other operating systems.
Unfortunately, parsing dot-decimal notation would be too easy, and there are actually some other ways to provide an IPv4 address string that can be parsed into its internal representation.
Some parts can be intentionally left blank. When this happens, the last part is expanded to fill the missing ones.
127.0.1 is valid and is expanded into
127.0.0.1. The string
10.0.4660 is expanded into
Where does the
52 come from? It is easier to use hexadecimal to illustrate what is happening. The decimal number
4660, in hexadecimal, is written as
1234. Since each octet is two hexadecimal digits, the last parts thus become
34, which are displayed in decimal as
10.1193046 will expand to
10.18.52.86 and taken to the extreme, even the first part can be omitted:
168965206 expands into
10.18.52.86 as well.
Sometimes, this may be seen as a handy shortcut if the application allows for it, for example
1.1 for Cloudflare’s alternate DNS IP address.
The typical IPv4 address parts are displayed in decimal, but this is not always required. IPv4 addresses can be written in hexadecimal (
0xa.0x12.0x34.0x56) and octal (
This is usually tied to the system implementation that converts a string to a number, and is why it is a bad practice to display IP addresses in decimal with padded zeroes, eg.
010.018.052.086. This may look nice in a column of monospaced IP addresses, but copy-pasting these IP addresses might very well have it parsed in octal.
To make matters worse, all of these can be mishmashed together to create monstrosities such as
10.022.0x3456, which results in
10.18.52.86. This is because IPv4 address representation was never standardized, and this led to differing and confusing implementations along the years.
Even if the various different ways to write IPv4 addresses are technically valid, it is heavily recommended to only use the typical dot-decimal IPv4 format. Taking two modern programming languages as example, Golang accepts any format mentioned above, while Rust only accepts the “strict” format by following the recommendations in RFC 6943.
Displaying IPv6 Addresses
An IPv6 address is displayed as eight groups of two octets in hexadecimal, separated by colon characters:
The previous example uses the following two rules to shorten an otherwise much longer representation (
- Leading zeroes in a group do not indicate a different base and are removed.
- Consecutive groups of
0000are replaced with
::, although only once.
There is only one well-defined exception to these rules, a format to help systems in a mixed environment of IPv4 and IPv6, with the last four octets written in dot-decimal as if they were an IPv4 address:
Parsing IPv6 Addresses
Parsing a typical IPv6 address string is straightforward: split the string by
:, convert each hexadecimal part into integer values, bit-shift each by the correct amount, and add them all together.
Similarly to IPv4, there are library functions related to IPv6 addresses:
inet_pton and its inverse
inet_pton parses a string into an internal IPv6 value, and also works with IPv4 addresses, although contrary to
inet_aton, these must be in dot-decimal notation.
As mentioned previously, CIDR notation is used as a bitmask to separate the network and the host. When displayed together with an IPv4 or IPv6 address, it directly follows the IP address with a slash character.
For example, the IPv4 address
188.8.131.52 belongs in a network that takes 24 bits. When talking about networks, it is displayed as
184.108.40.206/24. The IPv4 address
220.127.116.11 is in the network
18.104.22.168/18, which indicates that the network starts at
22.214.171.124 and ends at
IPv6 addresses with CIDR notation are displayed exactly the same.
2001:db8::ff00:42:8329 is in the
2001:db8::/32 network, which means that the first 32 bits represent the network number. The addresses within that network then go from
Displaying Port Numbers
Before now, only IP addresses were mentioned. On top of the Internet Layer lies the Transport Layer that comes with port numbers. Where an IP address targets a host, a port number targets a process on that host.
When a port number is mentioned with an IPv4 address, they are displayed together and separated with a colon character. For example,
126.96.36.199:443 means the port
443 on the host located at
On the other hand, IPv6 addresses are displayed using the same colon character, so to prevent any ambiguity, the address is enclosed in brackets, eg.
[2001:db8::ff00:42:8329]:22 for the port
Assignment of IP Addresses
This section gives an overview of how IP addresses are distributed, and how the end-user gets one.
The IANA is an organisation in charge of assigning Internet numbers, including, most relevant for this post, IP addresses. It assigns IPv4 and IPv6 address blocks to the 5 Regional Internet Registries (RIR), each of which managing a region of the world: AFRINIC, APNIC, ARIN, LACNIC, and RIPE NCC. A RIR then assigns sub-blocks to Local Internet Registries (LIR), which are Internet Service Providers (ISP), large companies, or academic institutions.
Note that some RIRs instead assign blocks to National Internet Registries (NIR), who are essentially middlemen that represent a country. These NIRs then assign sub-blocks to LIRs.
An ISP is a company that connects end-users to the Internet. As seen previously, they receive blocks of IP addresses from a RIR (or NIR), which they then assign individually to their customers. For most residential customers, that IP address is dynamic, and may change over time. Static addresses, on the other hand, do not change, but must usually be purchased. These are desired by companies that are accessible from the Internet, lest they need to reconfigure their servers when their IP address changes.
When a user’s router is connected to the modem and turned on, it receives an IP address from the ISP’s DHCP server. All connections between the network behind the router and the Internet will use this public IP address.
Similarly to how the router received an IP address from the ISP, a network-connected device receives one from the router via DHCP. The difference is that this IP address is private to the network behind the router.
The device has no idea what the public IP address is. To find it, one must ask the router (usually via its GUI), or make a request to one of various services, like ip.me or ifconfig.me. Both of these services are accessible from a browser, and the terminal via
curl. They simply answer with the request’s public IP address.
Geolocation Based on IP Addresses
Because an ISP usually operates around a region or country, the IP addresses that an ISP assigns to its customers can give an idea about where the user is located.
This is only a guess and doesn’t work every time, but this is one way how geoblocking works. For example, let’s say that a server receives a request from
188.8.131.52. A simple RDAP query will show that it belongs to the network
184.108.40.206/24, registered to the following address, and the server can decide to block the request or not:
EdgeCast Networks, Inc. 13031 W Jefferson Blvd 90094 Playa Vista CA UNITED STATES
A Virtual Private Network (VPN) like Mullvad or ProtonVPN works by acting as a middleman between a user and the Internet. The VPN makes queries on behalf of the user, so the target website is not aware of the user, but instead sees the VPN’s location. Note that bypassing geoblocks may sometimes be illegal, and VPN connections might be flagged by some websites (eg. banks) as being suspicious.
Reserved IP Addresses
This section contains IP addresses that are sometimes useful to know. It is not meant to be comprehensive at all, but contains links that can be followed for more information.
localhost IP Address,
127.0.0.1, is known as the loopback address, and connects back to the host. Less-known is that it is actually the
127.0.0.0/8 range, so any IP address that starts with
127 will loopback to the host, if the system and its routing table are correctly configured.
On the other hand, a single IPv6 address,
::1/128) is reserved for loopback.
Some IP ranges are reserved for private networks. IPv4 addresses in the following ranges are for the current network, and will not be routed out to the Internet:
IPv6 has a huge range reserved for a private network, also called unique local address:
fc00::/7. This means that all IPv6 addresses that start with
fd are only routable within the current private network.
The IP addresses
:: mean an unspecified address. It has some different meanings, but it is useful when starting a server, to indicate that it listens on incoming requests from any IP addresses.
Documentation and Examples
IP addresses has been in use for more than 40 years. Its history is a ride through time explaining why and how they were modified and expanded to fit the needs of a growing planetary network.
It’s easy to dismiss or laugh at past decisions, but these must be viewed within their own context. One-octet network addresses may seem ludicrous now, but they come from a time when connections were slow and costly. Only research centers, universities, and the military were thought to ever have a use for computers, which were then huge expensive applicances that could only be used for science.
IPv4 and its many extensions came from the context of an exponentially-growing network. Nobody at that time could predict how the Internet would be used by companies, let alone by everyone at home.
Even though we could in theory assign an IPv6 address to each atom in the universe, the protocol was still created before the populatity of mobile and IoT devices. Who knows what kind of technology or paradigm shift might appear in this decade or the next? We might be one “secure connections using unique addresses for each packet” or “time-based rotating quantum qbit addresses” from realizing we need to start thinking about the next network address upgrade. After all, if history taught us anything, it’s that foreseeing the future is not something we’re often good at.