Federico Mengozzi

The Internet Protocol

The IP

Datagram

The internet network-layer packets are referred to as datagrams. The IPv4 datagram has a specific structure

32 bits
versionheader lengthTOSdatagram length
IDflagsfragmentation offset
source IP
destination IP
[options]
data
  • header length: used to determine where the header ends and where the data payload begins
  • TOS: useful to specify whether the it’s a real-time datagram, a non-real-time traffic and more
  • datagram length: up to $2^{16}$ bytes, even though it’s rarely larger than 1500 bytes
  • ID, flags, fragmentation offset: used in IP fragmentation
  • TTL: used to prevent the datagram to circulate forever, each hop decrease TTL by one and when it reaches zero, the datagram is dropped
  • protocol: used only at destination, it indicates the specific transport-layer protocol (6 indicates TCP, 17 is for UDP)
  • header checksum: used to detect bit errors. Each two bytes in the header are treated as number, they are then summed sing the 1s complement. Finally the 1s complement of the sum is stored as checksum

Fragmentation

Not always an entire IPv4 datagram can go through a link, it might happens that the maximum transport unit MTU us much smaller. For this reason the datagram can split in different fragment and reassembled on the transport layer at the destination. ID is an incremental integer assigned to each set of fragments belonging to the same datagram, the offset specify the order in the set and finally the flag is set to 0 when the fragment is the last one and 1 for all others.

Addressing

The boundaries between host and physical layer is an interface, that is the piece in the network that as an IP address. A portion of the IP of each interface is determined by the subnet to which is connected.

For example the IP address 223.1.1.0/24 identify a subnet with IP address 233.1.1, the subnet IP consists of the first 24 bits, the network prefix and the remaining 8 bits specify the host part of the address. Only the

The internet’s address assignment strategy is known as classless interdomain routing CIDR

Before CIDR was in place, the network portion of the IP wasn’t flexible but was instead constraint to 8, 16 or 24 bits (this was called classfull addressing since the subnets with 8, 16, and 24 bit subnet address were known as class A, B and C respectively).

There is a special address in IP, namely 255.255.255.255, the is the broadcast address used to have a router/switch send a datagram to all interfaces connected to it.

CIDR still follows a hierarchical distribution of addresses, each organization is assigned a block of addresses. For example ISPs received their addresses block from the Internet Corporation for Assigned Names and Numbers ICANN that also manage the DNS root servers.

On the other hand, a host get its address from the ISP using Dynamic Host Configuration Protocol DHCP, a plug-and-play and zeroconf protocol. It’s a client-server protocol that involve four steps to complete.

  • DHCP server discovery: The new host send a DHCP discover message within a UDP packet on port 67. It’s then broadcasted to the network using the 255.255.255.255 address and in the source address, the value 0.0.0.0 is set.
  • DHCP server offers: When a DHCP server receive a discovery message, it reply with a DCHP offermessage with transaction ID of the discovery message, proposed IP address, network mask and IP address lease time. This message is again broadcasted on the network, in this way only the host that initiate the DHCP exchange will accept and process the message.
  • DHCP request: The host can now accept one of the (possibly) many DHCP offers it received, it will now echo back the configuration parameters to the server.
  • DHCP ACK: The server respond with a DHCP ACK message to confirm the client the requested parameters.

A drawback of the standard DHCP is that, for example, a TCP connection cannot be maintained if the host move around different subnets.

NAT

Network Address Translation NAT is a way for the internet to tackle the scarcity of IPv4 address. It works by mapping a host internal IP address to a combination of subnet address and port.

NAT gets it IP from the ISP using a DHCP server, let’s say the NAT address is 138.76.29.7. When a host tries to connect to a remote host, let’s say source 10.4.4.13:12345 and destination 72.12.1.64:8080, NAT will create a new entry in the NAT translation table, for example 5001-10.4.4.13:12345. It will then replace the source portion in the IPv4 datagram with the NAT address and the port, so 138.76.29.7:50001. When the NAT receive the response from the remote host, such response datagram will have destination address equal to 138.76.29.7:50001, NAT can now use the port to find the original host address and forward the packet correctly.

In certain scenarios, for example in P2P connections, an internal host need to act as a server. For a remote host it wouldn’t be possible to connect directly to the internal host if behind a NAT. There are NAT traversal tools that allow to go around this problem, for example UPnP.

Although NAT has many advantages is not really an elegant solution to the problems is solving. Middleboxing are a much more common technology that while not performing traditional datagram forwarding, it offers NAT, load balancing and traffic firewalling.

IPv6

While the transition to IPv6 hasn’t completely happened yet, there are several changes in IPv6 that address certain weakness of the previous version.

  • addressing capability: IPv6 has a 128 bit address size.
  • header: the header consists now of a fixed 40 bytes, faster to process
  • flow labeling: IPv6 introduce the concept of flow, to differentiate a datagram flow based on some criteria

The datagram has now the following new or relevant fields

  • traffic class: the TOS for IPv4
  • flow labe: to indicate to which flow the datagram belongs to
  • payload length: 16 bit to indicate the payload size
  • next header: the protocol field for IPv4, it determines the type of protocol that the datagram contains
  • hop limit: same as IPv4 TTL
  • source and destination address: IPv6 addresses

Since the transition between IPv4 to IPv6 us not straightforward, the most common way to adopt IPv6 using old IPv4-capable device is to encapsulate IPv6 datagram in IPv4 datagram and have the IPv4 device works as tunnel to IPv6 to reach IPv6-compatible hosts.

Go to top