Pit + Network of Networks

10/15/2023

Principles of Network Applications

Network Architecture

The network architecture is fixed and provides the network applications with a set of services. The software developer has to develop the application architecture, which relies on one of the two main network architectures: client-server or peer-to-peer.

In the client-server architecture there’s always an active host called server, which replies to the requests of many other hosts called client. A single server may not be able to sustain all the load caused by the clients’ requests. That’s why servers are often housed in data centers that create a powerful virtual server. This process in transparent to the client, which cannot know which server is actually replying (from the many servers in the data center).

On the other side, the peer-to-peer architecture (P2P) is based on the direct communication between a pair of hosts, called peer. Those hosts are intermittently connected and are not owned by a services provider. The peers are very often personal computers from private citizens. This architecture is extremely scalable: it is actually used in high traffic volume application (like BitTorrent).

Internet Architecture

Internet is just one of the many network (of networks) that are physically implemented. It is certainly one of the most widely used.

ISP interconnections

A peripheral machine is called host. An host can be classified as:

client: request a service
server: provide a service

An access network physically connects a system to its edge router. The edge router is the first on the path from a system to any other destination system outside the starting system’s network.

ISP (Internet Service Provider) is a company that provides Internet access to consumers and businesses. Most large telecommunication companies, such as mobile and cable companies, are ISPs.

Communicating Processes

As a developer, before writing your own network application, you must understand who programs running on different hosts can communicate to each other. In the OS (operative system) “slang”, you would talk about communicating processes instead of communicating programs.

A process is a program running on a system.

Processes on two different hosts communicate each other through the exchange of messages over the network.

Client and Server Processes

Network applications are build on top of a pair of communicating processes. The client process is the one that receives the messages, while the server process is the one that sends the messages.

In some kind of applications, like P2P file sharing, a process can be both client and server at the same time.

The following definition is only valid for all those services that relies on the pull model. Within the context of a communication session between a pair of processes, the process that starts the communication is denoted as client, while the one that waits to be reached is the server.

On the other hand, in the push model of communication, the client indicates its interest in receiving messages (like e-mails or the status of a soccer match). When messages are available, it is up to the server to send them to the client.

Thanks to this clarification, the previous definition can be re-written. Within the context of a communication session between a pair of processes, the process that asks for a service or a piece of information is denoted as client. The server is the process that delivers the service, or retrieves the required pieces of information.

The Socket Interface

A process sends and receives messages over the network through a software interface called socket. It is located between between application and transport layers within a host. It is also denoted as an API (Application Programming Interface) between the application and the network. A socket represents the programming interface with which the network applications are built.

Addressing

It is mandatory to identify processes communicating over the network. This issue is called addressing: provide a network machine with an identifiable and unique address.

The PID (process identifier), used by UNIX-based machines to identify different processes running locally, is not enough. One might wonder, for example, “What if the host is Windows based?” or “What if a new operative system is built from scratch?“.

Within Internet, a process is identified by:

IP address of the host: 4 byte integer (like 129.168.60.10)
port assigned to the process: 2 byte integer (like 3000)

Note that the IP address range goes from 0.0.0.0 to 255.255.255.255, while the port range goes from 0 to 65535.

Some port numbers are reserved to default services (SMTP on port 25, web server on port 80 and so on). The list of known port numbers is available at iana.org.

Network Classification

The speed of a network is evaluated in bit per second: bps (bit/sec). For the sake of convenience, multiples of the bit are most often used:

Megabit per second: Mbps (1e+6)
Terabit per second: Tbps (1e+12)

Dimension-based

With dimension is meant the distance between network nodes.

From the tiniest to the largest, the networks are:

Nano network: molecular communication (standard IEEE P1906.1)
Near-field communication network (like RFID)
NoC (Network on Chip): communication in 3D chips
PC bus: speed from 100 Mbps to 10 Gbps
BAN (Body Area Network): connects wearable devices
PAN (Personal Area Network): like Bluetooth connectivity
LAN (Local Area Network): cabled with Ethernet or wireless with WLAN, up to 80 meters
MAN (Metropolitan Area Network): interconnection of LANs
WAN (Wide Area Network): interconnection of MANs, up to 1000 km

These are heterogeneous networks that interconnect with each other and with the following networks, which are not listed in increasing dimension:

PSTN (Public Switched Telephone Network): all around the globe, aggregate traffic up to Tbps
PABX (Private Branch Exchange): corporate telephone network, range within 100m to 10km, up to 10Mbps
Cellular network Femtocells and picocells as WLAN access points, IoT, URLLC, 5G and beyond
Satellite network localization through GNSS, support for PSTN and cellular networks

Residential access

When Internet started to be surfed extensively also by private citizens, phone companies introduced DLS (Digital Subscriber Line) solutions. A DLS model connects to a DSLAM (DLS Line Access Multiplex) placed in the central office of the phone company. DLS uses the same physical cables to allow both Internet access and phone calls.

In place of DLS, Internet can be accessed through ad-hoc cable modems in a similar way that you watch cable TV.

Nowadays, we’re experiencing a wide adoption of FTTH: Fiber To The Home. The optical fiber goes all the way into private houses.

In the future, we may get to use FWA (5G Fixed Wireless Access). It will make obsolete all the expensive wired infrastructure.

Peripheral machines are commonly connected to edge routers through LAN networks (Local Area Network). There are many LAN technologies, in which Ethernet is the most used. Wireless LAN are based on WiFi technology, also known as IEEE 802.11.

Network Kernel

Packet Commutation

Distributed applications exchange messages that are subdivided into smaller pieces, called packets, before transmission. Between source and destination, the packets travel through data links and packets commutator (routers and data link commutators). If a commutator sends a packet of L bits over a data link with R bps speed (bit per seconds), the transmission will last L/R seconds.

Let’s make explicit some simplified formulas for the different transmission techniques.

Store and forward

The vast majority of packet commutator use store-and-forward technique. The commutator must store the whole packet before sending it to the subsequent node.

Store and forward scheme

Given the above schema, let’s make some hypothesis:

the router only has an input and an output
there is no communication delay due to the transmissive channel

The total delay between source and destination is $2\cdot L/R$ seconds.

Given a path made of N nodes, there are N-1 routers between source and destination (each with R as transmission speed). The end-to-end delay is then:

d_{\text{end-to-end}} = N \frac{L}{R}

Within packet commutation, the network does not book any resources for a given communication. In means that packets may encounter network congestions and having to wait in queues. Internet network, for example, does not provide any warranty about packets delivery timing.

Circuit Commutation

In circuit commutation networks, the resources requested to allow communication between peripheral systems are booked for the entire duration of the communication session. The commutators on the path from source to destination keep the connection state for the whole session. This state-full connection is called circuit. When a network books a circuit, it also ensure constant speed transmission.

In the image below, the four commutators are physically connected with four links. If the host on the upper left wants to communicates to the one on the lower right, the network must first establish a dedicated end-to-end connection (in bold).

circuit commutation scheme

Protocols and Models

Network models and protocols are layered. Each layer provide a different service that relies on the lower layer’s service. Let’s take a brief look at two of the most important models: ISO/OSI and TCP/IP.

ISO/OSI vs TCP/IP

Layers are numbered from bottom to top. A device (called host) that wants to send a piece of information traverses them from top (Application layer) to bottom (Physical layer). A device that receives a piece of information traverses the layers from bottom to top. Routers act as intermediate nodes between communicating hosts, so they behave a bit differently: they face the stack first upward and then downward, stopping at the 3rd layer (Network layer). What’s why a router is called network device.

Encapsulation is performed by moving down a layer, actually adding additional headers to the transported information. On the other hand, decapsulation is performed by moving up a layer, drawing from the relevant overhead information added during encapsulation.

ISO/OSI model

ISO stands for International Organization of Standardization, and OSI for Open System Interconnection. ISO/OSI model has seven layers:

7^th: Application: houses network applications and their protocols (e.g. file transfer, virtual terminal, e-mail).
6^th Presentation: cares about the syntax and semantics of the information exchange, implements services such as information compression and encryption.
5^th Session: coordinates hosts interaction based on 4th layer services. It handles dialog management. It allows establish, maintain and synchronize the interaction. It supports synchronization and rollback.
4^th Transport: enables data transfer between two nodes. Its tasks are:
- establish and maintain connections between communicating nodes,
- implement end-to-end reliability mechanisms,
- congestion control.
3^rd Network: deliver packets from source to destination across multiple links. It is responsible for:
- routing of packets,
- packet conversion between 2nd layer protocols,
- packet fragmentation.
2^nd Data Link: allows the transfer of packets through a communication channel (link). It deals with:
- coordination of channel access by multiple nodes,
- error control and retransmissions (mandatory for reliable communications),
- flow control (preventing fast transmissions from putting slow machines in trouble).
1^st Physical: defines how bits of information are transmitted over the physical channel (optical fiber, copper, radio, etc.). For example:
- Voltage levels
- Duration of the signal identifying a bit
- Modulation and encoding
- Half-duplex and full-duplex transmission

TCP/IP model

The TCP/IP (Internet protocol suite) model has fewer levels than ISO/OSI. Many important networks, like Internet, are based on this model. In the following list I’ll mention some protocols you can deepen in Application Layer article

TCP/IP has 5 layers:

5^th Application
4^th Transport
3^rd Network
2^nd Data Link
1^st Physical

Sometimes Network layer is called Internet layer. Moreover, Data Link and Physical layers are grouped into a single Link layer. Sometimes it is preferred to decouple them since they perform different tasks.

TCP/IP’s Application layer groups ISO/OSI’s Application, Presentation and Session layers.

IP Hourglass

IP hourglass schema illustrates the narrow-waisted shape of TCP/IP layered-Internet protocol. Internet network has many protocols in the physical, data-link, transport and application layers. In contrast, there is only one protocol in the network layer: the IP protocol.

IP is the only protocol that must be implemented by all devices that want to connect to the network: it is the only universal requirement for Internet connectivity, and is easy to implement.

The narrow-waisted shape acts as a spanning layer that hides the implementation details of the differences between the underlying layers. Applications that rely on IP protocol have a uniform service interface. Applications behave exactly the same regardless of whether the connection is wired (via Ethernet protocol, both optical fiber or copper cable) or wireless via protocols such as WiFi or Bluetooth.

IP-hourglass

With the development and gradual adoption of IPv6, it is necessary to design a new hourglass model fully equivalent to the former: the double hourglass model.

IP-double-hourglass

You can read more about IPv4 and IPv6 differences by reading Network layer - Data Plane article.

Layers Overview

As you’ll dig deeper later on this article, information packets have two fields: header and payload. The payload is the actual information that needs to be communicated through the network. The payload is enriched with overhead information: each layer adds its own header information (overhead) to allow the payload moving from one level to another. Moreover, information (I’m talking about header and payload, not to get confused) change name based on the layer in which they’re placed, and thus the overhead information they carry at that time.

Within the following image: H stands for header, subscript of the H represent the layer, P stands for payload. The arrows follows the information travel within the network.

The following image should clarify the concepts.

network message travel

Application layer

Information packets at application layer are called messages. A protocol on this layer is distributed over many peripheral machines. It allows the exchange of messages from network applications housed on different peripheral machines. Included protocols are HTTP, SMTP, FTP, DNS and many more.

An application-level protocol defines how application’s processes, executing on different machines, exchange messages to each other. In particular, it defines:

message type: e.g. request or response,
message syntax: what are the message’s fields and how they’re defined,
fields’ semantic

Some of those protocols are specified within the RFC (Request for Comments), and are in public domain. Many other application-level protocols are private and purposely not available world wide.

For a more accurate discussion on the topic, please read Application Layer article.

Transport layer

Information packets at transport layer are called segments. The most used transport layer protocols are TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). They both take care about the transportation of application layer messages by moving them to peripheral application points. Transport protocols rely on underlying network layer.

Many networks provide the different transport protocols to the network applications that reside inside. It is possible to classify those protocols under the following features:

Reliable data transfer
Throughput
Timing
Security

For a more accurate discussion on the topic, please read Transport Layer article.

Network layer

Information packets at network layer are called datagrams. Internet’s network layer includes the important IP protocol, and many routing protocols that determinate the datagrams’ paths through the travel from source to destination.

Network layer can be subdivided into two different planes that perform different tasks:

Data plane: performs the forwarding of the packets,
Control plane: performs the routing of the packets.

Before discussing the duties of those planes, let’s deepen some network-related topics.

Data Link layer

Information packets at data link layer are called frames. Network layer relies on data link layer protocols like Ethernet and WiFi to deliver frames from a node to the subsequent (in source to destination path). A single datagram could be managed by different data link layer protocols.

For instance, the source could be connected to a WiFi, while the destination could use Ethernet.

Physical layer

The physical layer transport frame’s bits from note to node. Protocols on this level depends on the link type and on the transmissive medium (like coaxial cable or optical fiber).

State within ISO/OSI model

In the context of the ISO/OSI model, state is typically managed at several layers, depending on the type of state and the protocol in question. Here’s a breakdown of where state is commonly managed:

Application Layer (application state)
Protocols like FTP, HTTP, and SMTP manage application-specific state, such as user sessions, authentication states, and data exchanges.
Session Layer (session state)
This layer is responsible for establishing, maintaining, and terminating communication sessions. It keeps track of session parameters and ensures proper sequencing and synchronization. Protocols that explicitly manage sessions, like RPC (Remote Procedure Call), manage session state here.
Transport Layer (connection state)
This layer manages the state of connections between hosts. In TCP, the state includes connection establishment, data transfer, and connection termination. TCP keeps track of retransmissions, data flow control, and error checking, making it stateful.
Network Layer (routing & forwarding state)
The state at this layer is related to routing and forwarding decisions. Routers maintain state information in routing tables, which help determine the best path for packet forwarding. Protocols like OSPF (Open Shortest Path First) and BGP (Border Gateway Protocol) maintain state about network topology to make dynamic routing decisions.
Data Link Layer (frame and MAC address state)
The data link layer manages state related to framing, error detection, and MAC address tables (in switches). This includes managing the status of data frames, flow control, and error correction. Ethernet protocol maintains state for each connection at the MAC level.

Each layer handles state relevant to its function in the network communication process, with the higher layers generally dealing with more complex and application-specific state management.