Federico Mengozzi

Application Layer

Web and HTTP

The HTTP protocol (HyperText Transfer Protocol) is the heard of the Web and it use TCP as its underling transfer protocol. A TCP connection is initiated by the HTTP client to the server, the communication then happens trough the socket on the server and on the browser. HTTP is said to be stateless, this because the server doesn’t maintains any information about its clients, it only responds to the clients’ requests.

Connection Persistency

Is possible to use use HTTP to make both persistent (default mode) and non-persistent connections:

  • Persistent - Initially the connection is created and then, after the server responds to the client’s requests the connection is kept open.

  • Non-Persistent - For each request, the client open a new connection and, after sending the response and acknowledging that the response is received, the server close the connection.

Message Format

Once the connection is created it’s possible for the client to send request massages and for the server to send response messages.

Request Format

Each HTTP requests message is made up of at least three field: the method field, the URL field and the HTTP version field.

Structure of a HTTP request message

There are different requesting method:

  • GET - The client request the object that identified by the URL field.
  • POST - The client request the object that identified by the URL field but the request encapsulate data in the body entity of the request.
  • HEAD - It’s similar to GET except that the server responds just with the header (leaving out from the response the requested object) to speed up the transfer, it’s usually used for debugging purposes.
  • PUT - It’s used to upload a file in a specific directory on the server.
  • DELETE - It allows to delete a file on the server.

Response Format

The response message has three sections: an initial status line(protocol version, status code, status message), six headers line and an entity body (that contain the object).

Structure of a HTTP response message

The server can responds with different status code:

  • 200 OK - The request succeed.
  • 301 Moved Permanently - The request object has been moved and the new location is speciefied in the Location: header of the response.
  • 400 Bad Request - Generic error.
  • 404 Not Found - The requested object has not been found on the server.
  • 505 HTTP Version Not Supported - The requested HTTP protocol version is not supported by the server.

User-Server Interactions

Although HTTP is a stateless protocol it’s possible to identify users using cookies. Cookie technology require four components: Cookie header line in the response message, Cookie header line in the request message, a cookie file stored in the browser and a back-end database.

The first time a user visit a website, the user is probably new to the server. The server reserve an unique identifier for the uer and create an entry in its database to store information regarding such identifier. The identifer is the included in the response message and sent to the client (using the Set-cookie: header).

Cookies are a used to create actual user session on top of the stateless HTTP.

Caches

A cache server (or proxy server) is a server located in the middle (wrt to the network topology) between client and server and responds to the client’s requests in behalf of the queried server.

Usually a client is directly connected to the cached server. It open a TCP connection to the cache server and began to send request. The cache server process the request and if the resources are locally available it just use them to responds to the request. If, on the other hand, the resources are not present it behave like a client and send the request to the destination server to retrieve them.

Using caches may arise the problem that the object stored in cached is not update. To solve this problem HTTP has a request method called conditional GET. A conditional GET is a request whose method must be GET and it should use the If-modified-since: header.

The request is sent to from the client to the cache server which in turn sent it to the destination server. The cache server implement the condition GET and once the destination server receive the request it checks if that file has been modified since the date value of the If-modified-since: header. If so, it return the new object, otherwise it return just the response header to notify the cache that the stored version is still updated.

Benefits of caches

Imagine the scenario in which an institution has a 100 Mbps LAN network, it’s connected to the internet with a by a 15 Mbps link. The average requested object size is 1 Mbits and the average number of request is 15 req/s. The time is take to the router to sent a request and receive a response in about 2 s.

The traffic intensity in the LAN is given by $I_{traff} = \dfrac{aL}r{R}$ and it’s equal to $\frac{15\ req/s \cdot 1\ bits}{100\ Mbps/req} = 0.15$ and the traffic intensity on the access link would be $I_{traff} = \dfrac{aL}{R}$ and it’s equal to $\frac{15\ req/s \cdot 1\ bits}{15\ Mbps/req} = 1$. When the traffic intensity approach to 1, the delay on a link became much higher.

One solution to this problem could be to increase the access rate to 100 Mbits (a very expensive solution). The traffic intensity would be 0.15 that consists in trascurable delay between the router and the total response time would roughly be the internet delaly, $2.0 s$.

It’s possible to solve this problem using cache too. Let’s say it’s is installed a cache server with hit rate of 0.4 (4 request out of 10 are resolved by the cache server, hit rates span from 0.2 to 0.7). Since the cache server is inside the institution LAN each request require just 0.01 s from the cache server to the device on the LAN. The time for the same amount of request would be $hitRate \cdot cacheDelay$ $+$ $(1 - hitRate)$ $\cdot$ $(internetDelay + cacheDelay)$ $=$ $0.4 \cdot 0.01 s$ $+$ $0.6 \cdot 2.01 s = 1.2 s$ that’s even better that upgrading the access rate.

Go to top