Search This Blog

Sunday 14 December 2014

Data Journey - HTTP, TCP, IP Protocol basics

Background

All the data sent here and there on the Internet - how does it work? The basis is synchronization of data transfer through an agreed upon procedure, ie adhering to a protocol of communication. There are different types of protocols. Machines/hosts/nodes communicate using lower level protocols, applications running on machines communicate using higher level protocols.

Communication happens between endpoints/sockets. Endpoints are entry points to a connection/process/service. Protocols are agreed upon rules, describing the format the communicated information, procedures that need to be followed.

Communication on the Internet depends on the Internet Protocol suite, a set of communication protocols making it possible to send bytes/octets between two networked computers, even if they are miles apart on different networks. Its alias is TCP/IP because the TCP (Transmission Control Protocol) and IP (Internet Protocol) protocols were formulated first.

How it works

We want to send a message from a browser on our local machine to a web application running a remote server, something we wrote in a form on a web page.

The remote web server is listening for HTTP requests on a socket described by the local IP address and a particular port. It directs requests for a dynamic resource to a web application. The client, sending the request, also creates a socket (web server IP address + the port on which the server is listening), through which it can now communicate with the web server. After establishing the physical connection, a several-step handshake follows , before data can be sent/received.

Browser and web server are applications communicating using the HTTP protocol. That way they know in what format they want receive the data, if they can deal with compression, whether it is possible to use a cached resource/page or need to retrieve it, etc. Browser will send the actual information we want to be sent, alongside with HTTP mandatory and optional headers.

How does the data travel to its destination? Thanks to the TCP, the sending application, browser in our case, does not need to worry about bytes and octets, but can send the whole message in one go, and let TCP tackle the problem.

TCP provides connection oriented, ie reliable transfer with error checking. It guarantees delivery but is not necessarily timely. It controls the data flow to avoid overwhelming the receiver, and network congestion, a situation when no or little data transfer is happening. When transfer reliability is not crucial, reduced latency (transfer delay) can be achieved with UDP, the connectionless User Datagram Protocol. While using TCP is important in e-commerce, for instance, UDP is used when streaming films, VOIP etc.

The message is divided into small pieces, each, a sequence of octets/bytes, each of which is then encapsulated with additional data (in a header/footer). The encapsulation - headers + payload - is called a packet or a datagram, and is a basic transfer unit. The headers (they gradually accumulate, as a protocol in each layer in the Internet Protocol suite, adds its own), contain, in the end, all the information needed to get the data across from one endpoint to the other one.  The TCP header holds information needed for reassembly of the message from individual packets (local and remote ports, sequence number etc).


 

http://books.msspace.net/mirrorbooks/snortids/0596006616/snortids-CHP-2-SECT-2.html

TCP operations have three phases. The first is about creating a connection using a multi-step handshake, to establish a reliable connection. A TCP connection is managed by the operating system through socket API (application programming interface) (Inter-process Communication). After that, the data transfer phase happens, followed by closure of the connection and release of resources.

IP protocol  deals with the actual packet transfer across different network boundaries, ie with routing. It prescribes the format of its associated header containing IP addresses of the local and remote hosts and other routing data.

When the collection of packets representing our message, arrives at the destination endpoint, they are reassembled to form the original data, according to the meta data in the TCP/IP/HTTP headers. The destination application receives the whole message instead of a bundle of little payloads. Impressive!

No comments:

Post a Comment

Note: only a member of this blog may post a comment.