HTTP (Hyper Text Transfer Protocol) is an application-layer protocol used for communicating between a client and a server.
HTTP is a request/response protocol. It specifies what clients can send to a server, and what they can expect to receive back [1, P. 683].
Originally HTTP was intended to transfer HTML documents from servers to browsers, but it’s now used for many different kinds of media.
HTTP/1.1 requests are made up of multiple lines. The first line is the most important, it contains the request method and the HTTP version number:
GET /glossary/internet HTTP/1.1
URIs (Universal Resource Identifiers) are strings that identify a resource [2, P. 18].
URIs can be represented either in absolute from, or relative form (relative to some base URI). An absolute URI begins with a scheme name (e.g. https) [2, P. 19].
HTTP URLs (Universal Resource Locators) include the information required to reach the resource. They have the following structure:
"http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
HTTP usually runs over TCP. HTTP/1.0 would close the TCP connection after receiving an HTTP response from a server. This meant each connection had to perform a TCP handshake, even if multiple HTTP requests were made to the same domain while loading a webpage (a common scenario) [1, P. 684].
To solve this, HTTP/1.1 supports persistent connections. The TCP connection can be kept alive, and additional requests can be sent over the same connection (also known as connection reuse). This improves performance [1, P. 684].
It’s also possible to pipeline requests (send 2 requests at the same time) [1, P. 684].
Connections are typically closed after a short time (e.g., 60 seconds) to avoid servers holding too many connections open [1, P. 685].
An HTTP request has an associated method.
The first word on the first line of an HTTP request (the Request-Line) is its method name:
Request-Line = Method Request-URI HTTP-Version CRLF
Method names indicate the intent of the request:
|GET||Read a Web page.|
|HEAD||Read a Web page’s header.|
|POST||Append to a Web page.|
|PUT||Store a Web page.|
|DELETE||Remove the Web page.|
|TRACE||Echo the incoming request.|
|CONNECT||Connect through a proxy.|
|OPTIONS||Query options for a page.|
The first line of an HTTP request/response can be followed by additional lines, called request headers. Responses can have response headers [1, P. 688].
Each header field is made of a name and a field value, separated by a colon (:):
message-header = field-name ":" [ field-value ]
Headers can be used to set caching policies, provide authorization, and provide metadata about the user agent making the request (as well as many other uses) [1, P. 688].
HTTP status codes are included in HTTP responses, as part of the Status-Line:
Status-Line = HTTP-Version Status-Code Reason-Phrase CRLF
The status code is a 3-digit integer that describes the status of the HTTP response [2, P. 39].
The status codes are classed based on the first integer:
|1xx||Information||100 = server agrees to handle client’s request.|
|2xx||Success||200 = request succeeded. 204 = no content present.|
|3xx||Redirection||301 = page moved. 304 = cached page still valid.|
|4xx||Client error||403 = forbidden page. 404 = page not found.|
|5xx||Server error||500 = internal server error. 503 = try again later.|
Caching is the process of saving HTTP responses to be used later. Caching improves performance by reducing network traffic and latency [1, P. 690].
-  A. Tanenbaum and D. Wetherall, Computer Networks, 5th ed. 2011.
-  H. F. Nielsen et al., “Hypertext Transfer Protocol – HTTP/1.1,” no. 2616. RFC Editor, Jun-1999.