HTTP/1

HTTP (Hyper Text Transfer Protocol) is an application-layer protocol used for communicating between a client and a server.

Table of contents

  1. Introduction
  2. URIs
  3. Connections
  4. HTTP Methods
  5. HTTP Headers
  6. HTTP status codes
  7. Caching
  8. References

Introduction

HTTP is a request/response protocol. It specifies what clients can send to a server, and what they can expect to receive back [1, P. 683].

Originally HTTP was intended to transfer HTML documents from servers to browsers, but it’s now used for many different kinds of media.

HTTP/1.1 requests are made up of multiple lines. The first line is the most important, it contains the request method and the HTTP version number:

GET /glossary/big-o-notation HTTP/1.1

URIs

URIs (Universal Resource Identifiers) are strings that identify a resource [2, P. 18].

URIs can be represented either in absolute from, or relative form (relative to some base URI). An absolute URI begins with a scheme name (e.g. https) [2, P. 19].

HTTP URLs (Universal Resource Locators) include the information required to reach the resource. They have the following structure:

"http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]

[2, P. 19]

Connections

HTTP usually runs over TCP. HTTP/1.0 would close the TCP connection after receiving an HTTP response from a server. This meant each connection had to perform a TCP handshake, even if multiple HTTP requests were made to the same domain while loading a webpage (a common scenario) [1, P. 684].

To solve this, HTTP/1.1 supports persistent connections. The TCP connection can be kept alive, and additional requests can be sent over the same connection (also known as connection reuse). This improves performance [1, P. 684].

It’s also possible to pipeline requests (send 2 requests at the same time) [1, P. 684].

Connections are typically closed after a short time (e.g. 60 seconds) to avoid servers holding too many connections open [1, P. 685].

HTTP Methods

An HTTP request has an associated method.

The first word on the first line of an HTTP request (the Request-Line) is its method name:

Request-Line = Method Request-URI HTTP-Version CRLF

[2, P. 35]

Method names indicate the intent of the request:

Method Description
GET Read a Web page
HEAD Read a Web page’s header
POST Append to a Web page
PUT Store a Web page
DELETE Remove the Web page
TRACE Echo the incoming request
CONNECT Connect through a proxy
OPTIONS Query options for a page

[1, P. 686]

HTTP Headers

The first line of an HTTP request/response can be followed by additional lines, called request headers. Responses can have response headers [1, P. 688].

Each header field is made of a name and a field value, separated by a colon (:):

message-header = field-name ":" [ field-value ]

[2, Pp. 31-2]

Headers can be used to set caching policies, provide authorization, and provide metadata about the user agent making the request (as well as many other uses) [1, P. 688].

HTTP status codes

HTTP status codes are included in HTTP responses, as part of the Status-Line:

Status-Line = HTTP-Version Status-Code Reason-Phrase CRLF

[2, P. 39]

The status code is a 3-digit integer code that describes the status of the HTTP response [2, P. 39].

The status codes are classed based on the first integer:

Code Meaning Examples
1xx Information 100 = server agrees to handle client’s request
2xx Success 200 = request succeeded; 204 = no content present
3xx Redirection 301 = page moved; 304 = cached page still valid
4xx Client error 403 = forbidden page; 404 = page not found
5xx Server error 500 = internal server error; 503 = try again later

[1, P. 688]

Caching

Caching is the process of saving HTTP responses to be used later. Caching improves performance by reducing network traffic and latency [1, P. 690].

References

  1. [1] A. Tanenbaum and D. Wetherall, Computer Networks, 5th ed. 2011.
  2. [2] H. F. Nielsen et al., “Hypertext Transfer Protocol – HTTP/1.1,” no. 2616. RFC Editor, Jun-1999.