Web programming

Units WEB1P and WEB2P

The HTTP protocol

The HTTP protocol is what makes the World Wide Web work. It is the protocol that connects browsers and web servers. In this section we look at it in a little more detail so that we understand how it works and what some of its capabilities are.

HTTP is a text-based protocol. This means you can read messages sent over it. Normally, of course, you don't, but it is worth understanding the format of messages passed between the client and server.

The most common request (remember a request is a HTTP message sent from the client to the server) is a GET request. GET requests ask the server to deliver some resource (usually a web page) to the client.

The GET request to request the University of Portsmouth's home page would be:

GET http://www.port.ac.uk/index.htm HTTP/1.1

This is in three parts:

  1. "GET" is the method. We will look at other methods later on.
  2. "http:..." is the URL. This specifies the resource (i.e. document) that is being requested.
  3. "HTTP/1.1" is the version of the protocol being used. Most servers now use v1.1 of HTTP, though most will also handle v1.0 requests. (For an explanation of the differences, see http://www.apacheweek.com/features/http11

Provided the server understands v1.1 of the HTTP protocol, it will find the requested resource and send it back to the client in a response (remember a response is a HTTP message sent from the server to the client). This would look something like the following:

HTTP/1.1 200 OK
Server: Microsoft-IIS/4.0
Date: Mon, 29 Apr 2002 08:50:53 GMT
Content-Type: text/html
Accept-Ranges: bytes
Last-Modified: Wed, 10 Apr 2002 16:12:34 GMT
ETag: "085fb85aae0c11:54fb"
Content-Length: 13845

<HTML>
<HEAD>
<TITLE>University of Portsmouth - Our University</TITLE>
...

The first few lines (down to Content-Length) are what are called HTTP headers. Both requests and responses can have headers. Most of the headers here are self-explanatory, but note the first line which is the HTTP response code (200 is the code for OK), and the "Content-Type" header, which defines what sort of the document is being returned (in this case an HTML one).

If the requested document didn't exist, the first line would contain an error code (404 is the code for "not found") as below:

HTTP/1.1 404 Object Not Found
Server: Microsoft-IIS/4.0
Date: Mon, 29 Apr 2002 08:58:12 GMT
Content-Length: 11891
Content-Type: text/html

<HTML>
<HEAD>
<TITLE>University of Portsmouth - Our University</TITLE>
...

Another method that is frequently used in a request is POST. We will look at this in more detail in lesson 4.

For a fuller description of HTTP see: Stein, p47-57 and Basham chapter 1.

The full HTTP 1.1 specification is in Internet RFC 2616, which is available at a number of websites including http://www.w3.org/Protocols/rfc2616/rfc2616.html.

 

Last updated by Prof Jim Briggs of the School of Computing at the University of Portsmouth

 
The web programming units include some material that was formerly part of the WPRMP, WECPP, WPSSM and WEMAM units.