negotiations from breaking down. In a similar vein, when computers need to talk, they
also follow a formal protocol. The protocol defines how data is transmitted and how
to decode it once it arrives. Web applications use the Hypertext Transfer Protocol
(HTTP) to move data between the browser running on your computer and the
application running on the server.
Many server applications communicate using protocols other than HTTP. Some
of these maintain an ongoing connection between the computers. The application
server knows exactly who is connected at all times and can tell when a connection is
dropped. Because they know the state of each connection and the identity of each
person using it, these are known as stateful protocols.
By contrast, HTTP is known as a stateless protocol. An HTTP server will accept
any request from any client and will always provide some type of response, even if
the response is just to say no. Without the overhead of negotiating and retaining a
connection, stateless protocols can handle a large volume of requests. This is one
reason why the Internet has been able to scale to millions of computers.
Another reason HTTP has become the universal standard is its simplicity. An
HTTP request looks like an ordinary text document. This has made it easy for
applications to make HTTP requests. You can even send an HTTP request by hand
using a standard utility such as Telnet. When the HTTP response comes back, it is also
in plain text that developers can read.
The first line in the HTTP request contains the method, followed by the location
of the requested resource and the version of HTTP. Zero or more HTTP request
headers follow the initial line. The HTTP headers provide additional information to
the server. This can include the browser type and version, acceptable document types,
and the browser’s cookies, just to name a few. Of the seven request methods, GET
and POST are by far the most popular.
Once the server has received and serviced the request, it will issue an HTTP
response. The first line in the response is called the status line and carries the HTTP
protocol version, a numeric status, and a brief description of the status. Following the
status line, the server will return a set of HTTP response headers that work in a way
similar to the request headers.
As we mentioned, HTTP does not preserve state information between
requests.The server logs the request, sends the response, and goes blissfully on to the
next request. While simple and efficient, a stateless protocol is problematic for
dynamic applications that need to keep track of their users. (Ignorance is not always
bliss.
Cookies and URL rewriting are two common ways to keep track of users
between requests. A cookie is a special packet of information on the user’s computer.
URL rewriting stores a special reference in the page address that a Java server can use
to track users. Neither approach is seamless, and using either means extra work when
developing a web application. On its own, a standard HTTP web server does not
traffic in dynamic content. It mainly uses the request to locate a file and then returns
that file in the response. The file is typically formatted using Hypertext Markup
Language (HTML) [W3C, HTML] that the web browser can format and display. The