Web programming

Units WEB1P and WEB2P

Configuring a web server

Introduction

The job of a web server is to receive HTTP requests from the TCP/IP stack and to respond to them in an appropriate way. If the URL in a request corresponds to a filename, the file is sent back to the user's web browser for display. If the URL corresponds to a CGI program, the program is run and its output is sent for display.

There are three aspects to configuring a web server:

Initial configuration

The three most common web servers are Apache, Microsoft IIS and Netscape Enterprise Server, which are all evaluated in Larson and Stephens (2000), chapter 4. In this lecture we'll look at how to configure Apache.

The web server is run as a daemon, a program which offers a service to other programs, often over a network. A separate daemon is run for each web server installed on a machine. Each web server has its own configuration file and file structure.

The configuration file contains functionality and configuration options for the web server, expressed as directives. When the Apache daemon is launched, it configures itself according to the directives which have been specified. To make changes to the configuration, the webmaster modifies directive text fields in the appropriate configuration file and restarts Apache.

The main configuration file in Apache is httpd.conf; the website's pages are stored in the htdocs directory, error and access logs in the log directory, and CGI scripts in the cgi-bin directory.

Because instructions and directives are so flexible, you need to develop a plan for the web site's configuration and access before you configure the server. You should consider interaction planning, security, user access and planning for growth.

The Apache web page gives details about configuring the httpd.conf file.

Larson and Stephens, chapter 4.2 describes how to customize the web server.

CGI configuration

There are several directives which need to be set up in the file httpd.conf to configure Apache to execute CGI programs as programs, rather than deliver them as files. These directives tell Apache where to find CGI-related items:

The Apache document Dynamic Content with CGI contains instructions for configuring a web server to execute CGI programs.

Access control and user management

A website does not have to operate access control; indeed most websites don't have it because they're intended for the general public. But if you want to restrict access, perhaps to authorised employees, registered customers, users from particular Internet locations, you should set up some means of access control.

Users access web page or CGI program files via the web server. They may also wish to add web page files and manipulate the access rights to those files. At the same time, the web administrator wants to control access to the web server file structure and prevent unauthorised access. On a web server with multiple users it would be a security risk to give everyone read/write permission on the web root directory.

As with other aspects of configuring a web server, draw up a plan for allowing access to the web files. The plan will specify which directories need protection, which users are allowed access and at what level.

Access control

Access control in Apache requires that certain modules are included to provide security mechanisms. The most common modules are mod_auth for basic authentication, mod_digest for digest authentication and/or mod_access for access control by host.

Larson and Stephens, lab 3.1 contains information on server users and directories, including a definition of the public_html directory.

Larson and Stephens, lab 4.3 contains information on setting permissions on files and directories, and limiting access to specific users and hosts.

Compare the Larson and Stephens references with the tutorial in Apache Today.

Access control can be carried out in two ways: site-wide control using the httpd.conf file, or per-directory control by users, using a .htaccess file in each directory.

Site-wide control

The web administrator can control access by specified hosts using the allow and deny directives within Directory blocks. The order directive is used to specify the order in which the allow and deny directives are executed. The deny and allow directives are also used to control access to specific directories from the httpd.conf file, as shown below.

Per-directory control by user

First, the web administrator creates files identifying users and user groups. The web administrator then sets directives in the web server configuration file:

The Apache notes on directives describe how to do this.

User management

For a user to perform access control, there needs to be a database of known users and their passwords. This can be kept in a directory users at the same level as the directory htdocs, with access restricted to the web administrator. The most common way to do this in Apache is to use the utility htpasswd. For example,

htpasswd -c /usr/local/etc/httpd/users Gordon 

creates the file users and adds Gordon to it; -c can be omitted if the file already exists. The command prompts for a password and stores it in the file with the username.

User group files can also be created so that identification can be customised for different directories. The group file is a plain text file listing the members of each group and is usually located in the same directory as the user file, although it can be stored in top-level directory Users for security.

Both user files and group files can then be referenced by the httpd.conf file.

If there are many users, a database file can be used. Note that the password file should not be stored in a directory accessible from the server's document root; this would make it potentially readable and would be a security breach. As an alternative to requiring users to be named, Apache can be set up to accept anonymous access. The module mod_auth_anon is required for this; see the Apache notes on mod_auth_anon for more details.

References

Larson, E. and Stephens, B. (2000) Administrating Web Servers, Security and Maintenance, Prentice Hall, New Jersey.

Laurie, B. and Laurie, P. (1999) Apache, the definitive guide, O'Reilly, Sebastopol, CA.

http://httpd.Apache.org, referenced 26th February 2003.

http://www.Apacheweek.com/, referenced 26th February 2003.

http://www.serverwatch.com/tutorials/article.php/1127711, referenced 26th February 2003.

 

Last updated by Prof Jim Briggs of the School of Computing at the University of Portsmouth

 
The web programming units include some material that was formerly part of the WPRMP, WECPP, WPSSM and WEMAM units.