Webserver use, configuration and management

Unit WUCM1

Further web server configuration

Basic Features

This practical explores some other Apache directives and their effects.  

The first set were the basic global directives:

### Section 1: Global Environment   
ServerType standalone
ServerRoot "C:/Apache"
PidFile logs/httpd.pid
ScoreBoardFile logs/apache_runtime_status
Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 15
MaxRequestsPerChild 0
ThreadsPerChild 5

Note that the specialised log files httpd.pid and apache_runtime_status are by default held in the logs sub-directory of that given for the ServerRoot, vis C:\Apache in my example. (Remember that internally Apache uses Unix style directory separators, mapping back and forth at the appropriate time is confusing!) 

For the purposes of this practical, we will leave these alone, but one experiment it would be good to try is that of ensuring that your Roger (or equivalent) WebRoot directory is used for the startup files, together with its logs subdirectory. (See the earlier practical.) If you want to see the pid file listed in Windows Explorer, note that it is classed as a system file and may be hidden by Windows, even if you have set Explorer to display hidden files – configure it to display system files as well as hidden files.

Apache modules

One group of directives is that describing which of the optional modules of Apache are to be included. The default install includes a standard set. To see what modules have been included, give the command "apache –l"  (note that the qualifier is an ‘ell’ not a ‘one’), as shown:

You can load a number of others - see the original conf file or the Apache help file for examples. Note that Apache 1 and Apache 2 support a different range of modules. For today we will mostly make do with these inbuilt modules.

In order to make use of these resources you need to tell Apache to add them in to its active set. This is not quite true, they are active, but need to be enabled in a particular order, especially if you have loaded any other modules; the AddModule directive lets you specify order after using ClearModuleList to reset to none.

The reason order is important lies in the need to have pre-declared any resources used by later modules. So add the following lines in to your httpd.conf file - you can omit any of the commented ones.   You should be able to copy and paste them from the ‘default’ or backup copy, or indeed from this file.

#The default conf file include a significant number of lines describing the 
#'modules' to be loaded, the  following are only those precompiled in the 
#standard Win32 build.   Order is important; if in doubt revert to  the original
#set and then experiment.  Note the loadmodules lines have been deleted.
# [WHENEVER YOU CHANGE THE  LOADMODULE SECTION ABOVE, UPDATE THIS TOO!]
ClearModuleList
#AddModule mod_vhost_alias.c
AddModule mod_env.c
AddModule mod_log_config.c
#AddModule mod_mime_magic.c
AddModule mod_mime.c
AddModule mod_negotiation.c
#AddModule mod_status.c
#AddModule mod_info.c
AddModule mod_include.c
AddModule mod_autoindex.c
AddModule mod_dir.c
AddModule mod_isapi.c
AddModule mod_cgi.c
AddModule mod_asis.c
AddModule mod_imap.c
AddModule mod_actions.c
#AddModule mod_speling.c
AddModule mod_userdir.c
AddModule mod_alias.c
#AddModule mod_rewrite.c
AddModule mod_access.c
AddModule mod_auth.c
#AddModule mod_auth_anon.c
#AddModule mod_auth_dbm.c
#AddModule mod_auth_digest.c
#AddModule mod_digest.c
#AddModule mod_proxy.c
#AddModule mod_cern_meta.c
#AddModule mod_expires.c
#AddModule mod_headers.c
#AddModule mod_usertrack.c
#AddModule mod_unique_id.c
AddModule mod_so.c
AddModule mod_setenvif.c

Laurie (2003) discusses all of these in brief in Chapter 1 (pp24-26) or Arnold (2000, pp43-48). One that we will be looking at briefly is “mod_dir.c” , this module handles requests on directories and directory index files. Another one that is often helpful in simplifying user interactions is “mod_alias.c”

Basic security

The main server configuration part of the conf file gives scope for more customisation. First a little security is a good idea. Previously, the only security we considered was to define the DocumentRoot.

Directory protection

The section of the conf file below adds basic protection to directories (folders) and their sub-directories (sub-folders).

###  Section 2: 'Main' server configuration, set up for your PC
ServerAdmin  admin@tech.port.ac.uk
ServerName  myserver.tech.port.ac.uk
Port  80

#alter  this to where you have put your web data.
DocumentRoot  "N:/WebRoot/Roger/htdocs"

#  First, we configure the "default" to be a very restrictive set of 
#  permissions.   Note the punctuation!  Check with apache –t
#  apache –h will give you a list of all of the qualifiers.
#  Should protect all of C:/ in the same way as for N:/
<Directory  "N:/">
  Options none
  AllowOverride none
  Order allow,deny
  Deny from all
</Directory>

#  This should be changed to whatever you set DocumentRoot to.
#
<Directory  "N:/WebRoot/Roger/htdocs">
  Options Indexes FollowSymLinks MultiViews
  AllowOverride none
  Order allow,deny
  Allow from all
</Directory>

Here we have the N: drive hard disc set up to default deny. On a Unix machine just “/” would be the whole file space, but under Windows only the boot drive. As we want to protect the N:/ drive, setting it up explicitly is best, but you should also then cover C:/ explicitly too. This is an example of where the integrated nature of the Unix file system is helpful. The AllowOverride none establishes that the access control cannot be overridden by access control files stored in the directory (otherwise these can be used to more precisely define access control). Only directories specifically setup in the conf file are thus ‘opened’ to user scrutiny. N.B. there is no space after the comma in the order directive - the allow,deny is a single token. Try putting a space in to see what happens.

Exercises

Having added this section into your conf file, try a number of changes to check its implementation. Try adding a subdirectory of htdocs, (say private) and set up the protection so that it has Order deny,allow followed by Deny from all and Allow from cl101.tech.port.ac.uk (or whatever your machine's name is). Can it be accessed from:

How would you test this? You would need to construct the private directory; put at least one file in it, say a simple test file called index.html; and then try and reach the directory. Unless you modify one of the existing website pages to point to it you will not be able to link, just type in the full path as a URL, e.g. http://cl101.tech.port.ac.uk/private

In order to really test whether the protection is working as you expect (i.e. forbidding access correctly), you need to examine the logs. See the next section if you do not have the error log configured yet.

Log files

In the lecture we looked at the form of the various log files. These are going to be vital in determining whether the server is behaving as you expect. There are several different forms of log file: see the Apache documentation for more information.  

Log file configuration

The section of the conf file below illustrates:

ErrorLog  logs/error.log
LogLevel warn

LogFormat "%h %l  %u %t \"%r\" %>s %b \"%{Referer}i\"  \"%{User-Agent}i\"" combined
LogFormat "%h %l  %u %t \"%r\" %>s %b" common
LogFormat  "%{Referer}i -> %U" referer
LogFormat  "%{User-agent}i" agent

CustomLog  logs/access.log common
CustomLog  logs/referer.log referer
CustomLog  logs/agent.log agent

The default conf file only offers one of the log files - here three are defined and held in access.log, referer.log and agent.log. These three files will be created (if they do not already exist) in the logs directory. NB: Remember that if you do not give a fully qualified path, then Apache will prepend the ServerRoot to generate a fully qualified path. Be specific if you want your logs in the N:\WebRoot\Roger directories. The error.log file collects server errors – do look periodically; you may be surprised!

Log file examples

Here is a section of the referrer log:

-  -> /
http://meshedfrog.ranvilles/  -> /index.html
http://meshedfrog.ranvilles/index.html  -> /intro.html
http://meshedfrog.ranvilles/index.html  -> /banhom.htm
http://meshedfrog.ranvilles/intro.html  -> /images/paper.jpg
http://meshedfrog.ranvilles/banhom.htm  -> /images/butadmin.GIF
http://meshedfrog.ranvilles/banhom.htm  -> /images/buthome.GIF 

Here is an extract from the error log file:

[Sun Mar 03 12:20:49  2002] [error] [client 192.168.27.58] File does not exist:  c:/apache/roger/htdocs/server_status
[Sun Mar 03 12:20:55  2002] [error] [client 192.168.27.58] File does not exist:  c:/apache/roger/htdocs/server_status/
[Sun Mar 03 12:21:20  2002] [error] [client 192.168.27.58] File does not exist:  c:/apache/roger/htdocs/server-status/
[Sun Mar 03 12:21:23  2002] [error] [client 192.168.27.58] File does not exist:  c:/apache/roger/htdocs/server-status 

And from the agent.log - not very interesting for a single PC experimental setup:

Mozilla/4.0  (compatible; MSIE 5.5; Windows NT 5.0)
Mozilla/4.0  (compatible; MSIE 5.5; Windows NT 5.0) 

To what use can you put this information? How will it help locate errors, locate client problems, tune the server performance etc?

Use of index pages

Sometimes it is possible to set Apache running and be presented with a directory listing (with or without icons) instead of the expected “It worked” message. The reason being that the conf file did not tell Apache what files to treat as index pages.  

By default, if Apache does not find an index file (usually index.html or index.htm), it will display a directory listing. For security reasons, many people turn this feature off, or arrange to have an index file in each directory.  

Option directive

In any directory block directive, the Options directive can be used with + or – Indexes as a parameter, vis:

Options –Indexes

Having done this, any browser hitting the directory gets a "forbidden" error. This is how IE renders it.

Beware, the exact way of joining Options that apply to the same directory is not always obvious. It is best to have a single statement: see Laurie (1999, pp68-69).

Which file is an index?

In order to tell Apache what files to treat as index files you need to use the DirectoryIndex directive:

#  DirectoryIndex: Name of the file or files to use as a pre-written HTML
#  directory index.  Separate multiple  entries with spaces.
#
<IfModule  mod_dir.c>
  DirectoryIndex index.html index.htm
</IfModule>

Note that this includes a test to see if the “mod_dir.c” module has been enabled; see beginning of the practical.

A space delimited list of all files that should be treated as indexes should follow the DirectoryIndex directive. Try your web server both with and without a suitable index file – using index.html and index.htm caters for both Unix and Windows based traditions. Add whatever else you think might be a good idea, home.html or welcome.html, for example.

Alias directives

Alias directives can be used to hide the exact location of special directories (including as we will see later, the CGI program directories) or for making complex paths easier for the web user. For example:

<IfModule  mod_alias.c>
# Note that if you include a trailing / on  fakename then the server will
# require it to be present in the URL.  So "/icons" isn't aliased in this
# example, only "/icons/".  If the fakename is slash-terminated, then the 
# realname must also be slash terminated,  and if the fakename omits the 
# trailing slash, the realname must also  omit it.
#
Alias /icons/ "C:/Apache/icons/"
<Directory  "C:/Apache/icons">
  Options Indexes MultiViews
  AllowOverride None
  Order allow,deny
  Allow from all
</Directory>

Alias /manual/  "C:/Apache/htdocs/manual/"
<Directory  "C:/Apache/htdocs/manual">
  Options Indexes FollowSymlinks MultiViews
  AllowOverride None
  Order allow,deny
  Allow from all
</Directory>
</IfModule>
# End of aliases

The above set of aliases are those set up by a default install on a Windows system. These define in particular the location of the html documentation files. Having set up a new DocumentRoot (e.g. N:\WebRoot\Roger\htdocs), it enables the user to enter http://cl101.tech.port.ac.uk/manual/ and be greeted by the manual index page (assuming you have permitted index pages. See the example below.

Note that if you define an alias with a trailing slash, then this must be used by your user!

An example of the alias in use, the user (me!) entered the request for the manual directory.

The response being the manual index page, even though the directory is not under the document root! Caution – this does require you to have thought about the security features of this directory, or you maybe still will not get it!

Try the effect of commenting out the <Directory xxx> block directive.

Server status and server info

The last two directives to consider involve the use of two non-default modules, specifically mod_status.so and mod_info.so. These two lines need to be included prior to the block of AddModule commands set out earlier.

LoadModule  status_module modules/mod_status.so
LoadModule  info_module modules/mod_info.so

The next step is to uncomment the relevant entries in the AddModule block, viz remove the # from:

#AddModule mod_status.c
#AddModule mod_info.c 

These two modules permit the web user to ask for two pseudo directories /server-info and /server-status as in the screen shots below. The server status by default is fairly brief, but can be extended with ExtendedStatus On

In order to activate these two modules you need to include:

# Allow server status  reports, with the URL of http://servername/server-status

<Location /server-status>
  SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from tech.port.ac.uk
</Location>

Allow remote server  configuration reports, with the URL of
# http://servername/server-info  (requires that mod_info.c be loaded).
<Location /server-info>
  SetHandler server-info
  Order deny,allow
  Deny from all
  Allow from tech.port.ac.uk
</Location>

The Allow from tech.port.ac.uk limits access to these potential security risk summaries to hosts on the local domain.

Web security

Again we will not have time to cover this in detail now, but the following areas are important, and all need to be addressed to achieve a secure web service.

Browser Security – All current web browsers have some security loopholes.   These are very difficult to control from the server provider viewpoint as the browsers are maintained by a totally autonomous set of people- the general public, over which you have no control.

Web Sever Security – Configuring your web server to maximise the security features is clearly desirable.   However, it usually conflicts with ‘ease of use’ in some way or other, so needs a careful balance.

NOS Security – The security features of the underlying network operating system need to be set up properly.   It is of little use to set up and enforce tight security in the protocols (e.g. the encrypted data packets) only to leave the credit card details in an insecure directory on the network server.   Separation of functions is usually a basic starting point, each machine dedicated to one function, either a web server or a database server, but never both.

Protocol Security – This relates to the use of secure protocols to send data over the public Internet.   Often invoked via the SSL (Secure Socket Layer), but involving a range of security services.   Using a secure protocol will only result in a secure system if all of the above features are addressed.   Encryption on its own is not a magic bullet.

References

Ben Laurie and Peter Laurie
Apache: The Definitive Guide (2e)
O’Reilly, 1999
ISBN: 1565925289

Mark Arnold, Jeff Almeida & Clint Miller
Administering Apache
McGraw-Hill, 2000
ISBN: 0072122919

Peter Wainwright
Professional Apache
Wrox, 1999
ISBN: 1861003021

Ben Laurie and Peter Laurie
Apache: The Definitive Guide (3e)
O’Reilly, 2003
ISBN: 0596002033

 

Last updated by Prof Jim Briggs of the School of Computing at the University of Portsmouth