Webserver use, configuration and management

Unit WUCM1

Identifying Requirements

Requirements engineering

Requirements engineering is a general process needed by all computing projects. Much of what is set out as the needs of any software development project are equally applicable to a webserver/website development problem. Broadly, the activities set out by Sommerville (1997), are still needed in a web development project:

In support the above activities some kind of management process is needed to oversee and control the timely delivery of outputs.

Kotonya (1998) gives the following illustrative example of some of the requirements for a library system.

  1. "The system shall maintain records of all library materials, including books, serials, newspapers and magazines, video and audio tapes, reports, collections of transparencies, computer disks and CDROMs.
  2. The system shall allow users to search for an item by author, title or ISBN.
  3. The system’s user interface shall be implemented using a web browser.
  4. The system shall support at least 20 transactions per second.
  5. The system facilities which are available to public users shall be demonstrable in 10 minutes or less."   Kotonya (1998)

Can you spot any problems with the 5 examples given? The above examples are reasonably typical of initial requirements and illustrate five different types of requirement, viz:

  1. "General requirements – such as 1 above, which set out in broad terms what the system should do.
  2. Functional requirements – such as 2, which define part of the system’s functionality.
  3. Implementation requirements – such as 3, which state how the system must be implemented.
  4. Performance requirements – such as 4, which specify a minimum acceptable performance for the system.
  5. Usability requirements – such as 5, which specify the usability in a measurable way." Kotonya (1998)

How much of this general requirements engineering activity is applicable to the development of a webserver and website? A look through texts like Lynch (1999) that focus on site design, include a section on planning and problem definition that can be interpreted as requirements engineering, whereas texts that concentrate on building web systems are far more likely to include a formal requirements section, e.g. Conallen, (2000) who looks at using UML to develop web systems.

Is a webserver special in any way? Look for requirements engineering (or equivalent terms) in your recommended reading for other units, and ask yourself these questions of their points and advice. Jot any interesting findings below.

Reasons for the webserver/website

Before delving into the requirements engineering issues, it is worth pondering the reasons for providing the webserver with its mounted website(s) in the first place. This will help identify some of the stakeholders and requirements questions. Much of material in this section is ‘borrowed with permission’ from Paterson (2001). Can you add any broad categories to the following? Can you give illustrative examples?

Most websites serve more than one purpose. Most are designed to make money in one way or another.

To inform or educate

To entertain

To market, sell or persuade

To stroke someone’s ego

Owners and stakeholders

In the case of the library system used as an example by Kotonya (1998), the owners and stakeholders are reasonably easily determined. For a website the position is often less clear-cut. There is usually a clear indication of who owns the webserver - the hardware etc. is purchased and mounted at a particular location by a clear "owner", but what of the data, both content and configuration, and the software (CGI etc) that make it work?

Consider the University website. It is mounted on two servers located in Mercantile House and James Watson Building. List who you think are the owners and separately the stakeholders.

Audience

Audience motivation

In trying to identify your audience, it is a good idea to ask the following two questions:

Knowing your audience is important as it will colour many of your requirements. Whether the biggest impact is on the non-functional ‘artistic’ website type requirements, or on the more technical webserver requirements, depends on your basic goals for the webserver/website, perhaps as clarified by the questions above.

Audience types

Who is the audience?   

Informational site

Entertainment site

Business site

Non-profit site

Ego site

  • Employees
  • Students
  • Information seekers
  • The curious
  • Generally younger people
  • Sophisticated web users
  • Need to be ‘up-to-date’
  • Bells & whistles
  • Current customers
  • Potential customers
  • Investors
  • Sales force
  • Competitors
  • Activists
  • Donors
  • Information seekers
  • Creator
  • Family & friends

www.port.ac.uk

www.disney.com

www.ibm.com

www.aidsquilt.org

www.yaboogie.com/~julie/index.html

Can you add to these broad observations?

How should you go about establishing information about your expected audience? Clearly asking them is best, but usually not possible. What sort of questionnaire might be worthwhile? Can you identify a few stereotypical users to build a model? Audience profiling using stereotypical users is a commonly used technique - see the more extended discussion in Powell (2000).

Audience questions

Questions from Powell (2000):

Basic questions about the user

  • Where are they located?
  • How old are they?
  • What gender?
  • What language do they speak?
  • How technically proficient are they?
  • What kind of connection would they have to the Internet?
  • What kind of computer would they use?
  • What kind of browser would they probably use?

What are they doing?

  • How did they get to the site?
  • What do you want them to do (from above discussion)?
  • When will they visit the site?
  • How long will they stay during a particular visit?
  • From what page(s) will they leave the site?
  • When will they return to the site (if ever)?

 

Where should the answers to these questions fit into your requirements gathering? How should results be recorded? Will there be an impact on the design? the content? the testing? the evaluation?

Content

Whilst identifying and structuring the content is important, we will be considering it in more detail in another session. What are the requirements issues concerning the content? There are a number of general points that should be considered under the heading of requirements, before moving on to the design issues.

In order to form some opinion on the capacity and bandwidth requirements to be discussed next, what content issues need to be addressed?

Volume of data

The volume of data to be served by the webserver clearly has a significant impact on both the hardware and software of the platform and the design issues to be addressed. It is important to get an estimate for the number of files and the average size of files, as well as the total size of the webspace. If the website will involve dynamic generation of pages, how big is the database to be used? All of these estimates will feed the capacity questions to be asked next. Another important estimate is the rate of growth of data to be served.

‘Churn’ of data

The churn is a measure of what proportion of the total data is changed per unit of time; e.g. hour, day or month. This aspect will have an impact on the number of repeat visits by users, and will raise issues of archival storage and version control. For what sort of site is it important to be able to back track through older versions of documents accessible via the webserver? Is this likely to be a part of your requirements documentation?

Number of ‘hits’

Whilst this is only peripherally a content issue, it is important in considering the capacity of the webserver and its network connection. For a new site, it will clearly be based on hope and expectation, rather than on any measure, but needs to be included. Collecting this data and consciously feeding it into the ongoing management process is vital.

In terms of requirements, this area might give rise to requirements such as:

In the above examples (from Conallen 2000) the requirements specification concentrated on the server performance requirements. Does this make any presumptions about the intervening delivery system, a LAN based intranet or the more diverse Internet? In terms of auditability, numbering and tracking these requirements in subsequent documents would follow the conventional software engineering model.

Capacity

Capacity planning is necessarily done before any of the system has been assembled. Webserver tuning, in comparison, is done after the initial architecture has been implemented and released to the waiting world. At this stage you have real data on which to base any performance tuning exercises. Killelea (1998) makes the very cogent point that whilst perfect capacity planning would eliminate the need for performance tuning, it is impossible to achieve as you cannot predict the behaviour of the users, even if they are a well defied cohort (for example, your employees). The very fact of offering a new service will alter their behaviour; much the same as opening a new motorway alters the traffic flows on which the size/route of the motorway was planned.

It is vital to undertake some initial planning – i.e. do the sums based on your estimates, so as to have a view on expected webserver performance at the launch of the service, and into the future. Unplanned growth is liable to throw up significant expense, especially if any of your estimates cross a ‘scalability threshold’, and you need to completely replace a system (whether this is hardware or software).

Killelea (1998 and 2002) discusses in detail the capacity planning issues relevant to establishing a new web system. The following is drawn from that material, and if you can locate a copy do read through chapter 2. (The library has several copies.) Killelea sets out the initial planning as a set of questions, to which he provides considerable elaboration and example, viz:

Conclusion

Greenberg (1999) discusses the full software engineering cycle for a typical small web system, addressing issues from the initial requirements issues we have looked at , the graphical and structural issues, to the programming and implementation issues, final ending up with the trauma of "going live" and its implications and issues. In respect of today’s topic Greenberg (1999) suggests that the requirements must:

Larson (2000) gives a brief overview of the process of requirements determination in chapter 2, together with a few self-test questions -worth a few moments study.

References

Mark Arnold, Jeff Almeida, & Clint Miller
Administering Apache,
McGraw-Hill, (2000),
ISBN: 0072122919                          (Lib)

Eric Larson & Brian Stephens
Administrating Web Servers, Security and Maintenance,
Prentice Hall, (2000),
ISBN: 0130225347                          (Lib)

Patrick Killelea
Web Performance Tuning (2e),
O'Reilly, (2002),
ISBN: 059600172X                         (Lib)

Ian Sommerville and Pete Sawyer
Requirements Engineering: A good Practice Guide
Wiley, (1997)
ISBN: 0471974447                          (Lib)

Peter Wainwright
Professional Apache
Wrox Press, (1999)
ISBN: 1861003021                          (Lib)

Gerald Kotonya and Ian Sommerville
Requirements Engineering: Processes and Techniques
Wiley, (1998)
ISBN: 0471972088                          (Lib)

Jim Conallen
Building Web Applications with UML
Addison Wesley, 2000
ISBN: 0201615770                          (Lib)

Patrick J. Lynch and Sarah Horton
Web Style Guide
Yale University Press, 1999
0300076754                                   (Lib)

Pat Paterson,
WDIEM Lecture notes.
UoP, (2001)

Jeff Greenberg and J.R. Lakeland
Building Professional Websites with the Right Tools
Prentice-Hall, (1999)
ISBN: 0130843172                          (OoP, sorry)

Patrick Killelea
Web Performance Tuning,
O'Reilly, (1998),
ISBN: 1565923790

Thomas A. Powell
Web Design: The Complete Reference
Osborne, 2000
ISBN: 0072122978                         

 

Last updated by Prof Jim Briggs of the School of Computing at the University of Portsmouth