University of Portsmouth

Department of Information Systems and Computer Applications

Information Systems Postgraduate Programme

 

Lecturer:

Jim Briggs / Penny Hart

Unit Code:

DL.WPRMP

Unit Title:

Web Site Programming and Management

Date of Test:

 July 2003

Time of Test:

--

Duration of Test:

2 hours

Requirements:

None

Office Code

 

Student Groups

 

 

Rubric:

Number of Questions

Answer 5 out of 8 questions (20 marks per question). Answer at least 2 questions from each section. Answer each section in a separate answer book.

 

Mode and Restrictions

Open book - no restrictions

 

Pass mark as a percentage

40%

 

Weighting as percentage of Unit

50%

 

Students to see scenario approximately one week in advance of exam.

Scenario

 

A healthcare trust is involved in an pilot e-Healthcare project to provide the general public with a health advice web site, giving information on topics such as diet, exercise and minor health problems. The site should also include the capability to submit enquirer details to e-Health advisers for further advice and appointment booking.

Statistics are to be collected to monitor the usage of the site.

 

In addition to this, as part of the same project, the trust wishes to pilot an electronic patient record system in one of its larger hospitals, using an intranet.

 

Currently, although automated patient record systems exist in most departments of the hospital, there is no linkage between them. Therefore a patient with multiple problems may have separate records on several systems, and a full diagnosis taking all factors into account cannot be made easily.

 

The electronic patient record, stored in a central database, will collate all this information, and present a history of any treatment the patient may have had in the past. Access to data from individual departments is restricted to the staff working in those departments. Access to the full electronic patient record is accessible only to medical consultants.

 

You are being tasked to set up a web site linked to an advice database, and a secure network for the transmission of the patient records, linked to a central record database. Confidentiality is vital for both enquiries and for patient records.

 

Section A  

 

1.      In order to maximise the use made of the on-line advice site, it is important to choose an easily remembered name for the site. The trust also wants to monitor how well it is being used.

 

(a)    What are the issues you need to take into account when naming a web site? What steps would you take to ensure that the site could be found?

(8 marks)

 

(b)   What techniques can be used to establish the site's readership? Describe any shortcomings in the current methods of readership measurement and indicate possible solutions.

(8 marks)

 

(c)    Assess the usefulness of server log files for determining how readers use the site.

(4 marks)

 

 

2.      This question is about access controls and user permissions.

 

(a)    Outline a strategy for allowing access to the web site and the intranet. What would you need to take into consideration? Would it be advisable to maintain both the web site and the intranet on the same server?

(8 marks)

(b)   What access controls would you adopt for two of the following? Express these as configuration directives.

 

(i)                  A member of the public wishing to visit the web site to submit a question.

 

(ii)                A healthcare trust statistician, who needs to view the access log data.

 

(iii)               A staff nurse needing to access a patient's personal record from the intranet.

(8 marks)

 

(c)    How would you go about setting up user permissions for the intranet?

(4 marks)

 

 

3.      It is essential that details entered by the enquirer on the web site and patient records on the intranet remain secure.

 

(a)    List the types of security problem you might experience, and suggest ways of countering them.

(10 marks)

 

(b)   How would you set up security when configuring the web server? Express this as Apache configuration directives or show how to set the controls in another web server.

(6 marks)

 

(c)    Explain why it is important to review security regularly. What other agencies might be involved in ensuring the security of your site?

(4 marks)

 

4.      If the advice web site is successful, the healthcare trust intends to publicise it to a wider audience. Therefore you need to perform capacity planning at the start of the project.

 

(a)    Describe the type of capacity planning you would carry out for the web site and indicate how you would monitor server traffic.

(8 marks)

(b)   Describe the circumstances in which the following server overload situations might occur for this web site and indicate how you would accommodate them:

(i)                  A sudden peak in site traffic

(ii)                A cumulative increase in site traffic

(8 marks)

(c)    In what ways would capacity planning for the intranet differ from capacity planning for the web site?

(4 marks)

Section B 

5.      One of Perl’s features is the ability to manipulate text strings and match patterns.

(a)    Describe situations where you might use pattern-matching operators.

(8 marks)

(b)   Write Perl substitution operations to change the following string: “Welcome to Java!\n”.

(i)                  To read “Welcome to Perl!\n”

(ii)                Then amend the string in (i) to read “Welcome Perl!\n”

(iii)               Then amend the string in (ii) to read “Hello. Welcome Perl!\n”

(6 marks)

(c)    What do the following instructions do?

(i)                  tr { / \ \ \ r \ n \ b \ f . }{_}; substitute underscore for non-alpha chars

(ii)                s / \ s + $ / / ;  discard trailing white space

(iii)               match expression

(6 marks)

6.      The question is about CGI programs.

(a)    Describe and explain the actions you would take to ensure that a CGI script could be run from the University’s Herring server.

(4 marks)

(b)   What environment variables are most commonly used by a CGI program? Describe how they are used and what they are used for.

(6 marks)

(c)    What is the purpose of each user's public_html directory on Herring? Explain how it contributes to the security of the server.

(4 marks)

(d)   There are various ways in which a Perl CGI program can output its HTML. Describe two of those ways and explain their advantages and disadvantages.

(6 marks)

 

7.      This question is about data structures.

(a)    What features of Perl make it possible to have data structures of more than one dimension?

(3 marks)

(b)   Each line of a text file contains a statistic in the form of a parameter:value pair. Write a Perl subroutine to extract the statistics and output them in table format. The text file name is to be supplied as a parameter to the subroutine.

[17 marks]

 

 

8.      The code in Appendix A analyses a web server’s access log and presents the results as HTML output. The format of the information in the log is:

Wed Feb 26 19:43:13 2003 – 200 – http://www.perl.com

Wed Feb 26 19:45:36 2003 – 500 – http://www.port.ac.uk

Wed Feb 26 19:54:06 2003 – 404 – http://www.del.com

Examine the code and answer the following questions (in the context of the problem being solved):

(a)    Explain what is happening in code line 9. What is meant by $ARGV[0] in line 7?

(3 marks)

(b)   Code lines 11 to 14 extract the month, result code and URL data. What does the regular expression in code line 14 achieve?

(2 marks)

(c)    What happens in lines 16 to 18? Explain the mechanism used.

(4 marks)

(d)   Indicate the format of the Site totals HTML output specified in code lines 25 to 40. In what order is the information presented?

(2 marks)

(e)    What is the meaning of code line 46?

(2 marks)

(f)     Why would the web administrator need to know the percentage of uptime for the web sites hosted by the server?

(2 marks)

(g)    Apart from the information specified above, what other things might log analyser tools measure?

(5 marks)

Appendix A

 

#!/usr/bin/perl

use warnings

use strict

 

my %site

my $latest

my log_file = $ARGV[0];

 

open (LOG, $log_file) or die “Log file: $!”;

 

while (my $line = <LOG>)

{

    my ($month, $code, $url) =  $line =~ / ^.. (…) . + ? -  (\d\d\d) –  ( . + ? ) $  /;

   ($latest) = $line =~ / ^ ( [ ^ - ] + ) - / ;

 

   $site{$url}->{total}++;

   $site{$url}->{result}->{$code}->{total}++;

   $site{$url}->{result}->{$code}->{date }->{month }->{total}++;

}

close LOG;

 

print “<html”>\n”;

print “<h2>Log Analysis</h2>\n”;

print “<h3>Site totals:</h3>\n”;

foreach my $url (sort keys %site)

{

   my $total = $site{$url}->{total} or 1;

   print “<p><b>$url</b>: $total monitor request(s) \n”;

   print “<ul>\n”;

 

   foreach my $code (sort keys %{$site{$url}->{result}})

   {

      my $total = $site{$url}->{result}->{$code}->{total};

      print “<li><b>$code</b>: $total monitor request(s)</li>\n”;

      print </ul>\n”;

 

      foreach my $month (sort keys %{$site{$url}->{result}->{$code}->{date}})

      {

          my $total = $site{$url}->{result}->{$code}->{date}->{month}->{total};

          print “<li><b>$month</b>: $total monitor request(s)</li>\n”;

      }

      print “</ul>\n”;

   }

   print “</ul>\n”;

 

   my $successes = $site{$url}->{result}->{200}->{total} || 0;

   my $uptime = sprintf(“%2.2f”, $successes / $total * 100);

   print “Percent uptime: <b>$uptime</b></p>\n\n”;

}

 my $summary_file = “/tmp/log_analysis_summary.txt”;

open (SUMMARY, “>$summary_file”) or die “ Summary: $!”;

 

foreach  my $url (sort keys %site)

{

   foreach my $code (sort keys %{site{$url}->{result}})

   {

       foreach my $month (sort keys %{site{url}->{result}->{$code}->{date}})

      {

         my $total = $site{$url}->{result}->{$code}->{date}->{$month}->{total};

         print SUMMARY “$url  - $code - $month - $total\n”;

      }

   }

}

 

print SUMMARY “Latest: $latest\n”;

 

close SUMMARY;

 

print “</HTML>”