Lecture 3 - Important characteristics of large-scale systems

Introduction to Software Engineering

Lecture 3 - Important characteristics of large-scale systems

Jim Briggs, 8 October 1998

Software evolution (Sommerville, 5^th edition, chap 32)

Large software systems are not static objects. Their environment changes, and software must adapt or become progressively less useful (then discarded). Process termed software evolution by Manny Lehman (Imperial College).

Maintenance accomplished by series of releases.

If system installed in several different places, unlikely that all installations will include all changes in each release. Furthermore, each installation may make local changes. (Each installation may have different priorities.) Consequently, versions of the system drift apart. This divergence makes future maintenance more difficult.

Lehman's laws (hypotheses derived from observation) of program evolution:

Continuing change: a program must change or become less useful. "Inevitability". Introducing a system into an environment changes the environment. Users in the modified environment then modify their expectations of the system.
Increasing complexity - degraded structure: as a program changes its structure becomes more complex (pure structure corrupted) unless actively avoided. Need to spend money at some point to make system (by restructuring) more adaptable in future.
Program evolution: large systems act like an inertial mass (size inhibits changes because changes introduce new faults, new requirements, etc.). They have a dynamic of their own established early in the development process. Maintenance management cannot override it (inhibits desired major changes because cost too great, perhaps beyond local budget, take time to make, results of organisational decision-making process). Result of inertia in human organisations.
Conservation of organisational stability: the rate of development of a program (over its lifetime) is approximately constant and independent of the resources (e.g. staff) devoted to it. Most large programming projects work in a "saturated" state - change of resources has imperceptible effect on long-term evolution. Large projects are unproductive because of communication overheads.
Conservation of familiarity: the incremental change in each release is approximately constant. Release with large number of changes will probably be followed quickly by another release with small number of changes to fix bugs introduced. Suggests need to interleave enhancement releases with repair releases. There is a limit to the rate at which new functionality can be introduced.

"Laws" based on observation but not (yet) validated properly. Do seem sensible and probably not best to try and buck them. 3-5 may not apply in smaller organisations.

Implications of above:

software maintenance costs not necessarily result of errors or omissions and cannot ever be eliminated;
management should not plan to make major changes in a single increment - adding functionality should not be attempted at the same time as curing problems;
most cost-effective way to develop software is to use as few people as possible in each group - if necessary, large system should be built from independent subsystems.

Reliability v. efficiency (Sommerville, 5^th edition, chap 18)

Reliability more important than efficiency because:

Equipment is becoming cheaper and faster. Less need to maximise equipment usage over human convenience. Paradox: faster equipment leads to higher expectations by user.
Users will avoid unreliable software however efficient it is. Reputation for one unreliable product may affect sales of other products.
In some applications (nuclear reactors, avionics) cost of system failure is much greater than cost of system itself.
Efficient systems can be tuned because execution time concentrated in small sections of program. Unreliability is difficult to improve since tends to be distributed throughout system.
Inefficiency is predictable. Unreliability occurs without warning and results may not be discovered until later.
Unreliable systems may result in data being lost. Data is very expensive - worth more to company than system that processes it - hence much effort on duplicating (backup).

Measures of reliability such as "mean time between failures" do not take into account subjective importance of fault. An unimportant fault occurring frequently might not give a system as poor a reputation for reliability as a rare fault that killed someone.

Therefore, important consideration is perceived reliability.

System reliability (in general) is dependent on correctness of design, correctness of mapping design to implementation and reliability of components. Car components wear out, therefore car cannot be 100% reliable. Software does not wear out, therefore reliability depends completely on design and implementation correctness.

Hardware reliability can be improved by duplication - can take similar approach with software. However, no point in executing identical copy of same procedure. Randell suggests that software reliability may be improved by executing a different but functionally equivalent procedure (recovery blocks).

Example of space shuttle system: four processors executing identical software; fifth (different) processor executing independently developed software.

Thus key to software reliability lies in system specification. Frequent problems with reliability requirements:

expressed in an informal, qualitative way and therefore difficult to assess whether system attains them;
incomplete in specifying what should happen when every error condition arises.

Result: lack of perceived reliability may be result of misunderstanding between developer and customer.

Even more difficult when using exploratory programming since no specification exists. Verification techniques of no use. For this reason, exploratory programming in safety-critical environments is not recommended.

Default: work to "fail-safe" criteria. Program should:

never produce incorrect output whatever the input (no output is better than incorrect output);
never allow itself to be corrupted;
take meaningful and useful actions in unexpected situations;
only fail completely when further progress impossible, and failure should not affect other components of system.

Do use of formal methods necessarily lead to more reliable systems? Formal specification less likely to contain anomalies but opaqueness of notations may make it more difficult for customer to establish whether system is what is required; therefore, less likely that reliability criteria will be met. If cause of unreliability is environmental then even a formal proof will not ensure reliability.

High reliability inevitably involves extra, otherwise redundant code to perform checking. Hence efficiency is reduced and development costs increase.

Difficult to quantify reliability, so some organisations place constraints on development process on the assumption that adherence to standards will lead to systems whose reliability is acceptable. Danger is that process becomes more important than costs or functionality, and there is no firm evidence concerning relationship between product and process reliability.

Summary

software engineering involves technical and non-technical issues such as human factors and management;

well engineered software provides services required PLUS is maintainable, reliable, efficient and has appropriate user interface;

the waterfall model of software development suffers from inadequacies but will continue to be used;

exploratory programming is not suited to development of large, long-lifetime or safety-critical systems;

software systems have to be maintained;

most important characteristic of software is reliability since cost of failure often exceeds cost of system.