CASCON '97 Workshop Report: Software Architectures (DRAFT)

Homayoun Dayani-Fard and Ivan Kalas
IBM Centre for Advanced Studies
homy@vnet.ibm.com

Abstract

During CASCON '97 we chaired a full-day workshop on software architectures. Software architectures are becoming an integral part of software engineering. In this paper we provide a background on software architectures, the problems that are facing developers, and solutions that different researchers have proposed. We further describe our focus and goals for putting together this workshop, the views offered by the invited speakers, and finally the highlights of the panel discussion.

1. Background

Software systems are increasingly becoming larger and more complex. In today's software industry, reports of millions of lines of code scattered across thousands of files, written in different languages are not uncommon. These software systems have had a long lifetime and still continue to be an integral part of modern businesses. Such software systems have evolved over decades. They have been maintained through changes in system requirements, changes in technology, and continue to change. In many cases, the original designers of the systems are no longer available. The challenge facing software industry is two folds. First, how to understand these large and complex software systems? The documentation of systems are rarely kept up to date and the most reliable source of information about software systems, by many programmers, is the code itself. Second, by knowing the life expectancy of software systems, and complexity of requirements for software systems, how do we build such complex systems? Building software systems from scratch is rapidly becoming an obsolete practice. Software developers need to (re)use existing software artifacts to build new software systems. However, composing existing software artifacts is itself a challenging problem and a successful approach that can universally be applied has eluded us. One answer to these problems is the use of software architectures. Software architectures provide a higher level of abstraction of software systems focusing on interactions among modules, their structure and compositions.

Software architectures have been, for long, part of software engineering folklore. Designers have used informal patterns, prose, box and arrow diagrams to convey design intentions. These diagrams and prose though appealing to the intuition of designers and programmers, have an imprecise semantics. In recent years, many researchers and developers have tried to formalize the notion of software architectures to improve understandability, better communicate design intentions, and provide a basis for analysis of the design.

However, to do so, there are several questions that need to be answered. First, what is a software architecture? How can it be documented and communicated to others? Second, how can we determine the architecture of an existing system? What are the required tools and methodologies for detecting the architecture of an existing system? Third, does every system have an architecture? Can a system have more than one architecture? What information an architecture should capture? Fourth, what are the realistic ways to maintain architectural documents? What tools can we use to keep such documents updated that programmers will use.

In this workshop, we focused on some of these questions. In the next section, we present a list of questions that we put forward to our participants. Later, we provide the highlights of what our invited speakers offered and the discussions that followed their presentations.

2. Goals of Workshop

Consortium for Software Engineering (CSER) is a Canadian collaboration among industry, university, and government to advance software engineering practices in Canada. Currently, one of the main thrusts of CSER research is legacy software systems. These are large systems that have outlived their original designers. Information that can be gathered about these systems, in most cases, is of little use and requires extensive refinement. To better understand such systems we need to determine their architectures. However, this undertaking requires finding answers to some challenging questions. For this reason, we invited a number of researchers from outside of CSER, a number of researchers and practitioners from CSER, and a number of developers from outside CSER for exchange of ideas, problems, and possible solutions. To focus the workshop, we formulated a number of questions that have been the topic of discussion in our community (CSER) and put them forward to the participants of the workshop.

We invited four speakers to provide an overview of their research. Later, we invited a number of practitioners and researchers to very briefly state their positions and put forward their questions to our speakers. In the next section, we provide highlights of the presentations of our speakers. In the subsequent section, we provide highlights of the discussion that followed these presentations.

3. Presentations

The workshop was divided into two sessions. In the morning four speakers from different backgrounds provided their views or those of their groups on software architecture issues. Robert Allen, from IBM, provided his views and those of his colleagues at Carnegie Mellon University. Jeromy Carriere presented the work that he and Rick Kazman have been doing on architecture extraction. Similarly, Ric Holt, from University of Waterloo, shared his views and experiences on program understanding and software landscape. Finally, Morven Gentleman, from National Research Council of Canada, provided an overview of the challenges of building large scientific software systems. In the following subsections we present excerpts from each presentation.

3.1 Modeling and Analysis of Software Architectures
Robert Allen

Allen motivated his presentation by pointing out the history of software architectures and the differences between architectural modeling and programming. Architectural modeling focuses on the structure of the system, the interaction among its components and their composition. Whereas programming focuses on implementation, computations carried out by the system. Furthermore, architectural models are declarative, programs are operational.

The benefits of architectural modeling are to clarify design intentions, provide a basis for analyzing system properties, and improve understandability by providing a system level description of the software. As a result, an architectural model must describe the components of the system, their connectors, and their properties Allen provided the following definitions for software architecture vocabulary. A component is ``the locus of computations,'' connectors are ``the interactions among these components,'' and properties are ``information for construction and analysis of the overall system.'' Allen further pointed out another aspect of architectural modeling: the notion of style. Different systems may have common patterns of design such as topology, constraints of components, connectors, etc. These systems share component types, configuration constraints, and it is possible that their analysis can be shared. Typical examples of architectural styles are pipe and filter, abstract data type, event-based, and client server.

Allen provided four different styles for an example system: Keyword in Context. These styles were shared memory, abstract data types, event based, and pipe and filter. He pointed out that each style has certain properties that depending on the design intentions be more appropriate. For example, shared memory solution is very efficient but hard to understand and not easy to modify. The abstract data type provides a better understanding of the system but it is not as efficient as the shared memory solution. The event-based solution is neither efficient nor provides a good understanding of the system. Instead it is very easy to modify the system. And finally the pipe and filter solution is easily extensible for new functionalities and allows reusing existing filters. Hence, all systems provide the same functionality but they have different non-functional properties due to their architectural style.

Allen continued his talk by pointing out that box and arrow diagrams are not sufficient for representing architectures effectively. We need to augment these documents by descriptions of the interfaces, behavioral specifications, etc. This is the basis of the work done on architectural description languages (ADL). Allen provided a brief survey of different ADL's and provided a more description of his work on WRIGHT architectural description language. Wright is based communicating sequential processes; a language designed for describing patterns of communication in distributed systems. Allen concluded by pointing out that ADL's, similar to programming languages, have different strengths and weaknesses which makes them more appropriate for some applications than others.

3.2 Architectural Extraction and Conformance
Jeromy Carriere

Carriere provided a brief description of their work at Software Engineering Institute on architectural extraction. The main theme of Carriere's talk was reasoning about existing systems based on two observations:

As a result, it is necessary to be able to analyze architectural properties of an existing system, re-document the architecture, and redefine architectural properties for reuse.

Carriere pointed out that architectural elements are not explicitly represented in the source code. Using reverse engineering tools, program understanding, and domain engineering, it is possible to extract different types of information from the existing system. Different tools provide different views; these views are not exclusive and they can improve one another. This observation indicate that no one set of tools is sufficient for architectural extractions. Carriere suggested that tools must be light-weight such that they can communicate with other tools. Carriere further provided an outline of the Dali Workbench. The main parts of Dali are view extraction, repository, view fusion, and representation. View extraction involves the extraction of static and dynamic elements from the system by parsing of the source code, lexical analysis, and profiling the extracted information. The result of view extraction --the extracted view-- can then be stored in the repository that is an SQL database. After a collection of views have been extracted, fused views can be defined. Carriere emphasized that the process of architectural extraction is not intended to be fully automated. Rather, the process is interactive and tools provide mechanisms for querying the extracted information, pattern matching, and direct manipulation. The representation process involves visualization, external manipulation, and analysis.

The goals of architectural architectural extraction is to evaluate different properties of the system architecture: the deviation of the extracted architecture from the intended architecture, impact analysis for modification, security issues, and performance analysis.

3.3 Documenting Architecture of Legacy Software Systems
Ric Holt

Holt discussed his collaboration with IBM on software bookshelf project and his views and experience involving large scale software systems. He pointed out the reality of large scale software systems: millions of lines of code scattered across thousands of files, which are sometimes implemented in different languages. Holt stated the importance of software architectures for solving the problems of understanding and communication in large systems. He praised the focus of research community on software architectures, in particular Shaw and Garlan. However, he cautioned that software architectures must solve the development problems. In many cases, PhD students spend years working on research problems that may not have a significant impact on the development. Furthermore, most PhD students do not take jobs as developers.

The work of Holt's group are somewhat similar with that of Carriere's. In summary, using parsers and lexical analyzers, facts are extracted from the existing systems and stored in a repository. These facts are later augmented with layout information and are used for visualizing the structure of the system. In large systems, these facts are many and diagrams are generally cluttered and rendered useless. Holt briefly pointed out ways of restructuring these cluttered diagrams by promoting data-flows from variables and procedures to files, and by interviewing the original designers. The result of this process is an architecture of the existing system that can be queried, navigated through, and annotated. He provided several examples including an IBM compiler, a Unix-like operating system, and an election system. He also reported that in some cases creation of an architecture can take up to one man month.

Holt emphasized on his position that a system must have one architecture. This way, it is easier for developers to communicate based on a common architecture. The architectures shown were reported to have been received by developers as a useful tool. In one cases, Holt reported that the design architecture and the as-built architecture of the system were almost identical. However, systems evolve and their architectures change over several releases. It is a common knowledge that software documentations are at some point in their lifetime out of date and as a result useless. Holt pointed out that the same destiny may hold for software architectures. Detailed and long architectural descriptions do not pay off and as a result it will be an added burden on the development time to keep them updated. Holt concluded by suggesting that architectures must provide high payoffs. They must be small, slick, and convey just the right information. Furthermore, we need to automate the process of updating the existing architecture as much as possible.

3.4 Issues of Software Architecture for Large Scale Scientifici Computation
Morven Gentleman

Gentleman sympathized with other speakers and provided a different problem for the audience. He focused his presentation on the issues involving the development and evolution of very large scientific software system. Typical research in mainstream software architecture does not address the issues of scientific computation. These systems are large, have a long lifetime (over 25 years in some cases), and most importantly they change owners over their lifetime.

Scientific computation provide analysis of complex models of physical systems. The analysis and data capture results in massive amount of data that sometimes must be retrieved again. As a result the systems are large. Further, components of scientific software systems are, in most cases procured, from other sources: They run on different platforms, under different operating systems. The integration of these components forces the system into network computing. Lastly, some of scientific software are no longer supported by the original designer. They have either abandoned their scientific task or are no longer interested in the problem area. How can we understand these systems?

Gentleman concluded by pointing some of the challenges of scientific computation:

The current work on software architectures fails to fully address the issues of impact analysis, performance issues, and the development time.

4. Panel Discussion

In the afternoon, we invited ten researchers and practitioners including the morning speakers to form a panel and present their position on software architectures, the status of current research and practice, and to discuss the concepts and ideas that were presented in the morning. These discussion focused on the questions that we had formulated before: definitions of architectures and their uniqueness, extraction of architectures from existing systems, languages and formalization for describing architectures, and architectural issues in software evolution.

4.1 Definitions of Software Architectures

Lamb suggested that software systems are collection of parts. Software architectures describe what these parts are and how they are put together. The parts and their relationships can be viewed differently. As a result, depending on issues of interests, there can be multiple architectures. Further, since there is no physical representation of software, the architectures are not objective. Mueller agreed with Lamb's views and provided some extensions. Mueller suggested that software architectures suppress some detail information and highlight other types of information. In other words, we choose what to see in a software system and what to abstract away. Different users ask different questions about the software system.

Holt strongly rejected the idea of multiple architectures. He claimed that the architecture of a software system must be a means of communication. Hence, it is a global document that is referenced by all those interested in the software system. If there are other concerns by a group of users, then they can have multiple local architectures. Lung agreed with Holt's views and refined his definition. Lung stated that there is one architecture of the system with multiple views. The common agreement among the panel members were that software architectures are a means of communication among designers. Software architectures must provide essential features of the software systems and abstract others. On the issue of multiple architectures the differences were not essential.

4.2 Software Architecture Extraction

The issue of determining the architecture of a software system was of utmost importance to the panel. Architectural extraction is a crucial task in software evolution which is described in detail in the subsequent section. However, on its own, architectural extraction is an important task. Gray was one of the first people on the panel who indirectly tackled the issue of architectural extraction. He stated that one of the main problems in industry is the rapid changes in the requirements of software systems. To effectively modify a software system to reflect these changes, we need to know the architecture of the software and what parts of the system will be effected. His views were echoed by Leathers. Over the years, software systems have grown beyond recognitions due to repeated modifications and added features. As a result, we need to perform structural analysis on these systems by extracting information from the software system.

A more interesting issue on architectural extraction was put forward by Lague and Lung: software procurement. Lague stated that the competitive nature of telecommunication industry has forced some companies to buy software from other vendors. The problem is how to determine the ``goodness'' of the purchased software. One way to do this, as Lague suggested, is to determine the architecture of the system, perform some analysis. Lung also suggested that use of architectural extraction can provide, in most cases, the essential properties of a software system. He further explained that they have developed some subjective metrics to classify software systems based on their sensitivity to changes. Lastly, he raised the question that how can designers of the software system use these measurements and classification.

On the technical side, there was a question raised about automation of the extraction process. In the morning session, both Carriere and Holt had explicitly stated that they need to extract as much as they could. Allen stated that there is good creativity invested in software systems that would be lost during automation. Carried expended on Allen's comment and suggested that their system is semi-automated and the reason behind this decision is that users can provide their interpretation of the extracted system, their intuition and perception that is not available in fully automated systems. Lamb suggested that architectural extraction is creation of a model: suppressing the right types of detail information so that we can ask interesting questions about the system. Mueller further suggested that a lot of information is usually lost during the implementation, information regarding corporate knowledge. In short, it is not whether we can automate the process of extraction, it is whether we can use the extracted information.

4.3 Architectural Representations

The next issue in software architectures is how to describe them. The common perception of software architectures is the notion of box and arrow diagrams. These diagrams are easy to understand, manipulate, and reason about the topology of the system. However, not all information about the system can be described using box and arrows. On the other extreme, ADL's provide a detailed formal specification of the architecture of a software systems. Allen was the only supporter of the ADL's who came under a lot of criticism. He was asked whether it would be possible to use raw mathematics to describe software architectures. In response, Allen stated his previous work on use of set theory to describe architectures. However, it was cumbersome. ADL's try to provide the mathematical foundation needed to describe software architectures.

Gentleman was the first panel member who criticized ADL's. He provided a few examples from the work that he had done over the years and their requirements that are not addressed by ADL's. One of these problems was printer drivers. The requirements of printer drivers have changed drastically due to the existence of the GUI. For example, how can we describe the requirement that ``the user can cancel a print job in the middle of the printing and resend the job to another printer.'' Next he referred to real-time systems and their non-functional requirements such as timing, stack sizes, etc. that cannot be answered from analysis of the code. Then he referred to dynamic loading in distributed computing. He asked how these issues can be addressed by ADL's. Allen responded to such criticisms by noting that software architectures, as we know it today, is a new field; it requires time to mature. He further admitted that ADL's today are not an off the shelf product. Their use is still experimental and requires time to become familiar with. Different ADL's have strengths and weaknesses that are not yet fully known to the user's community.

Perhaps the strongest criticism of ADL's come from the industry participants. Leathers asked the panel how he can describe the architecture of the software system that he has to build next week. Gray's earlier comment also echoed this criticism. Gray had earlier stated that telecommunication systems change rapidly. To remain in business, companies must provide new features, they cannot abandon their existing systems or build them from scratch. How is it possible to use ADL's to document architectures under such market pressures?

In summary, panel seemed to agree that box and arrow diagrams though informal, have an intuitive meaning and if need be, they can be described using graph theory. Users of such diagrams prefer them since our minds are well trained in spatial reasoning. Despite all their advantages, box and arrow diagrams are not sufficient. There are properties that cannot be described using box and arrow diagrams. ADL's, as they stand today, are nice research projects, however, their use requires higher payoff in industrial development.

4.4 Software Evolution

As we mentioned in the section on motivation, the main thrust of CSER research is on legacy software systems and their evolution. Software evolution is an industrial problem which in our opinion has not yet received the attention that it deserve from the research community. This sentiment was apparent in Gray's position. He described the problem that telecommunication companies are faced with, which are not all technical.

Gray started by outlining the features of telecommunication software systems. These systems are in order of 30 million lines of code. They have stringent non-functional requirements: they cannot fail. The competitive nature of telecommunication industry requires companies to offer new features and functionalities. Development time is critical. These systems are generally more than twenty years old. They cannot be redesigned for every new functionality. Instead software systems must rapidly evolve. It is hard to keep the architecture of the system invariant. New functionalities sometimes contradict old ones. For example, Gray provided the definition of busy signal for telephones in the 1920's: when someone is talking on the phone, the line is busy for others trying to reach her. However, the definition of busy signal is no longer the same: are busy even when you are on the phone?, are you busy for some but not to others?, are you screening your calls? In other words, the new features change the basic way the system works. Gray concluded by stating that architectures must capture design assumptions, map functionalities to components, and provide facilities for adding new functionalities. The ability to integrate existing systems with newer ones, modifying the existing systems, impact analysis are the crucial tasks which allow a company remain alive in the competitive market.

Gray's concerns where echoed by Leathers who is the designated architect for his company. Leather stated that an architect is the master builder. He has the responsibility of knowing what the components are, what their capabilities are, determine how the changes must be implemented under stringent timing constraints. He summarized his problems as how to describe the system that is being built next week, and how to determine the properties of the system that was built last week. To add to these two problems, new technologies must have high payoffs to convince the management to undertake new investments.

5. Conclusion

There is always old software. Software systems have reached lifetimes beyond the expectations of their original designers. As these software systems age, their functionalities change. They are, in some cases, becoming larger and more complex more rapidly to allow well thought out modification and redesign. As a result, understanding of the structure, functionality, and prediction of their behavior though difficult, is crucial. Abandonment of these software systems or redesigning them from scratch, in most cases, is impossible. Today's businesses rely heavily on their legacy software systems and their existence, in some cases, depend on these systems. For these reasons, it is imperative for us to understand existing systems, determine how new functionalities are to be added, what the impacts of such changes are, and communicate these effectively with other people involved with these systems.

On other side of the spectrum, businesses rely on acquiring software systems from third party vendors. These software systems are either integrated into another existing systems or are used they are. This requires us to determine the ``goodness'' of the acquired software system. These problems are similar to those of legacy systems. One viable solution lies in the study of software architectures.

Software architecture is a new field. Despite their intuitive appeal, software architectures still cannot provide the high level of payoff that is expected by the industrial practitioners. The timing constraints of software industries that are in competitive markets, in some cases, are very stringent. These constraints contributes to the lack of enthusiasms demonstrated by some practitioners to the research results in software architectures.

During the CASCON'97 workshop we realized that the interest in software architectures is genuine. However, there are some scepticism on the academic approach. Software architectures are to be an engineering tool; they must contribute to well-being of businesses. Here are some of the sentiments of the participants of our workshop.

As Allen stated during the workshop, despite the historical roots of software architectures, this is a new field. The success of this field rests on collaboration of industry and academia. This workshop provided a fruitful exchange of ideas between researchers and practitioners.

Acknowledgment

We would like to thank all our speakers and panel members who shared their works and problems with us. Without their participation, the success of our workshop would not have been as great as it was. This report is written based on our understanding of what was discussed at the workshop. All credits are due to participants; omissions and errors are ours.

The participants of the workshop were: Robert Allen (IBM Corporation), Jeromy Carriere (Software Engineering Institute), Morven Gentleman (National Research Council), Tom Gray (Mitel Corporation), Ric Holt (University of Waterloo), Bruno Lague (Bell Canada), David Lamb (Queen's University), Bruno Leathers (Cognos), Chung-Horng Lung (Nortel), Hausi Mueller (University of Victoria), and Hakan Erdogmus (National Research Council of Canada).

References

Here are some of the sources that were referenced during the workshop or were used to prepare this report.

Allen, R. J. 1997. A Formal Approach to Software Architecture. Carnegie Mellon University. Technical Report CMU-CS-97-144.

Finnigan, P. J. et al. 1997. The Software Bookshelfi. IBM Systems Journal:36(4).

Kazman, R. and Carriere, J. 1997. View Extraction and View Fusion in Architectural Understanding. Fifth International Conference on Software Reuse.

Lague, B. et al. 1997. A Framework for the Analysis of Layered Software Architecture.

Parnas, D. L. 1972. On the criteria to be used in decomposing systems into modules. Communications of the ACM: 15(12).

Shaw, M. and Garlan, D. 1996. Software Architecture. Prentice Hall.

This page has been accessed times since the counter was last reset.