back up on

Application Architectures for Web-Based Data Access


David Eichmann
Repository Based Software Engineering Project
Research Institute for Computing and Information Systems
University of Houston - Clear Lake

Introduction

The Web is proving to be a powerful delivery mechanism for information resources. The leveling nature of platform independent hypertext markup, coupled with the immediacy of client-based display, has resulted in a framework for delivering application support with unprecedented power and distribution. This position paper elaborates a number of architectures for structuring information systems on the Web and comments on the impact that each of these architectures has upon data currency, development ease, and quality of user interaction.

Architectural Alternatives

Static Translation

This approach involves building what are effectively report generators (similar to the many document translators, such as mif2html, etc.) that interact with an existing legacy information system and produce as output one or more files comprised of HTML structure. This approach is advantageous when information is relatively static, since translation need only occur once for many downloads/retrieval. However, if information changes frequently, or user input is needed to constrain or parameterize the retrieval, this approach is unsuitable.

Dynamic, Transient Translation via an Existing Application Library

This approach involves building CGI executables that are linked to a legacy information system via a library employed in the generation of the existing application. HTML is emitted dynamically by the new executable on a transaction-by-transaction basis - each virtual document retrieved is generated by a separate invocation of an executable, which terminates upon completion of the single task.

Dynamic, Persistent Translation via an Existing Application Library

This approach involves building CGI executables in a manner similar to the transient case above, but with one major distinction - each executable persists between client requests, thus allowing for retention of state and/or caching of costly results for frequently accessed results.

Dynamic Translation via Direct, Reengineered Support

This approach involves complete reformulation of the information system as a collection of CGI executables with consideration for user interaction and system maintainability in a Web centered environment. The primary distinction here is the potential for complete integration of HTML emission and information access and modification. Note that this approach has two variants, based upon the persistence/transience of the actual executables.

Application as Server

This approach is similar in nature to the preceeding persistent reengineered variant, but the server has been removed from the loop. The application now operates on the Web as a server in its own right. This implies that the application can now be in direct control of authentication, etc., and need not be concerned with limitations inherent in a generic server, particularly regarding the size and nature of HTTP protocol requests. Of course, normal HTTP request support would not be present unless designed into the application system.

Discussion

Our work on the MORE system [1] and related prototypes has lead us to a number of evaluations of these approaches and experimentation with a few. When first experimenting with the Web with our predecessor application (which was X-Windows based), we generated a proof-of-concept demonstration by static implementation. We basically created a small set of main programs that wrote HTML to a set of files, which were then edited to provide integration and enclosing structure.

Our current approach to MORE is a mix of dynamic, transient translation with an applications library and the reengineered approach. (This mixture is more an artifact of project history than explicit design decision - we were in the midst of reengineering the X-Windows system when the Web exploded onto the Internet.). MORE is separated into two distinct layers, a database layer that encapsulates database specifics and an interface layer that encapsulates HTML specifics. This approach has eased the rehosting of the system onto new database engines, and offers the same benefits regarding communication protocols (e.g., CORBA). This flexibility comes with a price, however - the architectural separation inflates the size of the of the system through the need to define an internal data model to migrate information back and forth between the database layer and the interface layer. We estimate that a direct, reengineered implementation, where HTML was emitted as a database cursor iterated over the database, could reduce the size of the system by as much as 30%, but at the price of sacrificing the flexibility of the layered approach.

Finally, our recent work on Sulla, a user agent for the Web [2], has involved the application as server approach. Sulla functions as a proxy server, blending normal proxy activity with the dynamic generation of pages characterizing the state of searches, etc. This has proven to be a powerful means of architecting an application, as all user activity is at the complete control of the proxy, but the learning curve associated with building a robust HTTP process has proven to be substantial compared to the CGI approach of MORE, even with the ability to avoid issues of state retention encoding within the transient HTML.

References

  1. Eichmann, D., T. McGregor and D. Danley, ``Integrating Structured Databases Into the Web: The MORE System,'' First International Conference on the World Wide Web, Geneva, Switzerland, May 25-27, 1994. Also in Computer Networks and ISDN Systems, v. 4, n. 2, 1994, pages 281-288.
  2. http://ricis.cl.uh.edu/agents/

back up on