Leon Shklar, David Makower, Evan Maloney, Sveta Gurevich
shklar@cs.rutgers.edu
With the accelerating advancement of Web technology, it is becoming increasingly important to support clean separation of content and presentation. Such separation serves to dramatically reduce the cost of data maintenance and storage, as well as the cost of migration to new technologies. In this paper, we describe an application development framework that employs metadata to facilitate construction of specialized models of heterogeneous information, and then uses these models to support flexible presentation of the content. The models are composed of metadata objects that contain two kinds of metadata: content-descriptive metadata (useful for searching and for flexibly presenting the content) and access-descriptive metadata that includes references to data accessible via major Internet protocols (HTTP, FTP, etc.) and other information necessary to retrieve the content. Our metadata objects have been designed to simplify future compliance with the emerging "Resource Description Framework" (RDF) standard from W3C as it nears completion. Metadata is generated with built-in redundancy, which serves the dual purpose of improving performance and providing for automatic recognition of content changes that may require partial regeneration of metadata. We have also designed a sophisticated multi-level caching mechanism to minimize the performance impact of dynamic presentation. Our application development framework is implemented as a hierarchy of Java classes, and conforms to the Java Servlet API from JavaSoft.
The World-Wide Web is becoming a platform of choice for the next generation of business applications. Many of these applications replicate information that is already available in a different form, either on the Internet or on a local corporate network. As applications mature and Web technology advances even further, the redundant data is often reformatted and replicated yet again to take advantage of the latest technology, while old copies remain to continue providing the old service. More often than not, this leads to an enormous expenditure of resources and a maintenance nightmare.
Our solution is based on creating a hierarchy of metadata objects to represent the content. Each metadata object is associated with one or more data sources from which to retrieve content, and any number of presenters responsible for different ways of presenting this content via the Web. Our notion of a data source is abstract in that a data source may represent a local file, a URL, a database query, or any other entity that would yield a stream of data. This notion is also recursive; our metadata objects utilize data sources in order to access their content, and may themselves be utilized as data sources. To support customized search and traversal, we enable the creation of virtual containers, logical groupings of metadata objects that may in addition be associated with their own data sources or content-based indices. These containers, in turn, are grouped together into other containers to form a Metaphoria repository - a virtual Web site.
An important characteristic of our approach is automated analysis of existing data in order to generate metadata objects. We define and implement high-level operations to control data analysis and metadata generation, as well as logical grouping of metadata objects according to search, traversal, and presentation requirements. We have designed an HTML-based interface to repository generation operations and are currently working on developing a visual modeling language.
Our metadata objects are designed to contain redundant information. The redundancy serves the dual purpose of improving performance and allowing for the detection of inconsistencies caused by changing data. Such inconsistencies may trigger partial regeneration of a repository, similar operations to those used when generating the original metadata, but only applied to the portion of the data known to have been affected by the change.
We have designed a sophisticated multi-level caching mechanism to minimize the performance impact of on-demand retrieval and filtering of information. It combines memory and disk caching to ensure rapid access to the most frequently accessed content. A unique feature of our mechanism is its implicit support for caching information that may not have been directly requested but is related to requested presentation.
In addition to supporting flexible access to existing information, Metaphoria provides a reason to seriously rethink the way to design and build new Web sites. With Metaphoria supporting multiple dynamic views of the same data, it becomes beneficial to separate content from presentation. The main task of the site design is then to logically collect the content in the most transparent form (e.g., plain text files or database entries), without moving data to a single file system or redesigning data maintenance procedures. So designed, a Web site stands ready for implementing complex applications (e.g., workflow, etc.). An important additional benefit is in protecting Web sites from the assault of new presentation technologies that have long outdated early HTML pages. Achieving a cutting-edge presentation would only involve upgrading presentation methods without changing physical content.
In section 2, we introduce the Metaphoria model, centered around the notion of a data source. We discuss different kinds of Metaphoria resources, ways for maintaining their consistency and integrity, and built-in support for building sophisticated applications. In section 3, we discuss sample Metaphoria applications that vary from static views of heterogeneous documents to a slide show. In section 4 we discuss Metaphoria architecture and implementation. Section 5 discusses related work, and section 6 summarizes our conclusions and plans for future work.
Fig. 1. The Data Source Abstraction
We use the notion of a data source as an object-oriented abstraction for any entity that can present some content as a data stream. This abstraction masks any underlying complexity that might be involved in generating, accessing, and/or processing content upon request (see fig. 1). To support such an abstraction, data sources typically contain access-descriptive metadata attributes [shk95-1], which control the physical retrieval of information.
Fig. 2. Examples of Simple Data Sources
We distinguish between simple and derived data sources. A simple data source contains access-descriptive metadata allowing it to present data retrieved directly from a single physical source (for example, a local or remote file, a database, etc.). Simple data sources present their content exactly as they find it (see fig. 2). A derived data source does not retrieve its content directly from a physical source, but instead contains references to one or more other data sources, each of which may be either simple or derived (see fig. 3). Derived data sources may contain metadata attributes that facilitate intermediate processing of retrieved content.
For example, a text file stored on a remote gopher server may be represented by a simple data source. Such a data source could be implemented with a single access-descriptive metadata attribute: the gopher URL for the file. When the data source is asked to present its content, it opens a connection to the server, uses the gopher protocol to request the file, and returns the contents of the file in a stream, perhaps caching the content locally so that the next access will be faster.
Fig. 3. Examples of Derived Data Sources
Imagine, however, a very large file with many sections. We might wish to treat each section of the file as a data source in its own right. Each section could be represented by a derived data source, containing a reference to the original, simple data source, as well as whatever information is required to separate out the specific section. When the derived data source is asked to present its content, its first response is to make a similar request of the simple data source to which it holds a reference. Having received the content from the simple data source, the derived data source extracts the appropriate section and returns it in a stream.
The example above is deliberately simplistic. Naturally, there is no reason that derived data sources need only reference simple data sources; the notion of a data source is a recursive one. Furthermore, a single derived data source may contain references to arbitrarily many, potentially heterogeneous back-end data sources.
The flexibility of the model also implies that major transformations of the content may take place through the use of a derived data source. For example, a derived data source whose back end consists of one or more text streams may be capable of presenting the information it represents as a stream of binary data: possibly an image, or even a serialized object or collection of objects.
It bears emphasis that although various data sources may use different
retrieval mechanisms, all of them maintain the same simple interface for
presenting their content. For example, a file data source simply reads a
file and presents the resulting stream; an HTTP data source may obtain
content through the HTTP GET request; an SQL data source
obtains its content by sending a query to a database, either locally or
through the network. Regardless of these differences, all of these data
sources present their content as a stream after receiving a "present
content" request.
Derived data sources may be capable of presenting their content in a variety of ways. The difference may be trivial, such as the order in which a list is sorted, or it may be more substantial, such as whether the response is to be a text stream or a stream of binary data. To support this kind of flexible, dynamic presentation, data sources allow for the specification of common presentation parameters. Presentation parameters are always optional, as each data source class specifies defaults to which it is capable of responding. The defaults may be overridden by any specific instance of a data source, which, in turn, may be overridden by values explicitly specified at presentation time.
Fig. 4. Sample Hierarchy of Metaphoria Resources for an ASCII Document
A Metaphoria resource is a data source--typically but not necessarily a derived data source--capable of packaging its content as an HTTP response. Each Metaphoria resource has a URL; when a request comes in for this URL, the resource presents its content. Although the most obvious type of Web presentation that leaps to mind is HTML, a Metaphoria resource may respond to an HTTP request using any MIME type. Furthermore, a single Metaphoria resource may be capable of presenting its content using any number of different MIME types, and may dynamically determine which one to use, based on information in the HTTP request. Presentation parameters may be passed in as explicit URL-encoded name-value pairs (e.g., summaryStyle="verbose"), or they may be implicit in the HTTP request (e.g., the host domain of the originating browser).
In a manner very much like that in which file systems contain files grouped into directories, Metaphoria resources are grouped into virtual containers. A virtual container is itself a Metaphoria resource--just as a directory is, in some sense, a file--that contains references to other resources, considered to be its children. Metaphoria resources that are not virtual containers are called leaf resources. Fig. 4 shows a hierarchy that might result from analyzing the plain text version of RFC 1738. The document itself is associated with a virtual container, as are sections 2, 3, 3.2, and 3.4, because they all contain subsections. Other sections are associated with leaf resources. This example will be discussed in more detail in section 3.1.
By default, the content of a virtual container is a listing of its children. However, it is possible for a virtual container to have a dual role: that of referencing its children, and that of presenting a content of its own. This content might be a textual description of the content of the children, or it might be a query front-end utilizing a full-text index generated based upon the content of the children.
To facilitate truly flexible presentation of Metaphoria resources without
requiring unnecessary subclassing, resources may present their content
through templates, where the name of the template may be passed
in through presentation parameters. Template syntax depends on the
type of content that is being presented. HTML templates, for example,
are valid HTML documents containing special comments beginning with the
characters <!--#MM, followed by any of several directives for
presentation primitives. Syntax would differ for VRML or XML templates.
We are currently investigating an enhanced template mechanism that will include support for more sophisticated scripting in templates, possibly involving extensions to Jacl, the new Tcl interpreter implemented in Java.
A Metaphoria repository is a logically related collection of leaf resources and virtual containers. These leaf resources and virtual containers may be added to the repository individually, but it is far more typical for them to be generated automatically and in large numbers, through the use of a script attached to the repository itself.
There are two major phases in the life-cycle of a Metaphoria repository. First, there is generation time, during which the initially empty repository reads in the content of one or more existing data sources, analyzes the content to discover its logical structure, and builds a model of that structure consisting of leaf resources and virtual containers. The repository generation process may be quite performance-intensive; as a result, it should either take place at a time when the Web server does not expect to be serving many requests, or be performed by a separate designated server.
Once the repository is generated, its contained resources can be made available via the Web. Processing of HTTP requests for Metaphoria resources is performed at presentation time. The distinction between generation time and presentation time in the life-cycle of a Metaphoria repository is in a sense analogous to the distinction between compile-time and run-time in the life-cycle of a program written in a compiled language like C or Java. Although performance is always of importance, it is the most important at presentation time, particularly for servers handling frequent requests. Metaphoria improves presentation-time performance by shifting computation-intensive analysis to generation time. Additional performance enhancements are provided through a sophisticated multi-level caching scheme, described in section 4.
Fig. 5. Data Encapsulation
The process of generating a Metaphoria repository follows a fairly well-defined pattern. First, there is the encapsulation step (see fig. 5). During encapsulation, one or more data sources are read in and analyzed. The purpose of the analysis is to break the data source content up into "logical units," where the meaning of a "logical unit" is application-dependent. It might be a section or subsection of a text document, a single function in a source code file, a row of a database table--anything that makes sense for the application in question.
The end result of the encapsulation process is a set of preliminary metadata objects, one for each logical unit discovered in the analysis. Each of these metadata objects is simply a bundle of key-value pairs representing access-descriptive and content-descriptive metadata attributes of the associated unit of content. Each object may have a parent reference and zero or more child references. These references are initially unset, but may be set during the next step of repository generation.
Fig. 6. Grouping Metadata Objects
After encapsulation comes the grouping step (see fig. 6), during which metadata objects are grouped into a structure that may mirror the original, native structure of the data, or may impose an entirely new structure upon it. During grouping, various set operations may be performed on the results of the encapsulation, and some objects may be designated as parents for other sets of objects which become their children. The result is a hierarchy of metadata objects.
Fig. 7. Registering Metadata Objects
The final stage of repository generation is registration, which results in associating each metadata object with either a virtual container or a leaf resource, and registering these resources within their Metaphoria repository (see fig. 7). Once registered, each resource has a URL through which it can be accessed via HTTP.
There may also be an optional indexing step, during which the content represented by the resources in the repository is passed through an existing full-text indexing system, such as Excite, WAIS, Cybotics, etc. The resulting indices allow for searching the content of the repository, even though that content is not physically stored on the Web server, and may in fact be distributed across the network. Search queries conducted against these indices would return URLs of Metaphoria resources, contributing to the desired illusion that the data is actually stored at the Web server.
We have seen that Metaphoria resources are accessible via HTTP, and reside within Metaphoria repositories. Because HTTP is a stateless protocol, [ber96, fie97], repositories that need to maintain user and session state must send additional information in requests and responses in order to simulate the state.
This problem is well-known, and many solutions exist. For security and performance reasons, it is not advisable to include full state information in requests and responses. Instead, it makes sense to send information back and forth that uniquely identifies a session. HTTP Cookies may be used, but they do not necessarily uniquely identify a session, only a particular browser on a particular machine. A user may have multiple browser windows open, each of which is perceived to have its own state, but a given cookie will only have one value, and will thus be inadequate to uniquely identify multiple, simultaneous sessions on the same machine.
On the other hand, using a session identifier embedded into the query string of all links included in the bodies of generated HTTP responses (sometimes referred to as the state trail) is not very reliable because of the browser-side caching. To minimize this problem, Metaphoria employs a hybrid session-tracking scheme, whereby cookies are used to identify the user over multiple sessions (in order to store persistent user preferences, etc.), while the state trail is used to distinguish between the different sessions maintained by the same user. The combined session identifier is used as a key to map to a server-side object representing session state.
Metaphoria, while providing high flexibility in traversing and retrieving information, also introduces additional requirements for the consistency and integrity of data. Data sources encapsulating remote files can only present their content if the appropriate remote service is available. In addition, data sources associated with fragments of physical files introduce dependency on the currency of the content. The dependencies compound with the wide use of derived data sources.
To ensure consistency and integrity of Metaphoria resources, we employ a dual strategy: generate redundant metadata attributes to help trigger content change exceptions at presentation time, and build a distributed notification framework [mac97] that is not discussed in this paper.
As an illustration of the utility of attribute redundancy, consider a Metaphoria resource associated with section 2 of a plain text document stored in a single file. This Metaphoria resource would store both the title of the section and the byte offsets of the beginning and end of its text. When processing a request, the resource would use the byte offset to retrieve the text, and would then compare the beginning of the text with the title. In the case of a mismatch, the resource takes the following actions:
The objective of this section is to describe simple Metaphoria applications utilizing two extreme kinds of data sources:
![]() |
![]() |
![]() |
| a. | b. | c. |
Fig. 8. Metaphoria Views of ASCII Documents
Consider using Metaphoria to achieve flexible presentation of the plain text versions of Internet RFCs (a demonstration version of this application is available). The Metaphoria resource hierarchy for RFC1738 was shown in fig. 4. Screenshots shown in fig. 8 illustrate presentations by the document-level container resource in response to different presentation parameters. Screenshots shown in fig. 9 illustrate different presentations by a section-level container.
![]() |
![]() |
| a. | b. |
Fig. 9. Metaphoria Views of a Section of an ASCII Document
At generation time, the encapsulation process associates Metaphoria resources with the smallest logical units of information, which, depending on the presentation requirements, may be either sections or subsections of the RFCs. Each of these resources contains metadata associated with the fragment of content that it represents. For instance, access-descriptive metadata would include the URL of the original document and byte offset of the fragment within the document. Content-descriptive metadata would include the name of the section, title of the document, author, date, etc. Once generated, the resources are grouped together into virtual containers mirroring the section structure of the original documents (fig. 4). As mentioned in section 2.3.2, such mirroring is not required by the Metaphoria model; it is a characteristic of the RFC application.
At presentation time, when an HTTP request arrives for a resource, the resource first requests its back-end data source to present the content of the full document. Depending on where the document is located, the back-end data source may negotiate an FTP connection or read a local file, hiding the details of the retrieval from the resource itself. Additionally, if the file was presented recently, its content may be in the cache (see section 4). Once the data source has presented the raw content, the resource then extracts the proper fragment and applies an appropriate presentation method to the extracted content, according to presentation parameters communicated through the HTTP request.
The screenshots in figures 8 and 9 are results of presenting the same Metaphoria resources with different parameters. A document (or its section) may either be presented as a general description with hyperlinks to individual sections (figures 8a, 8b, 8c, and 9a), or by displaying full encapsulated content (fig. 9b). Note that if the document has not changed (reasonable assumption for the RFC archive) presenting the content requires no parsing of the document at presentation time.
Fig. 10. Metaphoria Slide Show
Metaphoria's support for state management makes it easy to implement applications that customize their presentation based on the history of requests. A screenshot of such an application (Metaphoria slideshow) is shown in fig. 10. As with the RFCs, the slideshow itself is a plain text file that may be edited using a simple ASCII interface. At generation time, the encapsulator creates a hierarchy of metadata objects using indentation to infer parent-child relationships.
The screen in fig. 10 is presented by a slide-level container, which combines its own content with that of the child containers. Using one of the presentation format links (blue buttons in the screenshot) results first in changing the state of the session and then in repeating the presentation based on the new state. When a user collapses or expands bullets, the state is updated and the slide is presented again based on the new state. As additional presentation changes are made to the slide, the state changes accumulate but they don't get carried over to the next slide. Conversely, when the user requests a font size change, the state change is global; it carries over to other slides as the user navigates through the slideshow.
Applying Metaphoria to dynamic data sources provides similar advantages of flexible presentation and data traversal - the differences are in repository generation and in maintaining consistency and integrity of metadata. An important special case is database applications. Since it is assumed that content accessed via a database may differ between presentations, content-descriptive metadata is not stored in Metaphoria resources but rather in the database itself. Sometimes, this content-descriptive metadata may be entered into the database together with the data (or a data reference). For example, rich content-descriptive metadata available from third parties is quite typical of geospatial applications [shk97].
Content-descriptive metadata that is extracted at encapsulation time is still stored in the database and not in Metaphoria resources. A database application that relies on such automatically extracted metadata would trigger metadata extraction when new data is added into the database. Metaphoria resources in an application of this kind are generated with access-descriptive metadata containing enough information to retrieve both physical content and content-descriptive metadata when processing a presentation request.
Since content-descriptive metadata is stored with data and not in Metaphoria resources, maintaining resource consistency becomes a much easier task - resource regeneration would not have to be triggered by a change of content but only by changes in the database schema.
As you have seen, Metaphoria is flexible enough to support a wide variety of applications. Even though these applications may be quite different from one another, they all share common advantages:
Although the Metaphoria model embodies a fundamental departure from the manner in which a typical Web server accesses and presents information, it has never been our intention to design and implement our own HTTP server. Instead, we set out to design and implement a hierarchy of classes that would work across a variety of HTTP servers. Metaphoria is implemented as a set of Java classes, most of which are completely independent of any protocol. The current implementation of Metaphoria communicates with a Web server via the Java Servlet API. However, the interface between the core Metaphoria classes and the Servlet API is very small, and is accomplished through a modular mechanism with abstract wrapper classes; Metaphoria could easily be adapted to work with any server-side Java implementation that would pass it requests and accept responses. Adding support for another communication protocol (e.g., IIOP) would not require changes to the core of the system.
In the interests of scalability, we have implemented a sophisticated multi-level caching scheme, utilized both for managing the content presented by data sources and for memory management with regard to Metaphoria resources themselves.
In order to make it easy to create new kinds of data sources by
extending existing Java classes, DataSource is implemented
as an interface. It requires the implementation of a few simple methods,
the most important of which is the method presentContent(),
which returns an InputStream. Presentation parameters are
passed to the presentContent() method as an associative
array of name-value pairs.
DataSources are constructed based on a location identifier,
typically a URL. An object that requires a DataSource
does not construct it directly, but instead passes the location
identifier to a DataSourceFactory.
The DataSourceFactory has the responsibility of selecting,
based upon the identifier, a class of DataSource to instantiate.
The basic implementation of DataSourceFactory uses a mapping
between URL schemes (e.g., file, http, ftp, etc.)
and specific classes of DataSource. Subclasses of
DataSourceFactory may use more sophisticated logic based on
other information in the location identifier (such as the host portion
of the URL, etc.).
The DataSource and DataSourceFactory classes are
designed to be useful outside of Metaphoria as well.
Metaphoria itself uses a subclass of DataSourceFactory that
distinguishes between ordinary HTTP URLs and those referring to local
Metaphoria resources. For ordinary URLs, it creates a DataSource
object capable of opening up an HTTP connection; for URLs of local
Metaphoria resources, it instead returns the resource itself, since
Metaphoria resources implement the DataSource interface.
Fig. 11. Processing HTTP Requests
A servlet-enabled Web server has access to configuration information that
maps specific URL paths to servlet classes. When a request arrives at the
server, the server examines the URL to determine whether it maps to a
servlet. If so, then a request object is created and passed to that
servlet's service() method. From there, the servlet is in
complete control of the HTTP response.
HTTP URLs [ber94] have the following
structure (optional elements appear between square brackets, [ ] ):
URL := http://[user[:password]@]host[:port]/path?parameters
In the case of an HTTP URL referencing a Metaphoria resource, the
path portion has the following structure:
path := path-to-Metaphoria-servlet/handler-name/repository-name/resource-path
The Metaphoria servlet includes several request handlers.
When the Metaphoria servlet receives a request, it first decides which
handler the request is for, and then passes the request to the appropriate
handler. In particular, the repository handler handles all
requests for Metaphoria resources at presentation time. Other handlers
include the administration handler, responsible for
processing requests to configure Metaphoria. From now on we will focus
on the repository handler.
The resource-path portion of a Metaphoria URL is
a slash-separated sequence of one or more identifiers of virtual
containers, ending in an identifier of either a container or a leaf resource.
The repository handler uses this list
to construct a LookupContext object, which it passes to the
first virtual container in the path. This LookupContext is
passed in turn to each container in the path, until either the components
are exhausted or a leaf resource is reached. The final resource
(whether a leaf resource or a virtual container) is returned as
the result of the lookup operation. In the end, this results in the
invocation of the target resource's presentContent() method,
passing in parameters included in the request.
Metaphoria's caching mechanism was designed to support both the
memory management of Metaphoria resources and the caching of their
retrieved and presented content. The former is handled via the
RepositoryCache, and the latter is handled via the
ContentCache.
Fig. 12. Caching Metaphoria Resources
Since a repository may contain a very large number of resources, it is not practical or desirable for all of them to remain resident in main memory all the time. A persistence engine is provided to alleviate this problem. Like many components of Metaphoria, it is designed in a modular fashion with interfaces and abstract classes, so that, depending on the scalability requirements, Metaphoria may easily be adapted to utilize an existing persistence mechanism (e.g., one of several available Java persistence engines, or a full-scale object or relational database). Our default implementation has proved itself to be scalable to at least thousands of resources.
In the interest of efficiency, it is desirable to keep the most frequently accessed resources in main memory, and to swap less frequently accessed resources out to disk. The trick is to operationalize the notion of "most frequently accessed" in a manner that reflects the practical operation of the system. The simplest and most common approximation to computing the frequency of access is to maintain a time-stamp for the most recent access. To achieve a better approximation, the Metaphoria caching algorithm computes the number of accesses over a most recent finite period of time.
To implement this approximation, each Metaphoria resource object maintains three attributes relevant to caching: an access time-stamp, an evaluation time-stamp, and a priority value. Each time a resource is accessed, its access time-stamp is set to the current time, and its priority value is increased. On a periodic basis (or on demand, if available memory is getting low), an evaluation process is initiated, which causes each resource's priority value to be updated. If a resource was once very active, but has not been accessed frequently for quite some time, its priority value will be reduced to reflect the resource's diminishing use. This allows to take into account recent usage patterns as well as total usage throughout the history of the resource.
Fig. 13. Caching Content
The ContentCache works similarly to the
RepositoryCache, with an important difference regarding the
duration of secondary storage. Although Metaphoria resources may be swapped
in and out of main memory, they are meant to have permanent lifetimes in
the system as a whole. In other words, their persistent representations on
disk do not expire. In contrast, the content they retrieve and present
should only persist as long as it is current; not only should it be purged
from main memory and written to disk after some period of inactivity, but
after its lifetime on disk has expired, it should be purged from secondary
storage as well. Items in the ContentCache are thus
considered to have an active lifetime (in main memory), and an
inactive lifetime (in secondary storage), after which they are
purged completely. Once purged, the content must be retrieved from its
source (or regenerated, as the case may be) in order to be presented again.
There has been a number of recent attempts to apply a data-modeling approach to designing and building Web sites. Both "W3Objects" [ing95] and "CorbaWeb" [mer96] share some of their goals and objectives with Metaphoria but provide different solutions. W3Objects is an extensible object-oriented Web infrastructure which was designed to support a wide range of resources and services. In this system, Web resources are objects that have an internal state and a well-defined behavior. Such resources are responsible for managing their own state transitions and properties, in response to method invocations. The W3Objects system does not aspire to use data analysis to automate the generation of metadata utilized by its resources. It is implemented as a proprietary server that requires a gateway to talk to an HTTP client.
CorbaWeb was designed to provide access to CORBA objects belonging to the same Shared Information Space (SIS). SIS is defined outside of the Web servers' realms, which makes it necessary to use the ad-hoc CGI mechanism to access the objects from HTTP browsers. This, of course, may change with the growing acceptance of the IIOP protocol. CorbaWeb proposes its own scripting language, CorbaScript, which is a shell-level language aimed at implementing interfaces. This differs from the Metaphoria scripting language, which is designed to support building executable specifications of information repositories.
Closely related to CorbaWeb is the "ANSA Workprogramme" [ree95], which focuses on using CORBA to optimize the communication problems of HTTP by using CORBA's more advanced IIOP communications protocol. In the simplest form, they propose IIOP-to-HTTP and HTTP-to-IIOP gateways. In the future, they hope to add IIOP support to Web browsers and to build native IIOP servers. The work concentrates on the details of using CORBA rather than on defining abstractions.
"InfoHarness" is another example of the use of data modeling for building Web sites [shk95-1]. This earlier work was targeted primarily at providing Web access to legacy information, and the InfoHarness object model did not provide for sophisticated containers and multi-level encapsulation. InfoHarness objects were served by a proprietary server through a CGI interface and object specifications were loaded into the main memory when the server came up. A commercial version of the system is implemented as a stand-alone HTTP server and stores object specifications in a relational database.
In contrast to papers that concentrate on constructing and accessing server-based object frameworks, the "Distributed Active Objects" work [bro96] applies object-oriented approach to building distributed applets. In the context of this work, applets have very relaxed communication restrictions, which raises a variety of security concerns. We do not believe that it is beneficial to consider client-side objects out of the context of server-based object networks.
We believe Metaphoria to be an important step in changing the approach to building applications for the Web environment, because it provides a consistent means for separating information content from its presentation. This division maps neatly to practical reality: content often changes independently of the method for presenting it, and vice versa. As HTML 4, XML, and VRML find their way into the latest browsers, only presentation methods of Metaphoria-enabled Web sites would need to change in order to remain in step with cutting-edge trends; the content does not need to change. Conversely, as content is updated, presentation methods can remain the same, minimizing maintenance while providing a consistent look-and-feel to a Web site. It is clear that the double advantage of separating content from presentation is desirable for both the Internet and internal corporate networks. Metaphoria enables a very smooth transition to this new approach because one of its early motivations was to provide flexible access to legacy data.
In addition to streamlining Web application design, Metaphoria provides very high flexibility in personalizing data presentation based on client information. We foresee many possible applications for such personalization, including, but not limited to:
Our current work is mainly concentrated on designing and implementing a scripting language for building and presenting information repositories, possibly by extending and existing language (e.g., Tcl). We are also experimenting with a visual front-end for this language.