{ leon | davemak | weiyeh }@pencom.com
Position paper for the
Workshop
on Object-Oriented Web Servers and Data Modeling,
Sixth International World Wide Web Conference.
Monday, April 7, 1997
Santa Clara, USA.
This paper provides a brief overview of the MetaMagicTM architecture, a model that utilizes metadata objects, generated by and residing on a Web server, to faciliate the flexible presentation of heterogeneous data, which may itself be located elsewhere. To allow access to data in any number of locations, available via any number of protocols, we define an abstract data source interface -- an interface which the metadata objects utilize in order to obtain data, and in turn implement in order to serve as data sources in their own right. The entire architecture is implemented as a hierarchy of Java classes, extending object-oriented functionality currently provided by W3C's Jigsaw server. Future plans include adapting the hierarchy to work with server-side Java implementations emerging for various servers.
Our model focuses on the creation of MetaMagicTM resources, logical entities that use metadata to retrieve their content by accessing one or more data sources and that have one or more methods of presenting that content via the Web. Our notion of a data source is abstract, in that a data source may represent a local file, a URL, a database query, or any other entity that may yield a stream of data. The notion is also recursive; a MetaMagic resource utilizes data sources in order to access its content, but also, through its presentation method (or methods), may itself be utilized as a data source. MetaMagic resources are grouped together into virtual containers, which are in turn grouped into repositories. The containers and repositories may then be associated with any available content-based indexing technology.
An important advantage of our solution is advanced support for automated generation of MetaMagic resources. As in [shk95-2], we define and implement high-level operations to control data analysis and metadata generation and organization.
In addition to supporting flexible access to existing information, MetaMagic provides a reason to seriously rethink the way to design and build new Web sites. With MetaMagic supporting multiple dynamic views of the same data, it becomes beneficial to separate content from presentation. The main task of the site design is then to logically collect the content in the most transparent form (e.g., plain text files or database entries), without moving data to a single file system or redesigning data maintenance procedures. An important additional benefit of such approach is in protecting Web sites from the assault of new presentation technologies that have long outdated early HTML pages. Achieving a cutting-edge presentation would only involve upgrading presentation methods without changing physical content.
Instead of implementing yet another HTTP server in order to support the MetaMagic model, we have designed and implemented a hierarchy of Java classes that support data modeling capabilities. These classes extend functionality currently provided by W3C's object-oriented Jigsaw server [bai96], and are capable of being adapted to work with the coming generation of commercial object-oriented Web servers.
In MetaMagic, we create logical Web resources -- that is, resources accessed by a URL, but represented by object specifications, rather than some physical file system entity -- that are composed of metadata attributes. We call these resources MetaMagic resources. At presentation time, when a request for such a resource arrives at the MetaMagic server, these metadata attributes, in combination with the request information, determine content of the reply.
get request, while a data
source for an SQL query may execute a shell command utilizing an ad-hoc
query tool. Regardless of these differences, both present their content as
a stream after receiving a "present content" request.
Because there may be an arbitrary number of different data source classes, providing a consistent interface, but differing in their operational implementation, objects using data sources make use of a data source factory to construct specific data sources.
A MetaMagic resource is a data source capable of packaging its content as an HTTP response. MetaMagic resources can be collected into virtual containers. A virtual container is itself a MetaMagic resource that contains other MetaMagic resources, much in the same way that a directory on a filesystem is seen as containing files. MetaMagic resources that are not containers are referred to as leaf resources. MetaMagic resources are referred to by HTTP URLs
Albeit with certain qualifications, the choice of Java as an implementation language allows us to take platform-independence in the traditional sense for granted. Instead, we consider portability in terms of server platforms. In this context, the challenge is to ensure that our class hierarchy is general enough to extend different HTTP servers.
We have formulated a number of requirements for HTTP servers that qualify them as acceptable MetaMagic platforms:
Having adopted such an approach, we determined that the Jigsaw reference server from W3C was the natural and obvious choice as the basis for our initial prototyping. Jigsaw is implemented in Java and provides a Java API with all of our required features. Furthermore, the Jigsaw server is publicly available, including complete source code.
Plans for adapting and generalizing the architecture to work with server-side Java implementations of various other servers are already underway.
Before MetaMagic resources can present their content via the Web, they must
be generated and installed in repositories on the Web server. The first
operation that must be performed to generate a repository is the
encapsulate operation. To support this operation, we
define a class Encapsulator with an abstract method
encapsulate(). Subclasses of Encapsulator must
implement this method in order to provide a specification for the analysis
of data streams.
While performing the analysis, the Encapsulator obtains
metadata, which it stores in MetaDataNode objects (see
fig. 1). On completion of the analysis, these
objects are returned as a set. Each MetaDataNode
is essentially a collection of name-value pairs, and may contain references
to a parent node and zero or more child nodes. Neither parent nor child
references are set by the encapsulate() method; parent-child
relationships are established later by the group
operation.
After the Encapsulators have analyzed their streams, and each
has returned a set of MetaDataNodes, set operations may be
performed in order to remove elements from the sets, combine sets, or
create new sets from existing elements. Sets of MetaDataNodes
are then grouped together by creating a new MetaDataNode to
contain them. The group operation is illustrated in
fig. 2. The container node is designated as the parent,
and the contained nodes become its children.
MetaDataNodes to
Jigsaw resources.
Finally, after the MetaDataNodes have been arranged in a
suitable hierarchy, the nodes are converted into Jigsaw resources and
installed into the server's information space, in a configuration mirroring
the hierarchy of MetaDataNodes
(fig. 3). This is accomplished via a
pre-order traversal of the hierarchy, converting each
MetaDataNode as it is visited. MetaDataNodes
with children are converted into VirtualContainerResources,
and those without children are converted into LeafResources,
which are implemented as extensions of Jigsaw's
ContainerResource and FilteredResource classes,
respectively.
When a request arrives at a MetaMagic-enabled server, and the target of
the request is a MetaMagic resource (as opposed to any ordinary Web
resource, which is handled in the normal way), then the MetaMagic resource
responds to the request by sending a presentContent()
message to itself, initiating the following procedure:
DataSourceFactory, which instantiates the appropriate
data source classes.
presentContent() method for each
data source, obtaining each data source's content as an
InputStream.
InputStreams provided by its data sources,
filtering, integrating, processing or interpreting these
streams to generate its own content.
InputStream containing the generated content.
This stream is returned by the getContent() method.
To complete its response to the HTTP request, the MetaMagic resource
attaches the generated content to an HTTPReply object,
fills in any applicable HTTP header fields in the reply, and then
passes the reply to the server to send back to the browser.
Note that the MetaMagic resource, rather than the server, is responsible
for attaching the reply header. This has important implications,
particularly for the Content-type field. While a typical Web
server sets the Content-type field based upon a configurable
but static mapping between file extensions and MIME types, MetaMagic
resources set their MIME types dynamically at presentation time. This
allows a single MetaMagic resource to present its content flexibly, based
on information that may be passed in through the HTTP request.
We believe MetaMagic to be a step in changing the approach to building Web sites because it provides technology for separating information content from its presentation. This division maps neatly to practical reality: content often changes independently of the method for presenting it, and vice versa. As new extensions to HTML find their way into the latest browsers, only the presentation methods of a MetaMagic-enabled Web site need to change in order to remain in step with cutting-edge trends; the content does not need to change. Conversely, as content is updated, presentation methods can remain the same, providing a consistent look-and-feel to a Web site. It is clear that the double advantage of separating content from presentation is desirable for both the Internet and the corporate Intranets. MetaMagic enables a very smooth transition to this new approach because it was created to provide flexible access to legacy data.
In addition to streamlining Web site design, MetaMagic provides very high flexibility in personalizing data presentation based on client information. We foresee many possible applications for such personalization, including, but not limited to:
Our current work is mainly concentrated on designing and implementing a scripting language for building information repositories and a visual front-end for this language. We are also investigating techniques for using software agents to maintain referential integrity of both logical and physical resources.