|
|
Workshop:Object-Oriented Web Servers and Data Modeling[ Workshop Home Page | Workshop Position Paper | Call for Participation ] |
|
The stated objective of the workshop was to bring together researchers and developers working on designing extensible object frameworks for the World-Wide Web -- as exemplified by the new generation of HTTP servers ( Jigsaw, Java Web Server, etc.) -- and those interested in utilizing these frameworks to apply a data-modeling approach to Web site construction. As it happened, the workshop has also generated a lot of interest among application developers.
Workshop discussions were structured around the presentations of position papers. These presentations were ordered so as to facilitate a gradual transition from discussing object frameworks to generic high level services (concentrating on data modeling) and, finally, applications.
Leon Shklar opened the workshop with a discussion of the agenda and purpose of the workshop. He discussed synergy between the data-modeling approach and the emergence of object-oriented Web servers. Several data modeling projects (e.g., MORE, InfoHarness, W3Objects, MetaMagic) have produced interesting prototypes. The main idea of the approach is to create server-side models consisting of interrelated objects, capable of presenting themselves via the Web in response to HTTP requests. A Web site then becomes a view, or set of views, of this model, rather than simply a device for serving static files or the output of individual executable CGI programs.
To take maximum advantage of the data modeling approach, it is advisable to create content in a form that easily lends itself to structural analysis. In other words, there is no need to include presentation-specific markup in the content. Instead, content should be marked up in a way that most readily facilitates separation into logical units of information that may be be dynamically marked up for presentation. In the context of W3C's XML (Extensible Markup Language) and CSS (Cascading Style Sheets) efforts, each of which promote different levels of separation between content and presentation, we add yet another level to such separation. We are considering the introduction of the so-called Structure Markup Language, which would be most naturally defined as a set of XML entity specifications.
Until now, absence of a common server platform was the main limitation of this promising approach. Early prototypes were implemented using either the CGI mechanism, with its well-known performance and scalability limitations, or server APIs, which seriously limit portability. The emergence of object-oriented Java-based Web servers (Jigsaw, Jeeves, etc.), JavaSoft's Servlet API, and IIOP-based platforms open up completely new opportunities. Such innovations make it possible to design layers of functionality while avoiding the tedious task of porting them to multiple server platforms. Moreover, new layers of functionality should be able to take advantage of the ones that are already available. Facilitating communication between groups working on these different layers was the main focus of the meeting.
Anselm Baird-Smith started his keynote presentation by stressing the object-oriented nature of the Web. Initial applications were based primarily on sharing file-based information resources (e.g., html pages), and CGI-based interaction. The first Web servers only had to know how serve files and execute CGI scripts. The new emerging generation of Web servers aims at maximum extensibility to make heterogeneous documents, programs, databases, and other resources available to audiences with widely varying requirements. He discussed the evolution to custom protocols extending (e.g., via W3C's PEP) and possibly replacing HTTP. With that in mind, his Jigsaw server was designed to handle multiple protocols.
There are three key parts to Jigsaw:
In the conclusion of his talk, Anselm discussed what he sees as current impediments to the progress of Web server technology:
This work originally started with the development of Jeeves (a
Java-based object-oriented Web server, now renamed
Java Web
Server) but its current focus is on providing the
Java Servlet API - an open framework for adding new services to existing
servers, including not only HTTP servers, but servers for other protocols
as well. The framework principally supports server administration,
security, thread and session management, etc. It will soon support
server-side web page compilation and Java code generation. Servlets are
platform and server-independent and are network-downloadable from other
servers. As a result, they exist under certain security restrictions,
similar to those imposed upon applets loaded into a browser (the
"sandbox" model). Also related to security, servlets support
servlet signing for server-side load/execution. The
java.servlet package contains the following principle
interfaces:
Servlet, supporting methods to initialize and destroy
servlet instances, as well as the service() method that
defines the main activity of the servlet
ServletConfig, enabling named parameters to be passed to
the servlet, and providing the servlet with access to its
ServletContext
ServletContext, allowing servlets to find out information
about their environment, such as attributes of the server in which they are
installed, what other servlets are running, etc.
ServletRequest, an interface that encapsulates a
protocol-independent request from a client.
ServletResponse, an interface encapsulating a
protocol-independent response to a client.
The Java Web Server (the successor of Jeeves) is extensible via servlets, supports virtual hosting, inherits all features of Jeeves, and is HTTP/1.0 (but not HTTP/1.1) compliant. It supports so-called servlet chaining.
In this talk, Wei-Yeh Lee provided a brief overview of James, ExpressO, and Cascade. Unlike Jigsaw, none of these servers currently provides adequate support for data modeling applications.
James, the unlikely product of the Dutch public broadcasting organization, VPRO, is not simply an HTTP server, but a multi-protocol "e;server-server." The key features that have the potential for supporting metadata-based architectures include its support for the Java Servlet API, as well as Modules, services consisting of multiple objects that share some state and synchronization information. Unfortunately, only very limited documentation is currently available for James (unless you happen to understand Dutch).
Upon examination,
ExpressO
turned out to be a Java implementation
of an NCSA-type type server. It does not support CGI's, but supports GET
and POST through the ServerProcedure mechanism, which reminds
one of CGI, and is implemented in Java. URLs are mapped to ServerProcedures
either through an administrative interface or through a configuration file.
It is not clear if or how this can be programmatically maintained.
Complete documentation on Cascade's architecture was not available at the time of the survey of Java Web Servers, so its features were not summarized.
Mike Spreitzer and his colleague Bill Janssen are working on HTTP-NG, which is intended to serve as a path to convergence between HTTP and IIOP. It remains unclear whether HTTP-NG will in fact be defined as an IIOP-NG application or whether these future protocols will have a different kind of relationship. Mike and Bill are pushing for this work to become a formal W3C activity. Current problems include HTTP "mission creep," whereby separate groups are altering or abusing the HTTP standard for their own needs. This problem should be somewhat remedied by the emerging Protocol Extension Protocol (PEP) standard. Other problems include so-called anarchic evolution, and a requirement to query servers for versions and refinements/extensions, which costs too many roundtrips. A partial solution Mike cited was the use of structs with lists of extensions, e.g. IDL-like descriptions. He also described three layers that HTTP was (or should be) providing as well as possible solutions: transport (e.g., W3Mux, TCP state sharing, Tx TCP), RMI, and WWW interfaces.
David Ingham discussed the W3Objects system, based on an ORB-like foundation called W3Objects provides low-level support for referential integrity and migration transparency (something that is not directly provided by CORBA), allowing resources to be moved around freely without causing broken-links.
Web access to W3Objects is achieved via a gateway, implemented as a plug-in module for an extensible Web server such as Apache. The gateway translates HTTP requests into RPCs on distributed objects. This design supports scaleable sites since services (objects) can be distributed across a number of hosts transparently to the Web user. W3Objects persist across client requests therefore simplifying the construction of session-based services. This architecture has advantages over both CGI and server-specific APIs, in particular: transparent distribution of services, fault-isolation (server APIs load all application code into the server -- easy to introduce bugs that can bring down the whole server), support for sessions, good performance (cf. CGI).
W3Objects provides higher level mechanisms to facilitate the construction of manageable services. This is based on a concept called views in which common components can be isolated to simplify changes. Dynamic views are also supported though the W3OScript language (based on Tcl/Tk). W3OScript is used to encode the presentation logic of services to support customised presentation of services. Also W3OScript provides the glue between the Web presentation interface and the functional interface of a service. The W3Objects system provides gateways to legacy systems via object wrapping and W3OScript scripting. It was designed to pay particular attention to caching and change management (only reload what has changed).
Dave Makower presented Pencom Web Works' MetaMagic system, which is currently implemented as an extension of Jigsaw. MetaMagic focuses upon separating content from presentation, location transparency, and presentation negotiation. It is based on automatically generating repositories of metadata objects via a multi-step information fusion process:
Joule was designed to provide asynchronous support for group collaboration. It intends to support notification via an event model similar to that supported in JDK 1.1. Joule provides its own interface to its servers that forces users to supply required metadata (e.g. keywords) before the documents are accepted. Elizabeth Frank, who was giving the talk, also mentioned Merge, a heterogeneous querying protocol that is now under development. The underlying persistent object store of Joule is the current version of the old NCSA repository project.
Sankar Virdhagriswaran talked about PowWow and RAMP (DARPA-funded). PowWow supports four key features:
PowWow's version control supports federation across multiple control systems using a central site. The groupspace management supports filtered views using content-descriptive metadata (e.g. "show me all beta files"). URLs can be mapped to multiple other servers using an HTTP proxy. The multi-site build functionality supports three configurations: client-side builds (the system automatically checks out all required files for a build), designated-site builds (includes the generation of make files, etc.), and cluster builds. Semantic process management supports read and write operations in the context of long transactions. PowWow has a metadata object base (implemented using a variant of Scheme), and event notification is supported using mobile agents.
RAMP (Replication, Annotations, Migration and Partitioning) is a system that is being designed to support coordination using long-term transactions, and cooperation via notifications and annotations. It has a decentralized, distributed peer-to-peer architecture and supports replication, migration, and partitioning.
Yan Zhao has presented a talk on WebEntrance, which offers a single point of access to Web resources via a single login. Users can register their services enabling WebEntrance to hide the login process. They can also select other resources they want to have in their customized interface (e.g. newsgroups, search engines, etc.). Underlying the system is a centralized web server with a metadata repository storing configuration information.
"Two years from now..." there will be agents and information instead of just data. Bob Marcus, who gave this talk, said the basic idea is that Domain Knowledge + a Searching Capability will lead to the necessary partitioning of information on the internet. That is, there will be domain-specific search systems with agent interfaces constructed by groups of domain, web and software experts. Bob was critical of CORBA, one comment being that KIF and KQML is more appropriate than CORBA, as he felt that CORBA was static (not all present were of the opinion that these were competing technologies...).
While there were questions about the last presentation that fell through into the final discussion, due to lack of time we only briefly discussed the potential of a future workshop in Manhattan in the fall. This would depend upon interest and finding a common focus for the group. There are potentially a number of possibilities that the group may be interested in. This includes standards efforts for object servers, structure markup languages, as well as metadata, and a repository definition language; HTTP NG '97 and related efforts (any proposed custom object protocols) to better integrate client and server object processing environments; and, perhaps query interface or language support within object servers.
We would like to thank all workshop participants for their contribution to the exciting discussions. We would also like to separately thank all those who sent us their workshop notes, which were of great help in compiling this report.