efrank@ncsa.uiuc.edu
Joule is a collection of applets and applications designed to provide collaborative workspaces in which users can navigate, publish documents, and administer security. Objects in the workspace are pulled from the server using a simple, easily extended wire protocol. One design goal was that the server and the client should share the same representation of the object. On the server side, objects write and read themselves from database tables when needed. Originally, this was intended to run on a garden-variety web server (NCSA HTTPd 1.5.x), an RDBMS supporting standard SQL (Postgres95), and a special Java server just for translating between the wire protocol and the database queries. Publishing was accomplished with a Java client application which uploaded files to a CGI script.
A Java-based web server is a better solution to some of the problems presented by this model. In particular, objects on such a server, in constructing responses to requests, have access to the state of the entire server and all of the objects on it through a single and consistent API.
We've chosen to use Jigsaw, and have begun migrating the Joule
code base to use its features. For instance, the CGI script for
publishing has been eliminated, the publish client now uses HTTP
PUT to upload files. Also, the special Java server
which translated between the wire protocol and database queries
has been replaced with a Jigsaw servlet. Other places where Joule
created static HTML files to display the results of database queries
have been replaced by Jigsaw resources which make database queries
and construct responses on the fly. This eliminates the need to
re-construct HTML files whenever a change they depend on is made.
Similarly, the security administration applets use to create ".htaccess"
files which NCSA HTTPd used to restrict access to directories;
now the server simply creates and modifies HTAccessFilters,
written to Jigsaw's filter API, to accomplish the same thing.
The Postgres95 interface has been completely replaced with JDBC
calls, any JDBC-compliant database can be used to store Joule
objects; currently we're using NCSA's Decibel, a small DBMS written
entirely in Java which supports only a subset of JDBC.
We have only just begun to evaluate the overlap between Joule's
workspace model and Jigsaw's Container/Resource model. Certainly
Jigsaw's Resources and its SimpleResourceStore represent
a kind of object-oriented database, which is well-suited for URL-based
retrieval of statically-defined Java objects with simple attributes.
In this case, retrieving an object means instantiating it at retrieval
time, complete with all the attribute values it had when it was
written to the resource store. Jigsaw typically then retains
the resource in memory so that it can be looked up without re-reading
it from the store. This makes Jigsaw fast without sacrificing
the generality of the resource model; once an object is restored,
its attributes behave more or less like instance variables. This
is significantly more powerful than simpler, more static caching
mechanisms of the kinds that caching proxies use.
Not that this optimization comes without a price. Queries across
multiple resources cannot be handled gracefully under Jigsaw's
current ResourceStore implementation; to find out
the value of attribute "foo" for some collection
of resources, the resources must all be restored, placing a potentially
large load on the server. In addition, saving changes to a resource
is relatively expensive. Changed resources must be serialized
in their entirety, even if they have only partially changed; what's
more, if a single resource has been modified in a given store,
all the resources in the store must be serialized in order to
save the change. Fortunately, the ResourceStore API
is general-purpose enough that it does not force implementations
to use this strategy. Other strategies optimized for the needs
of particular resource behavior can be used alongside the SimpleResourceStore.
We've experimented with JDBC-based resource stores and will continue
to as we move into the next design phase of Joule.
One problem we've encountered in Joule is that of references between
objects. Currently, Joule models such relationships as foreign
keys in RDBMS tables; each table defines a flat namespace for
keys. This strategy is poorly suited for a distributed object
system, since such references can only be resolved within a single
DBMS. Jigsaw's resource model presents the same problem; ResourceStores
are flat namespaces and references to objects in other servers,
or even other ResourceStores, are relatively opaque
because they must be stored at best as URLs which can only be
resolved by an HTTP server and then only an HTTP reply is available,
not the object itself. Resolving a URL within a Jigsaw server
is another matter; URL lookup is handled one branch at a time
by ContainerResources, which resolve their part of
the pathname within their own stores' flat namespaces and either
locate a leaf resource or pass the rest of the pathname to the
appropriate one of their children. This is how resources representing
directories are implemented: one store per directory.
When Joule sends an object to the client, it sends only the keys of other objects it refers to. So Joule's mobile-object paradigm, as it is currently implemented, is really just one abstraction away from shipping messages over a protocol. We construct objects on the server side, serialize them, and reconstruct them on the client side, but in order to make the objects behave properly we need to tell them which side they're on using message codes embedded in the object. Furthermore, since only the server can resolve references between objects, the server is effectively the location of the objects and what we ship to and from the client are really messages. One result of this situation is that the mobile objects have become relatively heavyweight; in order to load the classes representing them applets now need to download class files from within the server code which implement their server-side behavior.
To improve this situation, we need not only a lightweight way of moving webs of objects piecemeal, but also protocol-level support for coping with the concurrency issues that this raises. Serialization alone is inadequate, since it lacks a protocol for requesting and receiving parts of an object graph, and notifying other hosts about changes. Java RMI is also not enough because it merely allows methods to be called remotely, with no protocol for sorting out concurrent access by multiple callers. HTTP provides a much richer protocol, since URLs provide a consistent name resolution paradigm and existing servers already provide support for proxies, redirection, and authentication. But HTTP has a limited set of widely supported methods, so that directives for more complex semantics have to be encoded in URLs, put in non-standard MIME headers or accessed with custom browsers or applications. This may not be such a high price to pay for interoperability; in general, web servers ignore directives with no meaning for them without behaving inappropriately.
Regardless of where we start with the protocol, eventually we wish to: