The computing environment of a user consists of the set resources and
information which are employed to accomplish various computing
task. The traditional view of these environments, which consists of
single devices operating in isolation, has been altered by several
on-going trends. Decreasing costs of computing devices, increased
connectivity, and improved performance are allowing users to
distribute and access information across many devices. Additionally,
groups of users are increasingly able to work collaboratively and
develop complex sharing patterns with one another. Unfortunately,
while these trends are significantly enriching the user's computing
experience, they are also increasing the data management overhead as
users must explicitly reason about data placement and replication
across multiple devices and logical sharing groups.
In this thesis, we will present the Wayfinder file system, which was
designed to simplify the management of information in federate
systems. To address this goal, we focus on three critical management
deficiencies present in most current federated environments: 1) the
lack of a consistent view for stored information across devices - the
same information is potentially named differently across devices 2) the
required manual management of replicated information - users are
forced to continuously reason about information and its location to
ensure its availability and 3) limited search/ranking capability -
most existing search/ranking tools rely on content-centric queries,
without considering the large amount of structure and meta-data stored
within file systems to improve the ranking of relevant results.
The Wayfinder file system addresses these deficiencies by providing
three synergistic abstractions: 1) a global namespace that is
uniformly accessible across connected and disconnected operation, 2)
user-centric automatic availability management to ensure continuous
access to information based on data-centric availability policies, and
3) a multi-dimensional fuzzy search framework that significantly
improves relevance ranking. We will show that these abstractions
simplify the management burden by requiring users to reason only about
the data and its properties while ignoring the underlying physical
complexities of the system.
Underlying all three abstractions is a common implementation layer
that adheres to the following three principles. First, any subset of
nodes in a Wayfinder community can interact normally when they are
interconnected, regardless of the membership of the subset; that is,
the absence of specific nodes may limit the amount of accessible data
but will never hinder the interaction (data sharing) of a set of
connected nodes. Second, all protocols and interactions are tolerant
of a weakly consistent model allowing them to suffer unexpected
devices departures. Third, devices are assumed to be owned by specific
users and so should prioritize the needs of their owners in the
presence of resource constraints, while using excess resources to
benefit the community as a whole (e.g., increase data availability).