|
|
|
Current Projects
|
Integrated Learning
|
|
|
The Integrated
Learning program will create a new kind of learning
system in which learning is an integrated problem
solving process where the learner opportunistically
assembles knowledge from many different sources,
including generating it by reasoning, in order to learn.
The challenge problem for the learner is to learn a
complex task model or generalized plan by being shown
how to perform some task only once. To accomplish this,
the learner must combine the limited observational data
with domain knowledge, world knowledge, reasoning, and
simulation (asking what-if questions) in order to
assemble the body of knowledge necessary to generate the
models. Learners in this program will not be exposed to
large numbers of training instances as a primary
learning input mechanism. More information can be found
here:
http://www.darpa.mil/ipto/solicitations/open/05-43_PIP.htm
http://www.bbn.com/News_and_Events/Press_Releases/06_07_10.html
|
|
|
Autonomic Systems Management (IBM): |
 |
|
This project builds on
our earlier work on autonomic computing, in which a
system monitored a single computer's connection to the
wide-area network. In the event of a loss of
connectivity, the system learned which adaptive
sequences of informational and remedial actions would
restore connectivity in minimum time. The new project
extends the earlier work to a broader network-management
problem. Instead of simply reacting when fatal problems
occur, it monitors performance and constantly tunes and
reconfigures the system to maximize performance in an
ongoing task. As a result, the system has the
opportunity to notice which of its actions result in
improved performance. However, this goal is challenging
because (1) the system is not told how to achieve
maximum throughput, or even (2) what level of throughput
is attainable moment to moment. Our system
constructs an ego-centric model of how its behavior
affects measurable quantities such as response time and
throughput at different points in the network. Since the
relation between configuration parameters and
performance changes over time (as a result of
differences in load, failed links, etc.), the modeling
problem is especially challenging.
To attempt to strike a more effective balance between
exploration and exploitation during learning, our system
makes the assumption that external network conditions
(modes) recur---that is, after some amount of
experience, the system assumes it is in a familiar
(though perhaps
unidentified) mode. We believe that such an assumption
is realistic (experts can often make a repair
quickly and reliably by recognizing a problem as an
instance of one he or she had previously encountered)
and perhaps necessary to make progress on the
challenging self-management problem.
|
|
Representation and Learning in Computational Game Theory
(NSF)
|
 |
|
Game theory has
emerged as the key tool for understanding and designing
complex multiagent environments such as the Internet,
systems of autonomous agents, and electronic communities
or economies.
This project is building a computational theory
of games that will provide a rich and flexible
collection of models and representations for complex
game-theoretic problems; powerful and efficient
algorithms for manipulating and learning these models;
and a deep understanding of the algorithmic and resource
issues arising in all aspects of game theory. Two of the
most important topics that have materialized to date ---
and the primary emphases of the current project --- are
the representation and efficient manipulation of
large and complex games, and new approaches to
learning in game-theoretic settings. On the topic of
representation, the project includes the development of
methods to model-structured interaction in
large-population games; the intersection of social
network theory and game theory; new representations in
repeated games; and representational issues for a
variety of equilibria types. On the topic of learning,
it includes the development of online multiplicative
update methods for large and structured games; modeling
cooperation in learning; applications of game theory to
the analysis of machine-learning methods; learning for
games that change over time; and the relationship
between game theory and reinforcement learning.
|
|
Evaluating Next Generation Probabilistic Planners (NSF)
|
|
|
The driving goal of
this project is to advance the state of the art of
probabilistic planners toward increased efficiency,
improved robustness to problem variations, and broadened
applicability to real-world problems. To accomplish this
goal, we focus on two interrelated tasks. First, we
propose and develop a methodology for evaluating
probabilistic planners. This requires studying a set of
alternatives and running experiments to correlate
evaluation metrics with desirable outcomes in
increasingly realistic domains. Our efforts have been
coordinated closely with the larger research community
through the biannual International Planning Competition
(IPC). Second, we pursue the development of our own
planning algorithms, with a particular emphasis on
approaches that exploit the relationship between
probabilistic planning and reinforcement learning.
|
|
Past Projects
|
|
|
Decision Theoretic Planning for Planetary Exploration
(NASA) |
 |
This project aims to understand how to
create efficient planning algorithms for reactive task
scheduling under resource constraints for
Mars-rover-type robots. We are using task descriptions
developed at NASA to create robot plans that decide
which scientific goals to pursue as a function of the
set of goals previously achieved and remaining
continuous resources such as battery life and daylight.
|
Learning to Create Knowledge:
Bridging the Representation Gap (DARPA)
|
 |
|
To develop more
intelligent, more adaptable, more powerful systems, we
are studying a novel approach to the autonomous creation
of knowledge by a learning system. Our key idea is the
development and exploitation of
intrinsically-motivated learning, in which the
learner strives to achieve, not only task-oriented
goals, but also the satisfaction of drives such as
curiosity, novelty-seeking, a desire to create and
experience "interesting'' stimuli, and achieving mastery
over its environment. Our learning system will combine
the development of knowledge and understanding via
intrinsic rewards with two other important elements. The
first is extrinsic, or task-oriented reward. While the
design of algorithms to maximize extrinsic reward are
well studied, we will need to show that our learner can
successfully balance its desires for externally and
internally-generated sources of reward. By modulating
the strength or "urgency'' of extrinsic reward from
slightly below to slightly above the
intrinsically-generated utilities, we believe the
standard algorithms for reward maximization will
seamlessly integrate these disparate motivations. The
problem is additionally simplified by the fact that
intrinsic rewards are transient---once curiosity is
satisfied, for example, it no longer serves as a
motivating force. |
|
Intelligent Distributed Intrusion Detection via
Collaboration
(HSARPA: Homeland Security)
|
|
|
With PnP Networks, we
are focusing on the challenge of building an automated,
autonomic DIDS (distributed intrusion detection system)
that functions across multiple administrative domains,
thereby enabling broad and complete correlation of
locally-observed patterns of network traffic and
resulting machine response.
|
|
|