Statistical Natural Language Processing
We're on TV! Well, actually the seminar is now being held over a
video link between UNC and Duke. The rooms are 08 Peabody at UNC and
North Building 130a at Duke. Time: 2pm-3:20pm.
Researchers creating practical systems that manipulate human languages
are turning more frequently to statistical or corpus-based approaches.
The goal of this seminar is familiarize participants with some of the
applications and techniques that define this emerging and exciting
Sample natural-language applications include:
Sample computational techniques include:
- Syntax: Part-of-speech tagging, parsing, language modeling,
prepositional-phrase attachment, spelling and grammar correction,
word segmentation, term and name identification, morphological
- Semantics: Word-sense disambiguation, word clustering, lexicon
acquisition, semantic analysis, database-query mapping
- Discourse: Information extraction, anaphora resolution, discourse
segmentation, event categorization
- Machine Translation: Bilingual text alignment, bilingual dictionary
construction, lexical, syntactic, and semantic transfer
- Information retrieval: information extraction, topic routing, text
filtering, cross-language methods
Participants will read and discuss research papers from recent
journals and conference proceedings.
- Statistical: n-gram models, hidden Markov models, probabilistic
context-free grammars, Bayesian networks
- Symbolic: Decision trees, rule-based, case-based, inductive logic
programming, automata and grammar induction
- Neural Network & Evolutionary: recurrent networks, self-organizing
maps, genetic algorithms
Papers to Discuss
Eugene Charniak, Curtis Hendrickson, Neil Jacobson, and Mike
for part-of-speech tagging. In Proceedings of the Eleventh
National Conference on Artificial Intelligence, Menlo Park: AAAI
Press/MIT Press (1993) 784-789.
L.R. Rabiner and B.H. Juang, "An Introduction to Hidden Markov
Models", IEEE ASSP Magazine, Jan., 1986, pp. 4-16.
Glenn Carroll and Eugene Charniak. Two
experiments on learning probabilistic dependency grammars from
corpora . Technical Report CS-92-16, Brown University, Department
of Computer Science, Providence, RI, 1992.
Eugene Charniak. Statistical
parsing with a context-free grammar and word statistics. In
Proceedings of the Fourteenth National Conference on Artificial
Intelligence AAAI, Press/MIT Press, Menlo Park, 1997.
Eugene Charniak, Glenn Carroll, John Adcock, Anthony Cassandra,
Yoshihiko Gotoh, Jeremy Katz, Michael Littman and John McCann. Taggers
for parsers. Artificial Intelligence, 85 (1--2): 45--57,
Eric Brill. Transformation-based
error-driven learning and natural language processing: A case study in
part of speech tagging. Computational Linguistics,
Eric Brill. Learning to
parse with transformations. In Recent Advances in Parsing
Technology, Kluwer Academic Publishers, 1996.
Eric Brill. Unsupervised learning
of disambiguation rules for part of speech tagging. To appear in
Natural Language Processing Using Very Large Corpora. Kluwer Academic
Adwait Ratnaparkhi. A maximum
entropy part-of-speech tagger. In Proceedings of the Empirical
Methods in Natural Language Processing Conference, May 17-18,
Steven Abney. Statistical
methods and linguistics. In: Judith Klavans and Philip Resnik
(eds.), The Balancing Act. The MIT Press, Cambridge, MA, pages 2-26
(Chapter 1), 1996.
Michael W. Berry, Susan T. Dumais, and G. W. O'Brien. Using
linear algebra for intelligent information retrieval. SIAM
Review, 37(4), 1995, 573-595.
Susan T. Dumais, Thomas K. Landauer. and Michael L. Littman. Automatic
cross-linguistic information retrieval using Latent Semantic
Indexing. In SIGIR'96 - Workshop on Cross-Linguistic
Information Retrieval, pp. 16-23, August 1996.
Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della
Pietra, Frederick Jelinek, John D. Lafferty, Robert L. Mercer, and
Paul S. Roossin. A statistical approach to machine translation.
Computational Linguistics, Volume 16, Number 2, June
Peter F. Brown, Jennifer C. Lai, and Robert L. Mercer. Aligning
sentences in parallel corpora. In Proceedings of the Conference of
the Association of Computational Linguistics, Berkeley,
pp. 169-176, 1991.
Stanley F. Chen. Aligning
sentences in bilingual corpora using lexical information. In
Proceedings of the 31st Annual Meeting of the Association for
Computational Linguistics, pages 9-16, 1993.
Pascale Fung and Kathleen McKeown.
word and term translation aid using noisy parallel corpora across
language groups, The Machine Translation Journal, Special
Issue on New Tools for Human Translators, 53--87, 1996.
David Yarowsky. Unsupervised word
sense disambiguation rivaling supervised methods. In
Proceedings of the 33rd Annual Meeting of the Association for
Computational Linguistics, Cambridge, MA, pp. 189-196,
Hinrich Schütze and Jan O. Pedersen. Information
retrieval based on word senses. In Fourth Annual Symposium
on Document Analysis and Information Retrieval, pages 161-175,
Las Vegas NV, 1995.
Thomas K. Landauer and Michael L. Littman. Fully automatic
cross-language document retrieval using latent semantic
indexing. In Proceedings of the Sixth Annual Conference of the
UW Centre for the New Oxford English Dictionary and Text
Research, pp. 31-38. UW Centre for the New OED and Text Research,
Waterloo Ontario, October 1990.
D. Beeferman, A. Berger, and J. Lafferty. Text
segmentation using exponential models. In Proceedings of the
Second Conference On Empirical Methods in NLP, Providence, RI,
Eric Brill. Some Advances
in Transformation-Based Part of Speech Tagging. In Proceedings
of the 12th National Conference on Artificial
Intelligence. Volume 1 (AAAI94-1). Seattle, WA, USA, 1994.
J. Lafferty, D. Sleator, and D. Temperley. Grammatical
trigrams: A probabilistic model of link grammar. In
Proceedings of the AAAI Fall Symposium on Probabilistic Approaches
to Natural Language, Cambridge, MA, October 1992.
Philip Resnik and I. Dan Melamed. Semi-automatic
acquisition of domain-specific translation lexicons. In
Proceedings of the 5th ANLP Conference, 1997
Philip Resnik. Disambiguating noun
groupings with respect to WordNet senses. In Proceedings of
the 3rd Workshop on Very Large Corpora, MIT, 30 June 1995.
I. Dan Melamed. Automatic
discovery of non-compositional compounds in parallel data. In
Proceedings of the 2nd Conference on Empirical Methods in Natural
Language Processing (EMNLP'97), Providence, RI, 1997.
Eric Brill. Automatic grammar
induction and parsing free text: A transformation-based approach.
In ACL 1993.
Eugene Charniak. Statistical
techniques for natural language parsing. AI
Adam L. Berger, Stephen A. Della Pietra, and Vincent J. Della
maximum entropy approach to natural language processing.
Computational Linguistics, 22 (1): 39--68, 1996.
Adwait Ratnaparkhi. A simple
introduction to maximum entropy models for natural language
processing. Technical Report 97-08, Institute for Research in
Cognitive Science, University of Pennsylvania, 1997.
Hinrich Schütze. Dimensions
of meaning. In Proceedings of Supercomputing, pages 787-796,
Minneapolis MN, 1992.
Pascale Fung and Kathleen McKeown. Aligning
noisy parallel corpora across language groups: Word pair feature
matching by dynamic time warping. In AMTA 94: Partnerships in
Translation Technology, Columbia, Maryland: Oct. 1994,
Thomas K. Landauer and Susan T. Dumais. A
solution to Plato's problem: The latent semantic analysis theory of
acquisition, induction and representation of
knowledge. Psychological Review, 1997, 104 (2),
Eric Sven Ristad and Robert G. Thomas. New techniques for
context modeling. In Proceedings of the 33rd Annual Meeting
of the ACL, Cambridge, MA, June 27-30, 1995.
I. Dagan and A. Itai. Word sense
disambiguation using a second language monolingual corpus.
Computational Linguistics, 20, 563-596, 1994.
D. Yarowsky. Decision lists for
lexical ambiguity resolution: Application to accent restoration in
Spanish and French. In Proceedings of the 32nd Annual Meeting
of the Association for Computational Linguistics, Las Cruces, NM,
pp. 88-95, 1994.
Papers Without Links
Eugene Charniak: Natural language learning. ACM Computing Surveys,
27(3), Sept 1995. 317-319.
Kenneth W. Church. One Term Or Two? Proceedings of the 18th Annual
International Conference on Research and Development in Information
Retrieval (SIGIR'95) (SIGIR95). Seattle, WA, USA, 1995. 310-318.
Gale, W., K. Church, and D. Yarowsky. ``A Method for Disambiguating
Word Senses in a Large Corpus.'' Computers and the Humanities. 26,
pp. 415-439, 1992.
M. E. Lesk, `` Automatic Sense Disambiguation Using Machine Readable
Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone,''
Proc. 1986 SIGDOC Conference, Toronto, Ontario, June, 1986
Philip Resnik. Corpus Linguistics
Hinrich Schutze. Don Hindle. Jelinek.
Eugene Charniak, 1997 AAAI best paper. Charniak and Goldman, Bayes
Nets in story understanding.
Church's tagging stuff. And Good Turing.
Peter Brown et al. Word sense disambiguation using statistical
methods. Also Class-based n-gram models of natural language. And the
classic automatic translation work.
Ido Dagan, Alon Itai. Word sense disambiguation using a second
language monolingual corpus.
Stuff in The Computation and
Language E-Print Archive: ``Cue Phrase Classification Using
Machine Learning'' (Litman), ``Stochastic Attribute-Value Grammars''
(Abney), ``Learning string edit distance'' (Ristad and Yianilos),
``Unsupervised Language Acquisition'' (de Marcken), ``Nonuniform
Markov models'' (Ristad and Thomas), ``Comparative Experiments on
Disambiguating Word Senses: An Illustration of the Role of Bias in
Machine Learning'' (Mooney), ``Hybrid language processing in the
Spoken Language Translator'' (Rayner and Carter), ``Automatic
Extraction of Subcategorization from Corpora'' (Briscoe and Carroll),
``Fast Statistical Parsing of Noun Phrases for Document Indexing''
(Zhai), ``A Maximum Entropy Approach to Identifying Sentence
Boundaries'' (Reynar and Ratnaparkhi), ``Machine Transliteration''
(Knight and Grahel), ``Sense Tagging: Semantic Tagging with a
Lexicon'' (Wilks and Stevenson), ``A Corpus-Based Approach for
Building Semantic Lexicons'' (Riloff and Shepherd), ``Mistake-Driven
Learning in Text Categorization'' (Dagan, Karov, and Roth),
``Distinguishing Word Senses in Untagged Text'' (Pedersen and Bruce),
``A Linear Observed Time Statistical Parser Based on Maximum Entropy
Title: Combining Multiple Methods for the Automatic Construction of
Multilingual WordNets, Authors: Jordi Atserias (Universitat
Politecnica de Catalunya), Salvador Climent (Universitat de
Barcelona), Xavier Farreres,German Rigau (Universitat Politecnica de
Catalunya), Horacio Rodriguez (Universitat Politecnica de Catalunya)
Comments: 7 pages, 4 postscript figures Journal-ref: EACL/ACL 97
Madrid pages 48-55, Paper: cmp-lg/9709003
Brown TRs: CS-94-07 Eugene Charniak and Glenn Carroll,
``Context-Sensitive Statistics for Improved Grammatical Language
Models'', CS-94-08 Glenn Carroll and Eugene Charniak, ``Combining
Grammars For Improved Learning''
Michael L. Littman
Last modified: Fri Oct 3 13:52:47 EDT 1997
by Michael Littman, email@example.com