Shalom Lappin

Abstracts and Files of Recent Papers

 

              Machine Learning as a Source of Insight into Universal Grammar

                                 Shalom Lappin                                                    Stuart Shieber                                                                                            
                        Department of Philosophy                       Division of Engineering and Applied Science    
                               
King's College, London                                          Harvard University                                                            
                              
shalom.lappin@kcl.ac.uk                                  shieber@deas.harvard.edu         

                                                                               and   

                                                                                         Michael Collins  
                                                   
Computer Science and Artificial Intelligence Laboratory
                                                                              
              MIT
                                                                               
mcollins@csail.mit.edu                                                                                                                                            

                                                           May 30, 2006

 

It is widely believed that  the scientific enterprise of theoretical linguistics and the engineering of language applications are

separate endeavors with little for their techniques and results to contribute to each other at the moment.  In this paper, we explore

the possibility that machine learning approaches to natural-language processing (NLP) being developed in engineering-oriented

computational linguistics (CL) may be able to provide specific scientific insights into the nature of human language. We argue

that, in principle, machine learning (ML) results could inform basic debates about language in one area at least, language acquisition,

and that, in practice, existing results may offer initial tentative support for this prospect.

 

Download the pdf file

                             

 

In H. Bunt and R. Muskes (eds.), Computing Meaning Vol. 3, Springer, 2006.

SHARDS: Fragment Resolution in Dialogue

Raquel Fernandez1, Jonathan Ginzburg2, Howard Gregory3, and Shalom Lappin4

1Institut für Linguistik
Universität
Potsdam
Karl-Liebknecht-Str.
                                                                                                                                                                           24-25D-14476 Golm
 raquel@ling.uni-potsdam.de

2Dept. of Computer Science
King's College, London
The Strand,
London WC2R 2LS
UK
jonathan.ginzburg@kcl.ac.uk

3Seminar fuer Englische Sprachwissenschaft
Georg-August-Universitaet Goettingen
howard.gregory@phil.uni-goettingen.de

4Dept. of Philosophy
King's College, London
The Strand,
London WC2R 2LS
UK
shalom.lappin@kcl.ac.uk

 

October, 2006

In this paper we present the main features of SHARDS - a Semantically-based HPSG Approach to the Resolution of Dialogue fragments. This implemented system interprets short questions (sluices) and short answers. It provides a procedure for computing the content values of clausal fragments from contextual information contained in a discourse record of previously
processed sentences.

Download the pdf file
 

 

In Proceedings of The Fifteenth Amsterdam Colloquium, Amsterdam, 2005, pp. 77-82.

                                Achieving Expressive Completeness and Computational Efficiency

                                     for  Underspecified Semantic Representations

 

                                                    Chris Fox                                              Shalom Lappin
                                      Department of Computer Science                  Department of Philosophy
                        
                              University of Essex                                    King's College, London
                                                              
foxcj@essex.ac.uk                              shalom.lappin@kcl.ac.uk

                                                                    November, 2005

 

The tension between expressive power and computational tractability poses an acute problem for theories of underspecified

semantic representation. In previous work we have presented an account of underspecified scope representations within Property

Theory with Curry Typing (PTCT), an intensional first-order theory for natural language semantics. Here we show how filters

applied to the underspecified scope terms of  PTCT permit both expressive completeness and the reduction of computational

complexity in a significant class of non-worst case scenarios.

 

Download the pdf file

 

 

To appear in Proceedings of CLIN 2004, Leiden, 2005.             

               Machine Learning and the Cognitive Basis of Natural Language

                                                                       Shalom Lappin
                                                                     
Deptartment of Computer Science
                                                                              
King's College, London
                                                                                 
lappin@dcs.kcl.ac.uk

                                                                                           July, 2005

Machine learning and statistical methods have yielded impressive   results in a wide variety of natural language processing

tasks. These advances have generally been regarded as engineering achievements. In fact it is possible to argue that the success

of machine learning methods is significant for our understanding of the cognitive basis of language acquisition and processing.

Recent work in unsupervised grammar induction is particularly relevant to this issue. It suggests that knowledge of language

can be achieved through general learning procedures, and that a richly articulated language faculty is not required to explain its

acquisition . 

 

Download the pdf file

 

 

In Proceedings of SIGdial 6, Lisbon, 2005, pp. 77-86.

            Using Machine Learning for Non-Sentential Utterance Classification

                                    Raquel Fernandez, Jonathan Ginzburg, and Shalom Lappin
                                                                Dept. of Computer Science

                                                                                  
King's College, London
                                                                    
{raquel,ginzburg,lappin}@dcs.kcl.ac.uk

                                                                                               July, 2005

In this paper we investigate the use of machine learning techniques to classify a wide range of non-sentential utterance types in dialogue,

a necessary first step in the interpretation of such fragments. We train different learners on a set of contextual features that can be extracted

from PoS information. Our results achieve an 87\% weighted f-score---a 25% improvement over a simple rule-based algorithm baseline.

 

Download the pdf file

 

In the Journal of Logic and Computation 15, 2005, pp. 129-141.

           Underspecified Interpretations in a Curry-Typed Representation Language

 

                                                    Chris Fox                                            Shalom Lappin
                                      Department of Computer Science             Department of Computer Science
                        
                              University of Essex                                    King's College, London
                                                              
foxcj@essex.ac.uk                                lappin@dcs.kcl.ac.uk

                                                                    April, 2005

 

In previous work we have developed Property Theory with Curry Typing (PTCT), an intensional first-order logic for natural language

semantics.  PTCT permits fine-grained specifications of meaning. It also supports polymorphic types and separation types. Here we extend

the type system to include product types, and use these to define a permutation function that generates underspecified scope representations

within PTCT. We indicate how filters can be added to encode constraints on possible scope readings.  Our account offers several important

advantages over other current theories of underspecification.

 

Download the pdf file

 

In Proceedings of the IWCS-6, Tilburg, 2005, pp. 115-127.

                            Automatic Bare Sluice Disambiguation in Dialogue

                          Raquel Fernandez, Jonathan Ginzburg, and Shalom Lappin
                                                      Dept. of Computer Science

                                                                      
King's College, London
                                                    
{raquel,ginzburg,lappin}@dcs.kcl.ac.uk

                                                       January, 2005

The capacity to recognize and interpret sluices (bare wh-phrases that exhibit a sentential meaning) is essential to maintaining cohesive interaction  
between human users and a machine interlocutor in a dialogue system. In this paper we present a machine learning approach to sluice disambiguation
in dialogue. Our experiments, based on solid theoretical considerations, show that applying machine learning techniques using a compact set of       
features  that can be automatically identified from PoS labelling in a corpus can be an efficient tool for disambiguating among possible sluice
interpretations.

Download the pdf file

 

In Proceedings of COLING 2004, Geneva, 2004, pp. 240-246.

                            Classifying Ellipsis in Dialogue: A Machine Learning Approach

                                        Raquel Fernandez, Jonathan Ginzburg, and Shalom Lappin
                                                                      Dept. of Computer Science

                                                                                          
King's College, London
                                                                              
{raquel,ginzburg,lappin}@dcs.kcl.ac.uk

                                                                     June, 2004

We present Property Theory with Curry Typing (PTCT), an intensional first-order logic for natural language semantics. PTCT permits fine-grained
specifications of meaning. It also supports polymorphic types and separation types. We develop an intensional number theory within PTCT in order
to represent proportional generalized quantifiers like most. We use the type system and our treatment of generalized quantifiers in natural language
to construct a type-theoretic approach to pronominal anaphora that avoids some of the difficulties that undermine previous type-theoretic analyses
of this phenomenon.

Download the pdf file

 

In the Logic Journal of the International Group for Pure and Applied Logic 12, 2004, pp. 135-168.

  An  Expressive First-Order Logic with Flexible Typing for Natural Language Semantics

                                                          Chris Fox                                            Shalom Lappin
                                          Department of Computer Science             Department of Computer Science
                             
                              University of Essex                                    King's College, London
                                                                   
foxcj@essex.ac.uk                                lappin@dcs.kcl.ac.uk

                                                                                                              April, 2004

We present Property Theory with Curry Typing (PTCT), an intensional first-order logic for natural language semantics. PTCT permits fine-grained
specifications of meaning. It also supports polymorphic types and separation types. We develop an intensional number theory within PTCT in order
to represent proportional generalized quantifiers like most. We use the type system and our treatment of generalized quantifiers in natural language
to construct a type-theoretic approach to pronominal anaphora that avoids some of the difficulties that undermine previous type-theoretic analyses of
this phenomenon.

Download the pdf file

 

In  N. Nicolov, R. Mitkov, G. Angelova, and K. Botcheva (eds.),  Recent Advances in Natural Language Processing III:  Selected
    Papers from RANLP 2003
, John Benjamins, Amsterdam, 2004, pp. 1-16.

.                                                  A Type-Theoretic Approach to Anaphora and Ellipsis

                                                          Chris Fox                                            Shalom Lappin
                                         Department of Computer Science             Department of Computer Science
                             
                             University of Essex                                 King's College, London
                                                              
foxcj@essex.ac.uk                                lappin@dcs.kcl.ac.uk

                                                                                                          May, 2004

We present an approach to anaphora and ellipsis resolution in which pronouns and elided structures are interpreted by the dynamic identification in
discourse of type constraints on their semantic representations. The content of these conditions is recovered in context from an antecedent expression.
The constraints define separation types (sub-types) in Property Theory with Curry Typing (PTCT), an expressive first-order logic with Curry typing
that we have proposed as a formal framework for natural language semantics.

Download the pdf file

   

In G. Jaeger, P. Monachesi, G. Penn, and S. Wintner (eds.), Proceedings of Formal Grammar 2003, Vienna, pp. 89-102.

                      Doing Natural  Language Semantics in an Expressive First-Order Logic
                                                                              
with Flexible  Typing

                                                        Chris Fox                                            Shalom Lappin
                                        Department of Computer Science             Department of Computer Science
                          
                             University of Essex                                  King's College, London
                                                         
foxcj@essex.ac.uk                                  lappin@dcs.kcl.ac.uk

                                                                                  May, 2003

We present Property Theory with Curry Typing (PTCT), an intensional  first-order logic for natural language semantics. PTCT permits
fine-grained specifications of meaning. It also supports polymorphic  types and separation types (separation types are also known as sub-types).
We develop an intensional number theory  within PTCT in order to represent proportional generalized quantifiers like most. We use the
type system and our treatment of generalized quantifiers in natural language to construct a type-theoretic approach to pronominal anaphora that
avoids some of the difficulties that undermine previous type-theoretic analyses of this phenomenon.

Download the ps file
Download the pdf file

 
   
In A. Branco, A, McEnery, and R. Mitkov (eds.), Anaphora Processing: Linguistic, Cognitive, and Computational
    Modelling, John Benjamins,
Amsterdam, 2005, pp. 3-16.

                 A Sequenced Model of Anaphora and Ellipsis Resolution

                                                               Shalom Lappin
                                                                    
Dept. of Computer Science
                                                                       
King's College, London
                                                                         
lappin@dcs.kcl.ac.uk

                                                                                  June, 2003

I compare several types of knowledge-based and knowledge-poor approaches to anaphora and ellipsis resolution. The former are able to
capture fine-grained distinctions that depend on lexical meaning and real world knowledge, but they are generally not robust. The latter
show considerable promise for yielding wide coverage systems.
However, they consistently miss a small but significant subset of cases that
are not accessible to rough-grained techniques of intepretation. I propose a sequenced model which first applies the most computationally
efficient and inexpensive methods to resolution and then progresses successively to more costly techniques to deal with cases not handled
by previous modules. Confidence measures evaluate the judgements of each component in order to determine which instances of anaphora
or ellipsis are to be passed on to the next, more fine-grained subsystem.

Download the ps file
Download the pdf file

   

In G. Alberti, K. Balough, and P. Dekker (eds.), Proceedings of the Seventh Symposium for Logic and Language, Pecs, Hungary, 2002, pp.37-46.

                     A Higher-Order Fine-Grained Logic for Intensional Semantics

                                                              Chris Fox                                             Shalom Lappin
                                      Department of Computer Science            Department of Computer Science
                          
                        University of Essex                                   King's College, London
                                             foxcj@essex.ac.uk                                  lappin@dcs.kcl.ac.uk
                                                                                                   
and
                                                                             Carl Pollard
                                                                    Department of Linguistics
                                                                  Ohio State University
                                                                   pollard@ling.ohio-state.edu

                                                                              June, 2002

This paper describes a higher-order logic with fine-grained  intensionality (FIL). Unlike traditional Montogovian type theory,
intensionality is treated as basic, rather than derived through possible worlds. This allows for fine-grained intensionality without
impossible worlds. Possible worlds and modalities are defined algebraically. The proof theory for \FIL is given as a set of
tableau rules, and an algebraic model theory is specified.  The proof theory is shown to be sound relative to this model theory.
FIL avoids many of the problems created by classical course-grained intensional logics that have been used in formal and
computational semantics.

Download the ps file
Download the pdf file

  
   
In G. Alberti, K. Balough, and P. Dekker (eds.), Proceedings of the Seventh Symposium for Logic and Language, Pecs, Hungary, 2002, pp. 47-56.

                                             Intensional First-Order Logic with Types

                                                             Chris Fox                                     Shalom Lappin
                                               Department of Computer Science       Department Computer Science
                                                                 
University of Essex                        King's College, London
                                                        foxcj@essex.ac.uk                           lappin@dcs.kcl.ac.uk
                                                                                            and
                                                                                     Carl Pollard
                                                                          Department of Linguistics
                                                                             Ohio State University
                                                                         pollard@ling.ohio-state.edu

                                                                                                           June, 2002

The paper presents Property Theory with Curry Typing (PTCT) where the language of terms and well-formed formulae are
joined by a  language of types. In addition to supporting fine-grained  intensionality, the basic theory is essentially first-order,
so that implementations using the theory can apply standard first-order theorem proving techniques. Some extensions to the
type theory are discussed, type polymorphism, and enriching the system with sufficient number theory to account for
quantifiers of proportion, such as most.

Download the ps file
Download the pdf file

 
   
In
S. Wintner (ed.), Proceedings of the Workshop on Natural Language Understanding and Logic Programming, Copenhagen, 2002, pp. 175-192.
                                  
                                 First-Order Curry-Typed Logic for Natural

                                                                                
Language Semantics

                                                           Chris Fox                                     Shalom Lappin
                                           Department of Computer Science       Department Computer Science
                                                          
University of Essex                         King's College, London
                                                   foxcj@essex.ac.uk                           lappin@dcs.kcl.ac.uk
                                                                                       and
                                                                                Carl Pollard
                                                                      Department of Linguistics
                                                                         Ohio State University
                                                                     pollard@ling.ohio-state.edu
                                                                                                
                                                                                                     May, 2002

The paper presents Property Theory with Curry Typing (PTCT) where the language of terms and well-formed formulae are joined by a  language of types. In addition to supporting fine-grained  intensionality, the basic theory is essentially first-order, so that implementations using the theory can apply standard first-order theorem proving techniques. The paper sketches a system of tableau rules that implements the theory. Some extensions to the type theory are discussed, including the possibility of adding type polymorphism, which provides a useful analysis of conjunctive terms. Such terms can be given a single polymorphic type that expresses the fact that they can conjoin phrases of any one type, yielding an expression of the same type.

Download the ps file
Download the pdf file

 
 
An updated version of a paper in P. de Groote, G. Morrill, and C. Retore (eds.) (2001), Logical Aspects of Computational Linguistics, Springer Lecture Notes in Artificial Intelligence, Berlin and New York.

A Framework for the Hyperintensional Semantics of Natural Language with Two Implementations

                                                                                         Chris Fox and Shalom Lappin
                                                                                                                
Dept. of Computer Science
                                                                                                                    
King's College, London
                                                                                                         
The Strand, London WC2R 2LS
                                                                                                                         
United Kingdom
                                                                                                            
{foxcj,lappin}@dcs.kcl.ac.uk

                                                                                                                               April, 2001

In this paper we present a framework for constructing hyperintensional semantics for natural language. On this approach, the axiom of extensionality is discarded from the axiom base of a logic. Weaker conditions are specified for the connection between equivalence and identity which prevent the reduction of the former relation to the latter. In addition, by axiomatising
an intensional number theory we can provide an internal account of proportional cardinality quantifiers, like most. We use a (pre-)lattice defined in terms of a (pre-)order that models the entailment relation. Possible worlds/situations/indices are then prime filters of propositions in the (pre-)lattice. Truth in a world/situation is then reducible to membership of a prime filter. We show how this approach can be implemented within (i) an intensional  higher-order type theory, and (ii) first-order property theory.

Download the ps file (Copyright Springer-Verlag)


 
 

in J. van Kuppevelt and R. Smith (eds.) (2003), Current and NewDirections in Discourse and Dialogue, Kluwer, pp.161-181.

Full Paraphrase Generation for Fragments in Dialogue

Christian Ebert, Shalom Lappin,
Department of Computer Science
King's College, London
{ebert, lappin}@dcs.kcl.ac.uk},{howard.gregory@kcl.ac.uk}

Howard Gregory
Seminar fuer Englische Sprachwissenschaft
Georg-August-Universitaet Goettingen
howard.gregory@phil.uni-goettingen.de

and

Nicolas Nicolov
IBM T. J. Watson Research Center

nicolas@us.ibm.com

July, 2002

Using SHARDS -- a semantically-based HPSG resolution of dialogue fragment system -- we will show how to generate full paraphrases for fragments in dialogue. We adopt a template-filler approach that does not require deep generation from an underlying semantic representation. Instead it reuses the results of the parse and interpretation process to dynamically compute templates and to update fillers as the dialogue proceeds. This recycling of already available syntactic and phonological information makes generation efficient, as it reduces the operations of the generator to mere string manipulations.

Download the ps file
Download the pdf file