An architecturally-based theory of human sentence comprehens

更新时间:2023-04-12 12:03:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

An Architecturally-based Theory of Human

Sentence Comprehension

Richard Lawrence Lewis

December18,1993

CMU-CS-93-226

Computer Science Department

Carnegie Mellon University

Pittsburgh,PA

Submitted in partial ful?llment of the requirements

for the degree of Doctor of Philosophy.

Thesis Committee:

Jill Fain Lehman,Chair

Jaime Carbonell,Computer Science Department

Marcel Just,Department of Psychology

Bradley Pritchett,Department of Philosophy

Allen Newell,Former Chair

c1993Richard L.Lewis

This research was sponsored in part by the Wright Laboratory,Aeronautical Systems Center,Air Force Materiel Command,USAF,and the Advanced Research Projects Agency(ARP A)under grant number F33615-93-1-1330,and in part by a National Science Foundation fellowship.

The views and conclusions contained in this document are those of the author and should not be interpreted as necessarily representing the of?cial policies or endorsements,either expressed or implied,of Wright Laboratory,NSF,or the 11cfcca9d1f34693daef3e21ernment.

Keywords:arti?cial intelligence,cognitive simulation,natural language processing, parsing and understanding,Soar

Abstract

This thesis presents NL-Soar,a detailed computational model of human sentence compre-hension that accounts for a broad range of psycholinguistic phenomena.NL-Soar provides in-depth accounts of structural ambiguity resolution,garden path effects,unproblematic ambiguities,parsing breakdown on dif?cult embeddings,acceptable embeddings,immedi-acy of interpretation,and the time course of comprehension.The model explains a variety of both modular and interactive effects,and shows how learning can affect ambiguity resolution behavior.In addition to accounting for the qualitative phenomena surrounding parsing breakdown and garden path effects,NL-Soar explains a wide range of contrasts between garden paths and unproblematic ambiguities,and dif?cult and acceptable embed-dings:the theory has been applied in detail to over100types of structures representing these contrasts,with a success rate of about90%.The account of real-time immediacy includes predictions about the time course of comprehension and a zero-parameter prediction about the average rate of skilled comprehension.Finally,the theory has been successfully applied to a suggestive range of cross-linguistic examples,including constructions from head-?nal languages such as Japanese.

NL-Soar is based on the Soar theory of cognitive architecture,which provides the underlying control structure,memory structures,and learning mechanism.The basic principles of NL-Soar are a result of applying these architectural mechanisms to the task of ef?ciently comprehending language in real-time.Soar is more than an implementation language for the system:it plays a central theoretical role and accounts for many of the model’s novel empirical predictions.

iii

iv

To my parents

and

the memory of Allen Newell

v

vi

Contents

Acknowledgments xvii

1Introduction1

1.1The core sentence-level phenomena2

1.2Architectural basis4

1.3An implemented system5

1.4Brief preview of theory and major results6

1.5Reader’s guide to the thesis6

2Human Sentence Comprehension:Phenomena and Previous Work7

2.1The products of comprehension7

2.1.1Referential representation8

2.1.2Semantic representation11

2.1.3Syntactic representation12

2.1.4Summary13

2.2Immediacy and the time course of comprehension13

2.2.1Immediacy of syntactic parsing14

2.2.2Immediacy of semantic interpretation16

2.2.3Immediacy of referential processing17

2.2.4Theories of immediacy18

2.3Structural ambiguity resolution19

2.3.1Structural preferences19

2.3.2Lexical preferences23

2.3.3Semantic and contextual effects24

2.3.4Limited parallelism25

2.3.5Theories of ambiguity resolution27

2.4Garden path effects and unproblematic ambiguities35

2.4.1Garden path phenomena de?ned35

2.4.2Unproblematic ambiguities36

2.4.3Some general garden path phenomena43

vii

2.4.4Garden path theories44

2.5Parsing breakdown and acceptable embeddings49

2.5.1Parsing breakdown de?ned50

2.5.2Acceptable embeddings53

2.5.3Some general parsing breakdown phenomena53

2.5.4Theories of parsing breakdown57

2.6Summary of the phenomena65

3The NL-Soar Theory69

3.1Preliminaries:architectures and Soar69

3.1.1What is a cognitive architecture?70

3.1.2The Soar architecture70

3.1.3Language and architecture75

3.2The basic structure of NL-Soar76

3.2.1Comprehension operators and the real time constraint76

3.2.2The structure of comprehension operators77

3.2.3From deliberation to recognition:comprehension as a skill78

3.3The utterance model78

3.3.1What the utterance model represents79

3.3.2How the utterance model represents(or,why call it a model?)80

3.3.3Constructing the utterance model82

3.3.4Chunking new u-constructors96

3.4The situation model99

3.4.1What the situation model represents100

3.4.2How the situation model represents100

3.4.3Constructing the situation model and chunking s-constructors101

3.5Reference resolution105

3.5.1Reference resolution as recognition106

3.5.2Building recognition chunks106

3.5.3Long term memory of content:reconstructive108

3.5.4General features of reference resolution110

3.6The control structure of comprehension111

3.6.1Syntactic ambiguity111

3.6.2Semantic ambiguity112

3.6.3Functional parallelism114

3.7Evaluating a space of alternative models116

3.7.1The space116

3.7.2The evaluation118

viii

3.8Summary of the theory121

4Structural Ambiguity Resolution123

4.1Modular effects123

4.1.1The limit of two attachment sites124

4.1.2An emergent recency preference125

4.1.3Object/subject and other“act vs.do-nothing”ambiguities127

4.1.4Time pressure:failure to bring knowledge to bear128

4.1.5Masking effects:linguistic Einstellung128

4.2Interactive effects130

4.3Summary:The NL-Soar theory of ambiguity resolution134

4.4Is NL-Soar modular?135

4.5General discussion137

5Parsing Breakdown and Acceptable Embeddings139

5.1The NL-Soar theory of parsing breakdown139

5.2Predictions on the PB and AE collection141

5.2.1Right and left-branching141

5.2.2Center-embedded relatives143

5.2.3Subject sentences146

5.2.4Complements of nominals149

5.2.5Clefts151

5.2.6Though-preposing153

5.2.7Pied-piping154

5.3Accounting for the major qualitative phenomena156

5.4Summary and general discussion157

6Garden Path Effects and Unproblematic Ambiguities161

6.1The NL-Soar theory of garden path effects161

6.2Predictions on the GP/UPA collection165

6.2.1Object/subject and object/speci?er ambiguities165

6.2.2Complement/adjunct ambiguities171

6.2.3Main verb/reduced relative ambiguities173

6.2.4Lexical ambiguities176

6.2.5Filler-gap ambiguities185

6.2.6Small clauses,coordination and other miscellany187

6.3Accounting for the major qualitative phenomena190

6.4Summary and general discussion191

ix

7Immediacy of Interpretation and the Time Course of Comprehension195

7.1NL-soar as an immediate interpreter195

7.2Satisfying the real-time constraint196

7.3Predictions about the time course of comprehension200

7.4Summary and discussion202

8Cross-linguistic Phenomena205

8.1Parsing breakdown on verb-?nal structures205

8.2Garden paths and unproblematic ambiguities208

8.2.1Japanese208

8.2.2Korean210

8.2.3Mandarin Chinese211

8.2.4Hebrew212

8.2.5German212

8.3Summary213

9General Discussion and Conclusion215

9.1Summary:the model and its predictions215

9.2The role of the architecture216

9.3Some theoretical issues219

9.3.1The A/R set and the magic number two219

9.3.2Inpidual differences221

9.3.3Grammar and parser222

9.3.4Learning223

9.4Related theoretical work224

9.4.1READER and CC READER224

9.4.2Pritchett’s model225

9.4.3Gibson’s model226

9.5Challenging areas for future work227

9.6Contributions and conclusion228 Bibliography229

x

List of Tables

2.1The products of comprehension.14

2.2Some structural parsing preferences.21

2.3Some studies demonstrating modularity effects.23

2.4Some studies demonstrating interactive effects.26

2.5A collection of garden path constructions(part1of3).37

2.6A collection of garden path constructions(part2of3).38

2.7A collection of garden path constructions(part3of3).39

2.8A collection of unproblematic ambiguities(part1of3).40

2.9A collection of unproblematic ambiguities(part2of3).41

2.10A collection of unproblematic ambiguities(part3of3).42

2.11A collection of constructions causing parsing breakdown(part1of2).51

2.12A collection of constructions causing parsing breakdown(part2of2).52

2.13A collection of acceptable embedded structures(part1of3).54

2.14A collection of acceptable embedded structures(part2of3).55

2.15A collection of acceptable embedded structures(part3of3).56

2.16Structural metrics for parsing breakdown.58

2.17Architectural theories of parsing breakdown.61

3.1The time scale of Soar processes.74

3.2Mapping comprehension functions onto problem space functions.114

3.3Dependencies in the parameter space.118

3.4Independent evaluation of parameter/value pairs.119

3.5Evaluation of parameter/value pairs,taking into account the dependencies.120

3.6Evaluation of parameter/value pairs in restricted subspace.120

5.1Summary of predictions on AE/PB collection.158

6.1Summary of predictions on the UPA/GP collection.192

8.1Summary of NL-Soar’s cross-linguistic coverage.213

9.1The basic characteristics of NL-Soar.216

xi

9.2Summary of NL-Soar’s predictions.218 9.3Summary of NL-Soar’s predictions on the GP/UP A/PB/AE collections219 9.4Varieties of learning in NL-Soar.223

xii

List of Figures

2.1How to minimally attach a PP20

2.2Dominance relations in minimal commitment models34

2.3How the On-line Locality Constraint predicts a garden path47

2.4Computing maximal local nonterminal counts60

2.5How the phenomena mutually constrain the theory67

3.1The Soar architecture72

3.2Recognitional comprehension77

3.3Comprehension as deliberation and recognition79

3.4X-bar schema80

3.5X-bar phrase structure for a complex NP81

3.6Attribute-value structures for the utterance model82

3.7The A/R set during John hit the ball.84

3.8Results of lexical access85

3.9Building the utterance model with link operators87

3.10Checking constraints for proposed links88

3.11An adjoined structure88

3.12Traces in syntactic structure89

3.13Phrase structure for a long distance dependency90

3.14How structural parallelism arises in the utterance model93

3.15Repairing the inconsistency with the snip operator94

3.16Repairing a complement inconsistency with snip95

3.17The problem spaces for building u-constructors97

3.18An example of a situation model101

3.19Constructing the situation model103

3.20The problem spaces for building s-constructors104

3.21Resolve operators recognize the situation model106

3.22Learning new reference resolution recognition chunks107

3.23Recall by reconstruction109

3.24Syntactic ambiguity manifested as multiple u-constructor112

xiii

3.25Resolving syntactic ambiguity recognitionally113 3.26Parallelism across functions in NL-Soar116

4.1First pass through The car examined.131 4.2Carefully re-comprehending The car examined.132 4.3Comprehending The car examined after learning.134

5.1Growth of right branching structures142 5.2Left branching143 5.3Singly-embedded object relative144 5.4Structure for Wh-question144 5.5The wrong way to analyse subject sentences146 5.6Topicalization analysis of sentence subjects147 5.7Nominalized subject sentences148 5.8Acceptable embedded subject sentences149 5.9Embedded complements of nominals151 5.10A pseudo-cleft152 5.11Though-preposing154 5.12Pied-piping155 5.13Qualitative phenomena of parsing breakdown(from Chapter2).156

6.1Repairing an unproblematic subject/object ambiguity163 6.2Failure to repair a subject/object ambiguity164 6.3A reverse subject/object garden path167 6.4Repairing an unproblematic object/speci?er ambiguity168 6.5A garden path involving a double complement168 6.6A double-object/relative garden path169 6.7Repairing a double object ambiguity170 6.8An object/object garden path170 6.9Repairing an unproblematic complement/adjunct ambiguity171 6.10Repairing an unproblematic predicate complement/modi?er ambiguity173 6.11The main verb reading of The horse raced past the barn.174 6.12The reduced relative reading of The horse raced past the barn.174 6.13The main verb/reduced relative garden path175 6.14An unproblematic reduced relative ambiguity176 6.15Repairing an unproblematic noun/verb ambiguity177 6.16Repairing a noun/verb ambiguity preceded by an adjective/noun179 6.17A garden path involving a noun/adjective ambiguity180 6.18Repairing a pronoun/determiner ambiguity182

xiv

6.19A main verb/auxiliary garden path184 6.20Repairing an in?ection marker/preposition ambiguity186 6.21Triggering the repair of a?ller-gap ambiguity186 6.22Repairing a VP small clause ambiguity188 6.23A repair involving coordination189 6.24Multiple compounding189 6.25General garden path phenomena190

7.1Ef?ciency increase due to chunk transfer.198 7.2Data from Instructo-Soar199 7.3Elaboration phase of u-constructor201 7.4Immediacy and time-course summary203

8.1Head-?nal phrase structure.206 8.2Repairing a Japanese unproblematic ambiguity211

9.1Qualitative comparison of NL-Soar theory to existing theories217

xv

xvi

Acknowledgments

I must?rst acknowledge my mentor and friend for?ve years,Allen Newell.Though Allen did not live to see the completion of this thesis,if there is one hope I have for this work,it is that his in?uence would still be seen clearly in it.This thesis is not only a product of his scienti?c ideas,but his endless patience with me as well.Working with Allen was exciting from day one,and the excitement never wore off.I am deeply thankful for the time I was given to spend with him,and dedicate this thesis to his memory.

Jill Lehman’s role as colleague and collaborator on the project has been indispensable. Jill is a student of language in a way that I—or Allen for that matter—have never been.Jill also had the unenviable task of assuming the role of my faculty advisor after Allen died. Had I been in her shoes,I wouldn’t have wanted to do it,but I’m glad she was there.

I was delighted to be able to interest Jaime Carbonell,Marcel Just,and Brad Pritchett in this work,enough to serve on my thesis committee.Through their feedback,and their own research,they have improved the quality of this thesis.Brad in particular has had a major impact on this work.His gracious lending of linguistic expertise over the past two years helped push the model in important new directions.

Over the years a number of people have been associated with the NL-Soar group,and have made working on the project great fun.Besides Allen and Jill,Scott Huffman,Greg Nelson,Robert Rubinoff,and Shirley Tessler have all contributed in various ways.NL-Soar got its start before I showed up as a grad student:Gregg Yost worked with Allen on NL-Soar in preparation for Allen’s William James lectures.Gregg was a great help early on as I was trying to?gure out how Soar worked.

The local Soar community at CMU provided a terri?c environment in which to do cognitive science.The larger Soar community,from Los Angeles to Groningen,has also been a continuous source of ideas and encouragement(not to mention travel opportunities!). Although we’ve never been able to deliver that elusive NL module that the community wants,it’s been great to have a number of people around the world actually care about how the work is proceeding.

I knew almost immediately that the Computer Science Department at Carnegie Mellon was the place I wanted to spend my years as a graduate student,and never once regretted the decision to come here.Although the department continues to experience growth pains, it has been hard for me to imagine a better environment.Of course,having world-class psychology and computational linguistics programs across the lawn doesn’t hurt either.

(Stepping back now and looking at all of these communities—from the local group to

xvii

the global group—it’s amazing to consider that in each case Allen played a signi?cant role in creating and sustaining them.)

I’ve been richly blessed for a long time with teachers and professors who not only put in the time to give me a quality education,but took a personal interest in me(perhaps on many occasion I gave them little choice...),challenging and encouraging every step of the way. It is with great pleasure that I thank Dan Buckman,Fernando Gomez,Charlie Hughes, Mike Moshell,Odell Payne,Chuck Phelps,and Sarah Jane Turner.It is by no means overstatement to say that without them I would not be writing these acknowledgments.

What I value most about the last six and a half years in Pittsburgh are the friends I’ve 11cfcca9d1f34693daef3e21ing into work each day is not only fun,but educational,if you have the of?cemates I’ve had:Jim Aspnes,Stewart Clamen,Bob Doorenbos,Jade Goldstein,and Aarti Gupta.I’m going to miss them a great deal.Anurag Acharya,Tony Simon,Milind Tambe,and Erik Altmann,among others made life inside and outside the university more enjoyable(if not for Erik I might still be taking the theory quali?er).It’s just not the same watching Seinfeld or hockey without Alonso and Risi V era,and lots of pizza.It’s been a delight to see Alonso and Risi grow together and I’m thankful for the time I got to spend with them.I also want to thank Noel for always making me feel welcome(where will I wash and wax my car now?)

The Squirrel Hill Christian Church took me in with open arms when I?rst came to Pittsburgh,and gave me a home-away-from-home for the next few years.I’ll always be thankful for their prayers and challenges during that time.I’m also going to miss the Allegheny Center Alliance Church,especially the Snider’s cell group.That group has played a greater role in my spiritual growth than they may ever know.Their prayers helped get me through the?nal months working on the thesis.And I know Mrs.Farmer’s unfailing prayers were always heard from McKeesport.

One of my regrets in leaving Pittsburgh is that I won’t get to see Sam and Sherry and the whole“Brunsvold”clan.I knew I could always count on Sam to be there through dif?cult times and joyful times.He made a signi?cant impact on my life not only personally,but through his dedication to the campus ministry.

One of the joys of leaving Pittsburgh is that I’ll be closer to Thad and Norma Polk and my new goddaughter Sarah!They have so enriched my life I can’t believe how fortunate I’ve been that I got a chance to meet them in grad school.I have asked more of Thad than anyone should have to put up with.But I’ll never forget all the fun we had while we were at CMU together.

My debts to friends(which is by now reaching embarrassing proportions)hardly stops in Pittsburgh.Seth Vanderdrift and Susan Smith have always been there over the past15 years,even though we were separated by a thousand miles during the last six.I can’t imagine what getting through school would have been like without them.

Finally,I want to thank my family,and especially my parents.I continually thank God for their unfailing encouragement,support,example,and love.This thesis is dedicated to them as well.

xviii

Chapter1

Introduction

Now,the overwhelmingly puzzling problem about sentence

comprehension is how people manage to do it so fast.

—Janet Fodor,Jerry Fodor,and Merrill Garrett(1975) F ODOR,F ODOR,AND G ARRETT certainly had it right.The ability to comprehend

language in real time is one of the most complex and impressive of human cognitive skills.Equally impressive is the staggering amount of scienti?c effort that has been devoted to exploring the processes of comprehension.Few topics engage so many disciplines within cognitive science.

Over the past three decades,psychologists have uncovered regularities about aspects of comprehension ranging from lexical access to memory for text.Although many theories have been proposed to explain these regularities,most address a small set of phenomena,and only a few take the form of complete computational models.In arti?cial intelligence,there has been more concern for building processing models with increasing functional coverage, but most complete NLP systems still do not model any appreciable set of psychological phenomena.

A notable exception is the READER model of Thibadeau,Just,and Carpenter(1982), which is one of the earliest examples of a complete,functional comprehension system that attains some measure of psychological plausibility.The continued development of this theory(Just&Carpenter,1992),along with some recent theories emerging from linguistics and computational linguistics(Gibson,1991;Kempen&V osse,1989;Pritchett, 1988;Weinberg,1993),indicates that uni?ed computational accounts of certain aspects of sentence comprehension are within reach.Each of these theories addresses a signi?cant range of phenomena with a single set of mechanisms or principles(a discussion of these and other theories appears in Chapters2and9).

This thesis takes another signi?cant step toward a uni?ed theory of sentence compre-hension by presenting a computational model,NL-Soar,that satis?es the following goals:

1.Breadth.The theory models a wider range of psychological phenomena than has

previously been given a cohesive account.

1

2Chapter1.Introduction

2.Depth.The theory models the phenomena with a depth matching or exceeding the

current best theories for those phenomena.

3.Architectural basis.The theory is embedded in an independently motivated theory

of the cognitive architecture.

4.Functionality.The theory functions as a working comprehension system.

The remainder of this chapter elaborates these goals by providing an overview of the target phenomena,an explanation of what it means for the theory to be architecturally-based,and a preview of the theory and major results.The chapter concludes with a reader’s guide to the remainder of the thesis.

1.1The core sentence-level phenomena

NL-Soar addresses six kinds of phenomena that form a cluster of regularities at the sentence level.The phenomena are primarily about the on-line processes involved in piecing together the words in a sentence to form a meaning.Though NL-Soar necessarily embodies some plausible assumptions about lower-level processes such as lexical access,and higher level processes such as creating a long-term memory of the comprehended content,the theory does not yet model the phenomena at these levels in signi?cant detail.However,the sentence-level processes and the phenomena surrounding them form an important core that must ultimately be addressed by any comprehension model.The phenomena are:

1.Immediacy of interpretation and the time course of comprehension.Our subjec-

tive experience is that we comprehend language incrementally,understanding each word as it is heard or read.As a hypothesis about the comprehension process,this has been advanced as the principle of immediacy of interpretation(Just&Carpenter, 1987),and much experimental evidence has accumulated in support of it.In general, immediacy holds for all levels of comprehension—syntactic parsing,semantic in-terpretation,and reference resolution.Furthermore,this immediate comprehension happens rapidly,at an average rate of240words per minute in skilled reading.

Although the average time per word is250ms,eye?xation studies also reveal that ?xations range from as little as50ms to1000ms or more.

2.Ambiguity resolution.When readers or listeners encounter an ambiguity,how

do they decide which interpretation to give it?A theory of comprehension must specify what knowledge is brought to bear in resolving ambiguities,and how and when that knowledge is brought to bear.There are several kinds of ambiguities that arise in comprehension,ranging from lexical-semantic to referential,but here we primarily focus on structural ambiguity—alternative interpretations that arise because the partial utterance is consistent with multiple syntactic parses.(1)below gives a simple example:

(1)The cop saw the dog with the binoculars.

1.1.The core sentence-level phenomena3

Sentence(1)exhibits a structural ambiguity:the prepositional phrase with the binoc-ulars can attach to saw or dog.General knowledge may prefer to interpret binoculars as the instrument of the seeing,but in certain speci?c contexts the binoculars may be associated with the dog.

The empirical evidence concerning the knowledge sources used to resolve ambiguities is mixed.Some studies have demonstrated that the semantic content of the sentence or the established discourse context can have an effect on the on-line resolution of local ambiguities.Other studies have shown the lack of such effects,demonstrating instead an apparent preference for one syntactic structure over another,independent of content or context.

3.Garden path effects.A garden path effect arises when a reader or listener attempts

to comprehend a grammatical sentence with a local ambiguity,misinterprets the ambiguity,and is unable to recover the correct interpretation.The result is an impression that the sentence is ungrammatical or nonsensical.(2a)below,due to Bever(1970),is the classic example.Raced may be taken as the main verb of the sentence,or a relative clause modifying horse.The relative interpretation is globally correct.((2b)has a parallel structure,but driven is unambiguous,so the garden path is avoided.)

(2)(a)The horse raced past the barn fell.

(b)The car driven past the station stopped.

The subjective experience provides compelling linguistic evidence for the unaccept-ability of these sentences,but additional experimental evidence comes from reading times and grammaticality judgments.The reduced relative construction in(2a)is but one kind of garden path;a collection of26different types is presented in Chapter2.

Though the garden path effect has been well known since Bever’s(1970)article, Pritchett(1988)was the?rst to deal in depth with the variety of constructions.

4.Unproblematic ambiguities.Some local ambiguities do not cause dif?culty no

matter which interpretation proves to be the globally correct one.Consider the pair of sentences in(3):

(3)(a)I know John very well.

(b)I know John went to the store.

There is a local ambiguity at John,since it could either be the direct object of know or the subject of an incoming clause.Yet,regardless of the?nal outcome,the sentence causes no perceptible processing dif?culty.There are a wide variety of constructions with unproblematic local ambiguities;Chapter2presents a collection of31different kinds.These constructions provide additional constraint for a theory intended to model garden path effects:the posited mechanism must be weak enough to predict dif?culty on garden paths,but not so weak that it cannot process the unproblematic ambiguities.

本文来源:https://www.bwwdw.com/article/ea7l.html

Top