What is computational phonology

更新时间:2024-03-21 23:51:01 阅读量: 综合文库 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Loquens1(1) January2014,e004

eISSN 2386-2637

doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology?

RobertDaland

UniversityofCalifornia,LosAngelese-mail:rdaland@humnet.ucla.edu

Citation/Cómocitaresteartículo:Daland,R.(2014).Whatiscomputationalphonology?Loquens,1(1),e004.doi:http://dx.doi.org/10.3989/loquens.2014.004

ABSTRACT:Computationalphonologyisnotonething.Rather,itisanumbrellatermwhichmayrefertoworkonformallanguagetheory,computer-implementedmodelsofcognitiveprocesses,andcorpusmethodsderivedfromtheliteratureonnaturallanguageprocessing(NLP).Thisarticlegivesanoverviewofthesedistinctareas,identifyingcommonalitiesanddifferencesinthegoalsofeacharea,aswellashighlightingrecentresultsofinterest.Theoverviewisnecessarilybriefandsubjective.Broadlyspeaking,itisarguedthatlearningisapervasivethemeintheseareas,butthecorequestionsandconcernsvarytoomuchtodefineacoherentfield.Computationalphonologistsaremoreunitedbyasharedbodyofformalknowledgethantheyarebyasharedsenseofwhattheimportantquestionsare.

KEYWORDS:computationalphonology

RESUMEN:?Quéeslafonologíacomputacional?.-Lafonologíacomputacionalnorepresentauncampounitario,sinoqueesuntérminogenéricoquepuedehacerreferenciaaobrassobreteoríasdelenguajesformales;amodelosdeprocesoscognitivosimplementadosporordenador;yamétodosdetrabajoconcorpus,derivadosdelabibliografíasobreprocesamientodellenguajenatural(PLN).Esteartículoofreceunavisióndeconjuntodeestasdistintasáreas,identificalospuntoscomunesylasdiferenciasenlosobjetivosdecadauna,yponederelievealgunosdelosúltimosresultadosmásrelevantes.Estavisióndeconjuntoesnecesariamentebreveysubjetiva.Entérminosgenerales,sear-gumentaqueelaprendizajeesuntemarecurrenteenestosámbitos,perolaspreguntasylosproblemascentralesvaríandemasiadocomoparadefinirunáreadeestudiounitariaycoherente.Losfonólogoscomputacionalesestánunidosporuncúmulocomúndeconocimientosformalesmásqueporunparecercompartidoacercadecuálessonlaspreguntasimportantes.

PALABRASCLAVE:fonologíacomputacional

1.INTRODUCTION

Whatdoesitmeantobeascientificfieldofinquiry?Proceedinginductively,wemightobservethatwell-es-tablishedfieldstendtoexhibitthefollowingproperties:(i)acoresetofobservablephenomena,whichthe

fieldseekstoexplain

(ii)acoresetofresearchquestionsthefieldasks

aboutthosephenomena

(iii)asharedsetofbackgroundknowledgethatisin

partspecifictothefield

(iv)ashared'toolbox'ofresearchmethodsusedfor

gainingnewknowledge

Thesepropertiesexhibitagranularityofscale;withinonefieldtheremaybesub-fieldswhichaskmorespecificquestions,assumegreateramountsofsharedknowledgethanthefieldasawhole,andutilizearestrict-edsetofmethodologies.Forexample,linguisticsisaratherwidefieldofinquiry;withinthisfieldthereisasub-fielddevotedtothestudyofsyntaxspecifically.Becausescienceisadynamicandevolvingenterprise,scientificfieldsexhibitthesamekindoftaxonomicstructureasotherevolutionarysystems,suchasspeciesandlanguages–subfieldsmayhavesub-subfieldsoftheirown,andparticularsub-fieldsmayhavemoreincommonwithadifferentfieldthanthe'parent'field.Forexample,psycholinguisticscanbeconsideredasub-field

Copyright:?2014CSICThisisanopen-accessarticledistributedunderthetermsoftheCreativeCommonsAttribution-NonCommercial(by-nc)Spain3.0License.

2?R.Daland

oflinguistics,buttheresearchmethodsandthespecial-izedknowledgespecifictothatfieldarearguablyclosertothefieldofpsychology.But,then,whichofproperties(i)-(iv)areessentialtoafield?Theanswertothisques-tionwillinformouranswertothequestion,“Whatiscomputationalphonology?”

Someperspectiveonthisquestioncanbegainedbyconsideringthehistoricaldevelopmentofafield.Fieldscanoccasionallyform,orshiftdramaticallyincharacter,withtheemergenceofacharismaticandpersuasivethinkeroraseminalpublication.ThiswasarguablythecaseinlinguisticswithChomsky'sreviewofB.F.Skinner'sVerbalBehavior(1959)andotherrelatedpublications(Chomsky,1956).Fieldsmayalsostratifytotheextentthatitisworthconsideringthemastwodifferentfields.Forinstance,mostofthescientificfieldsweknowtodayhavetheirrootsinphilosophy.Fieldsmaycoalescebytheidentificationofsimilarstrandsofthoughtinfieldsthatwereformerlyseparate;suchisarguablythecasewiththefieldofcognitivescience,ormorespecificallywithpsycholinguistics.Inthecaseofnewer,less-establishedfields,especiallythosewhichcoalescedfrommultipleotherfields,thereisamuchsmallercoreofshared,field-specificknowledge.Ar-guably,thecodificationofasharedbodyoffield-specificknowledgeistheconsequenceofestablishingacademicprograms/departmentsforagivenfield,ratherthanacauseornecessarypropertyoffieldhood.Asforthere-searchmethodsofafield,theyareever-changing.Methodologymightbeusedtocharacterizeafieldataparticularhistoricalmoment,butmostfieldspersistthroughseveralmethodologicalturnovers.Forexample,theincreaseincomputerresourcesoverthelast50yearshasrevolutionizedlinguisticmethodology,butthequestionsweasknowarearguablythesameonesthatChomskylaidoutinthe1950s:Howdochildrenlearnlanguage?Outofthespaceoflogicallyimaginablelin-guisticpatterns,whydomanysystematicallynotoccur?Towhatextentcantheoccurrence/non-occurrenceoflinguisticpatternsbeexplainedbyfunctionalaspectsofcommunication,andtowhatextentisitdeterminedbypropertiesofthecognitivesystems(s)thatprocessandrepresentlanguage?

Thereisroomforlegitimatedisagreementonthispoint,butformanyresearchers,afieldrevolvesaroundasetofempiricalphenomena,andakeysetofresearchquestionsthefieldseekstoansweraboutthosephenom-ena.Bythisstandard,Iwillsuggestthatcomputationalphonologyisnotreallyasinglefield.Rather,thephrase'computationalphonology'isusedasanumbrellatermforresearchwhichgenerallypresupposesashared,specificsetofbackgroundknowledgeandusesacom-monsetofresearchmethodologies,butoftenwithradi-callydivergingquestions.Iwillmakethiscasebysur-veyingrecentprogressinfourdifferentsubfields,allofwhichIorcolleagueshaveidentifiedas'computationalphonology'.Weshallseethereisageneralemphasisonlearning,andthatallormostpractitionershaveacom-monbackgroundincorpusandfinite-statemethods,but

thesub-fieldsthemselvesdifferquiteradicallyinwhattheresearchquestionsare.

Priortothesurvey,itisnecessarytovoiceacaveat.TheviewofthefieldthatIpresentismyown.Imakenoclaimthatthesurveybelowiscomprehensive,orunbiased;infact,Iavowthatthisreviewisstronglybi-asedtowardmyownresearchinterests,thereadingsIhavedone,andbyinformalconversationsIhavehadwithcolleagues.Ihavesurelyomittedmentionofagreatdealofimportantandinterestingwork,eitherfromtime/spaceconstraintsorbecauseIhavenotyethadthehonorofbeingexposedtoit.Still,asamultidisplinaryresearcherIhopethatallreaderswillfindsomethingnewwithinthesepages,andIhaveaimedforthefairest,mostscrupulousandscholarlytonefortheworksIwasabletoreviewhere.PriortothebodyofthepaperIbrieflyreviewbackgroundmaterial.2.BACKGROUND2.1.Whatisphonology?

Iassumethatthereaderofthisarticlehassomebackgroundinformallinguistics,perhapsequivalenttoaone-yearundergraduatesequencecoveringpho-netics,phonology,andothercoreareas.Forexample,Iassumethereaderisfamiliarwiththeconceptofunderlyingrepresentation(UR;alsocalledlexicalrepresentation,orinput)versussurface(SR;alsocalledoutput),andtheconventionthatURsareindicatedwithslashes//whileSRsareindicatedwithbrackets[];Iassumeknowledgeoftheterms'segment','sylla-ble','onset','coda',etcetera,andtheInternationalPhoneticAlphabet.Still,asIanticipatesomereaderswillcomefromacomputationalbackgroundwherethestudyofspeechsoundsisnotemphasized,Iwillbrieflydescribeherecoreconceptswhichfigureprominentlyinthepaper.2.1.1.Markedness

Cross-linguisticallysomestructuralconfigura-tionsappeartobedispreferred.Forexample,Frenchhasacomplexprocessknownasschwadeletion,inwhichtheweakschwavoweltendstodelete,exceptifthedeletionwouldcreateatriconsonantalcluster(Riggle&Wilson,2005).Moreover,triconsonantalclustersdonotappearinmanylanguages,andtendtohavearestricteddistributioninlanguagesthatal-lowthematall.ItappearsasifFrenchandmanyotherphonologiesarespecificallyavoidingthis'marked'configuration.Thepropertreatmentofmarkednessisacoreconcerninphonologicaltheory.Whatstructuralconfigurationsaremarked?Howismarkednessrepresentedinthemindsofspeakers?Howismarkednessacquired–isitlearnedfrompho-netics,projectedfromthelexicon,orsomethingelse?

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??3

2.1.2.Alternations

Alternationisthenamegiventocasesinwhichthesamephonologicalentityappearswithtwoormoreforms.Forexample,comparemycasualpronunciationsoftheEnglishwordspentagonandpentagonal:

(1)

pentagonpentagonal

[?p??????ɡa?n]||/\\||[p??n?t??ɡ??n-??]

readoutloudas'XgoestoYwhenitoccursafterAandbeforeB'.Theformalmechanismstheyintroduced–in-cludingthetreatmentofsegmentsas'featurebundles',towhichrulescouldrefer,andlanguage-specificruleorder-ings–becamethedominantparadigmwithinthefieldofphonologyformanyyearsafterwards.Evenasconstraint-basedformalismshavereplacedSPE-stylerulesasthepreferredvehicleforphonologicalanalysis,manylinguistsstilluserulesasaconvenientshorthandfordescribingphonologicalprocesses,e.g.in(2)above.2.1.5.OptimalityTheory

OptimalityTheory,likeSPE,definesthephonologi-calgrammarasacognitivemechanismwhichimple-mentsthemappingfromaninput/URtoanoutput/SR,andmaymakereferenceto'hidden'phonologicalstruc-turesuchasmetricalfeet,syllables,etc.UnlikeSPE,OTpositsthattherearemultiplepossiblecandidatesforagiveninput,andthereisaparallelcomputationtoidentifytheoptimal('mostharmonic')outputcandidate,ratherthantheserial/derivationalprocessororderedruleinSPE.SeminalworksonOT(McCarthy&Prince,1994;Prince&Smolensky,1993,2002,2004;Smolensky&Legendre,2006)definethecorecompo-nentsofabroadclassofconstraint-basedtheories:theremustbeacomponentwhichproposesoutputcandidates(GEN),asetofconstraints(CON),andanevaluation/se-lectionmechanism(EVAL)whichchoosesthewinningcandidatebasedonsomelanguage-specificprioritizationoftheconstraints.Someauthorsuse“OT”torefertoreferbroadlytoanysuchconstraint-basedtheoryofphonology.Iwilluse“OT”torefertothesubclassofconstraint-basedtheorieswiththe“totalordering”evaluationmethoddescribedinPrinceandSmolensky(1993,2002,2004)andMcCarthyandPrince(1994).Thatis,forthepurposesofthisarticle,“OT”meansthatconstraintconflictsareresolvedinfavorofthehighest-rankedconstraint,regardlessofwhetherthewinningcandidateincursmoreviolationsoflower-rankedcon-straintsthanalternatecandidates.(Constraintconflictarisesforparticularinputswhenitimpossibleforanoutputtosatisfyoneconstraintwithoutviolatinganother.Anexampleisshownbelowin(3).2.1.6.HarmonicGrammar

Laterinthearticle,article,Iwillmakefrequentref-erencetoMaxEntHG(Goldwater&Johnson,2003;Hayes&Wilson,2008),aprobabilisticextensionofHarmonicGrammar(Legendre,Miyata,&Smolensky,1990;Smolensky&Legendre,2006)inthelog-linearframework.AsthereadermaynotbefamiliarwithHarmonicGrammar,Idescribeitverybrieflyhere.HarmonicGrammarisaclosevariantofOTwhichdif-fersintheevaluationprocedure:constraintsareweighted,ratherthantotallyordered,andtheharmony

In(1),segment-to-segmentidentityisindicatedbyverticalalignment.Non-identicalcorrespondentsarever-ticallyaligned,butindicatedwithaverticalbarorslash.Everycorrespondingvowelisdifferentbetweenthesetwoforms,owingtothedifferentpositionofstress.Inaddition,themedialcoronalstop/t/isaspiratedinpentagonalbe-causeitprecedesthestressedvowel,whileitlenitestoaflapinpentagonbecauseitprecedesanunstressedvowel(andadditionallycoalesceswiththenasaltoyieldanasalizedflap).Thepropertreatmentofalternations,whereinthe'same'phonologicalunitvariesaccordingtoitscontext,isalsoacoreconcernofphonologicaltheory.2.1.3.Opacity

Opacityariseswhenthesurfaceevidenceforaphonologicalprocessisinconsistent.Forexample,Bakovi?(2007)givesthefollowing,well-knownexam-plefromYokutsYawelmani:

(2)

UR

LongHighVowelLoweringClosedSyllableShorteningSR

/?ili?+l/|?ile?l||?ilel|[?ilel]

Evidently,theLongHighVowelLoweringprocessservestoavoidlonghighvowels,amarkedoutcomewhichneverappearsonthesurfaceinthislanguage(eventhoughmanyURscontainunderlyinglonghighvowels).TheClosedSyllableShorteningprocessissimilarlymotivatedbytheobservationthatlongvowelsneverco-occurwithcodaconsonants.The'problem'in(2)isthatthereisnoreasonforbothprocessestoapply.ClosedSyllableShorteningalonewouldavoidbothmarkedstructures,butLongHighVowelLoweringappearstoapplyanyways,gratuitously'hiding'theunderlyingheightofthevowel.Opacity,oratleastcertaintypesofopaquepatterns,arebelievedtopresentasignificantlearningproblem.2.1.4.TheSoundPatternofEnglish(SPE/Rules)ChomskyandHalle(1968)proposedaphonologicalanalysisofEnglishusingstringrewriterulesoftheformAXB→AYB,typicallyabbreviatedX→Y/A__Band

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

4?R.Daland

ofaformisdeterminedbytheweightedsumofitsconstraintviolation.AswithOT,thisisstraightforward-lyillustratedwithatableau;anexampleofword-finaldevoicingisshownin(3):

(3)/ɡad/[ɡad]?[ɡat]IdentVce[-son]*[-son,+vcd]]PrWd

wt=-1wt=-5

**Harmony-1·0+-5·1=-5-1·1+-5·0=-1TheUR/gad/isgiveninthetopleftcell,whilecan-didateSRsarelistedbelow.Constraintnamesaregiveninthetoprowaftertheinput;IDENTVCE[-SON]penalizesobstruentsforwhichtheoutputvoicingvaluedoesnotmatchtheunderlyingvoicingspecification,while*[-SON,+VCD]]PRWDpenalizesvoicedobstruentsattheendofaword.Constraintviolationsaremarkedinthecellswithan'*'.Forinputswithunderlyinglyvoicedfinalobstruents,itisimpossibletosatisfybothcon-straintsatonce;thusthisisanexampleofconstraintconflict.Theconstraintweightsarelisteddirectlyunder-neaththeconstraintsthemselves,andarerequiredtobenonpositive.1Thefinalcolumnindicatestheharmonyvalueoftheoutputcandidate,definedastheweightedsumoftheconstraintviolations.AswithOT,themostharmonicoutputcandidate(or,equivalently,theleastdisharmonic)isselectedasthewinner;thisisconven-tionallyindicatedwiththe“OThand”?.Incaseswhereonlytwoconstraintviolationstradeoffagainstonean-other,HarmonicGrammarisequivalenttoOT;however,thetwotheoriesmakedifferentpredictionswhenasingleconstraintviolationconflictswithmultipleviolationsofadifferentconstraint(countingcumulativity)orvio-lationsofmultipleconstraints(gangingcumulativity).2.2.Probability

Iassumethereaderisfamiliarwithelementarystatisticsandprobabilitytheory.Forexample,Iassumethereaderisfamiliarwiththeconceptofp-value,t-test,anduseofthebinomialformulatocalculatetheproba-bilityofaseriesofcointosses.Ialsoassumethereaderisfamiliarwithexponentiationandtheinverseoperationoftakingthelogarithm.BelowIdescribetheconceptofodds,andbrieflyoutlinelog-linearmodels.2.2.1.Oddsandlog-odds

Theoddsoftwoevents,sometimeswrittena:b,indicatetherelativeprobabilityofthetwoevents.For

example,iftheoddsare3:2thatLuckyHorsewillwintherace,itmeansthatLuckyHorseisexpectedtowin3timesforevery2timesthatLuckyHorsedoesnotwin.Oddscanalwaysbeconvertedtoprob-abilitiesandviceversa;forexample3:2meansthatLuckyHorsewillwin3timesoutof5trialsforaprobabilityof3/(3+2)=0.6.Oddscanberepresentedassinglenumbersbysimpledivision,e.g.3:2=3/2=1.5.Thus,whenthereareonlytwopossibilities,anoddsof1.5correspondstoaprobabilityof0.6.Thelog-oddsoftwoeventsAandBissimplytheloga-rithmoftheirodds,i.e.log(a:b).(Ingeneral,Iwillassumethenaturallogarithmunlessotherwisespeci-fied.)Thelog-oddshasseveralintuitivelyattractiveproperties.ItiszerowhenAandBareequiprobable,positivewhenAismoreprobablethanB,andnega-tivewhenAislessprobablethanB.Moreover,thegreatertheasymmetryinprobabilitybetweenAandB,thegreaterthemagnitudeofthelog-odds.Finally,inmanyofthesystemswherelog-oddsareused,probabilitydifferencescanbemanyordersofmagni-tude.Thelogoperationmakestherelativelikelihoodoftheseoutcomeseasiertograspfornormalreaders.2.2.2.Log-linearmodels

Log-linearmodelsexpresstheprobabilityofinput-outcomepairsintermsofsomefeaturefunctionsandassociatedweights.ThescoreHM(w)ofaninput-outcomepairistheweightedsumofitsfeaturevalues.TheoutputofthemodelM(w)isthendeterminedbystipulatingthattheprobabilityofaninput-outcomepairisproportionaltotheexponentialofitsscore.Formally,alog-linearmodelM(w)consistsofavectoroffeaturefunctionsf={fk}andarelationGENwhichgivesthesetofpossibleoutcomesyijforeachinputxi.Inaddition,thevectorwisaparameterofM,andrepresentstheweightsthatareassociatedwiththefeaturefunctions:

(4)PrM(w)(yij|xi)=exp(HM(w)(xi,yij))/Z(xi)

HM(w)(xi,yij)=Σkwk·fk(xi,yij)Z(xi)=Σy[i,j']∈GEN(x[i])exp(HM(w)(xi,yij'))

Log-linearmodelshaveseveralattractivecomputa-tionalproperties.Oneofthemisthatitiseasytointer-prettherelativeprobabilityoftwodifferentoutputs:foragiveninputxi,thelog-oddsofoutcomeyiaversusyibissimplythedifferenceintheirscoresHM(w)(xi,yia)–HM(w)(xi,yib).Another,especiallyimportantpropertyisthatforfixedfandGEN,onlymildassumptionsare

Someauthorsinsteadrequirethatweightsbenonnegative.Theformalismitselfrequiresthattheweightsallbethesamesign.(Otherwise,thetheorywouldnotexhibitharmonicbounding,andwouldlosethedesirabletypologicalrestrictivenessthatcomesfromanexplicittheoryofmarkedness.)Iprefernegativeweights,sincethisalignsintuitivelywiththedefinitionofharmony:candidateswithmoreconstraintviolationsarelessharmonic.Asweshallseelater,negativeweightsalsoalignswiththenaturalextensionofHarmonicGrammartoalog-linearmodel.

1

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??5

neededtoensurethattheprobabilityofadatasetX={(xi,yij)i=1..N}isconvexinthespaceofallpossibleweightvectors(Berger,DellaPietra,&DellaPietra,1996).Inmoreeverydaylanguage,thismeanstwothings.First,thereisaunique'best'weightvector(wmax)whichmax-imizesthelikelihoodoftheobserveddata.Second,itispossibletofindthisuniquebestsolutioninacomputa-tionallyefficientmanner,usingwell-establishednumer-icaltechniqueslike(conjugate)gradientascent.Aswewillseelater,log-linearmodelsofferanaturalproba-bilisticextensionforHarmonicGrammar,whichofferstheexcitingpotentialforatheoryofphonologicallearningthatismachine-implementableandtestableonnaturallanguagedata.

Thiscompletesthesurveyofbackgroundmaterial.Thenextsectionbeginsthebodyofthepaper.Inthatsection,Ibrieflysurveythefieldknownas'formallan-guagetheory',whencemodernlinguisticsbegan.3.FORMALLANGUAGETHEORY

Formallanguagetheoryisanaxiomatic,logical/math-ematicalapproachtolanguage.A'language'isdefinedasasetofstrings,oftenaccordingtosomeprocessthatgeneratestheset.Researcherswhoworkinthisareaareconcernedwiththeclassificationoflanguagesaccordingtothe'complexity'oftheprocessrequiredtogeneratethelanguage,aswellastheassumptionsneededtolearnlanguagesinthevariousclassesidentified.Twoofthebest-knownconceptstohaveemergedfromthislineofresearcharetheChomsky-Schützenbergerhierarchy(Chomsky,1956)andtheconceptofidentificationinthelimit(Gold,1967),bothofwhichwillbebrieflycoveredlater.Twostrandsofworkinthislineofspecialrelevancetophonologyincludecomparisonsoftheex-pressivepowerofdifferentphonologicalframeworks(e.g.Buccola&Sonderegger,2013;Graf,2010ab;Jar-dine,inpress)andtheelaborationoffinite-statetech-niqueswhich'count'constraintviolationsforentireclassesofstrings,enablingefficientmachineoptimiza-tion(e.g.Eisner,2002;Hayes&Wilson,2008;Riggle,2009).

Asthismaterialissomewhattechnicalandunlikelytobeknowntotheaveragelinguist,Ibeginwithanoverviewofbasicconcepts.Furthermore,becausethearticleaimstocoverothertopicsbesidesjustformallanguagetheory,theoverviewisnecessarilysomewhatsuperficial;itismeanttodescribetheintuitions,themostcommonnotation,andthemostwidelycitedre-sults.ReaderswhoarealreadyacquaintedwiththismaterialmaywishtoskipdirectlytotheFrameworkcomparisonsubsection.Conversely,readerswhowishtolearnmoreareadvisedtoperuseasourcedevotedtoformallanguagetheory:Heinz(2011ab)forphonologyspecifically,Stabler(2009)forasurveyofformallan-guagetheoryasitrelatestonaturallanguageuniversals,oranintroductorycomputersciencetextbookforthebasics.

3.1.Generaloverview

Informallanguagetheory,'language'doesnotrefertoasharedlinguisticcodelikeEnglishorAmharicorTashliytBerber.Rather,itisaformalobjectwithpre-ciselyspecifiedproperties,whichcanbestudiedinamathematical,axiomatic,logicalfashion.Conventional-ly,formallanguagetheoryassumesanalphabetΣanddefinesastringasanorderedsequenceofelementsfromΣ.Forexample,ifΣ={a,b}thenσ=abisa(rathershort)stringoverΣ.ThesetofallpossiblestringsoverΣisdenotedΣ*(where*iscalledtheKleenestarandhastheconventionalizedmeaningof“0ormorerepeti-tions”).Normallyinformallanguagetheory,alanguageLisdefinedasasubsetofΣ*.Notethattheelementsofthealphabetdonothaveanyintrinsicmeaning,oranyinternalstructure;theyaresimplyalgebraicelementsthataredistinctfromoneanother.

Forexample,wecoulddefineΣ={C,V}andL=(CV)+(where+means“1ormorerepetitions”);there-sultingsetofstringswouldlooktoaphonologistlikeastrictCVlanguage:{CV,CVCV,CVCVCV,...}.ButtheformalismdoesnotknowthatCmeansconsonantandVmeansvowelinthesamewaythathumanspeak-ersdo.Humansknowthatvowelsarecharacterizedpartiallycomplementaryarticulatoryandacousticproperties,aswellassequencingfacts(e.g.wordsmustbeginwithaC,everyCmustbefollowedbyaV,VcanendawordorbefollowedbyaC).Theformalismmerelyknowsthesequencingfacts,andthatCisadif-ferentsymbolthanV.Indeed,thelanguageL=(ab)+overΣ={a,b}hasthesameabstractstructureasL=(CV)+overΣ={C,V};fromaformallanguageperspec-tive,thesearenotationalvariants,meaningthattheyexpressthesamepatternafteratransparent,structure-preservingchangeinnotation.

Interestinformallanguagetheoryismotivatedbytheassumptionthatnaturallanguagescanbemappedontosomeparticularclassofformallanguages(orviceversa),andthatthepropertiesoftheformallanguageclasswillyieldclearinsightsintohowlanguageislearned,represent-edandcomputedinthemindsofspeakers.Forexample,itiswidelybelievedthatsyntaxismildlycontext-sensitive,whilephonologyis(sub)regular(e.g.Heinz,2011ab;Stabler,2009).Wewillunpackthisassertionlater.Inthemeantime,itmustbeacknowledgedthisideaofidentifyinglanguageswithstringsets,anddividingthemupintoclassesbaseduponcertainproperties,isalargeassump-tion,whosefullimplicationswedonothavespacetoas-sesshere.Iwillpointoutoneimplicationherehowever:theliteratureonlearningformallanguages('learnability')assumesthatknowledgethatis'outside'thegrammarisnotbroughttobearongrammarlearning.Forexample,phoneticknowledgedoesnotfigureintheformallanguagelearnabilityliteratureonphonology,justassemantic/prag-maticknowledgedoesnotfigureinthelearnabilitylitera-tureonsyntax.Withthiskindofcaveatinmind,letusconsiderwhatformallanguagetheoristsmeanbyalan-guageclass.

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

6?R.Daland

3.2.TheChomskyHierarchy

Itmaybehelpfultobeginwithanexample.Chomsky(1956)describesawayofgeneratingstringsthatisnowknownasaphrase-structuregrammar.Phrase-structuregrammarsarepredicatedonasystemof“rewriterules”,inwhichonestringisrewrittenasanother.Hereisanex-ampleofanespeciallysimplephrase-structuregrammar:

(5)

rewritetononterminals:S→NPVP

VP→VNP

rewritetoterminals:V→likes

NP→theboyNP→thedog

languagesinwhichifthegrammargeneratesasentenceX=x1x2...xn,italsogeneratesthemirror-imageX'=xnxn-1...x1.Naturallanguagesexhibitcertainkindsofregularities,suchasconstituencystructure,whicharenotexpectedifrewriterulesarecompletelyunrestricted.Therefore,theclassofcontext-sensitivelanguagesis'toorich';itdoesnotexplainthestructuralconstraintsthatnaturallanguageshave.3.2.2.Context-freelanguages

Chomsky(1959)definedthecontext-freeclassasthesetoflanguageswhichcanbegeneratedbyagram-marinwhichtheleft-handsideofeveryrewriterulesisasinglenonterminal.Inotherwords,therewriterulessubstituteauniquenonterminalforsomethingelse–crucially,withoutregardtowhatsurroundsthenonter-minal.Grammar(5)isanexample:everyrewriterulecontainsasinglenonterminalontheleft-handside.In-tuitively,thismeansthattheeventualoutputthatcorre-spondstoanonterminalcannot'lookoutside'thenonter-minalitself.Inotherwords,context-freenessimposesatypeoflocalityrestrictiononhowsubstringsmaysharedependencies.Thisisonemeansofenforcingconstituentstructureincontext-freelanguages.3.2.3.Regularlanguages

Theregularlanguagesarethosethatcanbegenerat-edbyrewriterulesinwhichtheleft-handsideconsistsofasinglenonterminal,andtheright-handsidemaycontainatmostonenonterminal.Moreover,thenonter-minalsontheright-handsideoftherewriterulesmustalwaysbefinalintherewritestring(inwhichcasethelanguageiscalledrightregular),ormustalwaysbeinitialintherewritestring(inwhichcasethelanguageiscalledleftregular).Hereisanexampleofarightregulargrammarandastringthatitgenerates:

(7)

TheSsymbolisunderlinedtoindicatethatitistheuniquestartsymbol.Thisgrammargeneratesstringsbybeginningwiththestartsymbol,andgeneratingallpossibleoutputsbyapplyinganyrulethatcanapply,atanytime.Forexample,thisgrammargeneratesthestringtheboylikestheboybyrewriting'S→NPVP','NP→theboy','VP→VNP',and'NP→theboy'again.Thesequenceofrewriteoperations,togetherwiththefinaloutputofthederivation,hasanelegantvisualrepresen-tationasatree:

(6)

Thegrammarin(5)issimpleenoughtoenumeratetheentirelanguageitgenerates:theboylikestheboy,theboylikesthedog,thedoglikestheboy,thedoglikesthedog.

Moreformally,aphrase-structuregrammarGconsistsofastartsymbolS,asetofterminalsymbolsΣ,asetofnonterminalsymbolsV(whichmustnotshareanysym-bolswithΣ),andacollectionofrewriterulesR,whereeachrewriterulemapsasequencecontaininganontermi-naltoasequenceofterminalsandnonterminals.Thelan-guagegeneratedbysuchagrammarisdefinedasthesetofstringsgeneratedbyallderivationsthatterminate(i.e.stringscontainingonlyterminalsymbols).3.2.1.Context-sensitivelanguages

Thesetoflanguagesthatcanbegeneratedwhentherewriterulesareonlyunrestrictedtonotincreasethenumberofsymbolsiscalledthecontext-sensitiveclass.Itispossibletodefinecontext-sensitivelanguageswhicharecompletelyunlikenaturallanguages,forexample

Thesymbolsin(7)werechosensuggestively,toil-lustratetoreadershowformallanguagesmightencodestructuresandrelationsusedinmainstreamphonology.Nowweareinapositiontounderstandthecontrast-ingclaimsthat“phonologyis(sub)regular”while

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??7

“syntaxismildlycontext-sensitive”.TheformerphraseexpressesthebeliefthatforeverynaturallanguageL,thereisagrammarGLwhichcangenerateallandonlythelicitphonologicalstringsofL,andGLcanbewrittenasaregulargrammar(possiblyevenassomepropersubsetoftheregularlanguages).Thelatterphraseex-pressesthebeliefthatthisisnottrueforsyntax,sincethereexistsyntacticpatternswhich(ithasbeenclaimed)cannotbecapturedbyregularrewriterules,orevencontext-freerewriterules.Forexample,Shieber(1985)givesthefollowingSwissGermanclauseasanexampleofacross-serialdependency:

(8)

Languageswhichadmitofanarbitrarynumberofsuchdependenciesareprobablynon-contextfree,andShieberarguesthatSwissGermanisjustsuchacase.3.3.EquivalencyofFiniteStateAutomataand

RegularLanguagesAnoverviewofformallanguagetheorywouldnotbecompletewithoutmentionoffinitestatemachines(FSMs,alsocalledFSAsforfinitestateautomata).Practicallyspeaking,anFSMisanalternativerepresen-tationofaregularlanguage.Historically,thetwowereconceivedofseparately,buttheformalequivalencewasnotedandprovedinearlywork.AnFSMconsistsofasetofstates,conventionallyindicatedwithcirclesandanoptionalstatelabel.Inadditiontothestates,anFSMcontainstransitionsbetweenstates,whichmustbela-beledinmostformulations.Atleastonestateisdesig-natedasastartstate,andatleastonestateisdesignatedasanendstate;thesemaybethesamestate.Convention-ally,thestartstateisindicatedwithathickcircle,whileotherstatesareindicatedwithasinglecircle.Herearetwoexamples:

(9)

nitenumberofstrings,includingThisisthecat,Thisisthecatthatchasedtherat,Thisisthecatthatchasedtheratthatatethecheese,Thisisthecatthatatethecheesethatchasedtherat,etc.Ingenerationmode,theFSMworksbybeginningatthestartstate.Ifitisatanendstate,itmaystop,havinggeneratedacompletestring.Ifthereareoneormoretransitionsoutofthecurrentstate,themachineselectsonerandomlyandfollowsit,emittingasymbolalongtheway(thelabelonthetransition).However,whenthereisonlyonetransitionoutofanon-end-state,themachinemustfol-lowthatuniquetransition.Inparsingmode,themachineissaidtoconsumesymbolsfromaninputstring.Itbe-ginsatthestartstate.Whenitreceivesthenextsymbolfromtheinputstring,itlooksforatransitionwithamatchinglabel.Ifthereisamatchingtransition,itfol-lowsitandadvancestothenextinputsymbol.Ifthereisnotamatchinglabel,themachineissaidtorejectthestring.Ifthemachineisinanendstatewhentheinputstringisentirelyconsumed,themachineissaidtoacceptthestring;otherwisethemachinerejectsthestring.Inotherwords,theFSMacceptsthestringifandonlyifitcanmatcheverysymbolintheinputstringwithatransitionandendupinanendstatewhentheinputisconsumed.

(10)

Finitestatemachinescanbeconsideredasgeneratorsorparsers,buteitherway,theydescribethesamesetofstrings.Example(9)describesexactlytwostrings:Hellofather,andHelloworld.Example(10)describesaninfi-

Anexampleofastringthat(10)doesnotacceptisThiswasthecatthatchasedtherat.Initially,thema-chineisinthestartstate(S1).Thefirstinputsymbol,This,ispresentedandmatchesthelabelforthetransitionleadingoutofS1andintoS2,soThisisconsumedandthemachineentersstateS2.Nowtheinputsymbolwasisconsidered,buttheonlyavailabletransitionlabelis

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

8?R.Daland

is,sothemachinerejectsthestring.Informallanguagetheory,acceptingorrejectingastringisakintoofferingagrammaticalityjudgment.Languagesaredefinedassetsofstrings,soinprincipleitispossibletomakeabinaryjudgmentforeverystring,whetheritisinthelanguage.Inthecaseofregularlanguages,thereisguaranteedtobeafinitestatemachinewhichcannotonlydothisisprinciple,butcandosostraightforwardlyandefficientlyinacomputerimplementation.Forthisreason,finitestatemethodshavebeenappliedthrough-outcomputerscience,forbothnaturallanguageprocess-ingandvariousotherapplications(suchasprogramminglanguageparsingandcompilers).

Itisworthnotingherethattherearealternativefor-mulationsoffinitestatemachines.Forexample,itispossibletomakethestatelabelscorrespondtosymbolsbeinggenerated/consumed,whilethetransitionsareunlabelled.Itisalsopossibletoaugmentthetransitionand/orstateswithextrainformation,beyondthesymbolbeingconsumed/generated.Indeed,thereisagreatdealofworkonthistopic,whichisomittedforspacereasons.Afinaltypeoffinitestateautomataisknownasafinitestatetransducer(FSTs).AnFSTisjustlikeanFSM,exceptthatitparsesaninputstringandgeneratesacorrespondingoutputstring.Thatis,theFSTbehavesjustlikeaFSMintermsofparsing,butitstransitionla-belshavebeenaugmented;thelabelconsistsofboththeinputsymboltomatch,andanoutputsymboltogenerateuponasuccessfulmatch.Hereisanexamplewhichimplementsanintervocaliclenitionrule(d→e/a__a):

(11)

Thesymbol∈isaspecialsymbol,conventionallyusedtoindicateanemptyoutput.Thefinitestatetrans-ducerin(11)willfirstmatchan/a/andoutputan[a];thenitwillmatcha/d/butoutputnothing,waitingtoseeifitgetsanother/a/.Ifitgetsanother/a/,itwillthenoutputthe'delayed'[e]alongwiththe[a];otherwise,theFSTwillrejectthestring,indicatingthatthelenitionruledoesnotapplytothisinput.

3.4.Identificationinthelimit,andothernotionsof

learnabilityGold(1967)providedthefirstformalizationoflearnabilityforaformallanguage.InGold'sconception,theinputtoalearnerisdefinedasatextT–aninfinitesequence(t1,t2,...)ofgrammaticalitemsfromalan-guageL,whichisguaranteedtocontaineveryiteminLatleastonce,butnotinanyparticularorder.Agram-marG(L)isdefinedasafiniterepresentationthatcangenerateallandonlythestringsofL.AlearnerAisdefinedasafunctionwhichacceptsafinitesubsequence

Tn=(t1,t2,...,tn)fromatextTandreturnsahypothesizedgrammar.Forexample,A(T5)isthegrammarthatlearnerAwouldpositafterhearingthefirst5sentencesofLintextT.ByfeedingalearnerAsuccessivelylongersubsequencesfromaninputtextT,weobtainasequenceofpositedgrammarsA(T1),A(T2),...AlearnerissaidtoidentifyLinthelimitifforeverytextT,thereisafiniteamountofinputNsuchthatA(TN)=G(L),andA(Tm)=A(TN)forallm>N.Inotherwords,thelearnerAissaidtoidentifyLinthelimitiftheyareguaranteedtocon-vergeonagrammarthatgeneratesLinafiniteamountoftime.

PriortopresentingGold'smainresult,itisworthcon-sideringhowthisframeworkcompareswiththechild'slearningsituation.Intheframeworkdescribedabove,alearnerhasaccessonlytopositiveevidence,thatis,onlytosentenceswhichareactuallyinthelanguage.Thisisnowreferredtoasunsupervisedlearning,sincethelearnerdoesnothaveaccesstoanexternalmetricor'objectivefunction'whichunambiguouslyindicatesthenatureofthesolutiontobelearned.(Goldalsoconsideredsupervisedlearning,intheformofaninformantwhopresentsbothsentencesfromthelanguageandsentencesnotfromthelanguage,whileindicatingwhichiswhich.)Itisgenerallybelievedthatchildrenacquirethesyntaxoftheirlanguagefrompositiveevidenceonly,andtendtoignorethenega-tiveevidencetheydoget(R.Brown,1973).Ontheotherhand,Gold'sframeworkdoesnotallowfor'meaning',ei-therthesemanticmeaningofthewordsthatsentenceshear,orthe'phonetic'meaningofthephonemesthatmakeupthosewords.Moreover,Gold'snotionofidentificationinthelimitdoesnotimposeanyconstraintsupontheinputtext,suchassomekindof'representativeness'criterion.Thatis,itisasafebetthatthewords'momma'or'mother'appearinthefirstmillionwordsthateveryEnglish-acquir-inginfanthears,butthereisnothinginGold'sformulationwhichrequiresinputtextstoexhibitthiskindofreal-worlddistributionalproperty.Thus,Gold'sassumptionslineupwiththechild'slearningsituationinoneway,butdifferfromitinotherways.

ThereweretwokeyresultsinGold(1967).Thefirstisthattheclassofregularlanguagesisnotidentifiableinthelimit.Thesecondwasthatregularlanguages(andevenhigherclassesintheChomskyhierarchy)arelearnableinthelimitfromaninformant,i.e.supervisedlearningwithpositiveandnegativeexamples.Sincephonologyisbe-lievedtobe(sub)regular,andsyntaxisbelievedtobeatleastcontext-free,anditiswidelybelievedthatchildrendoeventuallylearnthecorrectgrammarfortheirlanguage,thisresultisinterpretedbymanytheoristsasprovingthatchildrenpossessinnateconstraintsonthespaceofhypothe-sesthattheyconsiderasgrammarsfortheirlanguage.ThisconclusiondoesnotactuallyfollowfromGold'stheorem.Ingeneral,onecanonlyreason'backwards'fromamodeltorealitywhenoneisconfidentthatthemodelisanaccurateportrayaloftherealityitismodeling.Thatis,modelingresultsdependonahostofassumptions;theyareakintoalogicalpropositionoftheform,'IfAandBandCandD,thenX'.Wecannotconcludefromthetruth

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??9

ofXthatAandBandCandDaretrue.Moreover,wecannotconcludefromthefalsityofXthataparticularas-sumption(e.g.C)isfalse;wecanonlyconcludethatsomeassumptionisfalse.SofromGold'stheorem,whatwecanconcludeisquitelimited.Itcouldbethatchildrenarebornwithinnateconstraintsonthegrammarsthattheyconsider.ButitalsocouldbethattheinputisfarmoreconstrainedthanGoldassumes.Itcouldbethatchildrenleveragemultipletypesofinformation(suchassemanticsandphonetics)inlanguageacquisition,andthatthisprovidesextraconstraintsonthespaceofpossiblegrammars.Itcouldbethathumangrammarsarenotstrictlycomparablewiththeclassesofstring-generatorsthatGoldconsiders.Itispossiblethathumansdonotactuallyconvergeasinglefinalgrammarstate,andactuallydoupdatetheirgrammarsonthebasisofnewinputthroughoutthelifespan.ThesepossibilitiesareallcompatiblewithGold'stheorem.CloseinspectionoftheproofforGold'stheoremre-vealsthatitdependscruciallyontheorderinwhichexam-plesarepresented.Goldshowsthatitispossibletocon-structatextwhichcontinuallyforcesthelearnertoupdatetheirhypothesis,becausetheclassofregularlanguagesisrichenoughthatonecan'maliciously'denycrucialevi-dencetothelearneradinfinitum.Valiant(1984)intro-ducedaprobabilisticframeworkforstudying(machine)learningknownasprobablyapproximatelycorrect(PAC).Abstractingawayfromthetechnicaldetails,thekeydif-ferenceisthattextsarerequiredtobe'representative',inthesensethattrainingexamplesmustbedrawnfromaprobabilitydistribution,andthelearneriscountedas'ap-proximatelycorrect'ifitsgeneralizationerrorsonthisdistributionfallbelowanarbitrarythresholdδ(whichcanbemadeassmallasdesired,aslongasitisstillgreaterthan0).AlanguageclassissaidtobePAC-learnableifalearnercanidentifyan'approximatelycorrect'languageinthehypothesisspacefromafinitesampleofthetargetlanguage.ItisefficientlyPAC-learnableifthereisanal-gorithmwhichisguaranteedtodothiswhilerequiringanumberofexamplesthatispolynomialinthesizeofthelanguage.KearnsandValiant(1994)showthatregularlanguagesarenotefficientlyPAC-learnable,whileLiandVitányi(1991)showthatregularlanguagesareefficientlyPAC-learnableundertheadditionalassumptionthat'sim-ple'examplesaremorelikelytobedrawnthancomplexones(asassessedbyameasurecalledKolmogorovcom-plexity).

Researchershaveinterpretedtheselearnabilityresultsinmanyways.Someresearchersbelievethatthewayforwardistodevelopincreasinglyfine-grainedspecifi-cationoftheassumptionsandincreasinglyfine-grainedclassificationsoftheclassesoflanguages.Somere-searchersbelievethatthiskindofworksimplyhasnobearingonthelearningproblemthatchildrenactuallyface.Onegeneralizationthatmanypartiescanagreetoisthatthelearnabilityresultsofferedsofararefragile,inthesensethatseeminglysmallchangesintheassump-tionscanresultinlargechangesinthenatureoftheconclusion(whileintuitivelysimilarchangesmayalsoyieldnomeaningfuldifference).Thus,onewaytoview

thecarefulworkdonebyGold,Valiantandothersisasanongoingattempttocharacterizewhichassumptionsactuallymatterforlearnability.

Thereisavastamountofworkonformallanguagetheoryandlearnabilitythatcannotbesurveyedhere;Itrustthepresentationabovewasdetailedenoughtogivethelayreaderasenseofwhatformallanguagetheoryaimstoaccomplish.Intheremainderofthissection,Iturntotwonewlinesofworkinformallanguagetheorywiththepotentialtoinformbasicquestionsinphonology.Thefirstconcernswhatmightbecalledframeworkcomparison–amethodologyforcomparingtwodistinctlinguisticfor-malismsviatheformallanguagestheygenerate.Thesec-ondconcernstheuseoffinite-statetechniquesforefficientimplementationsofconstraint-basedphonology,whichIwillrefertoasfinite-stateOT.3.5.Frameworkcomparison

Modernlinguisticshastakenseriouslythetaskofformalizingtheoreticalintuitions.Fromseminalworkstothemodernday,theoristsareapttoproposenewframeworkslikeSPE(Chomsky&Halle,1968)andOT(Prince&Smolensky,1993,2002,2004),ornon-trivialdeparturesfromexistingframeworks,suchasautoseg-mentalphonology(Goldsmith,1976,1990;McCarthy,1981),HarmonicSerialism(McCarthy,2008,2011),andothers.Theformalistbenthaspaidoffintheoreticalprecision:solongasthelinguisticatomsandoperationsarespecified,thereaderofapapercanmakenewpre-dictionsfromatheorywhichtheoriginalwriterwouldagreewith.Thiskindofprecisionenablesrapidprogress,andsurelyreducesthefrequencyandseverityoffruitlessdebatesinthefieldovermisinterpretationsofatheory.Still,aspointedoutinStabler(2009),theproliferationoftheoriesdoescomewithacost.Inmanycasestherearecompetingformalisms,butsincethesurfacecharac-teroftheexplanationissodifferent,itisdifficulttotellwhetherthetheoriesactuallymakedifferentpredictions.Formallanguagetheoryoffersawaytodirectlycomparetheexpressiveandrestrictivepowersoftwodifferentframeworks.Thiskindofworkisalreadywell-establishedinthesyntacticdomain,asevidentfromthefollowingquotationinareviewbyStabler(2009):

IntheworkofJoshi,Vijay-Shanker,andWeir(1991),Sekietal.(1991),andVijay-ShankerandWeir(1994)fourindependentlyproposedgrammarformalismsareshowntodefineexactlythesamelanguages:akindofhead-basedphrasestructuregrammars(HGs),combinatorycategorialgrammars(CCGs),treeadjoininggrammar(TAGs),andlinearindexedgrammars(LIGs).Further-more,thisclassoflanguagesisincludedinaninfinitehi-erarchyoflanguagesthataredefinedbymultiplecontextfreegrammars(MCFG),multiplecomponenttreeadjoin-inggrammars(MCTAGs),linearcontextfreerewritesystems(LCFRSs),andothersystems.Later,itwasshownacertainkindof“minimalistgrammar”(MG),aformula-tionofthecoremechanismsofChomskiansyntax–using

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

10?R.Daland

theoperationsmerge,move,andacertainstrict'shortestmovecondition'–defineexactlythesameclassoflan-guages(Michaelis,2001;Harkema,2001;Michaelis,1998).Theseclassesoflanguagesarepositionedbetweenthelanguagesdefinedbycontextfreegrammars(CFGs)andthelanguagesdefinedbycontextsensitivegrammars(CSGs)likethis.

TheworkscitedbyStablerindicatethatdespitethesurfacedifferencesbetweenformalframeworks,theyaresometimes“notationalvariants”,inthedeepsensethattheydescribethesamesetoflanguages.Thedetailsoftheseproofsarebeyondthescopeofthisarticle,butthegeneralnatureoftheargumentisclear:provideaschemafortranslatingoneformalismintoaparticularkindoflogic,whichcanbeexpressedasaformallan-guage.Thendothesamefortheotherformalism,andshowthatthetworesultingformallanguagesarethesame(ordifferent)accordingtoknownpropertiesoftheformallanguage.Intheviewoftheresearcherswhodothiswork,formallanguagetheoryhasacertainpotentialtotelluswhatourformalmechanismsareactuallybuyingforus.

KaplanandKay(1994)arguablysuppliedthefirstsuchexampleofthislineofworkinphonology.Theyprovedthattherule-basedrewritesystempresentedinSPEbelongstotheclassofregularlanguages,byembed-dingitinaclassoflogicsknownasMonadicSecondOr-der(MSO)logics,knowntobeequivalenttotheregularlanguages.Moreprecisely,KaplanandKayclaimedthatSPEwasregularevenwith'cyclicalrules'thatareallowedtofeeditsownenvironment,aslongastheyareforbiddenfromfeedingtheirowntargets(fordiscussionandclarifi-cationseeKaplan&Kay,1994).Potts&Pullum(2002)didessentiallythesamethingwithOT,embeddingaclassofOTconstraintsintoMSO.Inaddition,Potts&PullumdemonstratedthatparticularclassesofOTconstraintsthathadbeenproposed(e.g.alignconstraints)exceededthepowerofregularlanguages,andinsomecasestheypro-posedregularalternatives.

Graf(2010ab)comparedaformalismknownasGovernmentPhonologywithSPE.Forreadersnotal-readyfamiliarwithGovernmentPhonology,Graf(2010a)veryreadablypointsoutthevastsurfacediffer-encesbetweenitandSPE:

GPasdefinedinKayeetal.(1985,1990)andKaye(2000)differsfromSPEinthatitusesprivativefeatures(featureswithoutvalues)ratherthanbinaryones,assemblesthesefeaturesinoperator-headpairsinsteadoffeaturematrices,buildsitsstructuresaccordingtoanelaboratesyllabletemplate,employsemptycategoriesandallowsallfeaturestospread(justliketonefeaturesinautosegmentalphonology).(p.83)

inastrictlylessexpressivelogic.Inotherwords,Grafarguesthatdespitethemanydifferencesbetweentheseformalisms,thepropertythatreallymattersisboundedvs.unboundedspreading,sincewithunboundedspreadingthetwotheoriescanexpressthesamelan-guages.

Grafgoesontoaddressthe'empiricalbite'ofthistheorybyaskingwhetheranynaturalphonologicalphenomenadorequireunboundedfeaturespreading.Heproposestwocandidates–SanskritnatiandCairenestressassignment.AccordingtoGraf,thenatirulecausesanunderlying/n/(theTARGET)tobecomeretroflexedifitisthefirstpostvocalic/n/afteracon-tinuantretroflexconsonant(/?/or/?/;theTRIGGER),providedthatnocoronalintervenesbetweenthetriggerandtarget,thatthenasaltargetisimmediatelyfol-lowedbyanonliquidsonorant,andthatthereisnoretroflexcontinuantinthestringafterthetarget.AsforCairenestressassignment,theruleistostressthefinalifitissuperheavyorthepenultifitisheavy.Ifboththefinalsyllableandthepenultarelight,theruleistostressthepenultortheantepenult,whicheverisanevennumberofsyllablesfromtheclosestprecedingheavysyllable.Thissuggeststhepresenceofan'invis-ible'trochaicfootingsystem,inwhichsecondarystressespropagateinaniterative/boundedmannerfromtherightmostheavytothepenultortheante-penult.Ofcourse,asGrafpointsout,theabilitytoanalyzebounded/iterativespreadingofan'invisible'featureisempiricallyimpossibletodistinguishfromunboundedspreadingofavisiblefeature.Therefore,heproposestobanbounded/iterativespreadingofin-visiblefeaturesforthepurposesoftheorycomparison.Thissuggeststhatunboundedfeaturespreadingisre-quired–animportanttheoreticalclaim.

TwootherrecentstudiesinthislineofresearchareJardine(inpress)andBuccolaandSonderegger(2013).Jardine(inpress),followingdirectlyfromGraf(2010ab),askswhetherautosegmentalphonologybelongstothesameclassasSPE.Jardinedoesnotgiveacompleteanswertothisquestion,owingtophenomenasuchasfloatingtonesanddissociationrules.However,JardinedoesshowthatMSOisexpressiveenoughtocoverthe'simple'phenomenathatinitiallymotivatedautosegmen-talphonology,suchasrightwardfeature-spreading.BuccolaandSondereggeraddressCanadianraising,anopaquephonologicalpatterninwhichallophonicvaria-tionistriggeredbyanunderlyingcontrastthatiserasedatthesurface:

(12)a.Raisingbeforevoicelessconsonantsride/?a?d/write/?a?t/

b.Foot-medialtappingbatter/b?t?/badder/b?d?/

c.Interactionride/?a???/write/?a???/

Grafbeginsbytranslatingeachoftheseformalisms

intoakindofpropositionallogic.LikeKaplanandKay(1994),GrafembedsSPEinMSO.GrafgoesontoshowthatifGovernmentPhonologyallowsunbound-edfeaturespreading,itcanbeembeddedinMSO;ifitallowsonlyboundedspreading,itcanbeembedded

[?a?d][???t][b???][b???][?a?d][?????]

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??11

Patternslike(12)areeasilycapturedbyarules-basedanalysisinwhichthetappingruleappliesaftertheCanadianraisingrule.Itisgenerallybelievedthatsuchpatternscannotbeaccommodatedby'normal'OT,andaconsiderablebodyofworkhasbeendevotedtoaccommodatingthetheorytothistypeofpattern.WhatBuccolaandSondereggershowisthatforanyversionofOTinwhichthereisasinglestratum(thatis,oneinputrepresentationandoneselectedoutputrepresentation,withnointerveningrepresentationallevelssubjecttocompetition),andinwhichfaithful-nessconstraintsassesstherelationshiponlybetweenaninputsegmentanditsoutputcorrespondence(i.e.withoutreferencetoitsneighbors),theOTtheoryisstrictlyunabletoaccountfortheopacitypattern.TheyshowthisbytranslatingOTconstraintsintofinite-statetransducers,inthemannerproposedbyRiggle(2004)anddescribedinthenextsubsection.However,BuccolaandSondereggeralsoacknowledgethatthehighlyrelatedformalismofHarmonicGrammar(inwhichconstraintcompetitionsareresolvedthroughlinearcombinationsratherthanstrictdomination)ac-tuallycanaccommodatecaseslikeCanadianraising,withoutallowingforso-calledpositionalfaithfulnessconstraints.(Finally,thereisananalysisinwhichthe[??]~[a?]contrastistreatedasphonemic,althoughmanylinguistsdispreferthis,sinceitrequiresstipulat-ingthat[??]isonlylicensedbeforecoronalobstruentsand[?],andcruciallyfailstolinkthisfacttothenearlycomplementarydistributionof[a?].)

Insummary,formallanguagetheoryhasbeguntodeliveronthepromiseofframeworkcomparisoninphonology.Ifoneiswillingtoacceptthepremisethatalanguageisasetofstrings,thiskindoftechnicallyexactingworkhasthecapacitytorevealsurprisingequivalencesbetweenformalisms,andtozoominonkeypropertieswhichdistinguishexpressivity.Still,itmustbeacknowledgedthatexistingworkseemstodependsensitivelyondetailsoftheanalysiswhicharenotthemselvesrock-solid.Forexample,theclaimthatsyntaxisnotcontext-freerestsonphenomenalikecross-serialdependencies,andmorespecificallyontheclaimthatSwissGermanallowsanunboundednumberofthem.Inpractice,itislikelyquiterarefornaturalusagetoyieldmorethan1crossingdependen-cy.WhileGraf's(2010a)workdoesnotstrictlydependonwhetherunboundedfeaturespreadingactuallyoc-cursinphonology,itdoessuggestthatthisisacriticaldistinctionphonologistsshouldattendto.However,asheacknowledges,thetwoputativecaseshegiveshavebeencontentiousintheliterature.BuccolaandSonderegger(2013)discussCanadianraisingandmoregenerallycounterfeedingonenvironment(Bakovi?,2011)andseemtoendorsearules-basedapproach,buttherearealternativeanalysesthatdonotrequireadhocmodificationstoexistingtheories.

Inconclusion,formallanguagetheoryoffersarigorous,string-basedandaxiomaticapproachtophonologyasaformalsystem.Manyresearchersbe-

lievethatthiskindoflogic-ormodel-basedapproachtophonologyisthekeytodiscoveringwhatformalpropertiesofourframeworksmakeformeaningfulcontrastsinempiricalcoverageandrestrictiveness.Otherresearchersareuneasywiththisapproach,afeelingwhichStabler(2009)aptlysummarizedthusly:

Butmanylinguistsfeelthateventhestrongclaimthathumanlanguagesareuniversallyintheclassesboxedin(1)isactuallyratherweak.Theythinkthisbecause,intermsofthesortsofthingslinguistsdescribeinhumanlanguages,thesecomputationalclaimstelluslittleaboutwhathumanlanguagesarelike.(p.203)

Lookingbackovertheworksreviewedabove,itisclearthattheformallanguageapproachhasrelativelylittletosayaboutmarkedness,alternations,opacity,ormanyothercoreconcernsofmainstreamtheoreticalphonology.

Forexample,oneofthemostappealingaspectsofconstraint-basedgrammarsisthattheyformallyen-codeasubstantivebiasagainstmarkedstructures,bydirectlyincludingmarkednessconstraintinthetheory.Indeed,thesuccessofOTinpredictingthetypologyofsyllablestructuresarisesfromthecombinationofanONSETconstraint(whichpunisheswordsthatbeginwithavowel)withaformalpropertycalledharmonicbounding(ifcandidateBisequalorworseoneverydimensionthancandidateA,Bcanneverwin).OTtherebypredictstheexistenceoflanguageswhichre-quirewordstobeginwithaconsonant,andoflan-guageswhichallowwordstobeginwithavowel,whilecorrectlypredictingtheabsenceoflanguageswhichrequirewordstobeginwithavowel.Butthereisnothingabout“regularity”whichforcesthisproper-ty.Rather,itispartofthesubstantivecontentofthetheory.Formallanguagetheorysimplyhasnothingtosayaboutit.

Inanycase,itisclearthattheformallanguagetheoryapproachtoframeworkcomparisonhasjustbeguntoaffectphonology.Therewillbemoreofthisworkinthenearfuture,notless.Theeventualtheoret-icalimpactofthislineofworkcannotbedeterminedyet,andislikelytodependontheextenttowhichtheoristsengagewithwell-establishednaturallanguagedata.

3.6.Finite-stateOT

Mainstreamphonologicaltheoryhasundergoneaparadigmshiftwiththeinnovationofconstraint-basedtheoriessuchasOptimalityTheory(McCarthy&Prince,1994;Prince&Smolensky,1993,2002,2004).ItwasEllison(1994)whofirstproposedafinite-stateimple-mentationofOT.TheessenceoftheproposalwastoconstructanindividualFSTforeachconstraint.Forexample,withparticularrepresentationalassumptions,theconstraint*CODAcanbeencodedwith(13):

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

12?R.Daland

(13)

In(13),theinputiscodedaspairsofsyllableslotsandsegmentalmaterial,withOindicatinganonsetpo-sition,Nthenucleus,Cacoda,εtheemptystring(asyllablepositionthatisnotfilled),and?∈anynonemptystring(asyllablepositionthatisfilled).Thus,forexample,whenthesyllabifiedform[al.qal.am.u]isrunthroughtheFSTin(13),itisrepresentedasin(14a),andtheoutputisasin(14b):

(14)a.b.

O∈0

Na0

Cl-1

Oq0

Na0

Cl-1

Om0

Nu0

C∈0

Inotherwords,theinputstringistransducedtoastringofconstraintviolations,whosesumindicatesthenumberofconstraintviolationsforthecandidateasanegativeinteger.Moreover,byconstructingaregularexpressionwhichgeneratesallpossiblesyllabificationsof/alqalmu/andperforminganoperationknownasin-tersection(alsocalledtheproduct),onecanobtaintheconstraintviolationsforeverypossiblesyllabification.Theadvantageofdoingthiswithfinite-statemethodsisthattheyareamenabletomemory-andoperation-ef-ficientcomputerimplementation;infact,standardfinite-statelibrarieshavebeendevelopedformostmajorcomputerprogramminglanguages.

Subsequentworkhaselaboratedonthisconceptioninvariousways,althoughthecoreideaofwritingcon-straintsasFSTshasremained.Forexample,Karttunen(1998)proposedtocomposeconstraintsaccordingtotheirrankinginaparticularlanguagewithlenientcom-position,whichefficientlyremovescandidatesfromthecomputationassoonastheybecomesuboptimal,whileallowingcandidatestoviolatehigh-rankedconstraintswhenthereisnobettercompetitor.FrankandSatta(1998)studythegenerativepowerofOT,andconcludethatitisregularonlyifindividualconstraintscanassignatmostann-arydistinctioninwell-formednessforsomefiniten.Forexample,theALIGNfamilyofconstraints,whichmightpenalizeanelementaccordingtoits(poten-tiallyunbounded)distancefromtheedgeofaword,issuspectbythesecriteria.

Finite-stateOTisparticularlyexcitingtomebecauseofitspotentialforthestudyoflearning.Thekeyideascanbetracedtoavarietyofpapers.Goldwaterand

Johnson(2003)firstnoticedthatHarmonicGrammarscouldbeextendedtolog-linear(maximumentropy)models,simplybytreatingconstraintsasthefeaturefunctions.Bergeretal.(1996)provedthatundermildassumptionsthelikelihoodfunctionoflog-linearmodelsisconvexintheweightspace,whichmeansthatthereisauniquemaximumanditcanbefoundefficientlyusingthegradient(thevectorofderivativeswithrespecttoeachweight).Bergeretal.furtherobservedthatthegradientcanbecalculatedasO–E,whereOiistheob-servedviolationcountforconstraintfiinthetrainingdata,andEiistheexpectedviolationcount.Eisner(2002,etseq.)andRiggle(2004,2009)extendedthefinite-stateconceptionofconstraintswithaspecialproductoperationthattrackstheviolationvectorforanentiregrammar,alongwiththevector's(log-)probability,usinganalgebraicstructurereferredtoasa'violationsemiring'.Theviolationsemiringconstructionofferscomputationallyefficientcomputationoftheweightedviolationvectorsforanyregularclassofstrings.Therefore,itcanbeusedtocalculatetheexpectedvio-lationcountEwhenthatvalueiswell-defined.Together,theseresultsimplythatamachine-implementedlog-lineargrammaticalmodelcanfeasiblybetrained.HayesandWilson(2008)actuallyimplementedsuchamodelinJava,andhavebeenproducinginterestingworkwithitinsubsequentpapers.IwillreturntothismodelintheCognitiveModelingsection.

Heinzandcolleagueshaveappliedfinite-statetech-niquestotheacquisitionofphonology.Forexample,Heinz(2007)treatstheacquisitionoflong-distancephonologicalpatterns(suchassibilantharmony,vowelharmony,andstressassignment)usingfinite-statelearning.Heinzobservesthatallsuchlong-distancephenomenaexhibitapropertyhecallsneighborhooddistinctness,apropertywhichenforcescertainkindsofgeneralization,andwhichfallsoutnaturallyfromapply-inga'statemerging'operationduringconstructionoftheFSM.LaterworkbyHeinzconsiderslearningvari-ousclassesofsubregularlanguages,oftendirectlymoti-vatedbyparticularphenomenasuchasvowelharmony,andsometimeswithproofsofidentifiablilityinthelimit(Heinz,2010;Heinz&Koirala,2010;Heinz&Lai,2013).

Formallanguagetheoryhasdevelopedalargebodyofaxiomaticresultsonclassesof'languages',definedasstringsetsgeneratedintentionallybysomefinite,compactgenerativemechanisms.Workonthistopicisgenerallyconcernedwith'learnability',whichistypicallyformulatedatanabstract,algebraiclevel.Forexample,aclassoflanguagesisidentifiableinlimitifanoptimallearningalgorithmcanbeguaranteedtoconvergeuponthecorrectlanguageintheclassgivenanarbitrarysampleofsomesize.Recentworkonthistopichasillus-tratedsurprisinginsightsontheexpressiveequivalenceofformalframeworkswithverydifferentsurfacechar-acteristics,andhasprovidedpowerfultoolsforimple-mentingconstraint-basedphonologyincomputationallyefficientfinitestatemachines.

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??13

4.NATURALLANGUAGEPROCESSINGANDAUTOMATICSPEECHRECOGNITION(NLP/ASR)

EverytimeIfirealinguist,theperformanceofthespeechrecognizergoesup.--FredJelinek(inHirschberg,1998)Therearethreekindsoflies:lies,damnlies,andstatistics.--BenjaminDisraeli(Twain,2006,p.471)

encounternewpairsofspeechsoundsthroughouttheirlife.Theenormousrangeofvariationinfrequencieshasimportantbutsometimesunderappreciatedimplicationsforhowlearnersmightacquirephonology.4.2.Statisticalmodels

Asnotedabove,thegoalofmanyNLPandASRre-searchersistobuildlanguagetechnologiesthatwork,ratherthanfocusingonthecognitiveprinciplesthatun-derlielanguageuse.Ofcourse,thosetwogoalsarenotmutuallyexclusive,buttheyarenotidenticaleither.Infact,thegeneralexperienceoftheNLPandASRcom-munityhasbeenthat'dumb'modelswithlotsoftrainingdataperformbetterthan'smart'modelswithlesstrainingdata:

Idon'tknowhowmanyofyouworkinIThavehadthisexperience,butit'sreallyawfullydepressingtospendayearworkingonaninterestingresearchideaandthendiscoveryoucangetabiggerBLEUscoreincreaseby,say,doublingthesizeofyourlanguagemodeltrainingdata.Iseeacoupleofnoddingheads.--PhillipResnick(inP.Brown&Mercer,2013)

Computationalphonologyisgenerallyusedtorefertobasicresearch.However,thereisextensiveoverlapwiththefieldsofNaturalLanguageProcessing(NLP)andAutomaticSpeechRecognition(ASR),sinceallthreedealwithcomputationsinvolving(representationsof)speechsounds.Despitetheoverlap,thereisacertaintensionbetweenthegoalsofthescientistandthegoalsofengineerswhowishtoapplythesciencetosolvereal-worldproblems,asrevealedinJelinek'soft-repeatedquip,above.Thisreviewwillnotaddresscutting-edgeworkinNLPorASR,since'computationalphonology'isnotgenerallyusedtodescribethiskindofwork.Still,currentcomputationalworkowesahugedebttoNLPandASRfortheapplicationofstatisticalmethodstonaturallanguage.IwillbrieflydescribetwoconceptswhichoriginatedfromNLP/ASRbutwhichhavespreadtocomputationallinguisticingeneral.4.1.Zipfiandistributions

Itseemstrivial,almosttothepointofbanality,toobservethatsomethingshappenmorethanothers;forexample,somewordsarerepeatedmorefrequentlythanothers.However,thenatureofthedistributioncanhavepowerfulconsequencesforlanguageacquisitionandprocessing.Itturnsoutthatthevariationinwordfrequen-ciesisnotcompletelyrandom;itfollowswhathascometobeknownasaZipfiandistribution(Zipf,1935,1949).Thismeansthatasmallnumberofitemshavealargefrequency,andalargenumberofitemshaveasmallfrequency.Itisalsosometimesinformallydescribedas'mosteventsarerare'.

Zipfiandistributionsarefoundateveryleveloflin-guisticstructure.Baayen(2001)considerstheimplica-tionsofthisfactformorphology.Anessentialpointisthatforanynaturallanguagetext,theprobabilityofen-counteringanewitemneverdropstozero.Therefore,afunctioningmodeloflanguageusemustalwaysallowforunseenitems.Thereadermightbesurprisedtolearnhowmuchresearchdoesnotprovideforthis.Forexam-ple,thebest-knownandmost-successfulmodelofwordrecognition,TRACE(Elman&McClelland,1985),doesnothaveanyexplicitmechanismforhandlingso-calledOut-of-Vocabulary(OoV)items.DalandandPierrehum-bert(2011)foundthatevensegmentaldiphones(acon-sonantorvowel,followedbyanotherconsonantorvowel)exhibitaZipfiandistributioninEnglish.DalandandPierrehumbertgoontoshowthatanEnglishlistenergetsenoughinputinonedaytoapproximatethefrequen-cydistributionover(frequent)diphones,yetmightstill

Anexampleofa'dumb'modelinsyntaxistheMarkov/n-grammodelsthatChomsky(1956)attackedasinsufficienttoexplainvariouslong-distancephenom-ena.FromtheperspectiveofNLP/ASRresearchers,linguistictheoryisgoodtotheextentthatitisusefulandnecessaryforbuildingsystemsthatwork:

It'snotthatwewereagainsttheuseoflinguisticstheory,linguisticrules,orlinguisticintuition.Wejustdidn'tknowanylinguistics.Weknewhowtobuildstatisticalmodelsfromverylargequantitiesofdata,andthatwasprettymuchtheonlyarrowinourquiver.Wetookanengineeringapproachandwereperfectlyhappytodowhateverittooktomakeprogress.Infact,soonafterwebegantotranslatesomesentenceswithourcrudeword-basedmodel,were-alizedtheneedtointroducesomelinguisticsintothosemodels...Wereplacedthewordswithmorphs,andinclud-edsomena?vesyntactictransformationtohandlethingslikequestions,modifierposition,complexverbtensesandthelike...Nowthisisnotthetypeofsyntacticormorpho-logicalanalysisthatsetsthelinguist'sheartaflutter,butitdramaticallyreducesvocabularysizesandinturnimprovesthequalityoftheEMparameterestimates...Fromourpointofview,itwasnotlinguisticsversusstatistics;wesawlinguisticsandstatisticsfittingtogethersynergistical-ly.--PeterBrown(inP.Brown&Mercer,2013)

AcrucialcontributionofNLP/ASRhasbeenthein-sightthatprobabilisticapproachtolanguagemodelingisnecessaryfordevelopingreal-worldapplications.Arguably,itisalsoinspiringarevolutioninhowweconceptualizelanguageacquisition,oratleastphonolog-icalacquisition.

Thiscommunityhasalsodevelopedmachine-learn-ingtechniquesthatenableefficientestimationofmodel

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

14?R.Daland

parameters.Forexample,commercialASRtechnologieslikeNuanceDragonrelyonanacousticmodelwhichreliesona'dumb'HiddenMarkovModel(HMM).AnHMMisacloserelativeofaprobabilisticFSM,withtwokeydifferences.First,thestatesthemselvesarela-tentvariables(inthesensethatthemodelbuilderpositsthattheyexist,andtheyconditionthemodel'soutput,buttheirparameters/relationshipstoothermodelcom-ponentsarelearnedduringtraining).Second,emissionofastringisnotdirectlyassociatedstatetransitions;rather,eachstateisassociatedwithaprobabilitydistri-butionoverobservations.Theacousticobservationsareatimeseries{ot}t=1..Mwhereeachotissomekindofvector,typicallygeneratedbysomekindofspectraldecompositionofoverlappingtimeframesfromthewaveform.Forexample,asimpleHMMisshownin(15):

(15)

transitionstakeupthebulkoftheprobabilityineachcasesincenormallythesamevowel/consonantisspreadovermanyobservationframes.The'emissionprobability'boxescharacterizethelikelihoodofemit-tingthecurrentobservationotgiventhepositedstatestusingamulti-dimensionalnormaldistribution.Forexample,the'C'labelisassociatedwithrelativelyloweramplitudeandlessperiodicitythanvowelsinthe2-5kHzband,andrelativelyhigheramplitudebutstilllessperiodicitythanvowelsinthe5-10kHzband.Muchoftheearly,seminalworkinthesefieldsfocusedondevelopingdynamicprogrammingtechniquestotrainthesemodelsefficientlyfromlimitedorverylargeamountsoftrainingdata.Especiallywell-knownaretheViterbialgorithmforfindingthemostlikelysequenceofstatesgivenanobservationsequence(Viterbi,1967),andtheBaum-Welch(orforward-backward)algorithmforfindingtheunknownparametersofanHMM(Baum&Petrie,1966;Jelinek,Bahl,&Mercer,1975).Thesealgorithms,ormodestadaptations/generalizationsofthem,arestillusedinmostormanyNLPpaperspublishedtoday,aswellasinthefinite-stateOTmethodsdescribedearlierandelaboratedinmoredetailinlatersections.

ThediscussionofNLP/ASRisnecessarilybrief.Asemphasizedthroughoutthisdiscussion,NLPhasmadesignificantcontributionstowhatnowmightbecalledcomputationalphonology,althoughinpracticeNLPisinterestedinengineeringapplications(suchasASR)andisnormallyconsideredaseparatefield.Theuseofstatisticalmodelshastransformedcognitivemodelinginphonology,towhichIturnnext.5.COGNITIVEMODELING

TheadventofstatisticalmodelsinNLPofferedupnewavenuesformorecognitivelymindedresearchers.EarlyexamplesofthisincludetheworkoftheParallelDistributedProcessinggroup,whoformulatedtheTRACEmodelofspeechperception(Elman&McClel-land,1985)aswellasahotly-contestedsingle-routemodelofpasttenseformation(Rumelhart&McClel-land,1986).The'connectionist'approachtheyemployed,emphasizingso-calledArtificialNeuralNetworks(ANNs),haslargelybeenabandonedincontemporarycognitivescience,forreasonstoocomplextodiscusshere.Nonetheless,thePDPgroupdeservescreditforusheringinaneweraincognitivesciencebyattemptingtoexplicitlylink(psycho-)linguistictheorieswithhumanbehavioraldata.

5.1.PhonotacticandphonologicallearningThebulkofcognitivecomputationalmodelingofphonologythatthisauthorisawareofisconcentratedintheareasofphonotacticandphonologicallearning.Therearetwokeymessagesthatthisliteraturesuggeststome.Thefirstisthataconstraint-basedapproachto

Inthiscase,thetaskistoparseanacousticse-quencebylabelingeachdiscretetimeframeasbelong-ingtooneofthecategories'C','V',or''(silence).Theacousticobservationshavefourdimensions,rep-resentingtheabsolutemagnitudeofthesignalintheband2000-5000Hz,theentropyperHertzinthisband(ameasureofaperiodicity),theabsolutemagnitudeofthesignalintheband5000-10000Hz,andtheen-tropyperHertzinthisband.(Notethatthistypeofacousticrepresentationisquitedifferentfromwhatisusedincommercialapplications.AconcreteexampleisgiventohelpthereaderconceptualizeanHMM.Theparametersin(15)werenotgeneratedfromactualspeechdata;theyareincludedonlyforconcreteness.)Thesolidlinesrepresentstatetransitions,andthenumbersrepresenttheassociatedprobabilities.Self-

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??15

phonologicallearningmakessensefromarangeofstandpoints.Thesecondisthatastochasticapproachtophonologicalvariationmakessensefromarangeofstandpoints.

5.1.1.Factoringthelearningproblem

AsnicelysetforthinHayes(2004),aconstraint-basedapproachmakessenseoftheempiricaldataweseeonphonologicaldevelopment.Morespecifically,Hayes(2004)reviewsarangeofstudiessuggestingthatinfantsacquiresignificantaspectsofthephonotacticsoftheirlanguageby9-11monthsofage,whilethereisnoorlittleevidenceofunambiguouslyphonologicalalternationsuntil15-24monthsofage.Inaconstraint-basedframework,thispatterncanbecapturedbyathe-oryinwhichmarkednessconstraintsarelearnedearly.WhileHayes(2004)doesnotclaimthatinfantshavenocommandoffaithfulnessconstraints,itseemsintuitivelyplausiblethatitiseasiertolearnaboutwhichsurfacestructuresdoanddonotoccur(phonotactics)thanitistoalsolearnaboutnon-transparentrelationshipsbetweenURandSR.

5.1.2.Learnabilityproofsforconstraints

AlthoughitisinprinciplepossibletoreasonaboutacquisitionwithinSPE-stylerules,thenatureoftheOTformalismhasevidentlybeenmoreamenabletoformalanalysis.TheadventofOTwasfollowedinshortorderbylearningalgorithms,andformalproofsoftheireffi-cacy.Forexample,TesarandSmolensky(2000)sum-marizealargebodyofearlierworktreatingthephono-logicalacquisitionproblemfromtheperspectiveofOT.Oneaspectofthelearningproblemislearningthepro-ductiongrammar–thecomponentwhichmapsunderly-ingrepresentationstofullyspecifiedsurfacerepresenta-tions.Theygiveaformalproofofthe'correctness'ofanalgorithmtheyrefertoasError-DrivenConstraintDemotion(EDCD),whichsolvesthisproblem.Thatis,ifthelearnerisgivencorrectunderlyingformsandcorrectsurfaceformsfromanOTgrammarwithcon-straintsC={Ck},EDCDprobablyconvergestothecorrecttotalorderingoverCwhichgeneratedthelearningdata.Ofcourse,thelearningproblemforinfantsismoredifficult–theymustinfernotonlythegrammar,buttheunderlyingformsandthecorrectsurfaceforms(includinghiddenstructure).TesarandSmolenskyde-scribetheprocessofassigningafullyspecifiedsurfacerepresentationtoanobservableformasRobustInterpre-tiveParsing(RIP;althoughBoersma,2003,pointsoutthiscouldsimplybecalledperception).TesarandSmolenskyfurtherproposeLexiconOptimization,theassumptionthatwhenmultipleinputformsmaptothesamehypothesizedsurfacerepresentation,themostfaithfulURisselected.Theyshowinaseriesofsimula-tionsthatthiscombination(EDCD+RIP+LexiconOpti-

mization)correctlylearnsasignificantmajorityofstresspatternsinafactorialtypology,althoughtherewerecasesinwhichthelearnergot'stuck',failingtoconvergeonanycorrectgrammar.

Theadoptionofscalar-valuedweightshasopenedupadditionalanalyticpossibilitiesinconstraint-basedlearning.Forexample,Potts,Pater,Jesney,Bhatt,andBecker(2010)showedthatthesimplexalgorithmcouldbeusedtoidentifyweightsforaHarmonicGrammar.ThisprovidesalearnabilityproofforHarmonicGram-marthatisentirelyanalogoustothecorrectnessproofofTesarandSmolensky'sEDCDforOT,exceptthatPottsetal.employapre-existingmathematicalapproachwithawell-establishedpedigree.Inaseriesofpapers,Magri(2012,inpress)analysesthephonotacticlearningproblemusingascalar-valuedvariantofOTinwhichthewinninginput-outputcandidateisdeterminedbyatotalorderingofconstraints,whichisprojectedfromunderlyingscalar-valuedconstraintweights.Magrigivesboundsunderwhichtheuseofscalarweightsanderror-drivenre-weightingissufficienttorenderlearningalgo-rithmstoleranttonoise(i.e.occasionaldatapointswhichviolatethegrammar).However,Magri'sworkgenerallydealswiththegrammarasafunction,meaningthataninputmustbemappedtothesameoutputoneveryocca-sion.Boersmaandcolleagueshaveshownthatastochasticapproachprovidesgracefulhandlingnotonlyofnoise,butoffreevariation.Forexample,BoersmaappliedstochasticgradientascenttoaprobabilisticvariantofOT(somereadersmayknowthisastheGradualLearningAlgorithm).BoersmaandHayes(2001)testedthisalgorithmonanumberofempiricalphenomena,findingthatitwasabletohandlenotonlyexceptionaldatapoints,buttoaccuratelymodelgenuinefreevariation.AmorecomprehensivereviewofthistopicisgiveninSection4ofCoetzee&Pater(2011).5.1.3.Stochasticphonology

TheworkofPierrehumbertandcolleaguesreflectssomeoftheadvantagesofadoptingastatisticalperspec-tiveinthestudyofphonologyandphonologicalacqui-sition.Forexample,Pierrehumbert(1994)conductedastudyofthetriconsonantalclustersobservedword-me-diallyinEnglish.Asacrudefirstpass,sheproposedthattheexpectedoccurrencesofamedialclusterinmonomorphemescouldbedeterminedcompositionallyfromtheprobabilitiesofgeneratingtheclusterfromasyllablecodaandafollowingsyllableonset,e.g.E[lfr]=|L|·Pr(l]σ)·Pr([σfr)where|L|isthesizeofthemonomorphemiclexicon.Pierrehumbertfoundthatofthe8708potentiallygrammaticalmedialclustersthatcouldbegeneratedinthisway,only50wereactuallyattestedmonomorphemically.Naively,onemightimaginethismeansthereisalotofworkforlinguistictheorytodo,explainingwhysomanypossibleeventsdon'toccur.However,Pierehumbertpointedout,over8500ofthese8708clustershadanexpectedfrequency

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

16?R.Daland

below1.Inotherwords,aproperlinguisticexplanationwasonlyneededforthe150orsotriconsonantalclusterswhichhadexpectedfrequencieswellabove1,butob-servedfrequenciesof0.'Chance'alonewasenoughtoexplaintheabsenceofmostunattestedclusters,alleviat-ingtheburdenonlinguists.

ColemanandPierrehumbert(1997)furtherelaborat-edthisideabyformalizingasyllableparserasaproba-bilisticcontext-freegrammar(aPCFGisaCFGlikeinexample(6),butwithprobabilitiesattachedtotherewriterules).Theyaddedprosodicfeaturestodistin-guishstressedfromunstressedsyllables,aswellasinitialversusnoninitialandfinalversusnonfinalsyllables.ColemanandPierrehumbertvalidatedtheirmodelagainsthumanjudgmentsfromanonce-wordacceptabil-itytask.Theyfoundthattheaggregateacceptabilityoftheirnonwordswasalmostperfectlycorrelatedwiththelog-probabilitytheirmodelassigned,afindingthathassincebeenreplicatedwithnumerousotherprobabilisticmodels(Daland&Pierrehumbert,2011).Inadditiontocomparingthemodeloutputtobehavioraldata,Pierre-humbertandcolleagues'workrepresentsanearlyin-stanceofacriticalaspectofcomputationalcognitivemodeling–specifyingameaningfulbaseline,againstwhichtheutilityofaparticularformaldevicecanbemeasured.

Anotherdomaininwhichastochasticapproachhashadsomesuccessisinnon-deterministicmorphophonol-ogy.Asmentionedabove,thePDPgroupproposedaninfluentialconnectionistmodelofpasttenseproductioninEnglish(Rumelhart&McClelland,1986).Thispaperwasverypolarizing,sinceitsuggestedthatbothregularand'irregular'morphophonologycouldbeexplainedbyasingle,analogicalsystem.Anumberofresearchers,includingPinkerandMarcus,proposedadual-routemodelinwhichregularmorphologyiscalculatedbyarule-basedgrammar,while'irregular'morphologyiscalculatedbyananalogicalsystem.Owingtotheheatedrhetoricsurroundingtheissueandthenumberofpaperswrittenonthistopic(Albright&Hayes,2003;Daugher-ty&Seidenberg,1992;Marcus,1995;Marcus,Brinkmann,Clahsen,Wiese,&Pinker,1996;Pinker&Prince,1988;Plunkett&Marchman,1991;tonamejustafew),ithasbecomeknownasthePastTenseWars.Althoughthereisnotspacetoreviewthisfascinatingliterature,itismentionedherebecausecognitivecompu-tationalmodelingplayedsuchaprominentroleinthedebate–formalmodelswereimplementedincomputerprograms,whichgenerateddatathatwasthencomparedtochildand/oradultproduction.Partlyasaresultofresearchers'commitmentstoactualimplementedmodels,anumberofimportantdiscoveriesweremade.Theseincludedtheobservationthatminorityinflectionalpat-ternscanbemarginallyproductive(e.g.spling→splung),thediscoveryofoutput-orientedprocesses(e.g.irregularslikeburntsharesurfacecommonalitieswithregularlyinflecteditems,inthiscasethepresenceofaword-finalcoronalstopthatisnotpresentintheverbstem),andthediscoveryof'islandsofreliability'not

onlyinirregularlyinflectedpatternsbutalsoinregularforms(forfurtherdiscussionseeAlbright&Hayes,2003).

5.1.4.Constraint-basedstochasticphonologyFollowingtheresearchprogramofHayes(2004),andtheinsightofGoldwaterandJohnson(2003)thatHarmonicGrammarcanbenaturallyextendedtothelog-linearframework,Hayes&Wilson(2008)describeandimplementaphonotacticlearnerthatissuppliedwithaproto-lexicon(alistofwordforms)andaphono-logicalfeatureset.Thefeaturesetdefinesasetofnaturalclasses,followingmainstreamphonologicaltheory.Thesoftwarethenconsidersgrammarsconsistingof'n-gramconstraints',e.g.thebigramconstraint'*[-son,+vcd][-son,-vcd]'mightprohibitasequenceofobstruentsO1O2inwhichO1isvoicedwhileO2isvoiceless.Foragivensetofconstraints,thesoftwareusesthefinite-statemethodsofRiggle(2004,2009)torapidlydeterminetheoptimalweights.Thegrammarisbuiltandprunediteratively,byselectingnewconstraintsfromaverylargehypothesisspaceaccordingtovarioussearchheuristics,andthenretainingthoseconstraintswhichpassacomplexity-penalizedstatisticalcriterionforim-provingthemodelfittothetrainingdata.Hayes&Wilson(2008)demonstratethatthegrammarslearnedbythemodelexhibitvariousempiricallydesirableproperties.Forexample,whentrainedononsetclustersintheEnglishlexicon,itassignsgradientwell-formed-nessscorestolegalandunattestedonsetclusters,whichcorrelatequitetightlywiththeaggregatejudgmentsofschoolchildrenonthesameonsetsasreportedinthebodyofthepaper.Furthercomputationalworkstudyingthismodel'spredictionsforsonoritysequencingisgiveninDalandandPierrehumbert(2011)andHayes(2011).HayesandWhite(2013)usethemodelasabaselinetotestfor'phoneticnaturalness'effectsinlearning,i.e.whethertwoputativeconstraintswhichreceiveequalsupportfromthelexicon,butdifferintheextentofphoneticmotivation,aretreatedequallybyadultEnglishspeakersinratingnovelforms.

TheworkofJarosz(2006,2013)hasconcentratedparticularlyupontheproblemoflearningunderlyingrepresentationsinstochasticconstraint-basedphonology.Forexample,Jarosz(2013)containsacarefulanalysisofwhyRobustInterpretiveParsing(Tesar&Smolensky,2000)failsinparticularcases;amongotherthings,Jaroszconcludesthatencodingaprobabilitydistributionacrossoutputsallowsthelearnertorecoverfromthe'traps'thatcausedTesarandSmolensky'salgorithm(whichwascastincategorical,non-stochasticOT)tofail.

Thismomentisaveryexcitingoneinthetheoryofphonologicalacquisition.Thefieldwideshifttocon-straint-basedtheorieshasopenedupmultiplenewlinesofattackontheacquisitionproblem.AsHayes(2004)pointedout,theconstraint-basedapproachiscompatible

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??17

withthedevelopmentaltrajectorythatisactuallyob-served,undertheinterpretationthatchildrensettherelativeprioritizationofmarkednessconstraintsratherearlyindevelopment.Nearlyallofthepapersreviewedinthissectionrepresentsignificantinsightsontotheacquisitionproblem,thatwouldnothavebeenpossibleunderSPE-stylerules.Whiletherearenodoubtaddition-alsubtletiesinthisapproachthathavenotbeendiscov-ered,theratherrapidprogressthathasbeenmadeinthelast10yearsonphonologicalacquisitioninparticulararguablyoutstripstheprogressthathadbeenmadeinthepreceding30-40yearsduringwhichSPE-styleruleswerethedominantphonologicalframework.

Onepartofwhathasmadethisprogresspossibleisthattheconstraint-basedapproachlendsitselfnaturallytoproblemrepresentationsthataresimilar,andadaptableto,problemrepresentationsinmachinelearning.Themorethatlinguisticproblemscanberepresentedlikeproblemsinotherscientificfields,themorewelinguistsareabletoleveragethepowerfulcomputationaltoolsthathavebeendevelopedtosolvethem,suchasmaxi-mumentropymodels(Goldwater&Johnson,2003;Hayes&Wilson,2008;Jarosz,2013).Atthesametime,theadoptionofmachinelearningmethodspromisestohelpfocusphonologicaltheoryonthesubstantivecom-ponentswhichitadds,overandabovetheory-innocentmachinelearningmethods.Forexample,Hayesrepeat-edlymakesthepointthatakitchen-sinkapproachtoconstraintsfailswithtoylanguagesandotherwisesuc-cessfullearningalgorithms(Hayes,2004;Hayes&White,2013).Analogously,itiscommonloreamongsttheoreticalphonologiststhatasuccessfulOTanalysiscanbesunkbythewrongconstraint,andthisholdsequallytrueinacomputationalsettingwheresomeofthecandidateenumerationandscoringisdonerigorouslybythecomputer.

Wecanexpectfurther,rapidprogressonthisdomaininparticular;theauthorisincommunicationwithanumberofscholarsdoingnewandinterestingthingsonthistopicatthisverymoment.Inthenextsubsection,weturntoanotherareawherecomputationalmodelinghashadasignificantimpactonrapidprogress,wordsegmentation.

5.2.Wordsegmentation

Wordsegmentationistheperceptualprocesswherebylistenersparsethespeechstreamintoword-sizedunits.Asevidentfromlisteningtospeechinanunfamiliarlanguage,manywordsarenotfollowedbyasilenceorotherlanguage-generalauditoryboundarycue.However,fluentandnormally-hearinglistenersepiphenomenallyreportthesensationofhearingdiscretewordsduringspeechperception,exceptunderthemostchallenginglisteningconditions.Wordsegmentationreferstothecognitiveprocessorprocessesthathaveappliedbetweentheauditorylevelandthelistener'spercept,ofdiscretewordsinasequence.

OneoftheearliestcomputationalapproachestowordsegmentationwastheseminalTRACEmodelofspeechperception,publishedbythealready-men-tionedPDPresearchgroup(Elman&McClelland,1985).Inthismodel,thelistenerisequippedwithabankofphonologicalfeatures,aphoneme(orallo-phone)inventory,andaninventoryofwords.The'auditoryinput'isrepresentedasatime-varyingvectoroffeaturevalues.Themodelisaspecificinstanceofageneralclassofmodels,quitepopularinthepsy-cholinguisticliterature,knownas'spreadingactiva-tion':theperceptualinformationfromthe'bottom'(inthiscase,auditoryfeatural)levelpercolatesupto'higher'levels(phonemes,andthenwords),andinsomecases'top-down'informationalsopercolatesdownward.Asaresult,the'output'ofthemodelisatime-varyingvectorofwordactivations.Themodelisdeemedtohavesuccessfullyparsedasentenceifattheendofthesentence,allofthesentence'swordsarehighlyactivated,andnootherwordsarehighlyacti-vated.

AsStrauss,Harris,andMagnuson(2007)write:

AlthoughTRACEwasintroduced20yearsago,itcontin-uestobevitalincurrentworkinspeechperceptionandSWR.Despitewell-knownlimitations(acknowledgedintheoriginal1986articleanddiscussedbelow),TRACEisstillthebestavailablemodel,withthebroadestanddeepestcoverageoftheliterature...TRACEhasprovedextremelyflexibleandcontinuestospurnewresearchandprovideameansfortheorytesting.Forexample,ithasprovidedremarkablygoodfitstoeyetrackingdatafromrecentstudiesofthetimecourseoflexicalactivationandcompetition(Allopenna,Magnuson,&Tanenhaus,1998;Dahan,Magnuson&Tanenhaus,2001),includingsubtleeffectsofsubphonemicstimulusmanipulations(Dahan,Magnuson,Tanenhaus,&Hogan,2001).(p.20)

TRACEismentionedherebecause,amongotherthings,ithasbeenclaimedtoaccountforwordsegmen-tation.Theideaisthatifyourecognizethewordsthemselves,theepiphenomenalperceptofwordsegmen-tationhasbeenexplained.However,ashintedabove,TRACEisnotnecessarilyaviablemodelofacquisition.Inparticular,themodelcanonlyrecognizewordsinitslexicon;nomodel-internalmeansisavailableforpro-cessingnovelwordsandaddingthemtothelexicon.Asisgenerallyacknowledgedintheliteratureontheacqui-sitionofwordsegmentation,thisisanessentialaspectofthelargerproblem,sinceexperimentalevidencesuggeststhatinfantsareabletosegmentpreviouslyun-knownwords,andindeed,thisisthemajorityofnewwordsthatarelearned(forargumentationseeDaland&Pierrehumbert,2011,andGoldwater,Griffiths,&Johnson,2009).

Subsequentcomputationalresearchonthistopicemployedcorpusstudiesincombinationwithconnec-tionistmodeling(Aslin,Woodward,LaMendola,&Bever,1996;Cairns,Shillcock,Chater,&Levy,1997;Christiansen,Allen,&Seidenberg,1998;Elman,1990),

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

18?R.Daland

withthepromisingresultthatrelativelysimpleneuralnetworkmodelscouldpredictwordboundarieswithoutnecessarilyrecognizingtheneighboringwords.Howev-er,owingtothewell-knowndifficultieswithinterpretingtheinternalrepresentationsofconnectionistnetworks,thislineofresearchstalledshortlyaftertheinitialwave,essentiallybecauseitprovedimpossibletoreasonfromthemodelingresultstohowinfantsactuallysolvedtheproblem.Althoughthisisamoregeneralissuewithmodelingresearch,itprovedespeciallyacuteherebe-causeitwasnotevenpossibletodeterminehowthemodelssolvedtheproblem.

Nonetheless,thefindingthatprelexicalsegmentationwascomputationallypracticalhadimportantconse-quences.Experimentalevidencebeganpouringinaroundthistimeforphonotacticsegmentation,meaningsegmen-tationbasedon(knowledgeof)likely,unlikelybutper-missible,andimpermissiblesequenceswithinandacrossprosodicunitssuchaswords(e.g.Jusczyk,Hohne,&Baumann,1999;Jusczyk,Houston,&Newsome,1999;Mattys&Jusczyk,2001;Saffran,Aslin,&Newport,1996;foramorecomprehensivereviewseeDaland&Pierrehumbert,2011).Theexperimentalevidenceshowsquiteclearlythatinfantscananddoextractnewword-formsfromthespeechstream,evenfrom'difficult'posi-tionssuchasphrase-mediallywhentherearegoodphonotacticcues.

Thispromptedawaveofcomputationalmodelswhichattemptedtosolvethesegmentationproblemus-ingonlyphonotacticknowledge.EarlyinstancesincludeXanthos(2004)andFleck(2008),whousedutteranceboundaryinformationtoinferlexicalphonotacticprop-erties,asoriginallysuggestedbyAslin,Woodward,LaMendola,andBever(1996).Aprobabilisticallyrigor-ousbootstrappingmodelwasformulatedandtestedinDalandandPierrehumbert(2011)usingdiphones,se-quencesoftwosegments;inEnglish,individualdi-phonestypicallyhavepositionaldistributionsthatarehighlyskewedtowardbeingeitherword-internal,orword-spanning,sothatthisphonotacticcueisanexcel-lentoneforwordsegmentation.DalandandPierrehum-bertadvocateforaphonotacticapproachtowordseg-mentationbecausephonotacticsegmentationbecomesefficaciousassoonasinfantspossessthenecessaryphoneticexperience,around9months,consistentwiththedevelopmentalevidence.Moreover,DalandandPierrehumbertshowthatthephonotacticapproachisrobusttoconversationalreductionprocessesthatoccurinEnglish.Forexample,itiswell-knownthatword-finalcoronalstopsareoftendeletedinconversationalEnglish;DalandandPierrehumbertshowthatthiskindofprocesscausesonlyamodestdecrementtotheirphonotacticmodel,buthasrathermoredrasticeffectsonlexicalmodelswhichusewordformrecognitiontodowordsegmentation(sincecurrent-generationlexicalmodelsassumethesurfacepronunciationofawordformisitscanonicalandonlyform,theirdistributionalassumptionsareviolatedbyspeechcontainingpronunciationvaria-tion).AdriaansandKager(2010)proposeananalogousmodelintheframeworkofOT,whichinducessegmen-tationconstraintsfromfeaturalco-occurrenceinforma-tion.

Thephonotacticapproachhasnotpannedoutaswellasitsproponentsoriginallyhoped,however.Astheempiricalcoveragewidenedtootherlanguages,itbe-cameclearthatphonotacticapproachesalwaysworkedbestforEnglish(vs.Korean:Daland&Zuraw,2013;vs.SpanishandArabic:Fleck,2008;vs.Japanese:Fourtassi,B?rschinger,Johnson,&Dupoux,2013;etalia).Moreover,theassumption(basedonmaternalquestionnaires;Dale&Fenson,1996)that9-month-oldinfantsbarelyknewanywordswascontradictedbyex-perimentalevidence(e.g.Mandel,Jusczyk,&Pisoni,1995)suggestingthatinfantsknewsomewordformsasearlyas4-6months,eveniftheywerenotnecessarilyawareofthecorrespondingmeanings.

Inthemeantime,thephonotacticapproachtomodel-ingwordsegmentationwasovershadowedbytheBayesian,lexicalapproachdevelopedbyGoldwater,Johnson,andcolleagues.Thisapproach,whichhaditsrootsinthecomputationalmodelsofBatchelder(2002)andBrentandCartwright(1996),returnstotheviewofwordsegmentationasanepiphenomenonofwordrecognitionpopularizedinTRACE,butdepartsfromTRACEinvariousways.Mostcrucially,themodelsincludedmeanstoaddpreviouslyunencounteredword-formstoitslexicon('learnnewwords');also,BrentandCartwright(1996)definedanexplicitandprobabilisticmathematicalobjectivewhichtheirmodelwassupposedtomaximize.Thus,BrentandCartwrightadvocatedforframingthesegmentationproblematMarr'scomputa-tionallevel('Whatisthemathematicalcharacterizationofthefunctionthathumansoptimize?')ratherthanthealgorithmiclevel('Howdohumansfindtheoptimalso-lutionforthefunctionthattheyareoptimizing?').Goldwater,Griffiths,andJohnson(2009)extendtheearlyworkofBrentandCartwrighttoamoregeneralsetting,factoringthelearningproblemsoastoenableefficientoptimization,reframingtheobjectiveinaBayesiansetting(ratherthantherelated,butmorere-strictedMinimalDescriptionLengthapproachusedbyBrentandCartwright;fordiscussionandanalysisseeGoldwater,2006),andextendingthedatamodelsoastobebothmorepowerfulandmoreflexible.Forexam-ple,Goldwateretal.(2009)showthatbettersegmenta-tionispredictedifinfantsattendtodependenciesbe-tweenwords,apredictionthatwasretroactivelycon-firmedbyanexperimentalstudyshowingthat6-month-oldsusetheirownnamestosegmentthefollowingword(Bortfeld,Morgan,Golinkoff,&Rathbun,2005).

NumerousauthorshavefolloweduponGoldwaterandcolleagues'seminalwork.Forexample,Blanchard,Heinz,andGolinkoff(2010)adaptthemodelinGold-wateretal.(2009)byincludinganincrementaln-gramphonotacticmodel,whoseparametersarediscoveredduringthelearningprocess.Theyfoundasignificantbutverymodestgaininperformance,suggestingthatmuchoftheproblem-solvingpowerofGoldwater's

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??19

modelisactuallylocatedinthepriordistribution(owingtoreasonsofspace,Iamunabletodescribethismodelinmoredetailhere;thereaderisencouragedtoconsulttheoriginalpaperforclearexposition).Pearl&col-leagueshaveexperimentedwiththeideathat'addingperformancebackin'tocomputational-levelmodelscanyieldmorepsycholinguisticallyvalid(andsometimesmoreaccurate)performance,byincorporatinglimitedshort-termmemoryand/orlong-termforgettingintoGoldwater-likemodels(Pearl,Goldwater,&Steyvers,2011;Phillips&Pearl,2012).Lignos(2012)presentsanincrementalmodelwithaslightlydifferentobjectivethaninGoldwateretal.(2009);aninnovationistheuseofalexicalfilterwhichpreventslow-confidencewordsfrombeingincorporatedintothemodel'slexicon.Avarietyoflexicalfiltershavebeenusedinpreviouswork,includingespeciallytheconstraintthatawordmustcontainavowel(Brent&Cartwright,1996)orthatitmusthaveacertainminimalfrequency(Daland&Pierrehumbert,2011;seeCh.5ofDaland,2009,formodeling,analysis,anddiscussionof'errorsnowballs'andPearletal.,2011,forargumentationthatmemorylimitationshelppreventerrorsnowballsbyforgettingearlymisparses).

Therapid,intenseprogressthathastakenplaceinourunderstandingofwordsegmentationacquisitionhasbeendrivenbyaninterplayanddialoguebetweenava-rietyofresearchtraditions,mostnotablydevelopmentalpsycholinguists(Jusczyk,Mattys,Morgan,Saffran,etc.)andcognitivecomputationalmodelers(Daland,Goldwa-ter,Johnson,Pearl,etc.),aswellasresearcherswhoareabletomixthesemethodologies(Aslin,Kager,Swing-ley,etc.).Thisis,intheauthor'shumbleopinion,awonderfulthing,anditistobehopedthatthisexamplespreadstootherdomains.

Moregenerally,theimpactofcognitivemodelingcannotbeunderstatedinlinguistictheoryandincogni-tivesciencemoregenerally.Theinteractionbetweendomain-generalanddomain-specificrepresentationsandlearningalgorithmsisatopicofperennialinterest,andcomputationalmodelinghasandcontinuestoshednewlightonthecomplexities.Modelinghasinsomecasesclearlyruledouthypothesesastocognitivepro-cessesthatseemedaprioriquiteplausible;whileinothercasesithasshownthattwoformalismswhichmightnaivelybesupposedtomakecompletelydivergentpredictionsactuallyofferstatisticallyindistinguishableexplanationsfortheverysamedataset(e.g.Jarosz,2013).Justaswithformallanguagetheoryforframe-workcomparison,itissafetopredictthattherewillbemoreofthisworkinthefuture,notless.InthenextandfinalcontentsectionofthisreviewIturnbrieflytothetopicofcorpusstudies.6.CORPUSSTUDIES

Acorpusstudyisanystudyinwhichthecentraldataconsistsofa'corpus'–abodyoftextrepresentingsomeaspectoflanguageuse–andthecentralanalysisconsistsofcountingelementsinthetextanddoingstatisticalcomparisons.CorpusstudiesflourishedintheearlydaysoftheCHILDESdatabase(MacWhin-ney,2000),anearlycrowdsourcedprojectinwhich(usuallyorthographic)child-relatedcorporawereas-sembledtogetherundertheauspicesofasinglere-searchgroup.Forexample,muchoftheearlyworkonmorphologicalacquisitionfocusedonorder-of-morphemeacquisition,e.g.comparingthetimeandfrequencyof-ing,-ed,andotherEnglishfunctionalmorphemes(R.Brown,1973).

Owingtotheorthographiccodingofmostcorpora,andthephonologicallynon-transparentnatureofEnglish(theanalysislanguageformostcorpus-basedresearchtodate),thebulkofcorpusworkhasfocusedonmor-phologyandsyntaxratherthanphonology.Nonetheless,thereisasignificantbodyofcorpusworkinphonology.Iwilllimitthereviewtoafewexamples,asmuchofthisworkisofasimilarcharacter.

TwostudieswhichaddressphenomenaofinteresttotheoreticalphonologyweredonebyZurawandcol-leagues.Zuraw(2006)collectedacorpusofTagalogloanwordsusingInternetblogs.Loanwordsweredesir-ableforthisstudysincetheresearchquestionpertainedtotheproductivityofintervocalictapping,andthepro-ductivityofphonologicalpatternsfromhigh-frequencynativeitemsisconfoundedwithlexicalization.Usingthiscorpus,Zurawexaminedhowmorphologicalstatusinteractswithavariablephonologicalprocess;shefoundinterestingdifferencesbetweentheprefix+stemandstem+encliticcases,whichthereisnotspacetodiscusshere.Inaconceptuallysimilarstudy,Hayes,Zuraw,Siptár,andLonde(2009)investigatethevowelharmonypatternofHungarian,whichislargelycategorical,butexhibitsvariationinparticularcases(notably,whenaninitialbackvowelisfollowedbyoneormore'neutral'vowels,whichdonotundergoacousticallyobviousharmonyprocessesthemselves).Hayesetal.(2009)noteseveral'phoneticallyunnatural'aspectsoftheharmonysystemwhichappear,atleaststatistically,tonotbeduetochancealone(forexample,associationsbetweenconsonantplaceandvowelheightthatconditiontheapplicationrate).Theygoontoassesstheproductivityofthese'unnatural'patterns,andcomparethemtotheproductivityof'natural'patternswithsimilarstatisticalsupport,findingthatHungariannativespeakersexhibitknowledgeofboth,butapparentlyexhibitmoreproduc-tivityforthe'natural'patterns(seethepaperfordetails).LarsenandHeinz(2012)presentacorpusstudy,alsoofvowelharmony,butinKorean,andparticularlyinitsonomatopoeticsub-lexicon.Theiranalysisconfirmssomeaspectsofpreviousaccountsofthissub-lexicon,butaddnuances,e.g.thattheharmonyclassofavowelmaydependonitspositionintheword.Daland(2013)presentsacorpusstudyofadult-versuschild-directedspeech,inwhichhecomparestherelativefrequencyofdifferentsegmentalclasses.Dalandarguesagainsttheclaimthatadultstailorthesegmentalfrequenciesintheir

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

20?R.Daland

child-directedspeech,byshowingthatthemoment-to-momentvariationinsegmentalfrequenciesdwarfstheputativeaggregatedifferencesthathadbeenreportedinpreviousresearch.

Inallofthesecorpusstudies,researcherstakeanexistingcorpus(orcreateone)andthenanalyzeitandcomparethecountsagainstthepredictionsofsomeex-istingphonologicaltheoryoraccount.Corpusstudiesarerelativelyeasytoconductandreplicateoncethecorpushasbeencreated,sotheyareanappealingmethodology.However,itisthenormtosupplementcorpusstudieswithadditionalcomputationalstudiesand/orexperimentation,soastoprovideconvergingevidence.Therearemanycorpusstudiesthatcouldhavebeenreviewedhere,andIselectedamerehandfultoil-lustratethe'flavor'ofthisstyleofresearch.(Anumberofcorpusstudieswerealsoreviewedinthecognitivemodelingsectionearlier).Thisstyleofresearchisre-viewedhere,atleastbriefly,becauseitisconsideredtobe'computationalphonology'bymanyresearchers,in-cludingspecialistsonlanguageacquisition.7.SUMMARYANDCONCLUSIONS

InthispaperIhavereviewedanumberofsub-fieldswhichIorclosecolleaguesconsidertobe'computationalphonology'.Ibeganwithformallanguagetheoryasitisspecificallyappliedtophonology.Afterreviewingthefundamentals,Idiscussedrecenttheoreticalworkofinterest,includingtheuseofequivalenciesbetweenformallanguagesandlogicstocompareformalframe-works(likeSPEandOT),aswellastheapplicationoffinite-statemethodsforefficientoptimizationoflarge-scaleconstraint-basedmodels.Next,IbrieflydiscussedtheinfluenceofNLP/ASR(NaturalLanguageProcess-ingandAutomaticSpeechRecognition)oncomputation-alphonology;althoughthosefieldsarenotconsideredcomputationalphonology,cognitivescientistsoweahugedebttothesefieldsforintroducinganddemonstrat-ingtheutilityofprobabilisticmodelsfornaturallan-guageproblems.Inthesectionofthepaperthatcorre-spondsthemostcloselytomyownresearchinterests,Idiscussedcognitivecomputationalmodelingingeneral,andfocusedinparticularoncomputationalapproachestophonologicalandphonotacticacquisition,aswellastheacquisitionofwordsegmentationbyinfantsandchildren.Finally,Iverybrieflydiscussedcorpusstudies;thereisalongtraditionincorpusworkanditisaverygeneralmethodology,soIonlygaveafewexamplestoillustratewhatitcanandcannotdo.

Steppingbackfromthemanyandimportantdetailsthatgointomakinganyoneparticularstudy,itistimetorevisitthequestionwithwhichthisarticlebegan:Whatiscomputationalphonology?Letusbeginwithwhatiscommon.Asclaimedintheintroduction,manyormostoftheworksreviewedabovedrawuponacommonfoundationofformallanguagetheory.Forex-ample,someofthemostexcitingworkoncognitivemodelingofphonologicalacquisitionmakesuseoffi-nite-stateOT(Hayes&Wilson,2008).Similarly,mostoftheworkoncomputationalphonologyreliesonasharedbodyofmethodologicalknowledgeaboutcorpuslinguistics.Forexample,itisnearlyalwaysnecessarytopreprocessacorpusforone'sparticularresearchneeds.Moreover,theNaturalLanguageProcessingfieldhasrepeatedlyandforcefullydemonstratedthedangersofoverfitting;itisnowreceivedwisdominthisfieldthatgeneralizationmustbeassessedbytestingonadifferentdatasetthanthemodelwastrainedon(exceptincertaincasesofunsupervisedlearning).Nearlyalloftheworkreviewedaboveincognitivecomputationalmodelingdealseitherwithacorpusofphonologicaldata,orwithbehavioralresultsfroma'corpus'ofstimuli,orboth.Finally,thebulkofthestudiesreviewedheredealspecificallywithfirstlanguageacquisition(al-though,tobefair,thatpartiallyreflectstheauthor'sin-terests,inadditiontotheinherentbiasesofthefield).Thisisquiteabitofsharedknowledgeandmethodolog-icalcommonality.However,ifweexaminetheresearchquestionsthateachsubfieldasks,despitethefactthatthereisageneralpreoccupationwithlanguageacquisi-tion,westillseeagreateramountofvariationthanis,Ithink,commonforacoherentfield.

Withinformallanguagetheory,thepursuitisreallynotofempiricalphenomenathatdoordon'toccurinnaturallanguages;rather,thegoalistounderstandandelucidatetheformalrelationshipsbetweenvariousformalmodelsof'language'.Thissubfieldhaslargelyresistedprobabilisticapproaches,andithasconcen-tratedonformalrestrictionsonthegenerativecapacityofformalmodels(suchasregularversuscontext-free),attheexpenseofsubstantiverestrictions(suchastheimplicationaluniversalthatwordswithconsonantonsetsarestrictlylessmarkedthanonsetlesswords).Alargeamountofworkinthisfieldisdevotedtoac-quisition,butittendstoproceedinaproof-basedoralgorithmicmanner,askingiflearningalgorithmAisguaranteedtolearneverylanguageLinagivenclass.Thepsychologicalplausibilityofthelearningassump-tionsisnotalwaysaveryimportantconcerntosuchresearchers;rathertheyareinterestedinthemathemat-icalandlogicalrelationshipsbetweenAandL.

WithinNaturalLanguageProcessing(NLP),thegoalistosolvereal-worldengineeringproblems,oftenonesinwhichmoneycanbemade.Forexample,itisworthyandimportanttotranslatedocumentsfromresource-richlanguageslikeEnglishtohighinformation-demandlanguages(suchasMandarinChinese).Itisalsoworthyandimportanttotranslatedocumentsfromlanguageswhosespeakersproducegoodsandtechnologies(likeMandarinChinese)tolanguageswhosespeakerscon-sumegoodsandtechnologies(likeEnglish).Translatorsworkslowlyandmustbepaidaconsiderableamountofmoney;thereisalotofmoneytobemadeandsavedindevelopinggoodmachinetranslation.Inthissortofapplication,theformalpropertiesofamodelareofin-terestonlyinsofarastheyimpacttheultimateperfor-

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??21

manceofthesystemasawhole.ThereareofcourseresearcherswhoseinterestsspanbothNLPandmorebasicscience,includingresearcherswhobelievethatunderstandingthewayhumansdolanguagemayresultinbetterNLP,andsoon.Nonetheless,thefieldasawholeisorientedtowarddevelopingandapplyingstatis-ticalmodelswhichsolve'real-world'problems.Therearemanyandinterestingproblemsinthisfield,whichthisauthoristoodistantfromtoreviewinthedetailtheydeservehere.Itisquiteclear,however,thatthetypesofproblemsthisfieldisconcernedwitharequitediffer-entthantheratherabstractquestionsthatpreoccupyformallanguagetheorists.

Incognitivecomputationalmodeling,thegoalismorespecificallytoelucidatehowhumansactuallydosomeparticularlinguistictask.Thisisrelatedto,butcruciallydifferentfrom,theformallanguagetheoryap-proach.Attheriskofoversimplifyingconsiderably,onemightputitthisway:formallanguagetheoryasks,“WhatdoesmodelXdo?”;cognitivemodelersask,“DohumansdoitlikemodelX?”.Thatis,inthisfield,computationalresearchersareconcernedmuchmorewithpsychologicalplausibility,andlesswiththeabstractstructureoftheproblemspace.Itisnosurprise,then,thatcomputationalresearchinthisfieldrespondsandisrespondedtomoretightlywithdevelopmentalresearchonlanguageacquisition.

Mygoal,inreviewingthesedifferentsubfields,isnottoclaimthatoneissuperiortoanother.Rather,ithasbeentoillustratetherichtapestryofhumanthoughtthatfallsunderthebroadumbrellaterm'computationalphonology'.Therearestrandsthatconnecteachofthesesubfields,evenasthecoreconcernsdifferfromre-searchertoresearcherandsubfieldtosubfield.Compu-tationalphonologyisgettingbiggerandbigger,andfragmentingmorewitheachpassingyear.But,too,wearelearningmoreandmore.REFERENCES

Adriaans,F.,&Kager,R.(2010).Addinggeneralizationtostatistical

learning:Theinductionofphonotacticsfromcontinuousspeech.JournalofMemoryandLanguage,62,311-331.http://dx.doi.org/10.1016/j.jml.2009.11.007

Albright,A.,&Hayes,B.(2003).Rulesvs.analogyinEnglishpast

tenses:Acomputational/experimentalStudy.Cognition,90,119-161.http://dx.doi.org/10.1016/S0010-0277(03)00146-XAslin,R.N.,Woodward,J.,LaMendola,N.,&Bever,T.G.(1996).

Modelsofwordsegmentationinfluentmaternalspeechtoinfants.InJ.L.Morgan&K.Demuth(Eds.),Signaltosyntax:Bootstrappingfromspeechtogrammarinearlyacquisition(pp.117-134).Mahwah,NJ:Erlbaum.

Baayen,R.H.(2001).Wordfrequencydistributions.Dordrecht,

Netherlands:KluwerAcademic.http://dx.doi.org/10.1007/978-94-010-0844-0

Bakovi?,E.(2007).Arevisedtypologyofopaquegeneralisations.

Phonology,24,217-259.http://dx.doi.org/10.1017/S0952675707001194

Bakovi?,E.(2011).Opacityandordering.InJ.Goldsmith,J.Riggle

&A.C.L.Yu(Eds.),Thehandbookofphonologicaltheory(2nded.).Oxford,UK:Wiley-Blackwell.http://dx.doi.org/10.1002/9781444343069.ch2

Batchelder,E.O.(2002).Bootstrappingthelexicon:Acomputational

modelofinfantspeechsegmentation.Cognition,83,167-206.http://dx.doi.org/10.1016/S0010-0277(02)00002-1

Baum,L.E.,&Petrie,T.(1966).Statisticalinferenceforprobabilistic

functionsoffinitestateMarkovchains.TheAnnalsofMathematicalStatistics,37(6),1554-1563.http://dx.doi.org/10.1214/aoms/1177699147

Berger,A.,DellaPietra,S.,&DellaPietra,V.(1996).Amaximum

entropyapproachtonaturallanguageprocessing.ComputationalLinguistics,22(1),39-71.

Blanchard,D.,Heinz,J.,&Golinkoff,R.(2010).Modelingthe

contributionofphonotacticcuestotheproblemofwordsegmentation.JournalofChildLanguage,37,487-511.http://dx.doi.org/10.1017/S030500090999050X

Boersma,P.(2003).[ReviewofthebookLearnabilityinOptimality

Theory,byB.Tesar&P.Smolensky].Phonology,20,436-446.http://dx.doi.org/10.1017/S0952675704230111

Boersma,P.,&Hayes,B.(2001).EmpiricaltestsoftheGradual

LearningAlgorithm.LinguisticInquiry,32,45-86.http://dx.doi.org/10.1162/002438901554586

Bortfeld,H.,Morgan,J.L.,Golinkoff,R.M.,&Rathbun,K.(2005).

Mommyandme:Familiarnameshelplaunchbabiesintospeechstreamsegmentation.PsychologicalScience,16,298-304.http://dx.doi.org/10.1111/j.0956-7976.2005.01531.x

Brent,M.R.,&Cartwright,T.A.(1996).Distributionalregularity

andphonotacticconstraintsareusefulforsegmentation.Cognition,61,93-125.http://dx.doi.org/10.1016/S0010-0277(96)00719-6

Brown,P.,&Mercer,R.(2013).TwentyyearsofBitext

[Transcriptionandslides].Invitedtalk.EMNLPworkshopTwentyyearsofBitext.Seattle,WA.Retrievedfrom:http://cs.jhu.edu/~post/bitext/

Brown,R.(1973).Afirstlanguage:Theearlystages.Cambridge,

MA:HarvardUniversityPress.

Buccola,B.,&Sonderegger,M.(2013).Ontheexpressivityof

OptimalityTheoryversusrules:Anapplicationtoopaquepatterns.RefereedpresentationpresentedatthemeetingPhonology2013,UmassAmherst,09November2013.

Cairns,P.,Shillcock,R.C.,Chater,N.,&Levy,J.(1997).

Bootstrappingwordboundaries:Abottom-upcorpus-basedapproachtospeechsegmentation.CognitivePsychology,33,111-153.http://dx.doi.org/10.1006/cogp.1997.0649

Chomsky,N.(1956).Threemodelsforthedescriptionoflanguage.

IRETransactionsonInformationTheory,2,113-124.http://dx.doi.org/10.1109/TIT.1956.1056813

Chomsky,N.(1959).[ReviewofthebookVerbalBehavior,byB.

F.Skinner].Language,35(1),26-58.http://dx.doi.org/10.2307/411334

Chomsky,N.&Halle,M.(1968).ThesoundpatternofEnglish.New

York:Harper&Row.

Christiansen,M.H.,Allen,J.,&Seidenberg,M.S.(1998).Learning

tosegmentspeechusingmultiplecues:Aconnectionistmodel.LanguageandCognitiveProcesses,13(2-3),221-268.http://dx.doi.org/10.1080/016909698386528

Coetzee,A.W.,&Pater,J.(2011).Theplaceofvariationin

phonologicaltheory.InJ.Goldsmith,J.Riggle&A.C.L.Yu(Eds.),Thehandbookofphonologicaltheory(2nded.).Oxford,UK:Wiley-Blackwell.http://dx.doi.org/10.1002/9781444343069.ch13

Coleman,J.,&Pierrehumbert,J.(1997).Stochasticphonological

grammarsandacceptability.In3rdMeetingoftheACLSpecialInterestGroupinComputationalPhonology:ProceedingsoftheWorkshop,12July1997(pp.49-56).SomersetNJ:AssociationforComputationalLinguistics.

Daland,R.(2009).Wordsegmentation,wordrecognition,andword

learning:Acomputationalmodeloffirstlanguageacquisition(unpublisheddoctoraldissertation).NorthwesternUniversity,IL.Retrievedfrom:http://www.linguistics.northwestern.edu/docs/dissertations/dalandDissertation.pdf

Daland,R.(2013).Variationinchild-directedspeech:Acasestudy

ofmannerclassfrequencies.JournalofChildLanguage,40(5),1091-1122.http://dx.doi.org/doi:10.1017/S0305000912000372

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

22?R.Daland

Daland,R.,&Pierrehumbert,J.B.(2011).Learningdiphone-based

segmentation.CognitiveScience,35(1),119-155.http://dx.doi.org/10.1111/j.1551-6709.2010.01160.x

Daland,R.,&Zuraw,K.(2013).DoesKoreandefeatphonotactic

wordsegmentation?Shortpaperpresentedatthe51stAnnualMeetingoftheAssociationforComputationalLinguisticsinSofia,Bulgaria,August4-9,2013.

Dale,P.S.,&Fenson,L.(1996).Lexicaldevelopmentnormsfor

youngchildren.BehaviorResearchMethods,Instruments,&Computers,28,125-127.http://dx.doi.org/10.3758/BF03203646Daugherty,K.,&Seidenberg,M.S.(1992).Rulesorconnections?

Thepasttenserevisited.InProceedingsoftheFourteenthAnnualConferenceoftheCognitiveScienceSociety(pp.259-264).Hillsadle,NJ:Erlbaum.

Eisner,J.(2002).Parameterestimationforprobabilisticfinite-state

transducers.InProceedingsofthe40thAnnualMeetingoftheAssociationforComputationalLinguistics(pp.1-8).EastStroudsburg,PA:AssociationforComputationalLinguistics.http://dx.doi.org/10.3115/1073083.1073085

Ellison,M.T.(1994).Phonologicalderivationinoptimalitytheory.

InProceedingsofthe15thInternationalConferenceonComputationalLinguistics(COLING)(Vol.2,pp.1007-1013).Kyoto,Japan:AssociationforComputationalLinguistics.http://dx.doi.org/10.3115/991250.991312

Elman,J.L.(1990).Findingstructureintime.CognitiveScience,14,

179-211.http://dx.doi.org/10.1207/s15516709cog1402_1

Elman,J.L.,&McClelland,J.L.(1985).Anarchitectureforparallel

processinginspeechrecognition:TheTRACEmodel.InM.R.Schroeder(Ed.),Speechrecognition(pp.6-35).Gottingen,Germany:BibliotecaPhonetica.

Fleck,M.M.(2008).Lexicalizedphonotacticwordsegmentation.

InProceedingsofthe46thAnnualMeetingoftheAssociationforComputationalLinguistics:HumanLanguageTechnologies(pp.130-138).Madison,WI:Omnipress.

Fourtassi,A.,B?rschinger,B.,Johnson,M.,&Dupoux,E.(2013).

Whyisenglishsoeasytosegment.InProceedingsofthe4thWorkshoponCognitiveModelingandComputationalLinguistics(pp.1-10).Sofia,Bulgaria,August8,2013.

Frank,R.,&Satta,G.(1998).Optimalitytheoryandthe

computationalcomplexityofconstraintviolability.ComputationalLinguistics,24,307-315.

Gold,E.M.(1967).Languageidentificationinthelimit.Information

andControl,10(5),447-474.http://dx.doi.org/10.1016/S0019-9958(67)91165-5

Goldsmith,J.(1976).Autosegmentalphonology.Doctoraldissertation,

MIT,MA.

Goldsmith,J.A.(1990).Autosegmentalandmetricalphonology.

Oxford,UK:BasilBlackwell.

Goldwater,S.(2006).NonparametricBayesianmodelsoflexical

acquisition(unpublisheddoctoraldissertation).BrownUniversity,RI.Retrievedfrom:http://homepages.inf.ed.ac.uk/sgwater/papers/thesis_1spc.pdf

Goldwater,S.,&Johnson,M.(2003).LearningOTconstraint

rankingsusingamaximumentropymodel.InProceedingsoftheWorkshoponVariationwithinOptimalityTheory(pp.113-122).StockholmUniversity,Sweden.

Goldwater,S.,Griffiths,T.L.,&Johnson,M.(2009).ABayesian

frameworkforwordsegmentation:Exploringtheeffectsofcontext.Cognition,112(1),21-54.http://dx.doi.org/10.1016/j.cognition.2009.03.008

Graf,T.(2010a).Comparingincomparableframeworks:Amodel

theoreticapproachtophonology.UniversityofPennsylvaniaWorkingPapersinLinguistics,16(1),art.10.Retrievedfrom:http://repository.upenn.edu/pwpl/vol16/iss1/10

Graf,T.(2010b).Formalparametersofphonology:FromGovernment

PhonologytoSPE.InT.Icard&R.Muskens(Eds.),Interfaces:Explorationsinlogic,languageandcomputation,LectureNotesinArtificialIntelligence6211(pp.72-86).Berlin,Germany:Springer.

Hayes,B.(2004).PhonologicalacquisitioninOptimalityTheory:

theearlystages.InR.Kager,J.Pater&W.Zonneveld(Eds.),Fixingpriorities:Constraintsinphonologicalacquisition(pp.158-203).CambridgeUniversityPress.

Hayes,B.(2011).Interpretingsonority-projectionexperiments:The

roleofphonotacticmodeling.InProceedingsofthe17thInternationalCongressofPhoneticSciences(ICPhS11-HongKong)(pp.835-838)HongKong,PRC.

Hayes,B.,Kie,Z.,Siptár,P.,&Londe,Z.C.(2009).Naturaland

unnaturalconstraintsinHungarianvowelharmony.Language,85(4),822-863.http://dx.doi.org/10.1353/lan.0.0169

Hayes,B.,&White,J.(2013).Phonologicalnaturalnessand

phonotacticlearning.LinguisticInquiry,44(1),45-75.http://dx.doi.org/10.1162/LING_a_00119

Hayes,B.,&Wilson,C.(2008).Amaximumentropymodelof

phonotacticsandphonotacticlearning.LinguisticInquiry,39(3),379-440.http://dx.doi.org/10.1162/ling.2008.39.3.379

Heinz,J.(2007).TheInductiveLearningofPhonotacticPatterns

(doctoraldissertation),UniversityofCalifornia,LosAngeles.Heinz,J.(2010).Stringextensionlearning.InProceedingsofthe

48thAnnualMeetingoftheAssociationforComputationalLinguisticsinUppsala,Sweden(pp.897-906).

Heinz,J.(2011a).ComputationalPhonology-PartI:Foundations.

LanguageandLinguisticsCompass,5(4),140-152.http://dx.doi.org/10.1111/j.1749-818X.2011.00269.x

Heinz,J.(2011b).ComputationalPhonology-PartII:Grammars,

Learning,andtheFuture.LanguageandLinguisticsCompass,5(4),153-168.http://dx.doi.org/10.1111/j.1749-818X.2011.00268.x

Heinz,J.,&Koirala,C.(2010).Maximumlikelihoodestimationof

feature-baseddistributions.InProceedingsofthe11thMeetingoftheACLSpecialInterestGrouponComputationalMorphologyandPhonology,28-37,Uppsala,Sweden.

Heinz,J.,&Lai,R.(2013).Vowelharmonyandsubsequentiality.

InA.Kornai&M.Kuhlmann(Eds.),Proceedingsofthe13thMeetingonMathematicsofLanguage,Sofia,Bulgaria.

Hirschberg,J.(1998).'EverytimeIfirealinguist,myperformance

goesup',andothermythsofthestatisticalnaturallanguageprocessingrevolution.Invitedtalk.15thNationalConferenceonArtificialIntelligence,Madison,WI.

Jardine,A.(inpress).LogicandtheGenerativePowerof

AutosegmentalPhonology.InSupplementalProceedingsofPhonology2013.Retrievedfrom:https://sites.google.com/site/adamajardine/research-interests

Jarosz,G.(2006).Richlexiconsandrestrictivegrammars-Maximum

likelihoodlearninginOptimalityTheory(doctoraldissertation).JohnsHopkinsUniversity.RetrievedfromRutgersOptimalityArchiveNo.884.

Jarosz,G.(2013).LearningwithhiddenstructureinOptimality

TheoryandHarmonicGrammar:BeyondRobustInterpretiveParsing.Phonology,30,27-71.http://dx.doi.org/10.1017/S0952675713000031

Jelinek,F.,Bahl,L.,&Mercer,R.(1975).Designofalinguistic

statisticaldecoderfortherecognitionofcontinuousspeech.IEEETransactionsofInformationTheory,21(3),250-256.http://dx.doi.org/10.1109/TIT.1975.1055384

Jusczyk,P.W.,Hohne,E.A.,&Bauman,A.(1999).Infants'

sensitivitytoallophoniccuesforwordsegmentation.PerceptionandPsychophysics,61,1465-1476.http://dx.doi.org/10.3758/BF03213111

Jusczyk,P.W.,Houston,D.M.,&Newsome,M.(1999).The

beginningsofwordsegmentationinEnglish-learninginfants.CognitivePsychology,39(3-4),159-207.http://dx.doi.org/10.1006/cogp.1999.0716

Kaplan,R.M.,&Kay,M.(1994).Regularmodelsofphonological

rulesystems.ComputationalLinguistics,20(3),331-379.

Karttunen,L.(1998).Thepropertreatmentofoptimalitytheory

incomputationalphonology.InProceedingsoftheInternationalWorkshoponFiniteStateMethodsinNaturalLanguageProcessing(1-12).Ankara,Turkey:AssociationforComputationalLinguistics.

Kearns,M.,&Valiant,L.(1994).Cryptographiclimitationson

learningbooleanformulaeandfiniteautomata.JournaloftheACM,41(1),67-95.http://dx.doi.org/10.1145/174644.174647Larsen,D.,&Heinz,J.(2012).Neutralvowelsinsound-symbolic

vowelharmonyinKorean.Phonology,29,433-464.http://dx.doi.org/10.1017/S095267571200022X

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

Whatiscomputationalphonology??23

Legendre,G.,Miyata,Y.,&Smolensky,P.(1990).Harmonic

Grammar:Aformalmulti-levelconnectionisttheoryoflinguisticwell-formedness:Anapplication.InProceedingsofthetwelfthannualconferenceoftheCognitiveScienceSociety(pp.884-891).Cambridge,MA:LawrenceErlbaum.

Li,M.,&Vitányi,P.M.B.(1991).Learningsimpleconceptsunder

simpledistributions.SIAMJournalofComputing,20,911-935.http://dx.doi.org/10.1137/0220056

Lignos,C.(2012).Infantwordsegmentation:Anincremental,

integratedmodel.InProceedingsoftheWestCoastConferenceonFormalLinguistics,30,April13-15,2012.

MacWhinney,B.(2000).TheCHILDESProject:ToolsforAnalyzing

Talk.VolumeI:Transcriptionformatandprograms.VolumeII:Thedatabase.Mahwah,NJ:LawrenceErlbaum.

Magri,G.(2012).Constraintpromotion:notonlyconvergentbutalso

efficient.InCLS48:Proceedingsofthe48thannualconferenceoftheChicagoLinguisticSociety,Chicago,IL.

Magri,G.(inpress).Error-drivenandbatchmodelsoftheacquisition

ofphonotactics:DaviddefeatsGoliath.InPhonology2013:Proceedingsofthe2013PhonologyConference,November8-10,2013,Amherst,MA.

Mandel,D.R.,Jusczyk,P.W.,&Pisoni,D.B.(1995).Infants'

recognitionofthesoundpatternsoftheirownnames.PsychologicalScience,6,315-318.http://dx.doi.org/10.1111/j.1467-9280.1995.tb00517.x

Marcus,G.F.(1995).TheacquisitionoftheEnglishpasttensein

childrenandmulti-layeredconectionistnetworks.Cognition,56,271-279.http://dx.doi.org/10.1016/0010-0277(94)00656-6Marcus,G.F.,Brinkmann,U.,Clahsen,H.,Wiese,R.,&Pinker,S.

(1996).Germaninflection:Theexceptionthatprovestherule.CognitivePsychology,29,189-256.http://dx.doi.org/10.1006/cogp.1995.1015

Mattys,S.L.,&Jusczyk,P.W.(2001).Phonotacticcuesfor

segmentationoffluentspeechbyinfants.Cognition,78,91-121.http://dx.doi.org/10.1016/S0010-0277(00)00109-8

McCarthy,J.J.(1981).Aprosodictheoryofnon-concatenative

morphology.LinguisticInquiry,12(3),373-418.

McCarthy,J.J.(2008).Thegradualpathtoclustersimplification.

Phonology,25,271-319.http://dx.doi.org/10.1017/S0952675708001486

McCarthy,J.J.(2011).AutosegmentalspreadinginOptimality

Theory.InJ.Goldsmith,A.E.Hume&L.Wetzels(Eds.),TonesandFeatures(pp.195-222).Berlin,Germany:MoutondeGruyter.http://dx.doi.org/10.1515/9783110246223.195

McCarthy,J.J.,&Prince,A.(1994).Theemergenceoftheunmarked:

Optimalityinprosodicmorphology.InProceedingsoftheNorthEastLinguisticsSociety24.Amherst,MA.

Pearl,L.,Goldwater,S.,&Steyvers,M.(2011).OnlineLearning

MechanismsforBayesianModelsofWordSegmentation.ResearchonLanguageandComputation,8(2-3),107-132.http://dx.doi.org/10.1007/s11168-011-9074-5

Phillips,L.&Pearl,L.(2012).Syllable-basedBayesianinference:

A(more)plausiblemodelofwordsegmentation.WorkshoponPsychocomputationalModelsofHumanLanguageAcquisition.Portland,OR.

Pierrehumbert,J.(1994).Syllablestructureandwordstructure:a

studyoftriconsonantalclustersinEnglish.InP.Keating(Ed.),PapersinlaboratoryphonologyIII:Phonologicalstructureandphoneticform(pp.168-188).Cambridge,UK:CambridgeUniversityPress.

Pinker,S.,&Prince,A.(1988).Onlanguageandconnectionism:

AnalysisofaParallelDistributedProcessingmodeloflanguageacquisition.Cognition,28(1-2),73-193.http://dx.doi.org/10.1016/0010-0277(88)90032-7

Plunkett,K.&Marchman,V.(1991).U-shapedlearningand

frequencyeffectsinamulti-layeredperceptron:Implicationsforchildlanguageacquisition.Cognition,38,43-102.http://dx.doi.org/10.1016/0010-0277(91)90022-V

Potts,C.,Pater,J.,Jesney,K.,Bhatt,R.,&Becker,M.(2010).

HarmonicGrammarwithlinearprogramming:Fromlinearsystemstolinguistictypology.Phonology,27,77-117.http://dx.doi.org/10.1017/S0952675710000047

Potts,C.,&Pullum,G.K.(2002).Modeltheoryandthecontentof

OTconstraints.Phonology,19,361-393.

Prince,A.,&Smolensky,P.(1993).OptimalityTheory:Constraint

interactioningenerativegrammar.TechnicalReport,RUCSS,RutgersUniversity,NewBrunswick,NJ.Publishedin2004byBlackwell.

Prince,A.,&Smolensky,P.(2002).OptimalityTheory:Constraint

interactioningenerativegrammar.Retrievedfrom:roa.rutgers.edu/files/537-0802/537-0802-PRINCE-0-0.PDFPrince,A.,&Smolensky,P.(2004).OptimalityTheory:Constraint

interactioningenerativegrammar.Oxford:Blackwell.

Riggle,J.(2004).Generation,recognition,andlearninginfinite

stateOptimalityTheory(doctoraldissertation).UCLA,CA.Riggle,J.(2009).ViolationsemiringsinOptimalityTheory.Research

onLanguageandComputation,7(1),1-12.http://dx.doi.org/10.1007/s11168-009-9063-0

Riggle,J.,&Wilson,C.(2005).Localoptionality.InProceedings

ofNELS35.Amherst,MA:GLSA.

Rumelhart,D.E.,&McClelland,J.L.(1986).Onlearningthepast

tensesofEnglishverbs:Implicitrulesorparalleldistributedprocessing?InJ.L.McClelland,D.E.Rumelhart,&thePDPResearchGroup(Eds.),Paralleldistributedprocessing:Explorationsinthemicrostructureofcognition(Vol.2,pp.216-271).Cambridge,MA:MITPress.

Saffran,J.R.,Aslin,R.N.,&Newport,E.L.(1996).Statistical

learningby8-month-oldinfants.Science,274,1926-1928.http://dx.doi.org/10.1126/science.274.5294.1926

Shieber,S.M.(1985).Evidenceagainstthecontext-freenessofnatural

language.LinguisticsandPhilosophy,8(3),333-343,http://dx.doi.org/10.1007/BF00630917

Smolensky,P.,&Legendre,G.(2006).TheHarmonicMind:From

neuralcomputationtoOptimality-Theoreticgrammar(Vol.1:CognitiveArquitecture,pp.xvii-563.Vol.2:LinguisticandPhilosophicalImplications,pp.xvii-611).Cambridge,MA:MITPress.

Stabler,E.(2009).Computationalmodelsoflanguageuniversals.In

M.H.Christiansen,C.Collins,&S.Edelman(Eds.),LanguageUniversals(Rev.ed.,pp.200-223).Oxford,UK:OxfordUniversityPress.http://dx.doi.org/10.1093/acprof:oso/9780195305432.003.0010

Strauss,T.J.,Harris,H.D.,&Magnuson,J.S.(2007).jTRACE:A

reimplementationandextensionoftheTRACEmodelofspeechperceptionandspokenwordrecognition.BehaviorResearchMethods,39(1),19-30.http://dx.doi.org/10.3758/BF03192840Tesar,B.,&Smolensky,P.(2000)LearnabilityinOptimalityTheory.

Cambridge,MA:MITPress.

Twain,M.(2006).ChaptersfromMyAutobiography--XX.North

AmericanReviewDCXVIII.ProjectGutenberg.Retrievedfrom:http://www.gutenberg.org/files/19987/19987-h/19987-h.htm(originalworkpublishedin1906).

Valiant,L.G.(1984).Atheoryofthelearnable.Communicationsof

theACM27,1134-1142.http://dx.doi.org/10.1145/1968.1972Viterbi,A.J.(1967).Errorboundsforconvolutionalcodesandan

asymptoticallyoptimumdecodingalgorithm.IEEETransactionsonInformationTheory13(2),260-269.http://dx.doi.org/10.1109/TIT.1967.1054010

Xanthos,A.(2004).Combiningutterance-boundaryandpredictability

approachestospeechsegmentation.InW.G.Sakas(Ed.),Proceedingsofthefirstworkshoponpsycho-computationalmodelsoflanguageacquisitionatCOLING2004(pp.93-100).Geneva,Switzerland.

Zipf,G.K.(1935).ThePsychobiologyofLanguage.Boston,MA:

Houghton-Mifflin.

Zipf,G.K.(1949).HumanBehaviorandthePrincipleofLeastEffort.

Cambridge,MA:Addison-WesleyPress.

Zuraw,K.(2006).Usingthewebasaphonologicalcorpus:acase

studyfromTagalog.InEACL-2006:Proceedingsofthe11thConferenceoftheEuropeanChapteroftheAssociationforComputationalLinguistics/Proceedingsofthe2ndInternationalWorkshoponWebAsCorpus(pp.59-66).http://dx.doi.org/10.3115/1628297.1628306

Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004

本文来源:https://www.bwwdw.com/article/y9a8.html

Top