What is computational phonology
更新时间:2024-03-21 23:51:01 阅读量: 综合文库 文档下载
- what's推荐度:
- 相关推荐
Loquens1(1) January2014,e004
eISSN 2386-2637
doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology?
RobertDaland
UniversityofCalifornia,LosAngelese-mail:rdaland@humnet.ucla.edu
Citation/Cómocitaresteartículo:Daland,R.(2014).Whatiscomputationalphonology?Loquens,1(1),e004.doi:http://dx.doi.org/10.3989/loquens.2014.004
ABSTRACT:Computationalphonologyisnotonething.Rather,itisanumbrellatermwhichmayrefertoworkonformallanguagetheory,computer-implementedmodelsofcognitiveprocesses,andcorpusmethodsderivedfromtheliteratureonnaturallanguageprocessing(NLP).Thisarticlegivesanoverviewofthesedistinctareas,identifyingcommonalitiesanddifferencesinthegoalsofeacharea,aswellashighlightingrecentresultsofinterest.Theoverviewisnecessarilybriefandsubjective.Broadlyspeaking,itisarguedthatlearningisapervasivethemeintheseareas,butthecorequestionsandconcernsvarytoomuchtodefineacoherentfield.Computationalphonologistsaremoreunitedbyasharedbodyofformalknowledgethantheyarebyasharedsenseofwhattheimportantquestionsare.
KEYWORDS:computationalphonology
RESUMEN:?Quéeslafonologíacomputacional?.-Lafonologíacomputacionalnorepresentauncampounitario,sinoqueesuntérminogenéricoquepuedehacerreferenciaaobrassobreteoríasdelenguajesformales;amodelosdeprocesoscognitivosimplementadosporordenador;yamétodosdetrabajoconcorpus,derivadosdelabibliografíasobreprocesamientodellenguajenatural(PLN).Esteartículoofreceunavisióndeconjuntodeestasdistintasáreas,identificalospuntoscomunesylasdiferenciasenlosobjetivosdecadauna,yponederelievealgunosdelosúltimosresultadosmásrelevantes.Estavisióndeconjuntoesnecesariamentebreveysubjetiva.Entérminosgenerales,sear-gumentaqueelaprendizajeesuntemarecurrenteenestosámbitos,perolaspreguntasylosproblemascentralesvaríandemasiadocomoparadefinirunáreadeestudiounitariaycoherente.Losfonólogoscomputacionalesestánunidosporuncúmulocomúndeconocimientosformalesmásqueporunparecercompartidoacercadecuálessonlaspreguntasimportantes.
PALABRASCLAVE:fonologíacomputacional
1.INTRODUCTION
Whatdoesitmeantobeascientificfieldofinquiry?Proceedinginductively,wemightobservethatwell-es-tablishedfieldstendtoexhibitthefollowingproperties:(i)acoresetofobservablephenomena,whichthe
fieldseekstoexplain
(ii)acoresetofresearchquestionsthefieldasks
aboutthosephenomena
(iii)asharedsetofbackgroundknowledgethatisin
partspecifictothefield
(iv)ashared'toolbox'ofresearchmethodsusedfor
gainingnewknowledge
Thesepropertiesexhibitagranularityofscale;withinonefieldtheremaybesub-fieldswhichaskmorespecificquestions,assumegreateramountsofsharedknowledgethanthefieldasawhole,andutilizearestrict-edsetofmethodologies.Forexample,linguisticsisaratherwidefieldofinquiry;withinthisfieldthereisasub-fielddevotedtothestudyofsyntaxspecifically.Becausescienceisadynamicandevolvingenterprise,scientificfieldsexhibitthesamekindoftaxonomicstructureasotherevolutionarysystems,suchasspeciesandlanguages–subfieldsmayhavesub-subfieldsoftheirown,andparticularsub-fieldsmayhavemoreincommonwithadifferentfieldthanthe'parent'field.Forexample,psycholinguisticscanbeconsideredasub-field
Copyright:?2014CSICThisisanopen-accessarticledistributedunderthetermsoftheCreativeCommonsAttribution-NonCommercial(by-nc)Spain3.0License.
2?R.Daland
oflinguistics,buttheresearchmethodsandthespecial-izedknowledgespecifictothatfieldarearguablyclosertothefieldofpsychology.But,then,whichofproperties(i)-(iv)areessentialtoafield?Theanswertothisques-tionwillinformouranswertothequestion,“Whatiscomputationalphonology?”
Someperspectiveonthisquestioncanbegainedbyconsideringthehistoricaldevelopmentofafield.Fieldscanoccasionallyform,orshiftdramaticallyincharacter,withtheemergenceofacharismaticandpersuasivethinkeroraseminalpublication.ThiswasarguablythecaseinlinguisticswithChomsky'sreviewofB.F.Skinner'sVerbalBehavior(1959)andotherrelatedpublications(Chomsky,1956).Fieldsmayalsostratifytotheextentthatitisworthconsideringthemastwodifferentfields.Forinstance,mostofthescientificfieldsweknowtodayhavetheirrootsinphilosophy.Fieldsmaycoalescebytheidentificationofsimilarstrandsofthoughtinfieldsthatwereformerlyseparate;suchisarguablythecasewiththefieldofcognitivescience,ormorespecificallywithpsycholinguistics.Inthecaseofnewer,less-establishedfields,especiallythosewhichcoalescedfrommultipleotherfields,thereisamuchsmallercoreofshared,field-specificknowledge.Ar-guably,thecodificationofasharedbodyoffield-specificknowledgeistheconsequenceofestablishingacademicprograms/departmentsforagivenfield,ratherthanacauseornecessarypropertyoffieldhood.Asforthere-searchmethodsofafield,theyareever-changing.Methodologymightbeusedtocharacterizeafieldataparticularhistoricalmoment,butmostfieldspersistthroughseveralmethodologicalturnovers.Forexample,theincreaseincomputerresourcesoverthelast50yearshasrevolutionizedlinguisticmethodology,butthequestionsweasknowarearguablythesameonesthatChomskylaidoutinthe1950s:Howdochildrenlearnlanguage?Outofthespaceoflogicallyimaginablelin-guisticpatterns,whydomanysystematicallynotoccur?Towhatextentcantheoccurrence/non-occurrenceoflinguisticpatternsbeexplainedbyfunctionalaspectsofcommunication,andtowhatextentisitdeterminedbypropertiesofthecognitivesystems(s)thatprocessandrepresentlanguage?
Thereisroomforlegitimatedisagreementonthispoint,butformanyresearchers,afieldrevolvesaroundasetofempiricalphenomena,andakeysetofresearchquestionsthefieldseekstoansweraboutthosephenom-ena.Bythisstandard,Iwillsuggestthatcomputationalphonologyisnotreallyasinglefield.Rather,thephrase'computationalphonology'isusedasanumbrellatermforresearchwhichgenerallypresupposesashared,specificsetofbackgroundknowledgeandusesacom-monsetofresearchmethodologies,butoftenwithradi-callydivergingquestions.Iwillmakethiscasebysur-veyingrecentprogressinfourdifferentsubfields,allofwhichIorcolleagueshaveidentifiedas'computationalphonology'.Weshallseethereisageneralemphasisonlearning,andthatallormostpractitionershaveacom-monbackgroundincorpusandfinite-statemethods,but
thesub-fieldsthemselvesdifferquiteradicallyinwhattheresearchquestionsare.
Priortothesurvey,itisnecessarytovoiceacaveat.TheviewofthefieldthatIpresentismyown.Imakenoclaimthatthesurveybelowiscomprehensive,orunbiased;infact,Iavowthatthisreviewisstronglybi-asedtowardmyownresearchinterests,thereadingsIhavedone,andbyinformalconversationsIhavehadwithcolleagues.Ihavesurelyomittedmentionofagreatdealofimportantandinterestingwork,eitherfromtime/spaceconstraintsorbecauseIhavenotyethadthehonorofbeingexposedtoit.Still,asamultidisplinaryresearcherIhopethatallreaderswillfindsomethingnewwithinthesepages,andIhaveaimedforthefairest,mostscrupulousandscholarlytonefortheworksIwasabletoreviewhere.PriortothebodyofthepaperIbrieflyreviewbackgroundmaterial.2.BACKGROUND2.1.Whatisphonology?
Iassumethatthereaderofthisarticlehassomebackgroundinformallinguistics,perhapsequivalenttoaone-yearundergraduatesequencecoveringpho-netics,phonology,andothercoreareas.Forexample,Iassumethereaderisfamiliarwiththeconceptofunderlyingrepresentation(UR;alsocalledlexicalrepresentation,orinput)versussurface(SR;alsocalledoutput),andtheconventionthatURsareindicatedwithslashes//whileSRsareindicatedwithbrackets[];Iassumeknowledgeoftheterms'segment','sylla-ble','onset','coda',etcetera,andtheInternationalPhoneticAlphabet.Still,asIanticipatesomereaderswillcomefromacomputationalbackgroundwherethestudyofspeechsoundsisnotemphasized,Iwillbrieflydescribeherecoreconceptswhichfigureprominentlyinthepaper.2.1.1.Markedness
Cross-linguisticallysomestructuralconfigura-tionsappeartobedispreferred.Forexample,Frenchhasacomplexprocessknownasschwadeletion,inwhichtheweakschwavoweltendstodelete,exceptifthedeletionwouldcreateatriconsonantalcluster(Riggle&Wilson,2005).Moreover,triconsonantalclustersdonotappearinmanylanguages,andtendtohavearestricteddistributioninlanguagesthatal-lowthematall.ItappearsasifFrenchandmanyotherphonologiesarespecificallyavoidingthis'marked'configuration.Thepropertreatmentofmarkednessisacoreconcerninphonologicaltheory.Whatstructuralconfigurationsaremarked?Howismarkednessrepresentedinthemindsofspeakers?Howismarkednessacquired–isitlearnedfrompho-netics,projectedfromthelexicon,orsomethingelse?
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??3
2.1.2.Alternations
Alternationisthenamegiventocasesinwhichthesamephonologicalentityappearswithtwoormoreforms.Forexample,comparemycasualpronunciationsoftheEnglishwordspentagonandpentagonal:
(1)
pentagonpentagonal
[?p??????ɡa?n]||/\\||[p??n?t??ɡ??n-??]
readoutloudas'XgoestoYwhenitoccursafterAandbeforeB'.Theformalmechanismstheyintroduced–in-cludingthetreatmentofsegmentsas'featurebundles',towhichrulescouldrefer,andlanguage-specificruleorder-ings–becamethedominantparadigmwithinthefieldofphonologyformanyyearsafterwards.Evenasconstraint-basedformalismshavereplacedSPE-stylerulesasthepreferredvehicleforphonologicalanalysis,manylinguistsstilluserulesasaconvenientshorthandfordescribingphonologicalprocesses,e.g.in(2)above.2.1.5.OptimalityTheory
OptimalityTheory,likeSPE,definesthephonologi-calgrammarasacognitivemechanismwhichimple-mentsthemappingfromaninput/URtoanoutput/SR,andmaymakereferenceto'hidden'phonologicalstruc-turesuchasmetricalfeet,syllables,etc.UnlikeSPE,OTpositsthattherearemultiplepossiblecandidatesforagiveninput,andthereisaparallelcomputationtoidentifytheoptimal('mostharmonic')outputcandidate,ratherthantheserial/derivationalprocessororderedruleinSPE.SeminalworksonOT(McCarthy&Prince,1994;Prince&Smolensky,1993,2002,2004;Smolensky&Legendre,2006)definethecorecompo-nentsofabroadclassofconstraint-basedtheories:theremustbeacomponentwhichproposesoutputcandidates(GEN),asetofconstraints(CON),andanevaluation/se-lectionmechanism(EVAL)whichchoosesthewinningcandidatebasedonsomelanguage-specificprioritizationoftheconstraints.Someauthorsuse“OT”torefertoreferbroadlytoanysuchconstraint-basedtheoryofphonology.Iwilluse“OT”torefertothesubclassofconstraint-basedtheorieswiththe“totalordering”evaluationmethoddescribedinPrinceandSmolensky(1993,2002,2004)andMcCarthyandPrince(1994).Thatis,forthepurposesofthisarticle,“OT”meansthatconstraintconflictsareresolvedinfavorofthehighest-rankedconstraint,regardlessofwhetherthewinningcandidateincursmoreviolationsoflower-rankedcon-straintsthanalternatecandidates.(Constraintconflictarisesforparticularinputswhenitimpossibleforanoutputtosatisfyoneconstraintwithoutviolatinganother.Anexampleisshownbelowin(3).2.1.6.HarmonicGrammar
Laterinthearticle,article,Iwillmakefrequentref-erencetoMaxEntHG(Goldwater&Johnson,2003;Hayes&Wilson,2008),aprobabilisticextensionofHarmonicGrammar(Legendre,Miyata,&Smolensky,1990;Smolensky&Legendre,2006)inthelog-linearframework.AsthereadermaynotbefamiliarwithHarmonicGrammar,Idescribeitverybrieflyhere.HarmonicGrammarisaclosevariantofOTwhichdif-fersintheevaluationprocedure:constraintsareweighted,ratherthantotallyordered,andtheharmony
In(1),segment-to-segmentidentityisindicatedbyverticalalignment.Non-identicalcorrespondentsarever-ticallyaligned,butindicatedwithaverticalbarorslash.Everycorrespondingvowelisdifferentbetweenthesetwoforms,owingtothedifferentpositionofstress.Inaddition,themedialcoronalstop/t/isaspiratedinpentagonalbe-causeitprecedesthestressedvowel,whileitlenitestoaflapinpentagonbecauseitprecedesanunstressedvowel(andadditionallycoalesceswiththenasaltoyieldanasalizedflap).Thepropertreatmentofalternations,whereinthe'same'phonologicalunitvariesaccordingtoitscontext,isalsoacoreconcernofphonologicaltheory.2.1.3.Opacity
Opacityariseswhenthesurfaceevidenceforaphonologicalprocessisinconsistent.Forexample,Bakovi?(2007)givesthefollowing,well-knownexam-plefromYokutsYawelmani:
(2)
UR
LongHighVowelLoweringClosedSyllableShorteningSR
/?ili?+l/|?ile?l||?ilel|[?ilel]
Evidently,theLongHighVowelLoweringprocessservestoavoidlonghighvowels,amarkedoutcomewhichneverappearsonthesurfaceinthislanguage(eventhoughmanyURscontainunderlyinglonghighvowels).TheClosedSyllableShorteningprocessissimilarlymotivatedbytheobservationthatlongvowelsneverco-occurwithcodaconsonants.The'problem'in(2)isthatthereisnoreasonforbothprocessestoapply.ClosedSyllableShorteningalonewouldavoidbothmarkedstructures,butLongHighVowelLoweringappearstoapplyanyways,gratuitously'hiding'theunderlyingheightofthevowel.Opacity,oratleastcertaintypesofopaquepatterns,arebelievedtopresentasignificantlearningproblem.2.1.4.TheSoundPatternofEnglish(SPE/Rules)ChomskyandHalle(1968)proposedaphonologicalanalysisofEnglishusingstringrewriterulesoftheformAXB→AYB,typicallyabbreviatedX→Y/A__Band
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
4?R.Daland
ofaformisdeterminedbytheweightedsumofitsconstraintviolation.AswithOT,thisisstraightforward-lyillustratedwithatableau;anexampleofword-finaldevoicingisshownin(3):
(3)/ɡad/[ɡad]?[ɡat]IdentVce[-son]*[-son,+vcd]]PrWd
wt=-1wt=-5
**Harmony-1·0+-5·1=-5-1·1+-5·0=-1TheUR/gad/isgiveninthetopleftcell,whilecan-didateSRsarelistedbelow.Constraintnamesaregiveninthetoprowaftertheinput;IDENTVCE[-SON]penalizesobstruentsforwhichtheoutputvoicingvaluedoesnotmatchtheunderlyingvoicingspecification,while*[-SON,+VCD]]PRWDpenalizesvoicedobstruentsattheendofaword.Constraintviolationsaremarkedinthecellswithan'*'.Forinputswithunderlyinglyvoicedfinalobstruents,itisimpossibletosatisfybothcon-straintsatonce;thusthisisanexampleofconstraintconflict.Theconstraintweightsarelisteddirectlyunder-neaththeconstraintsthemselves,andarerequiredtobenonpositive.1Thefinalcolumnindicatestheharmonyvalueoftheoutputcandidate,definedastheweightedsumoftheconstraintviolations.AswithOT,themostharmonicoutputcandidate(or,equivalently,theleastdisharmonic)isselectedasthewinner;thisisconven-tionallyindicatedwiththe“OThand”?.Incaseswhereonlytwoconstraintviolationstradeoffagainstonean-other,HarmonicGrammarisequivalenttoOT;however,thetwotheoriesmakedifferentpredictionswhenasingleconstraintviolationconflictswithmultipleviolationsofadifferentconstraint(countingcumulativity)orvio-lationsofmultipleconstraints(gangingcumulativity).2.2.Probability
Iassumethereaderisfamiliarwithelementarystatisticsandprobabilitytheory.Forexample,Iassumethereaderisfamiliarwiththeconceptofp-value,t-test,anduseofthebinomialformulatocalculatetheproba-bilityofaseriesofcointosses.Ialsoassumethereaderisfamiliarwithexponentiationandtheinverseoperationoftakingthelogarithm.BelowIdescribetheconceptofodds,andbrieflyoutlinelog-linearmodels.2.2.1.Oddsandlog-odds
Theoddsoftwoevents,sometimeswrittena:b,indicatetherelativeprobabilityofthetwoevents.For
example,iftheoddsare3:2thatLuckyHorsewillwintherace,itmeansthatLuckyHorseisexpectedtowin3timesforevery2timesthatLuckyHorsedoesnotwin.Oddscanalwaysbeconvertedtoprob-abilitiesandviceversa;forexample3:2meansthatLuckyHorsewillwin3timesoutof5trialsforaprobabilityof3/(3+2)=0.6.Oddscanberepresentedassinglenumbersbysimpledivision,e.g.3:2=3/2=1.5.Thus,whenthereareonlytwopossibilities,anoddsof1.5correspondstoaprobabilityof0.6.Thelog-oddsoftwoeventsAandBissimplytheloga-rithmoftheirodds,i.e.log(a:b).(Ingeneral,Iwillassumethenaturallogarithmunlessotherwisespeci-fied.)Thelog-oddshasseveralintuitivelyattractiveproperties.ItiszerowhenAandBareequiprobable,positivewhenAismoreprobablethanB,andnega-tivewhenAislessprobablethanB.Moreover,thegreatertheasymmetryinprobabilitybetweenAandB,thegreaterthemagnitudeofthelog-odds.Finally,inmanyofthesystemswherelog-oddsareused,probabilitydifferencescanbemanyordersofmagni-tude.Thelogoperationmakestherelativelikelihoodoftheseoutcomeseasiertograspfornormalreaders.2.2.2.Log-linearmodels
Log-linearmodelsexpresstheprobabilityofinput-outcomepairsintermsofsomefeaturefunctionsandassociatedweights.ThescoreHM(w)ofaninput-outcomepairistheweightedsumofitsfeaturevalues.TheoutputofthemodelM(w)isthendeterminedbystipulatingthattheprobabilityofaninput-outcomepairisproportionaltotheexponentialofitsscore.Formally,alog-linearmodelM(w)consistsofavectoroffeaturefunctionsf={fk}andarelationGENwhichgivesthesetofpossibleoutcomesyijforeachinputxi.Inaddition,thevectorwisaparameterofM,andrepresentstheweightsthatareassociatedwiththefeaturefunctions:
(4)PrM(w)(yij|xi)=exp(HM(w)(xi,yij))/Z(xi)
HM(w)(xi,yij)=Σkwk·fk(xi,yij)Z(xi)=Σy[i,j']∈GEN(x[i])exp(HM(w)(xi,yij'))
Log-linearmodelshaveseveralattractivecomputa-tionalproperties.Oneofthemisthatitiseasytointer-prettherelativeprobabilityoftwodifferentoutputs:foragiveninputxi,thelog-oddsofoutcomeyiaversusyibissimplythedifferenceintheirscoresHM(w)(xi,yia)–HM(w)(xi,yib).Another,especiallyimportantpropertyisthatforfixedfandGEN,onlymildassumptionsare
Someauthorsinsteadrequirethatweightsbenonnegative.Theformalismitselfrequiresthattheweightsallbethesamesign.(Otherwise,thetheorywouldnotexhibitharmonicbounding,andwouldlosethedesirabletypologicalrestrictivenessthatcomesfromanexplicittheoryofmarkedness.)Iprefernegativeweights,sincethisalignsintuitivelywiththedefinitionofharmony:candidateswithmoreconstraintviolationsarelessharmonic.Asweshallseelater,negativeweightsalsoalignswiththenaturalextensionofHarmonicGrammartoalog-linearmodel.
1
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??5
neededtoensurethattheprobabilityofadatasetX={(xi,yij)i=1..N}isconvexinthespaceofallpossibleweightvectors(Berger,DellaPietra,&DellaPietra,1996).Inmoreeverydaylanguage,thismeanstwothings.First,thereisaunique'best'weightvector(wmax)whichmax-imizesthelikelihoodoftheobserveddata.Second,itispossibletofindthisuniquebestsolutioninacomputa-tionallyefficientmanner,usingwell-establishednumer-icaltechniqueslike(conjugate)gradientascent.Aswewillseelater,log-linearmodelsofferanaturalproba-bilisticextensionforHarmonicGrammar,whichofferstheexcitingpotentialforatheoryofphonologicallearningthatismachine-implementableandtestableonnaturallanguagedata.
Thiscompletesthesurveyofbackgroundmaterial.Thenextsectionbeginsthebodyofthepaper.Inthatsection,Ibrieflysurveythefieldknownas'formallan-guagetheory',whencemodernlinguisticsbegan.3.FORMALLANGUAGETHEORY
Formallanguagetheoryisanaxiomatic,logical/math-ematicalapproachtolanguage.A'language'isdefinedasasetofstrings,oftenaccordingtosomeprocessthatgeneratestheset.Researcherswhoworkinthisareaareconcernedwiththeclassificationoflanguagesaccordingtothe'complexity'oftheprocessrequiredtogeneratethelanguage,aswellastheassumptionsneededtolearnlanguagesinthevariousclassesidentified.Twoofthebest-knownconceptstohaveemergedfromthislineofresearcharetheChomsky-Schützenbergerhierarchy(Chomsky,1956)andtheconceptofidentificationinthelimit(Gold,1967),bothofwhichwillbebrieflycoveredlater.Twostrandsofworkinthislineofspecialrelevancetophonologyincludecomparisonsoftheex-pressivepowerofdifferentphonologicalframeworks(e.g.Buccola&Sonderegger,2013;Graf,2010ab;Jar-dine,inpress)andtheelaborationoffinite-statetech-niqueswhich'count'constraintviolationsforentireclassesofstrings,enablingefficientmachineoptimiza-tion(e.g.Eisner,2002;Hayes&Wilson,2008;Riggle,2009).
Asthismaterialissomewhattechnicalandunlikelytobeknowntotheaveragelinguist,Ibeginwithanoverviewofbasicconcepts.Furthermore,becausethearticleaimstocoverothertopicsbesidesjustformallanguagetheory,theoverviewisnecessarilysomewhatsuperficial;itismeanttodescribetheintuitions,themostcommonnotation,andthemostwidelycitedre-sults.ReaderswhoarealreadyacquaintedwiththismaterialmaywishtoskipdirectlytotheFrameworkcomparisonsubsection.Conversely,readerswhowishtolearnmoreareadvisedtoperuseasourcedevotedtoformallanguagetheory:Heinz(2011ab)forphonologyspecifically,Stabler(2009)forasurveyofformallan-guagetheoryasitrelatestonaturallanguageuniversals,oranintroductorycomputersciencetextbookforthebasics.
3.1.Generaloverview
Informallanguagetheory,'language'doesnotrefertoasharedlinguisticcodelikeEnglishorAmharicorTashliytBerber.Rather,itisaformalobjectwithpre-ciselyspecifiedproperties,whichcanbestudiedinamathematical,axiomatic,logicalfashion.Conventional-ly,formallanguagetheoryassumesanalphabetΣanddefinesastringasanorderedsequenceofelementsfromΣ.Forexample,ifΣ={a,b}thenσ=abisa(rathershort)stringoverΣ.ThesetofallpossiblestringsoverΣisdenotedΣ*(where*iscalledtheKleenestarandhastheconventionalizedmeaningof“0ormorerepeti-tions”).Normallyinformallanguagetheory,alanguageLisdefinedasasubsetofΣ*.Notethattheelementsofthealphabetdonothaveanyintrinsicmeaning,oranyinternalstructure;theyaresimplyalgebraicelementsthataredistinctfromoneanother.
Forexample,wecoulddefineΣ={C,V}andL=(CV)+(where+means“1ormorerepetitions”);there-sultingsetofstringswouldlooktoaphonologistlikeastrictCVlanguage:{CV,CVCV,CVCVCV,...}.ButtheformalismdoesnotknowthatCmeansconsonantandVmeansvowelinthesamewaythathumanspeak-ersdo.Humansknowthatvowelsarecharacterizedpartiallycomplementaryarticulatoryandacousticproperties,aswellassequencingfacts(e.g.wordsmustbeginwithaC,everyCmustbefollowedbyaV,VcanendawordorbefollowedbyaC).Theformalismmerelyknowsthesequencingfacts,andthatCisadif-ferentsymbolthanV.Indeed,thelanguageL=(ab)+overΣ={a,b}hasthesameabstractstructureasL=(CV)+overΣ={C,V};fromaformallanguageperspec-tive,thesearenotationalvariants,meaningthattheyexpressthesamepatternafteratransparent,structure-preservingchangeinnotation.
Interestinformallanguagetheoryismotivatedbytheassumptionthatnaturallanguagescanbemappedontosomeparticularclassofformallanguages(orviceversa),andthatthepropertiesoftheformallanguageclasswillyieldclearinsightsintohowlanguageislearned,represent-edandcomputedinthemindsofspeakers.Forexample,itiswidelybelievedthatsyntaxismildlycontext-sensitive,whilephonologyis(sub)regular(e.g.Heinz,2011ab;Stabler,2009).Wewillunpackthisassertionlater.Inthemeantime,itmustbeacknowledgedthisideaofidentifyinglanguageswithstringsets,anddividingthemupintoclassesbaseduponcertainproperties,isalargeassump-tion,whosefullimplicationswedonothavespacetoas-sesshere.Iwillpointoutoneimplicationherehowever:theliteratureonlearningformallanguages('learnability')assumesthatknowledgethatis'outside'thegrammarisnotbroughttobearongrammarlearning.Forexample,phoneticknowledgedoesnotfigureintheformallanguagelearnabilityliteratureonphonology,justassemantic/prag-maticknowledgedoesnotfigureinthelearnabilitylitera-tureonsyntax.Withthiskindofcaveatinmind,letusconsiderwhatformallanguagetheoristsmeanbyalan-guageclass.
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
6?R.Daland
3.2.TheChomskyHierarchy
Itmaybehelpfultobeginwithanexample.Chomsky(1956)describesawayofgeneratingstringsthatisnowknownasaphrase-structuregrammar.Phrase-structuregrammarsarepredicatedonasystemof“rewriterules”,inwhichonestringisrewrittenasanother.Hereisanex-ampleofanespeciallysimplephrase-structuregrammar:
(5)
rewritetononterminals:S→NPVP
VP→VNP
rewritetoterminals:V→likes
NP→theboyNP→thedog
languagesinwhichifthegrammargeneratesasentenceX=x1x2...xn,italsogeneratesthemirror-imageX'=xnxn-1...x1.Naturallanguagesexhibitcertainkindsofregularities,suchasconstituencystructure,whicharenotexpectedifrewriterulesarecompletelyunrestricted.Therefore,theclassofcontext-sensitivelanguagesis'toorich';itdoesnotexplainthestructuralconstraintsthatnaturallanguageshave.3.2.2.Context-freelanguages
Chomsky(1959)definedthecontext-freeclassasthesetoflanguageswhichcanbegeneratedbyagram-marinwhichtheleft-handsideofeveryrewriterulesisasinglenonterminal.Inotherwords,therewriterulessubstituteauniquenonterminalforsomethingelse–crucially,withoutregardtowhatsurroundsthenonter-minal.Grammar(5)isanexample:everyrewriterulecontainsasinglenonterminalontheleft-handside.In-tuitively,thismeansthattheeventualoutputthatcorre-spondstoanonterminalcannot'lookoutside'thenonter-minalitself.Inotherwords,context-freenessimposesatypeoflocalityrestrictiononhowsubstringsmaysharedependencies.Thisisonemeansofenforcingconstituentstructureincontext-freelanguages.3.2.3.Regularlanguages
Theregularlanguagesarethosethatcanbegenerat-edbyrewriterulesinwhichtheleft-handsideconsistsofasinglenonterminal,andtheright-handsidemaycontainatmostonenonterminal.Moreover,thenonter-minalsontheright-handsideoftherewriterulesmustalwaysbefinalintherewritestring(inwhichcasethelanguageiscalledrightregular),ormustalwaysbeinitialintherewritestring(inwhichcasethelanguageiscalledleftregular).Hereisanexampleofarightregulargrammarandastringthatitgenerates:
(7)
TheSsymbolisunderlinedtoindicatethatitistheuniquestartsymbol.Thisgrammargeneratesstringsbybeginningwiththestartsymbol,andgeneratingallpossibleoutputsbyapplyinganyrulethatcanapply,atanytime.Forexample,thisgrammargeneratesthestringtheboylikestheboybyrewriting'S→NPVP','NP→theboy','VP→VNP',and'NP→theboy'again.Thesequenceofrewriteoperations,togetherwiththefinaloutputofthederivation,hasanelegantvisualrepresen-tationasatree:
(6)
Thegrammarin(5)issimpleenoughtoenumeratetheentirelanguageitgenerates:theboylikestheboy,theboylikesthedog,thedoglikestheboy,thedoglikesthedog.
Moreformally,aphrase-structuregrammarGconsistsofastartsymbolS,asetofterminalsymbolsΣ,asetofnonterminalsymbolsV(whichmustnotshareanysym-bolswithΣ),andacollectionofrewriterulesR,whereeachrewriterulemapsasequencecontaininganontermi-naltoasequenceofterminalsandnonterminals.Thelan-guagegeneratedbysuchagrammarisdefinedasthesetofstringsgeneratedbyallderivationsthatterminate(i.e.stringscontainingonlyterminalsymbols).3.2.1.Context-sensitivelanguages
Thesetoflanguagesthatcanbegeneratedwhentherewriterulesareonlyunrestrictedtonotincreasethenumberofsymbolsiscalledthecontext-sensitiveclass.Itispossibletodefinecontext-sensitivelanguageswhicharecompletelyunlikenaturallanguages,forexample
Thesymbolsin(7)werechosensuggestively,toil-lustratetoreadershowformallanguagesmightencodestructuresandrelationsusedinmainstreamphonology.Nowweareinapositiontounderstandthecontrast-ingclaimsthat“phonologyis(sub)regular”while
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??7
“syntaxismildlycontext-sensitive”.TheformerphraseexpressesthebeliefthatforeverynaturallanguageL,thereisagrammarGLwhichcangenerateallandonlythelicitphonologicalstringsofL,andGLcanbewrittenasaregulargrammar(possiblyevenassomepropersubsetoftheregularlanguages).Thelatterphraseex-pressesthebeliefthatthisisnottrueforsyntax,sincethereexistsyntacticpatternswhich(ithasbeenclaimed)cannotbecapturedbyregularrewriterules,orevencontext-freerewriterules.Forexample,Shieber(1985)givesthefollowingSwissGermanclauseasanexampleofacross-serialdependency:
(8)
Languageswhichadmitofanarbitrarynumberofsuchdependenciesareprobablynon-contextfree,andShieberarguesthatSwissGermanisjustsuchacase.3.3.EquivalencyofFiniteStateAutomataand
RegularLanguagesAnoverviewofformallanguagetheorywouldnotbecompletewithoutmentionoffinitestatemachines(FSMs,alsocalledFSAsforfinitestateautomata).Practicallyspeaking,anFSMisanalternativerepresen-tationofaregularlanguage.Historically,thetwowereconceivedofseparately,buttheformalequivalencewasnotedandprovedinearlywork.AnFSMconsistsofasetofstates,conventionallyindicatedwithcirclesandanoptionalstatelabel.Inadditiontothestates,anFSMcontainstransitionsbetweenstates,whichmustbela-beledinmostformulations.Atleastonestateisdesig-natedasastartstate,andatleastonestateisdesignatedasanendstate;thesemaybethesamestate.Convention-ally,thestartstateisindicatedwithathickcircle,whileotherstatesareindicatedwithasinglecircle.Herearetwoexamples:
(9)
nitenumberofstrings,includingThisisthecat,Thisisthecatthatchasedtherat,Thisisthecatthatchasedtheratthatatethecheese,Thisisthecatthatatethecheesethatchasedtherat,etc.Ingenerationmode,theFSMworksbybeginningatthestartstate.Ifitisatanendstate,itmaystop,havinggeneratedacompletestring.Ifthereareoneormoretransitionsoutofthecurrentstate,themachineselectsonerandomlyandfollowsit,emittingasymbolalongtheway(thelabelonthetransition).However,whenthereisonlyonetransitionoutofanon-end-state,themachinemustfol-lowthatuniquetransition.Inparsingmode,themachineissaidtoconsumesymbolsfromaninputstring.Itbe-ginsatthestartstate.Whenitreceivesthenextsymbolfromtheinputstring,itlooksforatransitionwithamatchinglabel.Ifthereisamatchingtransition,itfol-lowsitandadvancestothenextinputsymbol.Ifthereisnotamatchinglabel,themachineissaidtorejectthestring.Ifthemachineisinanendstatewhentheinputstringisentirelyconsumed,themachineissaidtoacceptthestring;otherwisethemachinerejectsthestring.Inotherwords,theFSMacceptsthestringifandonlyifitcanmatcheverysymbolintheinputstringwithatransitionandendupinanendstatewhentheinputisconsumed.
(10)
Finitestatemachinescanbeconsideredasgeneratorsorparsers,buteitherway,theydescribethesamesetofstrings.Example(9)describesexactlytwostrings:Hellofather,andHelloworld.Example(10)describesaninfi-
Anexampleofastringthat(10)doesnotacceptisThiswasthecatthatchasedtherat.Initially,thema-chineisinthestartstate(S1).Thefirstinputsymbol,This,ispresentedandmatchesthelabelforthetransitionleadingoutofS1andintoS2,soThisisconsumedandthemachineentersstateS2.Nowtheinputsymbolwasisconsidered,buttheonlyavailabletransitionlabelis
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
8?R.Daland
is,sothemachinerejectsthestring.Informallanguagetheory,acceptingorrejectingastringisakintoofferingagrammaticalityjudgment.Languagesaredefinedassetsofstrings,soinprincipleitispossibletomakeabinaryjudgmentforeverystring,whetheritisinthelanguage.Inthecaseofregularlanguages,thereisguaranteedtobeafinitestatemachinewhichcannotonlydothisisprinciple,butcandosostraightforwardlyandefficientlyinacomputerimplementation.Forthisreason,finitestatemethodshavebeenappliedthrough-outcomputerscience,forbothnaturallanguageprocess-ingandvariousotherapplications(suchasprogramminglanguageparsingandcompilers).
Itisworthnotingherethattherearealternativefor-mulationsoffinitestatemachines.Forexample,itispossibletomakethestatelabelscorrespondtosymbolsbeinggenerated/consumed,whilethetransitionsareunlabelled.Itisalsopossibletoaugmentthetransitionand/orstateswithextrainformation,beyondthesymbolbeingconsumed/generated.Indeed,thereisagreatdealofworkonthistopic,whichisomittedforspacereasons.Afinaltypeoffinitestateautomataisknownasafinitestatetransducer(FSTs).AnFSTisjustlikeanFSM,exceptthatitparsesaninputstringandgeneratesacorrespondingoutputstring.Thatis,theFSTbehavesjustlikeaFSMintermsofparsing,butitstransitionla-belshavebeenaugmented;thelabelconsistsofboththeinputsymboltomatch,andanoutputsymboltogenerateuponasuccessfulmatch.Hereisanexamplewhichimplementsanintervocaliclenitionrule(d→e/a__a):
(11)
Thesymbol∈isaspecialsymbol,conventionallyusedtoindicateanemptyoutput.Thefinitestatetrans-ducerin(11)willfirstmatchan/a/andoutputan[a];thenitwillmatcha/d/butoutputnothing,waitingtoseeifitgetsanother/a/.Ifitgetsanother/a/,itwillthenoutputthe'delayed'[e]alongwiththe[a];otherwise,theFSTwillrejectthestring,indicatingthatthelenitionruledoesnotapplytothisinput.
3.4.Identificationinthelimit,andothernotionsof
learnabilityGold(1967)providedthefirstformalizationoflearnabilityforaformallanguage.InGold'sconception,theinputtoalearnerisdefinedasatextT–aninfinitesequence(t1,t2,...)ofgrammaticalitemsfromalan-guageL,whichisguaranteedtocontaineveryiteminLatleastonce,butnotinanyparticularorder.Agram-marG(L)isdefinedasafiniterepresentationthatcangenerateallandonlythestringsofL.AlearnerAisdefinedasafunctionwhichacceptsafinitesubsequence
Tn=(t1,t2,...,tn)fromatextTandreturnsahypothesizedgrammar.Forexample,A(T5)isthegrammarthatlearnerAwouldpositafterhearingthefirst5sentencesofLintextT.ByfeedingalearnerAsuccessivelylongersubsequencesfromaninputtextT,weobtainasequenceofpositedgrammarsA(T1),A(T2),...AlearnerissaidtoidentifyLinthelimitifforeverytextT,thereisafiniteamountofinputNsuchthatA(TN)=G(L),andA(Tm)=A(TN)forallm>N.Inotherwords,thelearnerAissaidtoidentifyLinthelimitiftheyareguaranteedtocon-vergeonagrammarthatgeneratesLinafiniteamountoftime.
PriortopresentingGold'smainresult,itisworthcon-sideringhowthisframeworkcompareswiththechild'slearningsituation.Intheframeworkdescribedabove,alearnerhasaccessonlytopositiveevidence,thatis,onlytosentenceswhichareactuallyinthelanguage.Thisisnowreferredtoasunsupervisedlearning,sincethelearnerdoesnothaveaccesstoanexternalmetricor'objectivefunction'whichunambiguouslyindicatesthenatureofthesolutiontobelearned.(Goldalsoconsideredsupervisedlearning,intheformofaninformantwhopresentsbothsentencesfromthelanguageandsentencesnotfromthelanguage,whileindicatingwhichiswhich.)Itisgenerallybelievedthatchildrenacquirethesyntaxoftheirlanguagefrompositiveevidenceonly,andtendtoignorethenega-tiveevidencetheydoget(R.Brown,1973).Ontheotherhand,Gold'sframeworkdoesnotallowfor'meaning',ei-therthesemanticmeaningofthewordsthatsentenceshear,orthe'phonetic'meaningofthephonemesthatmakeupthosewords.Moreover,Gold'snotionofidentificationinthelimitdoesnotimposeanyconstraintsupontheinputtext,suchassomekindof'representativeness'criterion.Thatis,itisasafebetthatthewords'momma'or'mother'appearinthefirstmillionwordsthateveryEnglish-acquir-inginfanthears,butthereisnothinginGold'sformulationwhichrequiresinputtextstoexhibitthiskindofreal-worlddistributionalproperty.Thus,Gold'sassumptionslineupwiththechild'slearningsituationinoneway,butdifferfromitinotherways.
ThereweretwokeyresultsinGold(1967).Thefirstisthattheclassofregularlanguagesisnotidentifiableinthelimit.Thesecondwasthatregularlanguages(andevenhigherclassesintheChomskyhierarchy)arelearnableinthelimitfromaninformant,i.e.supervisedlearningwithpositiveandnegativeexamples.Sincephonologyisbe-lievedtobe(sub)regular,andsyntaxisbelievedtobeatleastcontext-free,anditiswidelybelievedthatchildrendoeventuallylearnthecorrectgrammarfortheirlanguage,thisresultisinterpretedbymanytheoristsasprovingthatchildrenpossessinnateconstraintsonthespaceofhypothe-sesthattheyconsiderasgrammarsfortheirlanguage.ThisconclusiondoesnotactuallyfollowfromGold'stheorem.Ingeneral,onecanonlyreason'backwards'fromamodeltorealitywhenoneisconfidentthatthemodelisanaccurateportrayaloftherealityitismodeling.Thatis,modelingresultsdependonahostofassumptions;theyareakintoalogicalpropositionoftheform,'IfAandBandCandD,thenX'.Wecannotconcludefromthetruth
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??9
ofXthatAandBandCandDaretrue.Moreover,wecannotconcludefromthefalsityofXthataparticularas-sumption(e.g.C)isfalse;wecanonlyconcludethatsomeassumptionisfalse.SofromGold'stheorem,whatwecanconcludeisquitelimited.Itcouldbethatchildrenarebornwithinnateconstraintsonthegrammarsthattheyconsider.ButitalsocouldbethattheinputisfarmoreconstrainedthanGoldassumes.Itcouldbethatchildrenleveragemultipletypesofinformation(suchassemanticsandphonetics)inlanguageacquisition,andthatthisprovidesextraconstraintsonthespaceofpossiblegrammars.Itcouldbethathumangrammarsarenotstrictlycomparablewiththeclassesofstring-generatorsthatGoldconsiders.Itispossiblethathumansdonotactuallyconvergeasinglefinalgrammarstate,andactuallydoupdatetheirgrammarsonthebasisofnewinputthroughoutthelifespan.ThesepossibilitiesareallcompatiblewithGold'stheorem.CloseinspectionoftheproofforGold'stheoremre-vealsthatitdependscruciallyontheorderinwhichexam-plesarepresented.Goldshowsthatitispossibletocon-structatextwhichcontinuallyforcesthelearnertoupdatetheirhypothesis,becausetheclassofregularlanguagesisrichenoughthatonecan'maliciously'denycrucialevi-dencetothelearneradinfinitum.Valiant(1984)intro-ducedaprobabilisticframeworkforstudying(machine)learningknownasprobablyapproximatelycorrect(PAC).Abstractingawayfromthetechnicaldetails,thekeydif-ferenceisthattextsarerequiredtobe'representative',inthesensethattrainingexamplesmustbedrawnfromaprobabilitydistribution,andthelearneriscountedas'ap-proximatelycorrect'ifitsgeneralizationerrorsonthisdistributionfallbelowanarbitrarythresholdδ(whichcanbemadeassmallasdesired,aslongasitisstillgreaterthan0).AlanguageclassissaidtobePAC-learnableifalearnercanidentifyan'approximatelycorrect'languageinthehypothesisspacefromafinitesampleofthetargetlanguage.ItisefficientlyPAC-learnableifthereisanal-gorithmwhichisguaranteedtodothiswhilerequiringanumberofexamplesthatispolynomialinthesizeofthelanguage.KearnsandValiant(1994)showthatregularlanguagesarenotefficientlyPAC-learnable,whileLiandVitányi(1991)showthatregularlanguagesareefficientlyPAC-learnableundertheadditionalassumptionthat'sim-ple'examplesaremorelikelytobedrawnthancomplexones(asassessedbyameasurecalledKolmogorovcom-plexity).
Researchershaveinterpretedtheselearnabilityresultsinmanyways.Someresearchersbelievethatthewayforwardistodevelopincreasinglyfine-grainedspecifi-cationoftheassumptionsandincreasinglyfine-grainedclassificationsoftheclassesoflanguages.Somere-searchersbelievethatthiskindofworksimplyhasnobearingonthelearningproblemthatchildrenactuallyface.Onegeneralizationthatmanypartiescanagreetoisthatthelearnabilityresultsofferedsofararefragile,inthesensethatseeminglysmallchangesintheassump-tionscanresultinlargechangesinthenatureoftheconclusion(whileintuitivelysimilarchangesmayalsoyieldnomeaningfuldifference).Thus,onewaytoview
thecarefulworkdonebyGold,Valiantandothersisasanongoingattempttocharacterizewhichassumptionsactuallymatterforlearnability.
Thereisavastamountofworkonformallanguagetheoryandlearnabilitythatcannotbesurveyedhere;Itrustthepresentationabovewasdetailedenoughtogivethelayreaderasenseofwhatformallanguagetheoryaimstoaccomplish.Intheremainderofthissection,Iturntotwonewlinesofworkinformallanguagetheorywiththepotentialtoinformbasicquestionsinphonology.Thefirstconcernswhatmightbecalledframeworkcomparison–amethodologyforcomparingtwodistinctlinguisticfor-malismsviatheformallanguagestheygenerate.Thesec-ondconcernstheuseoffinite-statetechniquesforefficientimplementationsofconstraint-basedphonology,whichIwillrefertoasfinite-stateOT.3.5.Frameworkcomparison
Modernlinguisticshastakenseriouslythetaskofformalizingtheoreticalintuitions.Fromseminalworkstothemodernday,theoristsareapttoproposenewframeworkslikeSPE(Chomsky&Halle,1968)andOT(Prince&Smolensky,1993,2002,2004),ornon-trivialdeparturesfromexistingframeworks,suchasautoseg-mentalphonology(Goldsmith,1976,1990;McCarthy,1981),HarmonicSerialism(McCarthy,2008,2011),andothers.Theformalistbenthaspaidoffintheoreticalprecision:solongasthelinguisticatomsandoperationsarespecified,thereaderofapapercanmakenewpre-dictionsfromatheorywhichtheoriginalwriterwouldagreewith.Thiskindofprecisionenablesrapidprogress,andsurelyreducesthefrequencyandseverityoffruitlessdebatesinthefieldovermisinterpretationsofatheory.Still,aspointedoutinStabler(2009),theproliferationoftheoriesdoescomewithacost.Inmanycasestherearecompetingformalisms,butsincethesurfacecharac-teroftheexplanationissodifferent,itisdifficulttotellwhetherthetheoriesactuallymakedifferentpredictions.Formallanguagetheoryoffersawaytodirectlycomparetheexpressiveandrestrictivepowersoftwodifferentframeworks.Thiskindofworkisalreadywell-establishedinthesyntacticdomain,asevidentfromthefollowingquotationinareviewbyStabler(2009):
IntheworkofJoshi,Vijay-Shanker,andWeir(1991),Sekietal.(1991),andVijay-ShankerandWeir(1994)fourindependentlyproposedgrammarformalismsareshowntodefineexactlythesamelanguages:akindofhead-basedphrasestructuregrammars(HGs),combinatorycategorialgrammars(CCGs),treeadjoininggrammar(TAGs),andlinearindexedgrammars(LIGs).Further-more,thisclassoflanguagesisincludedinaninfinitehi-erarchyoflanguagesthataredefinedbymultiplecontextfreegrammars(MCFG),multiplecomponenttreeadjoin-inggrammars(MCTAGs),linearcontextfreerewritesystems(LCFRSs),andothersystems.Later,itwasshownacertainkindof“minimalistgrammar”(MG),aformula-tionofthecoremechanismsofChomskiansyntax–using
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
10?R.Daland
theoperationsmerge,move,andacertainstrict'shortestmovecondition'–defineexactlythesameclassoflan-guages(Michaelis,2001;Harkema,2001;Michaelis,1998).Theseclassesoflanguagesarepositionedbetweenthelanguagesdefinedbycontextfreegrammars(CFGs)andthelanguagesdefinedbycontextsensitivegrammars(CSGs)likethis.
TheworkscitedbyStablerindicatethatdespitethesurfacedifferencesbetweenformalframeworks,theyaresometimes“notationalvariants”,inthedeepsensethattheydescribethesamesetoflanguages.Thedetailsoftheseproofsarebeyondthescopeofthisarticle,butthegeneralnatureoftheargumentisclear:provideaschemafortranslatingoneformalismintoaparticularkindoflogic,whichcanbeexpressedasaformallan-guage.Thendothesamefortheotherformalism,andshowthatthetworesultingformallanguagesarethesame(ordifferent)accordingtoknownpropertiesoftheformallanguage.Intheviewoftheresearcherswhodothiswork,formallanguagetheoryhasacertainpotentialtotelluswhatourformalmechanismsareactuallybuyingforus.
KaplanandKay(1994)arguablysuppliedthefirstsuchexampleofthislineofworkinphonology.Theyprovedthattherule-basedrewritesystempresentedinSPEbelongstotheclassofregularlanguages,byembed-dingitinaclassoflogicsknownasMonadicSecondOr-der(MSO)logics,knowntobeequivalenttotheregularlanguages.Moreprecisely,KaplanandKayclaimedthatSPEwasregularevenwith'cyclicalrules'thatareallowedtofeeditsownenvironment,aslongastheyareforbiddenfromfeedingtheirowntargets(fordiscussionandclarifi-cationseeKaplan&Kay,1994).Potts&Pullum(2002)didessentiallythesamethingwithOT,embeddingaclassofOTconstraintsintoMSO.Inaddition,Potts&PullumdemonstratedthatparticularclassesofOTconstraintsthathadbeenproposed(e.g.alignconstraints)exceededthepowerofregularlanguages,andinsomecasestheypro-posedregularalternatives.
Graf(2010ab)comparedaformalismknownasGovernmentPhonologywithSPE.Forreadersnotal-readyfamiliarwithGovernmentPhonology,Graf(2010a)veryreadablypointsoutthevastsurfacediffer-encesbetweenitandSPE:
GPasdefinedinKayeetal.(1985,1990)andKaye(2000)differsfromSPEinthatitusesprivativefeatures(featureswithoutvalues)ratherthanbinaryones,assemblesthesefeaturesinoperator-headpairsinsteadoffeaturematrices,buildsitsstructuresaccordingtoanelaboratesyllabletemplate,employsemptycategoriesandallowsallfeaturestospread(justliketonefeaturesinautosegmentalphonology).(p.83)
inastrictlylessexpressivelogic.Inotherwords,Grafarguesthatdespitethemanydifferencesbetweentheseformalisms,thepropertythatreallymattersisboundedvs.unboundedspreading,sincewithunboundedspreadingthetwotheoriescanexpressthesamelan-guages.
Grafgoesontoaddressthe'empiricalbite'ofthistheorybyaskingwhetheranynaturalphonologicalphenomenadorequireunboundedfeaturespreading.Heproposestwocandidates–SanskritnatiandCairenestressassignment.AccordingtoGraf,thenatirulecausesanunderlying/n/(theTARGET)tobecomeretroflexedifitisthefirstpostvocalic/n/afteracon-tinuantretroflexconsonant(/?/or/?/;theTRIGGER),providedthatnocoronalintervenesbetweenthetriggerandtarget,thatthenasaltargetisimmediatelyfol-lowedbyanonliquidsonorant,andthatthereisnoretroflexcontinuantinthestringafterthetarget.AsforCairenestressassignment,theruleistostressthefinalifitissuperheavyorthepenultifitisheavy.Ifboththefinalsyllableandthepenultarelight,theruleistostressthepenultortheantepenult,whicheverisanevennumberofsyllablesfromtheclosestprecedingheavysyllable.Thissuggeststhepresenceofan'invis-ible'trochaicfootingsystem,inwhichsecondarystressespropagateinaniterative/boundedmannerfromtherightmostheavytothepenultortheante-penult.Ofcourse,asGrafpointsout,theabilitytoanalyzebounded/iterativespreadingofan'invisible'featureisempiricallyimpossibletodistinguishfromunboundedspreadingofavisiblefeature.Therefore,heproposestobanbounded/iterativespreadingofin-visiblefeaturesforthepurposesoftheorycomparison.Thissuggeststhatunboundedfeaturespreadingisre-quired–animportanttheoreticalclaim.
TwootherrecentstudiesinthislineofresearchareJardine(inpress)andBuccolaandSonderegger(2013).Jardine(inpress),followingdirectlyfromGraf(2010ab),askswhetherautosegmentalphonologybelongstothesameclassasSPE.Jardinedoesnotgiveacompleteanswertothisquestion,owingtophenomenasuchasfloatingtonesanddissociationrules.However,JardinedoesshowthatMSOisexpressiveenoughtocoverthe'simple'phenomenathatinitiallymotivatedautosegmen-talphonology,suchasrightwardfeature-spreading.BuccolaandSondereggeraddressCanadianraising,anopaquephonologicalpatterninwhichallophonicvaria-tionistriggeredbyanunderlyingcontrastthatiserasedatthesurface:
(12)a.Raisingbeforevoicelessconsonantsride/?a?d/write/?a?t/
b.Foot-medialtappingbatter/b?t?/badder/b?d?/
c.Interactionride/?a???/write/?a???/
Grafbeginsbytranslatingeachoftheseformalisms
intoakindofpropositionallogic.LikeKaplanandKay(1994),GrafembedsSPEinMSO.GrafgoesontoshowthatifGovernmentPhonologyallowsunbound-edfeaturespreading,itcanbeembeddedinMSO;ifitallowsonlyboundedspreading,itcanbeembedded
[?a?d][???t][b???][b???][?a?d][?????]
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??11
Patternslike(12)areeasilycapturedbyarules-basedanalysisinwhichthetappingruleappliesaftertheCanadianraisingrule.Itisgenerallybelievedthatsuchpatternscannotbeaccommodatedby'normal'OT,andaconsiderablebodyofworkhasbeendevotedtoaccommodatingthetheorytothistypeofpattern.WhatBuccolaandSondereggershowisthatforanyversionofOTinwhichthereisasinglestratum(thatis,oneinputrepresentationandoneselectedoutputrepresentation,withnointerveningrepresentationallevelssubjecttocompetition),andinwhichfaithful-nessconstraintsassesstherelationshiponlybetweenaninputsegmentanditsoutputcorrespondence(i.e.withoutreferencetoitsneighbors),theOTtheoryisstrictlyunabletoaccountfortheopacitypattern.TheyshowthisbytranslatingOTconstraintsintofinite-statetransducers,inthemannerproposedbyRiggle(2004)anddescribedinthenextsubsection.However,BuccolaandSondereggeralsoacknowledgethatthehighlyrelatedformalismofHarmonicGrammar(inwhichconstraintcompetitionsareresolvedthroughlinearcombinationsratherthanstrictdomination)ac-tuallycanaccommodatecaseslikeCanadianraising,withoutallowingforso-calledpositionalfaithfulnessconstraints.(Finally,thereisananalysisinwhichthe[??]~[a?]contrastistreatedasphonemic,althoughmanylinguistsdispreferthis,sinceitrequiresstipulat-ingthat[??]isonlylicensedbeforecoronalobstruentsand[?],andcruciallyfailstolinkthisfacttothenearlycomplementarydistributionof[a?].)
Insummary,formallanguagetheoryhasbeguntodeliveronthepromiseofframeworkcomparisoninphonology.Ifoneiswillingtoacceptthepremisethatalanguageisasetofstrings,thiskindoftechnicallyexactingworkhasthecapacitytorevealsurprisingequivalencesbetweenformalisms,andtozoominonkeypropertieswhichdistinguishexpressivity.Still,itmustbeacknowledgedthatexistingworkseemstodependsensitivelyondetailsoftheanalysiswhicharenotthemselvesrock-solid.Forexample,theclaimthatsyntaxisnotcontext-freerestsonphenomenalikecross-serialdependencies,andmorespecificallyontheclaimthatSwissGermanallowsanunboundednumberofthem.Inpractice,itislikelyquiterarefornaturalusagetoyieldmorethan1crossingdependen-cy.WhileGraf's(2010a)workdoesnotstrictlydependonwhetherunboundedfeaturespreadingactuallyoc-cursinphonology,itdoessuggestthatthisisacriticaldistinctionphonologistsshouldattendto.However,asheacknowledges,thetwoputativecaseshegiveshavebeencontentiousintheliterature.BuccolaandSonderegger(2013)discussCanadianraisingandmoregenerallycounterfeedingonenvironment(Bakovi?,2011)andseemtoendorsearules-basedapproach,buttherearealternativeanalysesthatdonotrequireadhocmodificationstoexistingtheories.
Inconclusion,formallanguagetheoryoffersarigorous,string-basedandaxiomaticapproachtophonologyasaformalsystem.Manyresearchersbe-
lievethatthiskindoflogic-ormodel-basedapproachtophonologyisthekeytodiscoveringwhatformalpropertiesofourframeworksmakeformeaningfulcontrastsinempiricalcoverageandrestrictiveness.Otherresearchersareuneasywiththisapproach,afeelingwhichStabler(2009)aptlysummarizedthusly:
Butmanylinguistsfeelthateventhestrongclaimthathumanlanguagesareuniversallyintheclassesboxedin(1)isactuallyratherweak.Theythinkthisbecause,intermsofthesortsofthingslinguistsdescribeinhumanlanguages,thesecomputationalclaimstelluslittleaboutwhathumanlanguagesarelike.(p.203)
Lookingbackovertheworksreviewedabove,itisclearthattheformallanguageapproachhasrelativelylittletosayaboutmarkedness,alternations,opacity,ormanyothercoreconcernsofmainstreamtheoreticalphonology.
Forexample,oneofthemostappealingaspectsofconstraint-basedgrammarsisthattheyformallyen-codeasubstantivebiasagainstmarkedstructures,bydirectlyincludingmarkednessconstraintinthetheory.Indeed,thesuccessofOTinpredictingthetypologyofsyllablestructuresarisesfromthecombinationofanONSETconstraint(whichpunisheswordsthatbeginwithavowel)withaformalpropertycalledharmonicbounding(ifcandidateBisequalorworseoneverydimensionthancandidateA,Bcanneverwin).OTtherebypredictstheexistenceoflanguageswhichre-quirewordstobeginwithaconsonant,andoflan-guageswhichallowwordstobeginwithavowel,whilecorrectlypredictingtheabsenceoflanguageswhichrequirewordstobeginwithavowel.Butthereisnothingabout“regularity”whichforcesthisproper-ty.Rather,itispartofthesubstantivecontentofthetheory.Formallanguagetheorysimplyhasnothingtosayaboutit.
Inanycase,itisclearthattheformallanguagetheoryapproachtoframeworkcomparisonhasjustbeguntoaffectphonology.Therewillbemoreofthisworkinthenearfuture,notless.Theeventualtheoret-icalimpactofthislineofworkcannotbedeterminedyet,andislikelytodependontheextenttowhichtheoristsengagewithwell-establishednaturallanguagedata.
3.6.Finite-stateOT
Mainstreamphonologicaltheoryhasundergoneaparadigmshiftwiththeinnovationofconstraint-basedtheoriessuchasOptimalityTheory(McCarthy&Prince,1994;Prince&Smolensky,1993,2002,2004).ItwasEllison(1994)whofirstproposedafinite-stateimple-mentationofOT.TheessenceoftheproposalwastoconstructanindividualFSTforeachconstraint.Forexample,withparticularrepresentationalassumptions,theconstraint*CODAcanbeencodedwith(13):
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
12?R.Daland
(13)
In(13),theinputiscodedaspairsofsyllableslotsandsegmentalmaterial,withOindicatinganonsetpo-sition,Nthenucleus,Cacoda,εtheemptystring(asyllablepositionthatisnotfilled),and?∈anynonemptystring(asyllablepositionthatisfilled).Thus,forexample,whenthesyllabifiedform[al.qal.am.u]isrunthroughtheFSTin(13),itisrepresentedasin(14a),andtheoutputisasin(14b):
(14)a.b.
O∈0
Na0
Cl-1
Oq0
Na0
Cl-1
Om0
Nu0
C∈0
Inotherwords,theinputstringistransducedtoastringofconstraintviolations,whosesumindicatesthenumberofconstraintviolationsforthecandidateasanegativeinteger.Moreover,byconstructingaregularexpressionwhichgeneratesallpossiblesyllabificationsof/alqalmu/andperforminganoperationknownasin-tersection(alsocalledtheproduct),onecanobtaintheconstraintviolationsforeverypossiblesyllabification.Theadvantageofdoingthiswithfinite-statemethodsisthattheyareamenabletomemory-andoperation-ef-ficientcomputerimplementation;infact,standardfinite-statelibrarieshavebeendevelopedformostmajorcomputerprogramminglanguages.
Subsequentworkhaselaboratedonthisconceptioninvariousways,althoughthecoreideaofwritingcon-straintsasFSTshasremained.Forexample,Karttunen(1998)proposedtocomposeconstraintsaccordingtotheirrankinginaparticularlanguagewithlenientcom-position,whichefficientlyremovescandidatesfromthecomputationassoonastheybecomesuboptimal,whileallowingcandidatestoviolatehigh-rankedconstraintswhenthereisnobettercompetitor.FrankandSatta(1998)studythegenerativepowerofOT,andconcludethatitisregularonlyifindividualconstraintscanassignatmostann-arydistinctioninwell-formednessforsomefiniten.Forexample,theALIGNfamilyofconstraints,whichmightpenalizeanelementaccordingtoits(poten-tiallyunbounded)distancefromtheedgeofaword,issuspectbythesecriteria.
Finite-stateOTisparticularlyexcitingtomebecauseofitspotentialforthestudyoflearning.Thekeyideascanbetracedtoavarietyofpapers.Goldwaterand
Johnson(2003)firstnoticedthatHarmonicGrammarscouldbeextendedtolog-linear(maximumentropy)models,simplybytreatingconstraintsasthefeaturefunctions.Bergeretal.(1996)provedthatundermildassumptionsthelikelihoodfunctionoflog-linearmodelsisconvexintheweightspace,whichmeansthatthereisauniquemaximumanditcanbefoundefficientlyusingthegradient(thevectorofderivativeswithrespecttoeachweight).Bergeretal.furtherobservedthatthegradientcanbecalculatedasO–E,whereOiistheob-servedviolationcountforconstraintfiinthetrainingdata,andEiistheexpectedviolationcount.Eisner(2002,etseq.)andRiggle(2004,2009)extendedthefinite-stateconceptionofconstraintswithaspecialproductoperationthattrackstheviolationvectorforanentiregrammar,alongwiththevector's(log-)probability,usinganalgebraicstructurereferredtoasa'violationsemiring'.Theviolationsemiringconstructionofferscomputationallyefficientcomputationoftheweightedviolationvectorsforanyregularclassofstrings.Therefore,itcanbeusedtocalculatetheexpectedvio-lationcountEwhenthatvalueiswell-defined.Together,theseresultsimplythatamachine-implementedlog-lineargrammaticalmodelcanfeasiblybetrained.HayesandWilson(2008)actuallyimplementedsuchamodelinJava,andhavebeenproducinginterestingworkwithitinsubsequentpapers.IwillreturntothismodelintheCognitiveModelingsection.
Heinzandcolleagueshaveappliedfinite-statetech-niquestotheacquisitionofphonology.Forexample,Heinz(2007)treatstheacquisitionoflong-distancephonologicalpatterns(suchassibilantharmony,vowelharmony,andstressassignment)usingfinite-statelearning.Heinzobservesthatallsuchlong-distancephenomenaexhibitapropertyhecallsneighborhooddistinctness,apropertywhichenforcescertainkindsofgeneralization,andwhichfallsoutnaturallyfromapply-inga'statemerging'operationduringconstructionoftheFSM.LaterworkbyHeinzconsiderslearningvari-ousclassesofsubregularlanguages,oftendirectlymoti-vatedbyparticularphenomenasuchasvowelharmony,andsometimeswithproofsofidentifiablilityinthelimit(Heinz,2010;Heinz&Koirala,2010;Heinz&Lai,2013).
Formallanguagetheoryhasdevelopedalargebodyofaxiomaticresultsonclassesof'languages',definedasstringsetsgeneratedintentionallybysomefinite,compactgenerativemechanisms.Workonthistopicisgenerallyconcernedwith'learnability',whichistypicallyformulatedatanabstract,algebraiclevel.Forexample,aclassoflanguagesisidentifiableinlimitifanoptimallearningalgorithmcanbeguaranteedtoconvergeuponthecorrectlanguageintheclassgivenanarbitrarysampleofsomesize.Recentworkonthistopichasillus-tratedsurprisinginsightsontheexpressiveequivalenceofformalframeworkswithverydifferentsurfacechar-acteristics,andhasprovidedpowerfultoolsforimple-mentingconstraint-basedphonologyincomputationallyefficientfinitestatemachines.
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??13
4.NATURALLANGUAGEPROCESSINGANDAUTOMATICSPEECHRECOGNITION(NLP/ASR)
EverytimeIfirealinguist,theperformanceofthespeechrecognizergoesup.--FredJelinek(inHirschberg,1998)Therearethreekindsoflies:lies,damnlies,andstatistics.--BenjaminDisraeli(Twain,2006,p.471)
encounternewpairsofspeechsoundsthroughouttheirlife.Theenormousrangeofvariationinfrequencieshasimportantbutsometimesunderappreciatedimplicationsforhowlearnersmightacquirephonology.4.2.Statisticalmodels
Asnotedabove,thegoalofmanyNLPandASRre-searchersistobuildlanguagetechnologiesthatwork,ratherthanfocusingonthecognitiveprinciplesthatun-derlielanguageuse.Ofcourse,thosetwogoalsarenotmutuallyexclusive,buttheyarenotidenticaleither.Infact,thegeneralexperienceoftheNLPandASRcom-munityhasbeenthat'dumb'modelswithlotsoftrainingdataperformbetterthan'smart'modelswithlesstrainingdata:
Idon'tknowhowmanyofyouworkinIThavehadthisexperience,butit'sreallyawfullydepressingtospendayearworkingonaninterestingresearchideaandthendiscoveryoucangetabiggerBLEUscoreincreaseby,say,doublingthesizeofyourlanguagemodeltrainingdata.Iseeacoupleofnoddingheads.--PhillipResnick(inP.Brown&Mercer,2013)
Computationalphonologyisgenerallyusedtorefertobasicresearch.However,thereisextensiveoverlapwiththefieldsofNaturalLanguageProcessing(NLP)andAutomaticSpeechRecognition(ASR),sinceallthreedealwithcomputationsinvolving(representationsof)speechsounds.Despitetheoverlap,thereisacertaintensionbetweenthegoalsofthescientistandthegoalsofengineerswhowishtoapplythesciencetosolvereal-worldproblems,asrevealedinJelinek'soft-repeatedquip,above.Thisreviewwillnotaddresscutting-edgeworkinNLPorASR,since'computationalphonology'isnotgenerallyusedtodescribethiskindofwork.Still,currentcomputationalworkowesahugedebttoNLPandASRfortheapplicationofstatisticalmethodstonaturallanguage.IwillbrieflydescribetwoconceptswhichoriginatedfromNLP/ASRbutwhichhavespreadtocomputationallinguisticingeneral.4.1.Zipfiandistributions
Itseemstrivial,almosttothepointofbanality,toobservethatsomethingshappenmorethanothers;forexample,somewordsarerepeatedmorefrequentlythanothers.However,thenatureofthedistributioncanhavepowerfulconsequencesforlanguageacquisitionandprocessing.Itturnsoutthatthevariationinwordfrequen-ciesisnotcompletelyrandom;itfollowswhathascometobeknownasaZipfiandistribution(Zipf,1935,1949).Thismeansthatasmallnumberofitemshavealargefrequency,andalargenumberofitemshaveasmallfrequency.Itisalsosometimesinformallydescribedas'mosteventsarerare'.
Zipfiandistributionsarefoundateveryleveloflin-guisticstructure.Baayen(2001)considerstheimplica-tionsofthisfactformorphology.Anessentialpointisthatforanynaturallanguagetext,theprobabilityofen-counteringanewitemneverdropstozero.Therefore,afunctioningmodeloflanguageusemustalwaysallowforunseenitems.Thereadermightbesurprisedtolearnhowmuchresearchdoesnotprovideforthis.Forexam-ple,thebest-knownandmost-successfulmodelofwordrecognition,TRACE(Elman&McClelland,1985),doesnothaveanyexplicitmechanismforhandlingso-calledOut-of-Vocabulary(OoV)items.DalandandPierrehum-bert(2011)foundthatevensegmentaldiphones(acon-sonantorvowel,followedbyanotherconsonantorvowel)exhibitaZipfiandistributioninEnglish.DalandandPierrehumbertgoontoshowthatanEnglishlistenergetsenoughinputinonedaytoapproximatethefrequen-cydistributionover(frequent)diphones,yetmightstill
Anexampleofa'dumb'modelinsyntaxistheMarkov/n-grammodelsthatChomsky(1956)attackedasinsufficienttoexplainvariouslong-distancephenom-ena.FromtheperspectiveofNLP/ASRresearchers,linguistictheoryisgoodtotheextentthatitisusefulandnecessaryforbuildingsystemsthatwork:
It'snotthatwewereagainsttheuseoflinguisticstheory,linguisticrules,orlinguisticintuition.Wejustdidn'tknowanylinguistics.Weknewhowtobuildstatisticalmodelsfromverylargequantitiesofdata,andthatwasprettymuchtheonlyarrowinourquiver.Wetookanengineeringapproachandwereperfectlyhappytodowhateverittooktomakeprogress.Infact,soonafterwebegantotranslatesomesentenceswithourcrudeword-basedmodel,were-alizedtheneedtointroducesomelinguisticsintothosemodels...Wereplacedthewordswithmorphs,andinclud-edsomena?vesyntactictransformationtohandlethingslikequestions,modifierposition,complexverbtensesandthelike...Nowthisisnotthetypeofsyntacticormorpho-logicalanalysisthatsetsthelinguist'sheartaflutter,butitdramaticallyreducesvocabularysizesandinturnimprovesthequalityoftheEMparameterestimates...Fromourpointofview,itwasnotlinguisticsversusstatistics;wesawlinguisticsandstatisticsfittingtogethersynergistical-ly.--PeterBrown(inP.Brown&Mercer,2013)
AcrucialcontributionofNLP/ASRhasbeenthein-sightthatprobabilisticapproachtolanguagemodelingisnecessaryfordevelopingreal-worldapplications.Arguably,itisalsoinspiringarevolutioninhowweconceptualizelanguageacquisition,oratleastphonolog-icalacquisition.
Thiscommunityhasalsodevelopedmachine-learn-ingtechniquesthatenableefficientestimationofmodel
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
14?R.Daland
parameters.Forexample,commercialASRtechnologieslikeNuanceDragonrelyonanacousticmodelwhichreliesona'dumb'HiddenMarkovModel(HMM).AnHMMisacloserelativeofaprobabilisticFSM,withtwokeydifferences.First,thestatesthemselvesarela-tentvariables(inthesensethatthemodelbuilderpositsthattheyexist,andtheyconditionthemodel'soutput,buttheirparameters/relationshipstoothermodelcom-ponentsarelearnedduringtraining).Second,emissionofastringisnotdirectlyassociatedstatetransitions;rather,eachstateisassociatedwithaprobabilitydistri-butionoverobservations.Theacousticobservationsareatimeseries{ot}t=1..Mwhereeachotissomekindofvector,typicallygeneratedbysomekindofspectraldecompositionofoverlappingtimeframesfromthewaveform.Forexample,asimpleHMMisshownin(15):
(15)
transitionstakeupthebulkoftheprobabilityineachcasesincenormallythesamevowel/consonantisspreadovermanyobservationframes.The'emissionprobability'boxescharacterizethelikelihoodofemit-tingthecurrentobservationotgiventhepositedstatestusingamulti-dimensionalnormaldistribution.Forexample,the'C'labelisassociatedwithrelativelyloweramplitudeandlessperiodicitythanvowelsinthe2-5kHzband,andrelativelyhigheramplitudebutstilllessperiodicitythanvowelsinthe5-10kHzband.Muchoftheearly,seminalworkinthesefieldsfocusedondevelopingdynamicprogrammingtechniquestotrainthesemodelsefficientlyfromlimitedorverylargeamountsoftrainingdata.Especiallywell-knownaretheViterbialgorithmforfindingthemostlikelysequenceofstatesgivenanobservationsequence(Viterbi,1967),andtheBaum-Welch(orforward-backward)algorithmforfindingtheunknownparametersofanHMM(Baum&Petrie,1966;Jelinek,Bahl,&Mercer,1975).Thesealgorithms,ormodestadaptations/generalizationsofthem,arestillusedinmostormanyNLPpaperspublishedtoday,aswellasinthefinite-stateOTmethodsdescribedearlierandelaboratedinmoredetailinlatersections.
ThediscussionofNLP/ASRisnecessarilybrief.Asemphasizedthroughoutthisdiscussion,NLPhasmadesignificantcontributionstowhatnowmightbecalledcomputationalphonology,althoughinpracticeNLPisinterestedinengineeringapplications(suchasASR)andisnormallyconsideredaseparatefield.Theuseofstatisticalmodelshastransformedcognitivemodelinginphonology,towhichIturnnext.5.COGNITIVEMODELING
TheadventofstatisticalmodelsinNLPofferedupnewavenuesformorecognitivelymindedresearchers.EarlyexamplesofthisincludetheworkoftheParallelDistributedProcessinggroup,whoformulatedtheTRACEmodelofspeechperception(Elman&McClel-land,1985)aswellasahotly-contestedsingle-routemodelofpasttenseformation(Rumelhart&McClel-land,1986).The'connectionist'approachtheyemployed,emphasizingso-calledArtificialNeuralNetworks(ANNs),haslargelybeenabandonedincontemporarycognitivescience,forreasonstoocomplextodiscusshere.Nonetheless,thePDPgroupdeservescreditforusheringinaneweraincognitivesciencebyattemptingtoexplicitlylink(psycho-)linguistictheorieswithhumanbehavioraldata.
5.1.PhonotacticandphonologicallearningThebulkofcognitivecomputationalmodelingofphonologythatthisauthorisawareofisconcentratedintheareasofphonotacticandphonologicallearning.Therearetwokeymessagesthatthisliteraturesuggeststome.Thefirstisthataconstraint-basedapproachto
Inthiscase,thetaskistoparseanacousticse-quencebylabelingeachdiscretetimeframeasbelong-ingtooneofthecategories'C','V',or'
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??15
phonologicallearningmakessensefromarangeofstandpoints.Thesecondisthatastochasticapproachtophonologicalvariationmakessensefromarangeofstandpoints.
5.1.1.Factoringthelearningproblem
AsnicelysetforthinHayes(2004),aconstraint-basedapproachmakessenseoftheempiricaldataweseeonphonologicaldevelopment.Morespecifically,Hayes(2004)reviewsarangeofstudiessuggestingthatinfantsacquiresignificantaspectsofthephonotacticsoftheirlanguageby9-11monthsofage,whilethereisnoorlittleevidenceofunambiguouslyphonologicalalternationsuntil15-24monthsofage.Inaconstraint-basedframework,thispatterncanbecapturedbyathe-oryinwhichmarkednessconstraintsarelearnedearly.WhileHayes(2004)doesnotclaimthatinfantshavenocommandoffaithfulnessconstraints,itseemsintuitivelyplausiblethatitiseasiertolearnaboutwhichsurfacestructuresdoanddonotoccur(phonotactics)thanitistoalsolearnaboutnon-transparentrelationshipsbetweenURandSR.
5.1.2.Learnabilityproofsforconstraints
AlthoughitisinprinciplepossibletoreasonaboutacquisitionwithinSPE-stylerules,thenatureoftheOTformalismhasevidentlybeenmoreamenabletoformalanalysis.TheadventofOTwasfollowedinshortorderbylearningalgorithms,andformalproofsoftheireffi-cacy.Forexample,TesarandSmolensky(2000)sum-marizealargebodyofearlierworktreatingthephono-logicalacquisitionproblemfromtheperspectiveofOT.Oneaspectofthelearningproblemislearningthepro-ductiongrammar–thecomponentwhichmapsunderly-ingrepresentationstofullyspecifiedsurfacerepresenta-tions.Theygiveaformalproofofthe'correctness'ofanalgorithmtheyrefertoasError-DrivenConstraintDemotion(EDCD),whichsolvesthisproblem.Thatis,ifthelearnerisgivencorrectunderlyingformsandcorrectsurfaceformsfromanOTgrammarwithcon-straintsC={Ck},EDCDprobablyconvergestothecorrecttotalorderingoverCwhichgeneratedthelearningdata.Ofcourse,thelearningproblemforinfantsismoredifficult–theymustinfernotonlythegrammar,buttheunderlyingformsandthecorrectsurfaceforms(includinghiddenstructure).TesarandSmolenskyde-scribetheprocessofassigningafullyspecifiedsurfacerepresentationtoanobservableformasRobustInterpre-tiveParsing(RIP;althoughBoersma,2003,pointsoutthiscouldsimplybecalledperception).TesarandSmolenskyfurtherproposeLexiconOptimization,theassumptionthatwhenmultipleinputformsmaptothesamehypothesizedsurfacerepresentation,themostfaithfulURisselected.Theyshowinaseriesofsimula-tionsthatthiscombination(EDCD+RIP+LexiconOpti-
mization)correctlylearnsasignificantmajorityofstresspatternsinafactorialtypology,althoughtherewerecasesinwhichthelearnergot'stuck',failingtoconvergeonanycorrectgrammar.
Theadoptionofscalar-valuedweightshasopenedupadditionalanalyticpossibilitiesinconstraint-basedlearning.Forexample,Potts,Pater,Jesney,Bhatt,andBecker(2010)showedthatthesimplexalgorithmcouldbeusedtoidentifyweightsforaHarmonicGrammar.ThisprovidesalearnabilityproofforHarmonicGram-marthatisentirelyanalogoustothecorrectnessproofofTesarandSmolensky'sEDCDforOT,exceptthatPottsetal.employapre-existingmathematicalapproachwithawell-establishedpedigree.Inaseriesofpapers,Magri(2012,inpress)analysesthephonotacticlearningproblemusingascalar-valuedvariantofOTinwhichthewinninginput-outputcandidateisdeterminedbyatotalorderingofconstraints,whichisprojectedfromunderlyingscalar-valuedconstraintweights.Magrigivesboundsunderwhichtheuseofscalarweightsanderror-drivenre-weightingissufficienttorenderlearningalgo-rithmstoleranttonoise(i.e.occasionaldatapointswhichviolatethegrammar).However,Magri'sworkgenerallydealswiththegrammarasafunction,meaningthataninputmustbemappedtothesameoutputoneveryocca-sion.Boersmaandcolleagueshaveshownthatastochasticapproachprovidesgracefulhandlingnotonlyofnoise,butoffreevariation.Forexample,BoersmaappliedstochasticgradientascenttoaprobabilisticvariantofOT(somereadersmayknowthisastheGradualLearningAlgorithm).BoersmaandHayes(2001)testedthisalgorithmonanumberofempiricalphenomena,findingthatitwasabletohandlenotonlyexceptionaldatapoints,buttoaccuratelymodelgenuinefreevariation.AmorecomprehensivereviewofthistopicisgiveninSection4ofCoetzee&Pater(2011).5.1.3.Stochasticphonology
TheworkofPierrehumbertandcolleaguesreflectssomeoftheadvantagesofadoptingastatisticalperspec-tiveinthestudyofphonologyandphonologicalacqui-sition.Forexample,Pierrehumbert(1994)conductedastudyofthetriconsonantalclustersobservedword-me-diallyinEnglish.Asacrudefirstpass,sheproposedthattheexpectedoccurrencesofamedialclusterinmonomorphemescouldbedeterminedcompositionallyfromtheprobabilitiesofgeneratingtheclusterfromasyllablecodaandafollowingsyllableonset,e.g.E[lfr]=|L|·Pr(l]σ)·Pr([σfr)where|L|isthesizeofthemonomorphemiclexicon.Pierrehumbertfoundthatofthe8708potentiallygrammaticalmedialclustersthatcouldbegeneratedinthisway,only50wereactuallyattestedmonomorphemically.Naively,onemightimaginethismeansthereisalotofworkforlinguistictheorytodo,explainingwhysomanypossibleeventsdon'toccur.However,Pierehumbertpointedout,over8500ofthese8708clustershadanexpectedfrequency
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
16?R.Daland
below1.Inotherwords,aproperlinguisticexplanationwasonlyneededforthe150orsotriconsonantalclusterswhichhadexpectedfrequencieswellabove1,butob-servedfrequenciesof0.'Chance'alonewasenoughtoexplaintheabsenceofmostunattestedclusters,alleviat-ingtheburdenonlinguists.
ColemanandPierrehumbert(1997)furtherelaborat-edthisideabyformalizingasyllableparserasaproba-bilisticcontext-freegrammar(aPCFGisaCFGlikeinexample(6),butwithprobabilitiesattachedtotherewriterules).Theyaddedprosodicfeaturestodistin-guishstressedfromunstressedsyllables,aswellasinitialversusnoninitialandfinalversusnonfinalsyllables.ColemanandPierrehumbertvalidatedtheirmodelagainsthumanjudgmentsfromanonce-wordacceptabil-itytask.Theyfoundthattheaggregateacceptabilityoftheirnonwordswasalmostperfectlycorrelatedwiththelog-probabilitytheirmodelassigned,afindingthathassincebeenreplicatedwithnumerousotherprobabilisticmodels(Daland&Pierrehumbert,2011).Inadditiontocomparingthemodeloutputtobehavioraldata,Pierre-humbertandcolleagues'workrepresentsanearlyin-stanceofacriticalaspectofcomputationalcognitivemodeling–specifyingameaningfulbaseline,againstwhichtheutilityofaparticularformaldevicecanbemeasured.
Anotherdomaininwhichastochasticapproachhashadsomesuccessisinnon-deterministicmorphophonol-ogy.Asmentionedabove,thePDPgroupproposedaninfluentialconnectionistmodelofpasttenseproductioninEnglish(Rumelhart&McClelland,1986).Thispaperwasverypolarizing,sinceitsuggestedthatbothregularand'irregular'morphophonologycouldbeexplainedbyasingle,analogicalsystem.Anumberofresearchers,includingPinkerandMarcus,proposedadual-routemodelinwhichregularmorphologyiscalculatedbyarule-basedgrammar,while'irregular'morphologyiscalculatedbyananalogicalsystem.Owingtotheheatedrhetoricsurroundingtheissueandthenumberofpaperswrittenonthistopic(Albright&Hayes,2003;Daugher-ty&Seidenberg,1992;Marcus,1995;Marcus,Brinkmann,Clahsen,Wiese,&Pinker,1996;Pinker&Prince,1988;Plunkett&Marchman,1991;tonamejustafew),ithasbecomeknownasthePastTenseWars.Althoughthereisnotspacetoreviewthisfascinatingliterature,itismentionedherebecausecognitivecompu-tationalmodelingplayedsuchaprominentroleinthedebate–formalmodelswereimplementedincomputerprograms,whichgenerateddatathatwasthencomparedtochildand/oradultproduction.Partlyasaresultofresearchers'commitmentstoactualimplementedmodels,anumberofimportantdiscoveriesweremade.Theseincludedtheobservationthatminorityinflectionalpat-ternscanbemarginallyproductive(e.g.spling→splung),thediscoveryofoutput-orientedprocesses(e.g.irregularslikeburntsharesurfacecommonalitieswithregularlyinflecteditems,inthiscasethepresenceofaword-finalcoronalstopthatisnotpresentintheverbstem),andthediscoveryof'islandsofreliability'not
onlyinirregularlyinflectedpatternsbutalsoinregularforms(forfurtherdiscussionseeAlbright&Hayes,2003).
5.1.4.Constraint-basedstochasticphonologyFollowingtheresearchprogramofHayes(2004),andtheinsightofGoldwaterandJohnson(2003)thatHarmonicGrammarcanbenaturallyextendedtothelog-linearframework,Hayes&Wilson(2008)describeandimplementaphonotacticlearnerthatissuppliedwithaproto-lexicon(alistofwordforms)andaphono-logicalfeatureset.Thefeaturesetdefinesasetofnaturalclasses,followingmainstreamphonologicaltheory.Thesoftwarethenconsidersgrammarsconsistingof'n-gramconstraints',e.g.thebigramconstraint'*[-son,+vcd][-son,-vcd]'mightprohibitasequenceofobstruentsO1O2inwhichO1isvoicedwhileO2isvoiceless.Foragivensetofconstraints,thesoftwareusesthefinite-statemethodsofRiggle(2004,2009)torapidlydeterminetheoptimalweights.Thegrammarisbuiltandprunediteratively,byselectingnewconstraintsfromaverylargehypothesisspaceaccordingtovarioussearchheuristics,andthenretainingthoseconstraintswhichpassacomplexity-penalizedstatisticalcriterionforim-provingthemodelfittothetrainingdata.Hayes&Wilson(2008)demonstratethatthegrammarslearnedbythemodelexhibitvariousempiricallydesirableproperties.Forexample,whentrainedononsetclustersintheEnglishlexicon,itassignsgradientwell-formed-nessscorestolegalandunattestedonsetclusters,whichcorrelatequitetightlywiththeaggregatejudgmentsofschoolchildrenonthesameonsetsasreportedinthebodyofthepaper.Furthercomputationalworkstudyingthismodel'spredictionsforsonoritysequencingisgiveninDalandandPierrehumbert(2011)andHayes(2011).HayesandWhite(2013)usethemodelasabaselinetotestfor'phoneticnaturalness'effectsinlearning,i.e.whethertwoputativeconstraintswhichreceiveequalsupportfromthelexicon,butdifferintheextentofphoneticmotivation,aretreatedequallybyadultEnglishspeakersinratingnovelforms.
TheworkofJarosz(2006,2013)hasconcentratedparticularlyupontheproblemoflearningunderlyingrepresentationsinstochasticconstraint-basedphonology.Forexample,Jarosz(2013)containsacarefulanalysisofwhyRobustInterpretiveParsing(Tesar&Smolensky,2000)failsinparticularcases;amongotherthings,Jaroszconcludesthatencodingaprobabilitydistributionacrossoutputsallowsthelearnertorecoverfromthe'traps'thatcausedTesarandSmolensky'salgorithm(whichwascastincategorical,non-stochasticOT)tofail.
Thismomentisaveryexcitingoneinthetheoryofphonologicalacquisition.Thefieldwideshifttocon-straint-basedtheorieshasopenedupmultiplenewlinesofattackontheacquisitionproblem.AsHayes(2004)pointedout,theconstraint-basedapproachiscompatible
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??17
withthedevelopmentaltrajectorythatisactuallyob-served,undertheinterpretationthatchildrensettherelativeprioritizationofmarkednessconstraintsratherearlyindevelopment.Nearlyallofthepapersreviewedinthissectionrepresentsignificantinsightsontotheacquisitionproblem,thatwouldnothavebeenpossibleunderSPE-stylerules.Whiletherearenodoubtaddition-alsubtletiesinthisapproachthathavenotbeendiscov-ered,theratherrapidprogressthathasbeenmadeinthelast10yearsonphonologicalacquisitioninparticulararguablyoutstripstheprogressthathadbeenmadeinthepreceding30-40yearsduringwhichSPE-styleruleswerethedominantphonologicalframework.
Onepartofwhathasmadethisprogresspossibleisthattheconstraint-basedapproachlendsitselfnaturallytoproblemrepresentationsthataresimilar,andadaptableto,problemrepresentationsinmachinelearning.Themorethatlinguisticproblemscanberepresentedlikeproblemsinotherscientificfields,themorewelinguistsareabletoleveragethepowerfulcomputationaltoolsthathavebeendevelopedtosolvethem,suchasmaxi-mumentropymodels(Goldwater&Johnson,2003;Hayes&Wilson,2008;Jarosz,2013).Atthesametime,theadoptionofmachinelearningmethodspromisestohelpfocusphonologicaltheoryonthesubstantivecom-ponentswhichitadds,overandabovetheory-innocentmachinelearningmethods.Forexample,Hayesrepeat-edlymakesthepointthatakitchen-sinkapproachtoconstraintsfailswithtoylanguagesandotherwisesuc-cessfullearningalgorithms(Hayes,2004;Hayes&White,2013).Analogously,itiscommonloreamongsttheoreticalphonologiststhatasuccessfulOTanalysiscanbesunkbythewrongconstraint,andthisholdsequallytrueinacomputationalsettingwheresomeofthecandidateenumerationandscoringisdonerigorouslybythecomputer.
Wecanexpectfurther,rapidprogressonthisdomaininparticular;theauthorisincommunicationwithanumberofscholarsdoingnewandinterestingthingsonthistopicatthisverymoment.Inthenextsubsection,weturntoanotherareawherecomputationalmodelinghashadasignificantimpactonrapidprogress,wordsegmentation.
5.2.Wordsegmentation
Wordsegmentationistheperceptualprocesswherebylistenersparsethespeechstreamintoword-sizedunits.Asevidentfromlisteningtospeechinanunfamiliarlanguage,manywordsarenotfollowedbyasilenceorotherlanguage-generalauditoryboundarycue.However,fluentandnormally-hearinglistenersepiphenomenallyreportthesensationofhearingdiscretewordsduringspeechperception,exceptunderthemostchallenginglisteningconditions.Wordsegmentationreferstothecognitiveprocessorprocessesthathaveappliedbetweentheauditorylevelandthelistener'spercept,ofdiscretewordsinasequence.
OneoftheearliestcomputationalapproachestowordsegmentationwastheseminalTRACEmodelofspeechperception,publishedbythealready-men-tionedPDPresearchgroup(Elman&McClelland,1985).Inthismodel,thelistenerisequippedwithabankofphonologicalfeatures,aphoneme(orallo-phone)inventory,andaninventoryofwords.The'auditoryinput'isrepresentedasatime-varyingvectoroffeaturevalues.Themodelisaspecificinstanceofageneralclassofmodels,quitepopularinthepsy-cholinguisticliterature,knownas'spreadingactiva-tion':theperceptualinformationfromthe'bottom'(inthiscase,auditoryfeatural)levelpercolatesupto'higher'levels(phonemes,andthenwords),andinsomecases'top-down'informationalsopercolatesdownward.Asaresult,the'output'ofthemodelisatime-varyingvectorofwordactivations.Themodelisdeemedtohavesuccessfullyparsedasentenceifattheendofthesentence,allofthesentence'swordsarehighlyactivated,andnootherwordsarehighlyacti-vated.
AsStrauss,Harris,andMagnuson(2007)write:
AlthoughTRACEwasintroduced20yearsago,itcontin-uestobevitalincurrentworkinspeechperceptionandSWR.Despitewell-knownlimitations(acknowledgedintheoriginal1986articleanddiscussedbelow),TRACEisstillthebestavailablemodel,withthebroadestanddeepestcoverageoftheliterature...TRACEhasprovedextremelyflexibleandcontinuestospurnewresearchandprovideameansfortheorytesting.Forexample,ithasprovidedremarkablygoodfitstoeyetrackingdatafromrecentstudiesofthetimecourseoflexicalactivationandcompetition(Allopenna,Magnuson,&Tanenhaus,1998;Dahan,Magnuson&Tanenhaus,2001),includingsubtleeffectsofsubphonemicstimulusmanipulations(Dahan,Magnuson,Tanenhaus,&Hogan,2001).(p.20)
TRACEismentionedherebecause,amongotherthings,ithasbeenclaimedtoaccountforwordsegmen-tation.Theideaisthatifyourecognizethewordsthemselves,theepiphenomenalperceptofwordsegmen-tationhasbeenexplained.However,ashintedabove,TRACEisnotnecessarilyaviablemodelofacquisition.Inparticular,themodelcanonlyrecognizewordsinitslexicon;nomodel-internalmeansisavailableforpro-cessingnovelwordsandaddingthemtothelexicon.Asisgenerallyacknowledgedintheliteratureontheacqui-sitionofwordsegmentation,thisisanessentialaspectofthelargerproblem,sinceexperimentalevidencesuggeststhatinfantsareabletosegmentpreviouslyun-knownwords,andindeed,thisisthemajorityofnewwordsthatarelearned(forargumentationseeDaland&Pierrehumbert,2011,andGoldwater,Griffiths,&Johnson,2009).
Subsequentcomputationalresearchonthistopicemployedcorpusstudiesincombinationwithconnec-tionistmodeling(Aslin,Woodward,LaMendola,&Bever,1996;Cairns,Shillcock,Chater,&Levy,1997;Christiansen,Allen,&Seidenberg,1998;Elman,1990),
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
18?R.Daland
withthepromisingresultthatrelativelysimpleneuralnetworkmodelscouldpredictwordboundarieswithoutnecessarilyrecognizingtheneighboringwords.Howev-er,owingtothewell-knowndifficultieswithinterpretingtheinternalrepresentationsofconnectionistnetworks,thislineofresearchstalledshortlyaftertheinitialwave,essentiallybecauseitprovedimpossibletoreasonfromthemodelingresultstohowinfantsactuallysolvedtheproblem.Althoughthisisamoregeneralissuewithmodelingresearch,itprovedespeciallyacuteherebe-causeitwasnotevenpossibletodeterminehowthemodelssolvedtheproblem.
Nonetheless,thefindingthatprelexicalsegmentationwascomputationallypracticalhadimportantconse-quences.Experimentalevidencebeganpouringinaroundthistimeforphonotacticsegmentation,meaningsegmen-tationbasedon(knowledgeof)likely,unlikelybutper-missible,andimpermissiblesequenceswithinandacrossprosodicunitssuchaswords(e.g.Jusczyk,Hohne,&Baumann,1999;Jusczyk,Houston,&Newsome,1999;Mattys&Jusczyk,2001;Saffran,Aslin,&Newport,1996;foramorecomprehensivereviewseeDaland&Pierrehumbert,2011).Theexperimentalevidenceshowsquiteclearlythatinfantscananddoextractnewword-formsfromthespeechstream,evenfrom'difficult'posi-tionssuchasphrase-mediallywhentherearegoodphonotacticcues.
Thispromptedawaveofcomputationalmodelswhichattemptedtosolvethesegmentationproblemus-ingonlyphonotacticknowledge.EarlyinstancesincludeXanthos(2004)andFleck(2008),whousedutteranceboundaryinformationtoinferlexicalphonotacticprop-erties,asoriginallysuggestedbyAslin,Woodward,LaMendola,andBever(1996).Aprobabilisticallyrigor-ousbootstrappingmodelwasformulatedandtestedinDalandandPierrehumbert(2011)usingdiphones,se-quencesoftwosegments;inEnglish,individualdi-phonestypicallyhavepositionaldistributionsthatarehighlyskewedtowardbeingeitherword-internal,orword-spanning,sothatthisphonotacticcueisanexcel-lentoneforwordsegmentation.DalandandPierrehum-bertadvocateforaphonotacticapproachtowordseg-mentationbecausephonotacticsegmentationbecomesefficaciousassoonasinfantspossessthenecessaryphoneticexperience,around9months,consistentwiththedevelopmentalevidence.Moreover,DalandandPierrehumbertshowthatthephonotacticapproachisrobusttoconversationalreductionprocessesthatoccurinEnglish.Forexample,itiswell-knownthatword-finalcoronalstopsareoftendeletedinconversationalEnglish;DalandandPierrehumbertshowthatthiskindofprocesscausesonlyamodestdecrementtotheirphonotacticmodel,buthasrathermoredrasticeffectsonlexicalmodelswhichusewordformrecognitiontodowordsegmentation(sincecurrent-generationlexicalmodelsassumethesurfacepronunciationofawordformisitscanonicalandonlyform,theirdistributionalassumptionsareviolatedbyspeechcontainingpronunciationvaria-tion).AdriaansandKager(2010)proposeananalogousmodelintheframeworkofOT,whichinducessegmen-tationconstraintsfromfeaturalco-occurrenceinforma-tion.
Thephonotacticapproachhasnotpannedoutaswellasitsproponentsoriginallyhoped,however.Astheempiricalcoveragewidenedtootherlanguages,itbe-cameclearthatphonotacticapproachesalwaysworkedbestforEnglish(vs.Korean:Daland&Zuraw,2013;vs.SpanishandArabic:Fleck,2008;vs.Japanese:Fourtassi,B?rschinger,Johnson,&Dupoux,2013;etalia).Moreover,theassumption(basedonmaternalquestionnaires;Dale&Fenson,1996)that9-month-oldinfantsbarelyknewanywordswascontradictedbyex-perimentalevidence(e.g.Mandel,Jusczyk,&Pisoni,1995)suggestingthatinfantsknewsomewordformsasearlyas4-6months,eveniftheywerenotnecessarilyawareofthecorrespondingmeanings.
Inthemeantime,thephonotacticapproachtomodel-ingwordsegmentationwasovershadowedbytheBayesian,lexicalapproachdevelopedbyGoldwater,Johnson,andcolleagues.Thisapproach,whichhaditsrootsinthecomputationalmodelsofBatchelder(2002)andBrentandCartwright(1996),returnstotheviewofwordsegmentationasanepiphenomenonofwordrecognitionpopularizedinTRACE,butdepartsfromTRACEinvariousways.Mostcrucially,themodelsincludedmeanstoaddpreviouslyunencounteredword-formstoitslexicon('learnnewwords');also,BrentandCartwright(1996)definedanexplicitandprobabilisticmathematicalobjectivewhichtheirmodelwassupposedtomaximize.Thus,BrentandCartwrightadvocatedforframingthesegmentationproblematMarr'scomputa-tionallevel('Whatisthemathematicalcharacterizationofthefunctionthathumansoptimize?')ratherthanthealgorithmiclevel('Howdohumansfindtheoptimalso-lutionforthefunctionthattheyareoptimizing?').Goldwater,Griffiths,andJohnson(2009)extendtheearlyworkofBrentandCartwrighttoamoregeneralsetting,factoringthelearningproblemsoastoenableefficientoptimization,reframingtheobjectiveinaBayesiansetting(ratherthantherelated,butmorere-strictedMinimalDescriptionLengthapproachusedbyBrentandCartwright;fordiscussionandanalysisseeGoldwater,2006),andextendingthedatamodelsoastobebothmorepowerfulandmoreflexible.Forexam-ple,Goldwateretal.(2009)showthatbettersegmenta-tionispredictedifinfantsattendtodependenciesbe-tweenwords,apredictionthatwasretroactivelycon-firmedbyanexperimentalstudyshowingthat6-month-oldsusetheirownnamestosegmentthefollowingword(Bortfeld,Morgan,Golinkoff,&Rathbun,2005).
NumerousauthorshavefolloweduponGoldwaterandcolleagues'seminalwork.Forexample,Blanchard,Heinz,andGolinkoff(2010)adaptthemodelinGold-wateretal.(2009)byincludinganincrementaln-gramphonotacticmodel,whoseparametersarediscoveredduringthelearningprocess.Theyfoundasignificantbutverymodestgaininperformance,suggestingthatmuchoftheproblem-solvingpowerofGoldwater's
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??19
modelisactuallylocatedinthepriordistribution(owingtoreasonsofspace,Iamunabletodescribethismodelinmoredetailhere;thereaderisencouragedtoconsulttheoriginalpaperforclearexposition).Pearl&col-leagueshaveexperimentedwiththeideathat'addingperformancebackin'tocomputational-levelmodelscanyieldmorepsycholinguisticallyvalid(andsometimesmoreaccurate)performance,byincorporatinglimitedshort-termmemoryand/orlong-termforgettingintoGoldwater-likemodels(Pearl,Goldwater,&Steyvers,2011;Phillips&Pearl,2012).Lignos(2012)presentsanincrementalmodelwithaslightlydifferentobjectivethaninGoldwateretal.(2009);aninnovationistheuseofalexicalfilterwhichpreventslow-confidencewordsfrombeingincorporatedintothemodel'slexicon.Avarietyoflexicalfiltershavebeenusedinpreviouswork,includingespeciallytheconstraintthatawordmustcontainavowel(Brent&Cartwright,1996)orthatitmusthaveacertainminimalfrequency(Daland&Pierrehumbert,2011;seeCh.5ofDaland,2009,formodeling,analysis,anddiscussionof'errorsnowballs'andPearletal.,2011,forargumentationthatmemorylimitationshelppreventerrorsnowballsbyforgettingearlymisparses).
Therapid,intenseprogressthathastakenplaceinourunderstandingofwordsegmentationacquisitionhasbeendrivenbyaninterplayanddialoguebetweenava-rietyofresearchtraditions,mostnotablydevelopmentalpsycholinguists(Jusczyk,Mattys,Morgan,Saffran,etc.)andcognitivecomputationalmodelers(Daland,Goldwa-ter,Johnson,Pearl,etc.),aswellasresearcherswhoareabletomixthesemethodologies(Aslin,Kager,Swing-ley,etc.).Thisis,intheauthor'shumbleopinion,awonderfulthing,anditistobehopedthatthisexamplespreadstootherdomains.
Moregenerally,theimpactofcognitivemodelingcannotbeunderstatedinlinguistictheoryandincogni-tivesciencemoregenerally.Theinteractionbetweendomain-generalanddomain-specificrepresentationsandlearningalgorithmsisatopicofperennialinterest,andcomputationalmodelinghasandcontinuestoshednewlightonthecomplexities.Modelinghasinsomecasesclearlyruledouthypothesesastocognitivepro-cessesthatseemedaprioriquiteplausible;whileinothercasesithasshownthattwoformalismswhichmightnaivelybesupposedtomakecompletelydivergentpredictionsactuallyofferstatisticallyindistinguishableexplanationsfortheverysamedataset(e.g.Jarosz,2013).Justaswithformallanguagetheoryforframe-workcomparison,itissafetopredictthattherewillbemoreofthisworkinthefuture,notless.InthenextandfinalcontentsectionofthisreviewIturnbrieflytothetopicofcorpusstudies.6.CORPUSSTUDIES
Acorpusstudyisanystudyinwhichthecentraldataconsistsofa'corpus'–abodyoftextrepresentingsomeaspectoflanguageuse–andthecentralanalysisconsistsofcountingelementsinthetextanddoingstatisticalcomparisons.CorpusstudiesflourishedintheearlydaysoftheCHILDESdatabase(MacWhin-ney,2000),anearlycrowdsourcedprojectinwhich(usuallyorthographic)child-relatedcorporawereas-sembledtogetherundertheauspicesofasinglere-searchgroup.Forexample,muchoftheearlyworkonmorphologicalacquisitionfocusedonorder-of-morphemeacquisition,e.g.comparingthetimeandfrequencyof-ing,-ed,andotherEnglishfunctionalmorphemes(R.Brown,1973).
Owingtotheorthographiccodingofmostcorpora,andthephonologicallynon-transparentnatureofEnglish(theanalysislanguageformostcorpus-basedresearchtodate),thebulkofcorpusworkhasfocusedonmor-phologyandsyntaxratherthanphonology.Nonetheless,thereisasignificantbodyofcorpusworkinphonology.Iwilllimitthereviewtoafewexamples,asmuchofthisworkisofasimilarcharacter.
TwostudieswhichaddressphenomenaofinteresttotheoreticalphonologyweredonebyZurawandcol-leagues.Zuraw(2006)collectedacorpusofTagalogloanwordsusingInternetblogs.Loanwordsweredesir-ableforthisstudysincetheresearchquestionpertainedtotheproductivityofintervocalictapping,andthepro-ductivityofphonologicalpatternsfromhigh-frequencynativeitemsisconfoundedwithlexicalization.Usingthiscorpus,Zurawexaminedhowmorphologicalstatusinteractswithavariablephonologicalprocess;shefoundinterestingdifferencesbetweentheprefix+stemandstem+encliticcases,whichthereisnotspacetodiscusshere.Inaconceptuallysimilarstudy,Hayes,Zuraw,Siptár,andLonde(2009)investigatethevowelharmonypatternofHungarian,whichislargelycategorical,butexhibitsvariationinparticularcases(notably,whenaninitialbackvowelisfollowedbyoneormore'neutral'vowels,whichdonotundergoacousticallyobviousharmonyprocessesthemselves).Hayesetal.(2009)noteseveral'phoneticallyunnatural'aspectsoftheharmonysystemwhichappear,atleaststatistically,tonotbeduetochancealone(forexample,associationsbetweenconsonantplaceandvowelheightthatconditiontheapplicationrate).Theygoontoassesstheproductivityofthese'unnatural'patterns,andcomparethemtotheproductivityof'natural'patternswithsimilarstatisticalsupport,findingthatHungariannativespeakersexhibitknowledgeofboth,butapparentlyexhibitmoreproduc-tivityforthe'natural'patterns(seethepaperfordetails).LarsenandHeinz(2012)presentacorpusstudy,alsoofvowelharmony,butinKorean,andparticularlyinitsonomatopoeticsub-lexicon.Theiranalysisconfirmssomeaspectsofpreviousaccountsofthissub-lexicon,butaddnuances,e.g.thattheharmonyclassofavowelmaydependonitspositionintheword.Daland(2013)presentsacorpusstudyofadult-versuschild-directedspeech,inwhichhecomparestherelativefrequencyofdifferentsegmentalclasses.Dalandarguesagainsttheclaimthatadultstailorthesegmentalfrequenciesintheir
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
20?R.Daland
child-directedspeech,byshowingthatthemoment-to-momentvariationinsegmentalfrequenciesdwarfstheputativeaggregatedifferencesthathadbeenreportedinpreviousresearch.
Inallofthesecorpusstudies,researcherstakeanexistingcorpus(orcreateone)andthenanalyzeitandcomparethecountsagainstthepredictionsofsomeex-istingphonologicaltheoryoraccount.Corpusstudiesarerelativelyeasytoconductandreplicateoncethecorpushasbeencreated,sotheyareanappealingmethodology.However,itisthenormtosupplementcorpusstudieswithadditionalcomputationalstudiesand/orexperimentation,soastoprovideconvergingevidence.Therearemanycorpusstudiesthatcouldhavebeenreviewedhere,andIselectedamerehandfultoil-lustratethe'flavor'ofthisstyleofresearch.(Anumberofcorpusstudieswerealsoreviewedinthecognitivemodelingsectionearlier).Thisstyleofresearchisre-viewedhere,atleastbriefly,becauseitisconsideredtobe'computationalphonology'bymanyresearchers,in-cludingspecialistsonlanguageacquisition.7.SUMMARYANDCONCLUSIONS
InthispaperIhavereviewedanumberofsub-fieldswhichIorclosecolleaguesconsidertobe'computationalphonology'.Ibeganwithformallanguagetheoryasitisspecificallyappliedtophonology.Afterreviewingthefundamentals,Idiscussedrecenttheoreticalworkofinterest,includingtheuseofequivalenciesbetweenformallanguagesandlogicstocompareformalframe-works(likeSPEandOT),aswellastheapplicationoffinite-statemethodsforefficientoptimizationoflarge-scaleconstraint-basedmodels.Next,IbrieflydiscussedtheinfluenceofNLP/ASR(NaturalLanguageProcess-ingandAutomaticSpeechRecognition)oncomputation-alphonology;althoughthosefieldsarenotconsideredcomputationalphonology,cognitivescientistsoweahugedebttothesefieldsforintroducinganddemonstrat-ingtheutilityofprobabilisticmodelsfornaturallan-guageproblems.Inthesectionofthepaperthatcorre-spondsthemostcloselytomyownresearchinterests,Idiscussedcognitivecomputationalmodelingingeneral,andfocusedinparticularoncomputationalapproachestophonologicalandphonotacticacquisition,aswellastheacquisitionofwordsegmentationbyinfantsandchildren.Finally,Iverybrieflydiscussedcorpusstudies;thereisalongtraditionincorpusworkanditisaverygeneralmethodology,soIonlygaveafewexamplestoillustratewhatitcanandcannotdo.
Steppingbackfromthemanyandimportantdetailsthatgointomakinganyoneparticularstudy,itistimetorevisitthequestionwithwhichthisarticlebegan:Whatiscomputationalphonology?Letusbeginwithwhatiscommon.Asclaimedintheintroduction,manyormostoftheworksreviewedabovedrawuponacommonfoundationofformallanguagetheory.Forex-ample,someofthemostexcitingworkoncognitivemodelingofphonologicalacquisitionmakesuseoffi-nite-stateOT(Hayes&Wilson,2008).Similarly,mostoftheworkoncomputationalphonologyreliesonasharedbodyofmethodologicalknowledgeaboutcorpuslinguistics.Forexample,itisnearlyalwaysnecessarytopreprocessacorpusforone'sparticularresearchneeds.Moreover,theNaturalLanguageProcessingfieldhasrepeatedlyandforcefullydemonstratedthedangersofoverfitting;itisnowreceivedwisdominthisfieldthatgeneralizationmustbeassessedbytestingonadifferentdatasetthanthemodelwastrainedon(exceptincertaincasesofunsupervisedlearning).Nearlyalloftheworkreviewedaboveincognitivecomputationalmodelingdealseitherwithacorpusofphonologicaldata,orwithbehavioralresultsfroma'corpus'ofstimuli,orboth.Finally,thebulkofthestudiesreviewedheredealspecificallywithfirstlanguageacquisition(al-though,tobefair,thatpartiallyreflectstheauthor'sin-terests,inadditiontotheinherentbiasesofthefield).Thisisquiteabitofsharedknowledgeandmethodolog-icalcommonality.However,ifweexaminetheresearchquestionsthateachsubfieldasks,despitethefactthatthereisageneralpreoccupationwithlanguageacquisi-tion,westillseeagreateramountofvariationthanis,Ithink,commonforacoherentfield.
Withinformallanguagetheory,thepursuitisreallynotofempiricalphenomenathatdoordon'toccurinnaturallanguages;rather,thegoalistounderstandandelucidatetheformalrelationshipsbetweenvariousformalmodelsof'language'.Thissubfieldhaslargelyresistedprobabilisticapproaches,andithasconcen-tratedonformalrestrictionsonthegenerativecapacityofformalmodels(suchasregularversuscontext-free),attheexpenseofsubstantiverestrictions(suchastheimplicationaluniversalthatwordswithconsonantonsetsarestrictlylessmarkedthanonsetlesswords).Alargeamountofworkinthisfieldisdevotedtoac-quisition,butittendstoproceedinaproof-basedoralgorithmicmanner,askingiflearningalgorithmAisguaranteedtolearneverylanguageLinagivenclass.Thepsychologicalplausibilityofthelearningassump-tionsisnotalwaysaveryimportantconcerntosuchresearchers;rathertheyareinterestedinthemathemat-icalandlogicalrelationshipsbetweenAandL.
WithinNaturalLanguageProcessing(NLP),thegoalistosolvereal-worldengineeringproblems,oftenonesinwhichmoneycanbemade.Forexample,itisworthyandimportanttotranslatedocumentsfromresource-richlanguageslikeEnglishtohighinformation-demandlanguages(suchasMandarinChinese).Itisalsoworthyandimportanttotranslatedocumentsfromlanguageswhosespeakersproducegoodsandtechnologies(likeMandarinChinese)tolanguageswhosespeakerscon-sumegoodsandtechnologies(likeEnglish).Translatorsworkslowlyandmustbepaidaconsiderableamountofmoney;thereisalotofmoneytobemadeandsavedindevelopinggoodmachinetranslation.Inthissortofapplication,theformalpropertiesofamodelareofin-terestonlyinsofarastheyimpacttheultimateperfor-
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??21
manceofthesystemasawhole.ThereareofcourseresearcherswhoseinterestsspanbothNLPandmorebasicscience,includingresearcherswhobelievethatunderstandingthewayhumansdolanguagemayresultinbetterNLP,andsoon.Nonetheless,thefieldasawholeisorientedtowarddevelopingandapplyingstatis-ticalmodelswhichsolve'real-world'problems.Therearemanyandinterestingproblemsinthisfield,whichthisauthoristoodistantfromtoreviewinthedetailtheydeservehere.Itisquiteclear,however,thatthetypesofproblemsthisfieldisconcernedwitharequitediffer-entthantheratherabstractquestionsthatpreoccupyformallanguagetheorists.
Incognitivecomputationalmodeling,thegoalismorespecificallytoelucidatehowhumansactuallydosomeparticularlinguistictask.Thisisrelatedto,butcruciallydifferentfrom,theformallanguagetheoryap-proach.Attheriskofoversimplifyingconsiderably,onemightputitthisway:formallanguagetheoryasks,“WhatdoesmodelXdo?”;cognitivemodelersask,“DohumansdoitlikemodelX?”.Thatis,inthisfield,computationalresearchersareconcernedmuchmorewithpsychologicalplausibility,andlesswiththeabstractstructureoftheproblemspace.Itisnosurprise,then,thatcomputationalresearchinthisfieldrespondsandisrespondedtomoretightlywithdevelopmentalresearchonlanguageacquisition.
Mygoal,inreviewingthesedifferentsubfields,isnottoclaimthatoneissuperiortoanother.Rather,ithasbeentoillustratetherichtapestryofhumanthoughtthatfallsunderthebroadumbrellaterm'computationalphonology'.Therearestrandsthatconnecteachofthesesubfields,evenasthecoreconcernsdifferfromre-searchertoresearcherandsubfieldtosubfield.Compu-tationalphonologyisgettingbiggerandbigger,andfragmentingmorewitheachpassingyear.But,too,wearelearningmoreandmore.REFERENCES
Adriaans,F.,&Kager,R.(2010).Addinggeneralizationtostatistical
learning:Theinductionofphonotacticsfromcontinuousspeech.JournalofMemoryandLanguage,62,311-331.http://dx.doi.org/10.1016/j.jml.2009.11.007
Albright,A.,&Hayes,B.(2003).Rulesvs.analogyinEnglishpast
tenses:Acomputational/experimentalStudy.Cognition,90,119-161.http://dx.doi.org/10.1016/S0010-0277(03)00146-XAslin,R.N.,Woodward,J.,LaMendola,N.,&Bever,T.G.(1996).
Modelsofwordsegmentationinfluentmaternalspeechtoinfants.InJ.L.Morgan&K.Demuth(Eds.),Signaltosyntax:Bootstrappingfromspeechtogrammarinearlyacquisition(pp.117-134).Mahwah,NJ:Erlbaum.
Baayen,R.H.(2001).Wordfrequencydistributions.Dordrecht,
Netherlands:KluwerAcademic.http://dx.doi.org/10.1007/978-94-010-0844-0
Bakovi?,E.(2007).Arevisedtypologyofopaquegeneralisations.
Phonology,24,217-259.http://dx.doi.org/10.1017/S0952675707001194
Bakovi?,E.(2011).Opacityandordering.InJ.Goldsmith,J.Riggle
&A.C.L.Yu(Eds.),Thehandbookofphonologicaltheory(2nded.).Oxford,UK:Wiley-Blackwell.http://dx.doi.org/10.1002/9781444343069.ch2
Batchelder,E.O.(2002).Bootstrappingthelexicon:Acomputational
modelofinfantspeechsegmentation.Cognition,83,167-206.http://dx.doi.org/10.1016/S0010-0277(02)00002-1
Baum,L.E.,&Petrie,T.(1966).Statisticalinferenceforprobabilistic
functionsoffinitestateMarkovchains.TheAnnalsofMathematicalStatistics,37(6),1554-1563.http://dx.doi.org/10.1214/aoms/1177699147
Berger,A.,DellaPietra,S.,&DellaPietra,V.(1996).Amaximum
entropyapproachtonaturallanguageprocessing.ComputationalLinguistics,22(1),39-71.
Blanchard,D.,Heinz,J.,&Golinkoff,R.(2010).Modelingthe
contributionofphonotacticcuestotheproblemofwordsegmentation.JournalofChildLanguage,37,487-511.http://dx.doi.org/10.1017/S030500090999050X
Boersma,P.(2003).[ReviewofthebookLearnabilityinOptimality
Theory,byB.Tesar&P.Smolensky].Phonology,20,436-446.http://dx.doi.org/10.1017/S0952675704230111
Boersma,P.,&Hayes,B.(2001).EmpiricaltestsoftheGradual
LearningAlgorithm.LinguisticInquiry,32,45-86.http://dx.doi.org/10.1162/002438901554586
Bortfeld,H.,Morgan,J.L.,Golinkoff,R.M.,&Rathbun,K.(2005).
Mommyandme:Familiarnameshelplaunchbabiesintospeechstreamsegmentation.PsychologicalScience,16,298-304.http://dx.doi.org/10.1111/j.0956-7976.2005.01531.x
Brent,M.R.,&Cartwright,T.A.(1996).Distributionalregularity
andphonotacticconstraintsareusefulforsegmentation.Cognition,61,93-125.http://dx.doi.org/10.1016/S0010-0277(96)00719-6
Brown,P.,&Mercer,R.(2013).TwentyyearsofBitext
[Transcriptionandslides].Invitedtalk.EMNLPworkshopTwentyyearsofBitext.Seattle,WA.Retrievedfrom:http://cs.jhu.edu/~post/bitext/
Brown,R.(1973).Afirstlanguage:Theearlystages.Cambridge,
MA:HarvardUniversityPress.
Buccola,B.,&Sonderegger,M.(2013).Ontheexpressivityof
OptimalityTheoryversusrules:Anapplicationtoopaquepatterns.RefereedpresentationpresentedatthemeetingPhonology2013,UmassAmherst,09November2013.
Cairns,P.,Shillcock,R.C.,Chater,N.,&Levy,J.(1997).
Bootstrappingwordboundaries:Abottom-upcorpus-basedapproachtospeechsegmentation.CognitivePsychology,33,111-153.http://dx.doi.org/10.1006/cogp.1997.0649
Chomsky,N.(1956).Threemodelsforthedescriptionoflanguage.
IRETransactionsonInformationTheory,2,113-124.http://dx.doi.org/10.1109/TIT.1956.1056813
Chomsky,N.(1959).[ReviewofthebookVerbalBehavior,byB.
F.Skinner].Language,35(1),26-58.http://dx.doi.org/10.2307/411334
Chomsky,N.&Halle,M.(1968).ThesoundpatternofEnglish.New
York:Harper&Row.
Christiansen,M.H.,Allen,J.,&Seidenberg,M.S.(1998).Learning
tosegmentspeechusingmultiplecues:Aconnectionistmodel.LanguageandCognitiveProcesses,13(2-3),221-268.http://dx.doi.org/10.1080/016909698386528
Coetzee,A.W.,&Pater,J.(2011).Theplaceofvariationin
phonologicaltheory.InJ.Goldsmith,J.Riggle&A.C.L.Yu(Eds.),Thehandbookofphonologicaltheory(2nded.).Oxford,UK:Wiley-Blackwell.http://dx.doi.org/10.1002/9781444343069.ch13
Coleman,J.,&Pierrehumbert,J.(1997).Stochasticphonological
grammarsandacceptability.In3rdMeetingoftheACLSpecialInterestGroupinComputationalPhonology:ProceedingsoftheWorkshop,12July1997(pp.49-56).SomersetNJ:AssociationforComputationalLinguistics.
Daland,R.(2009).Wordsegmentation,wordrecognition,andword
learning:Acomputationalmodeloffirstlanguageacquisition(unpublisheddoctoraldissertation).NorthwesternUniversity,IL.Retrievedfrom:http://www.linguistics.northwestern.edu/docs/dissertations/dalandDissertation.pdf
Daland,R.(2013).Variationinchild-directedspeech:Acasestudy
ofmannerclassfrequencies.JournalofChildLanguage,40(5),1091-1122.http://dx.doi.org/doi:10.1017/S0305000912000372
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
22?R.Daland
Daland,R.,&Pierrehumbert,J.B.(2011).Learningdiphone-based
segmentation.CognitiveScience,35(1),119-155.http://dx.doi.org/10.1111/j.1551-6709.2010.01160.x
Daland,R.,&Zuraw,K.(2013).DoesKoreandefeatphonotactic
wordsegmentation?Shortpaperpresentedatthe51stAnnualMeetingoftheAssociationforComputationalLinguisticsinSofia,Bulgaria,August4-9,2013.
Dale,P.S.,&Fenson,L.(1996).Lexicaldevelopmentnormsfor
youngchildren.BehaviorResearchMethods,Instruments,&Computers,28,125-127.http://dx.doi.org/10.3758/BF03203646Daugherty,K.,&Seidenberg,M.S.(1992).Rulesorconnections?
Thepasttenserevisited.InProceedingsoftheFourteenthAnnualConferenceoftheCognitiveScienceSociety(pp.259-264).Hillsadle,NJ:Erlbaum.
Eisner,J.(2002).Parameterestimationforprobabilisticfinite-state
transducers.InProceedingsofthe40thAnnualMeetingoftheAssociationforComputationalLinguistics(pp.1-8).EastStroudsburg,PA:AssociationforComputationalLinguistics.http://dx.doi.org/10.3115/1073083.1073085
Ellison,M.T.(1994).Phonologicalderivationinoptimalitytheory.
InProceedingsofthe15thInternationalConferenceonComputationalLinguistics(COLING)(Vol.2,pp.1007-1013).Kyoto,Japan:AssociationforComputationalLinguistics.http://dx.doi.org/10.3115/991250.991312
Elman,J.L.(1990).Findingstructureintime.CognitiveScience,14,
179-211.http://dx.doi.org/10.1207/s15516709cog1402_1
Elman,J.L.,&McClelland,J.L.(1985).Anarchitectureforparallel
processinginspeechrecognition:TheTRACEmodel.InM.R.Schroeder(Ed.),Speechrecognition(pp.6-35).Gottingen,Germany:BibliotecaPhonetica.
Fleck,M.M.(2008).Lexicalizedphonotacticwordsegmentation.
InProceedingsofthe46thAnnualMeetingoftheAssociationforComputationalLinguistics:HumanLanguageTechnologies(pp.130-138).Madison,WI:Omnipress.
Fourtassi,A.,B?rschinger,B.,Johnson,M.,&Dupoux,E.(2013).
Whyisenglishsoeasytosegment.InProceedingsofthe4thWorkshoponCognitiveModelingandComputationalLinguistics(pp.1-10).Sofia,Bulgaria,August8,2013.
Frank,R.,&Satta,G.(1998).Optimalitytheoryandthe
computationalcomplexityofconstraintviolability.ComputationalLinguistics,24,307-315.
Gold,E.M.(1967).Languageidentificationinthelimit.Information
andControl,10(5),447-474.http://dx.doi.org/10.1016/S0019-9958(67)91165-5
Goldsmith,J.(1976).Autosegmentalphonology.Doctoraldissertation,
MIT,MA.
Goldsmith,J.A.(1990).Autosegmentalandmetricalphonology.
Oxford,UK:BasilBlackwell.
Goldwater,S.(2006).NonparametricBayesianmodelsoflexical
acquisition(unpublisheddoctoraldissertation).BrownUniversity,RI.Retrievedfrom:http://homepages.inf.ed.ac.uk/sgwater/papers/thesis_1spc.pdf
Goldwater,S.,&Johnson,M.(2003).LearningOTconstraint
rankingsusingamaximumentropymodel.InProceedingsoftheWorkshoponVariationwithinOptimalityTheory(pp.113-122).StockholmUniversity,Sweden.
Goldwater,S.,Griffiths,T.L.,&Johnson,M.(2009).ABayesian
frameworkforwordsegmentation:Exploringtheeffectsofcontext.Cognition,112(1),21-54.http://dx.doi.org/10.1016/j.cognition.2009.03.008
Graf,T.(2010a).Comparingincomparableframeworks:Amodel
theoreticapproachtophonology.UniversityofPennsylvaniaWorkingPapersinLinguistics,16(1),art.10.Retrievedfrom:http://repository.upenn.edu/pwpl/vol16/iss1/10
Graf,T.(2010b).Formalparametersofphonology:FromGovernment
PhonologytoSPE.InT.Icard&R.Muskens(Eds.),Interfaces:Explorationsinlogic,languageandcomputation,LectureNotesinArtificialIntelligence6211(pp.72-86).Berlin,Germany:Springer.
Hayes,B.(2004).PhonologicalacquisitioninOptimalityTheory:
theearlystages.InR.Kager,J.Pater&W.Zonneveld(Eds.),Fixingpriorities:Constraintsinphonologicalacquisition(pp.158-203).CambridgeUniversityPress.
Hayes,B.(2011).Interpretingsonority-projectionexperiments:The
roleofphonotacticmodeling.InProceedingsofthe17thInternationalCongressofPhoneticSciences(ICPhS11-HongKong)(pp.835-838)HongKong,PRC.
Hayes,B.,Kie,Z.,Siptár,P.,&Londe,Z.C.(2009).Naturaland
unnaturalconstraintsinHungarianvowelharmony.Language,85(4),822-863.http://dx.doi.org/10.1353/lan.0.0169
Hayes,B.,&White,J.(2013).Phonologicalnaturalnessand
phonotacticlearning.LinguisticInquiry,44(1),45-75.http://dx.doi.org/10.1162/LING_a_00119
Hayes,B.,&Wilson,C.(2008).Amaximumentropymodelof
phonotacticsandphonotacticlearning.LinguisticInquiry,39(3),379-440.http://dx.doi.org/10.1162/ling.2008.39.3.379
Heinz,J.(2007).TheInductiveLearningofPhonotacticPatterns
(doctoraldissertation),UniversityofCalifornia,LosAngeles.Heinz,J.(2010).Stringextensionlearning.InProceedingsofthe
48thAnnualMeetingoftheAssociationforComputationalLinguisticsinUppsala,Sweden(pp.897-906).
Heinz,J.(2011a).ComputationalPhonology-PartI:Foundations.
LanguageandLinguisticsCompass,5(4),140-152.http://dx.doi.org/10.1111/j.1749-818X.2011.00269.x
Heinz,J.(2011b).ComputationalPhonology-PartII:Grammars,
Learning,andtheFuture.LanguageandLinguisticsCompass,5(4),153-168.http://dx.doi.org/10.1111/j.1749-818X.2011.00268.x
Heinz,J.,&Koirala,C.(2010).Maximumlikelihoodestimationof
feature-baseddistributions.InProceedingsofthe11thMeetingoftheACLSpecialInterestGrouponComputationalMorphologyandPhonology,28-37,Uppsala,Sweden.
Heinz,J.,&Lai,R.(2013).Vowelharmonyandsubsequentiality.
InA.Kornai&M.Kuhlmann(Eds.),Proceedingsofthe13thMeetingonMathematicsofLanguage,Sofia,Bulgaria.
Hirschberg,J.(1998).'EverytimeIfirealinguist,myperformance
goesup',andothermythsofthestatisticalnaturallanguageprocessingrevolution.Invitedtalk.15thNationalConferenceonArtificialIntelligence,Madison,WI.
Jardine,A.(inpress).LogicandtheGenerativePowerof
AutosegmentalPhonology.InSupplementalProceedingsofPhonology2013.Retrievedfrom:https://sites.google.com/site/adamajardine/research-interests
Jarosz,G.(2006).Richlexiconsandrestrictivegrammars-Maximum
likelihoodlearninginOptimalityTheory(doctoraldissertation).JohnsHopkinsUniversity.RetrievedfromRutgersOptimalityArchiveNo.884.
Jarosz,G.(2013).LearningwithhiddenstructureinOptimality
TheoryandHarmonicGrammar:BeyondRobustInterpretiveParsing.Phonology,30,27-71.http://dx.doi.org/10.1017/S0952675713000031
Jelinek,F.,Bahl,L.,&Mercer,R.(1975).Designofalinguistic
statisticaldecoderfortherecognitionofcontinuousspeech.IEEETransactionsofInformationTheory,21(3),250-256.http://dx.doi.org/10.1109/TIT.1975.1055384
Jusczyk,P.W.,Hohne,E.A.,&Bauman,A.(1999).Infants'
sensitivitytoallophoniccuesforwordsegmentation.PerceptionandPsychophysics,61,1465-1476.http://dx.doi.org/10.3758/BF03213111
Jusczyk,P.W.,Houston,D.M.,&Newsome,M.(1999).The
beginningsofwordsegmentationinEnglish-learninginfants.CognitivePsychology,39(3-4),159-207.http://dx.doi.org/10.1006/cogp.1999.0716
Kaplan,R.M.,&Kay,M.(1994).Regularmodelsofphonological
rulesystems.ComputationalLinguistics,20(3),331-379.
Karttunen,L.(1998).Thepropertreatmentofoptimalitytheory
incomputationalphonology.InProceedingsoftheInternationalWorkshoponFiniteStateMethodsinNaturalLanguageProcessing(1-12).Ankara,Turkey:AssociationforComputationalLinguistics.
Kearns,M.,&Valiant,L.(1994).Cryptographiclimitationson
learningbooleanformulaeandfiniteautomata.JournaloftheACM,41(1),67-95.http://dx.doi.org/10.1145/174644.174647Larsen,D.,&Heinz,J.(2012).Neutralvowelsinsound-symbolic
vowelharmonyinKorean.Phonology,29,433-464.http://dx.doi.org/10.1017/S095267571200022X
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
Whatiscomputationalphonology??23
Legendre,G.,Miyata,Y.,&Smolensky,P.(1990).Harmonic
Grammar:Aformalmulti-levelconnectionisttheoryoflinguisticwell-formedness:Anapplication.InProceedingsofthetwelfthannualconferenceoftheCognitiveScienceSociety(pp.884-891).Cambridge,MA:LawrenceErlbaum.
Li,M.,&Vitányi,P.M.B.(1991).Learningsimpleconceptsunder
simpledistributions.SIAMJournalofComputing,20,911-935.http://dx.doi.org/10.1137/0220056
Lignos,C.(2012).Infantwordsegmentation:Anincremental,
integratedmodel.InProceedingsoftheWestCoastConferenceonFormalLinguistics,30,April13-15,2012.
MacWhinney,B.(2000).TheCHILDESProject:ToolsforAnalyzing
Talk.VolumeI:Transcriptionformatandprograms.VolumeII:Thedatabase.Mahwah,NJ:LawrenceErlbaum.
Magri,G.(2012).Constraintpromotion:notonlyconvergentbutalso
efficient.InCLS48:Proceedingsofthe48thannualconferenceoftheChicagoLinguisticSociety,Chicago,IL.
Magri,G.(inpress).Error-drivenandbatchmodelsoftheacquisition
ofphonotactics:DaviddefeatsGoliath.InPhonology2013:Proceedingsofthe2013PhonologyConference,November8-10,2013,Amherst,MA.
Mandel,D.R.,Jusczyk,P.W.,&Pisoni,D.B.(1995).Infants'
recognitionofthesoundpatternsoftheirownnames.PsychologicalScience,6,315-318.http://dx.doi.org/10.1111/j.1467-9280.1995.tb00517.x
Marcus,G.F.(1995).TheacquisitionoftheEnglishpasttensein
childrenandmulti-layeredconectionistnetworks.Cognition,56,271-279.http://dx.doi.org/10.1016/0010-0277(94)00656-6Marcus,G.F.,Brinkmann,U.,Clahsen,H.,Wiese,R.,&Pinker,S.
(1996).Germaninflection:Theexceptionthatprovestherule.CognitivePsychology,29,189-256.http://dx.doi.org/10.1006/cogp.1995.1015
Mattys,S.L.,&Jusczyk,P.W.(2001).Phonotacticcuesfor
segmentationoffluentspeechbyinfants.Cognition,78,91-121.http://dx.doi.org/10.1016/S0010-0277(00)00109-8
McCarthy,J.J.(1981).Aprosodictheoryofnon-concatenative
morphology.LinguisticInquiry,12(3),373-418.
McCarthy,J.J.(2008).Thegradualpathtoclustersimplification.
Phonology,25,271-319.http://dx.doi.org/10.1017/S0952675708001486
McCarthy,J.J.(2011).AutosegmentalspreadinginOptimality
Theory.InJ.Goldsmith,A.E.Hume&L.Wetzels(Eds.),TonesandFeatures(pp.195-222).Berlin,Germany:MoutondeGruyter.http://dx.doi.org/10.1515/9783110246223.195
McCarthy,J.J.,&Prince,A.(1994).Theemergenceoftheunmarked:
Optimalityinprosodicmorphology.InProceedingsoftheNorthEastLinguisticsSociety24.Amherst,MA.
Pearl,L.,Goldwater,S.,&Steyvers,M.(2011).OnlineLearning
MechanismsforBayesianModelsofWordSegmentation.ResearchonLanguageandComputation,8(2-3),107-132.http://dx.doi.org/10.1007/s11168-011-9074-5
Phillips,L.&Pearl,L.(2012).Syllable-basedBayesianinference:
A(more)plausiblemodelofwordsegmentation.WorkshoponPsychocomputationalModelsofHumanLanguageAcquisition.Portland,OR.
Pierrehumbert,J.(1994).Syllablestructureandwordstructure:a
studyoftriconsonantalclustersinEnglish.InP.Keating(Ed.),PapersinlaboratoryphonologyIII:Phonologicalstructureandphoneticform(pp.168-188).Cambridge,UK:CambridgeUniversityPress.
Pinker,S.,&Prince,A.(1988).Onlanguageandconnectionism:
AnalysisofaParallelDistributedProcessingmodeloflanguageacquisition.Cognition,28(1-2),73-193.http://dx.doi.org/10.1016/0010-0277(88)90032-7
Plunkett,K.&Marchman,V.(1991).U-shapedlearningand
frequencyeffectsinamulti-layeredperceptron:Implicationsforchildlanguageacquisition.Cognition,38,43-102.http://dx.doi.org/10.1016/0010-0277(91)90022-V
Potts,C.,Pater,J.,Jesney,K.,Bhatt,R.,&Becker,M.(2010).
HarmonicGrammarwithlinearprogramming:Fromlinearsystemstolinguistictypology.Phonology,27,77-117.http://dx.doi.org/10.1017/S0952675710000047
Potts,C.,&Pullum,G.K.(2002).Modeltheoryandthecontentof
OTconstraints.Phonology,19,361-393.
Prince,A.,&Smolensky,P.(1993).OptimalityTheory:Constraint
interactioningenerativegrammar.TechnicalReport,RUCSS,RutgersUniversity,NewBrunswick,NJ.Publishedin2004byBlackwell.
Prince,A.,&Smolensky,P.(2002).OptimalityTheory:Constraint
interactioningenerativegrammar.Retrievedfrom:roa.rutgers.edu/files/537-0802/537-0802-PRINCE-0-0.PDFPrince,A.,&Smolensky,P.(2004).OptimalityTheory:Constraint
interactioningenerativegrammar.Oxford:Blackwell.
Riggle,J.(2004).Generation,recognition,andlearninginfinite
stateOptimalityTheory(doctoraldissertation).UCLA,CA.Riggle,J.(2009).ViolationsemiringsinOptimalityTheory.Research
onLanguageandComputation,7(1),1-12.http://dx.doi.org/10.1007/s11168-009-9063-0
Riggle,J.,&Wilson,C.(2005).Localoptionality.InProceedings
ofNELS35.Amherst,MA:GLSA.
Rumelhart,D.E.,&McClelland,J.L.(1986).Onlearningthepast
tensesofEnglishverbs:Implicitrulesorparalleldistributedprocessing?InJ.L.McClelland,D.E.Rumelhart,&thePDPResearchGroup(Eds.),Paralleldistributedprocessing:Explorationsinthemicrostructureofcognition(Vol.2,pp.216-271).Cambridge,MA:MITPress.
Saffran,J.R.,Aslin,R.N.,&Newport,E.L.(1996).Statistical
learningby8-month-oldinfants.Science,274,1926-1928.http://dx.doi.org/10.1126/science.274.5294.1926
Shieber,S.M.(1985).Evidenceagainstthecontext-freenessofnatural
language.LinguisticsandPhilosophy,8(3),333-343,http://dx.doi.org/10.1007/BF00630917
Smolensky,P.,&Legendre,G.(2006).TheHarmonicMind:From
neuralcomputationtoOptimality-Theoreticgrammar(Vol.1:CognitiveArquitecture,pp.xvii-563.Vol.2:LinguisticandPhilosophicalImplications,pp.xvii-611).Cambridge,MA:MITPress.
Stabler,E.(2009).Computationalmodelsoflanguageuniversals.In
M.H.Christiansen,C.Collins,&S.Edelman(Eds.),LanguageUniversals(Rev.ed.,pp.200-223).Oxford,UK:OxfordUniversityPress.http://dx.doi.org/10.1093/acprof:oso/9780195305432.003.0010
Strauss,T.J.,Harris,H.D.,&Magnuson,J.S.(2007).jTRACE:A
reimplementationandextensionoftheTRACEmodelofspeechperceptionandspokenwordrecognition.BehaviorResearchMethods,39(1),19-30.http://dx.doi.org/10.3758/BF03192840Tesar,B.,&Smolensky,P.(2000)LearnabilityinOptimalityTheory.
Cambridge,MA:MITPress.
Twain,M.(2006).ChaptersfromMyAutobiography--XX.North
AmericanReviewDCXVIII.ProjectGutenberg.Retrievedfrom:http://www.gutenberg.org/files/19987/19987-h/19987-h.htm(originalworkpublishedin1906).
Valiant,L.G.(1984).Atheoryofthelearnable.Communicationsof
theACM27,1134-1142.http://dx.doi.org/10.1145/1968.1972Viterbi,A.J.(1967).Errorboundsforconvolutionalcodesandan
asymptoticallyoptimumdecodingalgorithm.IEEETransactionsonInformationTheory13(2),260-269.http://dx.doi.org/10.1109/TIT.1967.1054010
Xanthos,A.(2004).Combiningutterance-boundaryandpredictability
approachestospeechsegmentation.InW.G.Sakas(Ed.),Proceedingsofthefirstworkshoponpsycho-computationalmodelsoflanguageacquisitionatCOLING2004(pp.93-100).Geneva,Switzerland.
Zipf,G.K.(1935).ThePsychobiologyofLanguage.Boston,MA:
Houghton-Mifflin.
Zipf,G.K.(1949).HumanBehaviorandthePrincipleofLeastEffort.
Cambridge,MA:Addison-WesleyPress.
Zuraw,K.(2006).Usingthewebasaphonologicalcorpus:acase
studyfromTagalog.InEACL-2006:Proceedingsofthe11thConferenceoftheEuropeanChapteroftheAssociationforComputationalLinguistics/Proceedingsofthe2ndInternationalWorkshoponWebAsCorpus(pp.59-66).http://dx.doi.org/10.3115/1628297.1628306
Loquens,1(1),January2014,e004.eISSN 2386-2637doi:http://dx.doi.org/10.3989/loquens.2014.004
正在阅读:
What is computational phonology03-21
我国农业电子商务平台建设现状及存在问题(同名13757)04-12
园艺专业英语词根词缀 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 例 - - 词10-20
第三章保护生物的多样性(2013人教版)08-17
2014年中国广播电视接收设备及器材制造行业湖南省衡阳市TOP10企业排名01-24
7905和7805三端稳压 - 图文06-12
链轮SEO优化 - 链轮结构权重传递探索06-24
english试题及答案(大学英语3)03-06
磁悬浮实验报告12-19
- 多层物业服务方案
- (审判实务)习惯法与少数民族地区民间纠纷解决问题(孙 潋)
- 人教版新课标六年级下册语文全册教案
- 词语打卡
- photoshop实习报告
- 钢结构设计原理综合测试2
- 2014年期末练习题
- 高中数学中的逆向思维解题方法探讨
- 名师原创 全国通用2014-2015学年高二寒假作业 政治(一)Word版
- 北航《建筑结构检测鉴定与加固》在线作业三
- XX县卫生监督所工程建设项目可行性研究报告
- 小学四年级观察作文经典评语
- 浅谈110KV变电站电气一次设计-程泉焱(1)
- 安全员考试题库
- 国家电网公司变电运维管理规定(试行)
- 义务教育课程标准稿征求意见提纲
- 教学秘书面试技巧
- 钢结构工程施工组织设计
- 水利工程概论论文
- 09届九年级数学第四次模拟试卷
- computational
- phonology
- What
- 混凝土浇筑技术交底大全(1)
- 小学三年级奥数第14讲--巧求周长之家庭作业 - 试题及答案
- 内燃机车
- 高考语文常考基础20练11
- 中国移动网络“三英”战4G
- 8结构主义人类学
- 托福机经范文第二篇 - Getting advice from friends who are old
- 收入分配与社会公平教案(高一) - 图文
- 假设检验作业习题
- 农田水利施工组织设计
- 普通心理学复习知识点
- 中小学教师信息技术应用能力提升工程项目试题
- 葡萄酒行业发展论文
- 对内地学生车祸点赞 部分港人怎么了
- 合同范本之承揽合同纠纷判决书
- 超星尔雅从爱因斯坦到霍金的宇宙课后习题答案2019
- 20XX党员思想汇报范文:人生价值观
- 2012年期货市场基础知识第八章利率期货重要考点(1)
- 上海交通大学附属中学10-11学年高一数学下学期期末考试[会员独享
- OMS系统月度停电计划表单填写规范要求说明