XLELFG parsing → Discriminant
更新时间:2023-05-21 00:56:01 阅读量: 实用文档 文档下载
- 小鹿饿了推荐度:
- 相关推荐
We present the LFG Parsebanker, a comprehensive toolkit for interactive incremental construction of a treebank as a parsed corpus. The tool which we have developed supports the process flow in semi-automatic treebank construction, as illustrated in the fol
LFG
Parsebanker
TREPILNorwegianTreebankPilotProject
Introduction
WepresenttheLFGParsebanker,acomprehensivetoolkitforinteractiveincrementalcon-structionofatreebankasaparsedcorpus.Thetoolwhichwehavedevelopedsupportstheprocess owinsemi-automatictreebankconstruction,asillustratedinthefollowingscheme:
XLE/LFGparsing →Discriminantdisambiguation
→DatabasestorageThetoolkithasthefollowingcomponents:
XLE-Web,aninterfacetotheXLEparseronawebpage;thisinterfaceincludesanewdisplayofpackedstructuresandoffersdiscriminants[1],designedandimplementedforLFGgrammars,toselectananalysis;
aparsebankingpagewhichoffersviewsanddisambiguationasinXLE-Web,butalsoaddi-tionalparsebankmanagementoperations,suchassubcorpusandgrammarselectionandasearchwindowbasedonTigerSearchextendedforf-structures;
anoverviewpageprovidingnavigation,informationandsortingofutterances; adiscriminantstatisticspagedisplayingstatisticsonchosendiscriminants.
MostofthesecomponentsareimplementedinCommonLispanduseXML,XSLTandJavascripttoservetheinterfacewebpages.C-structuretrees(andgraphs)aredrawnus-ingScalableVectorGraphics(SVG)andMySQLisusedtostoretheparsebank.
Disambiguationwithdiscriminants
Inbuildingatreebank,theannotator’schoicebetweendifferentpossiblegrammaticalstruc-turesiscomplicatedbyseveralfactors.Amajorchallengeisthesheernumberofpossiblestructures,whichmayrunintothehundredsorthousandsforlongersentences.Anotherchal-lengeisthehighlevelofdetailrecordedinthestructures,whichisdesirableinthetreebankbutcanbedauntingfortheannotator.Considerthef-structuresin(2)forthesentenceinexample(1),wherehverdagcanbeanobjectoranadjunct.(1)Barn-alekerhverdag.child-DEF.PLplayeveryday“Thechildrenplayeveryday.”
(2)
Thedifferenceindicatedwithgreenshadinginthestructuresin(2)ispresentedtothean-notatorasthechoicein(3).Thesesimple,localdifferencesarecalleddiscriminants[1].Bychoosingwhetherhverdagisanobjectoranadjunct,theannotatordecidesontheintendedanalysisbutavoidsexaminingthewhole,complicatedstructures.
(3)
NULLOBJNULLADJUNCTParsebankinginterfacewithdiscriminantdisambiguation
Theinterfaceforidentifyingtheintendedanalysisisshowninthefollowingscreenshot.Hereweseethelistofdiscriminantsontheleft,thepackedconstituentstructureinthemiddle,andthepackedfunctionalstructureontheright.Theanalysesshownareforexample(4),inwhichtilfjellshastwopossibleattachments.Theannotator
basicallychoosesdiscriminantsbyclickingtochooseorrejectthem,butotheradvancedactionsarealsoavailable[3].
VictoriaRosén,PaulMeurerandKoenraaddeSmedt
UniversityofBergenandUnifobAKSIS
(4)Tamedbarn-atilfjells.
takealongchild-DEF.PLtomountain-LOC
“Takethechildrenalongtothemountains”or“Takethechildreninthemountainsalong”
Discriminanttypes
1.Lexicaldiscriminant(awordformanditspartofspeech)
2.Morphologicaldiscriminant(abaseformwithitstagsfrommorphologicalpreprocessing)3.C-structurediscriminant(alabeledorunlabeledbracketingofasubstring)4.F-structurediscriminant(aminimalpaththroughanf-structure)
Treebankoverviewpage
Theoverviewpage,showninthefollowingscreenshot,listsallsentencesinthecorpusto-getherwithinformationaboutnumberofparsesolutions,whethertheanalysisisfragmented,numberofdiscriminants,numberofchosenanalyses,sentencelength,andwhetherthecho-senanalysis
istheintendedone.Anycommentsaddedbytheannotatorduringthedisam-biguationprocessarealsoshown.
Discriminantstatisticspage
Thediscriminantstatisticspagepresentsafrequencylistofchosendiscriminantsforasub-corpus.Eachdiscriminantislistedwithitstype,thenumberoftimesitischosen(i.e.markedasgood)andthenumberoftimesitscomplementischosen
(i.e.markedasbad).(Note:Thestatisticsshownwerecompiledbeforelexicaldiscriminantswereaddedtothesystem.)
Resultsandprospects
OurworkbuildsonpreviousparsebankingeffortssuchastheTreebanker[1],Alpino[4]andLinGORedwoods[2].Ourtoolkit,however,isspeci callydesignedforLFGgrammars.WehaveimplementedTIGER-basedsearchonf-structuresaswellasc-structures,andwecantrainparserankingbasedonourLFGdiscriminants.
Thetoolwhichwehavedevelopedisfunctionalandwillbefurtherdevelopedintheremain-deroftheproject.AlthoughitwasoriginallyprimarilyintendedforNorwegian,ithasbeenimplementedinalanguage-independentfashion.ThismeansthatitmaybeusedforbuildingatreebankforanylanguageforwhichasuitableLFGgrammarisavailable.
TheTREPILprojectrunsfromApril1,2004toDecember31,2008.Itswebsiteis:http://gandalf.aksis.uib.no/trepil/.
References
[1]DavidCarter.TheTreeBanker:Atoolforsupervisedtrainingofparsedcorpora.InProceedingsoftheFourteenthNationalConferenceonArti cialIntelli-gence,pages598–603,Providence,RhodeIsland,1997.[2]StephanOepen,DanFlickinger,KristinaToutanova,andChristopherD.Manning.LinGORedwoods,arichanddynamictreebankforHPSG.ResearchonLanguage&Computation,2(4):575–596,December2004.[3]VictoriaRosén,KoenraadDeSmedt,andPaulMeurer.Towardsatoolkitlinkingtreebankingtogrammardevelopment.InProceedingsoftheFifthWorkshoponTreebanksandLinguisticTheories,pages55–66,2006.[4]LeonoorVanderBeek,GosseBouma,RobertMalouf,andGertjanVanNoord.TheAlpinodependencytreebank.InComputationalLinguisticsintheNetherlands(CLIN)2001,TwenteUniversity,2002.
正在阅读:
XLELFG parsing → Discriminant05-21
风险和机遇应对措施管理控制程序04-24
街道党工委班子成员落实主体责任情况汇报11-12
中国人居环境文化的论文03-26
X射线复习和思考题07-07
改革三十年从标语口号变迁看社会发展03-09
安全知识抢答题复习题06-14
会计造假案01-25
并发处理 练习题04-14
- 教学能力大赛决赛获奖-教学实施报告-(完整图文版)
- 互联网+数据中心行业分析报告
- 2017上海杨浦区高三一模数学试题及答案
- 招商部差旅接待管理制度(4-25)
- 学生游玩安全注意事项
- 学生信息管理系统(文档模板供参考)
- 叉车门架有限元分析及系统设计
- 2014帮助残疾人志愿者服务情况记录
- 叶绿体中色素的提取和分离实验
- 中国食物成分表2020年最新权威完整改进版
- 推动国土资源领域生态文明建设
- 给水管道冲洗和消毒记录
- 计算机软件专业自我评价
- 高中数学必修1-5知识点归纳
- 2018-2022年中国第五代移动通信技术(5G)产业深度分析及发展前景研究报告发展趋势(目录)
- 生产车间巡查制度
- 2018版中国光热发电行业深度研究报告目录
- (通用)2019年中考数学总复习 第一章 第四节 数的开方与二次根式课件
- 2017_2018学年高中语文第二单元第4课说数课件粤教版
- 上市新药Lumateperone(卢美哌隆)合成检索总结报告
- Discriminant
- parsing
- XLELFG
- 让数学贴近学生生活
- 一年级上册卫生与保健教案
- 材料科学与人类文明课程复习资料
- 第11章 建设中国特色社会主义的根本目的和依靠力量
- 鸡养殖过程中适宜的补钙数量、时间与方法
- M7U2 reading Two life-saving medicines课文
- 低位直肠癌患者知情同意中存在的问题及对策
- 铅笔书法课程纲要
- 校本文化资源概念辨析及其开发利用
- 建筑设计实习自我鉴定.doc
- 会计出纳交接表
- 2006级操作系统期末试卷B卷及答案
- 2012年消毒技术规范
- 二十以内加法Microsoft Word 文档
- 存储管理实验1 分页方式内存分配
- 5&39;-一磷酸腺苷激活蛋白激酶对运动骨骼肌糖代谢的调节效应
- 忻州市南云中河治理美化工程设计方案探讨
- 金蝶KIS财务软件实训报告
- DNF贫民女大枪加点以及攻略
- 基于ANSYS_WORKBENCH的机床动态性能分析及改进_孙永清