Empirical project monitor A tool for mining multiple project data

更新时间:2023-07-17 20:17:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita

EmpiricalProjectMonitor:AToolforMiningMultipleProjectData

MasaoOhira ,ReishiYokomori ,MakotoSakai ,Ken-ichiMatsumoto ,KatsuroInoue ,KojiTorii

NaraInstituteofScienceandTechnology

ohira@empirical.jp,{matumoto,torii}@is.aist-nara.ac.jp

GraduateSchoolofInformationScienceandTechnology,OsakaUniversity

{yokomori,inoue}@ist.osaka-u.ac.jp

SRAKeyTechnologyLaboratory,Inc.

sakai@sra.co.jp

Abstract

Projectmanagementforeffectivesoftwareprocessim-provementmustbeachievedbasedonquantitativedata.However,becausedatacollectionformeasurementrequireshighcostsandcollaborationwithdevelopers,itisdif culttocollectcoherent,quantitativedatacontinuouslyandtoutilizethedataforpracticingsoftwareprocessimprove-ment.Inthispaper,wedescribeEmpiricalProjectMoni-tor(EPM)whichautomaticallycollectsandmeasuresdatafromthreekindsofrepositoriesinwidelyusedsoftwaredevelopmentsupportsystemssuchascon gurationman-agementsystems,mailinglistmanagersandissuetrackingsystems.Providingintegratedmeasurementresultsgraphi-cally,EPMhelpsdevelopers/managerskeepprojectsundercontrolinrealtime.

1Introduction

Insoftwaredevelopmentinrecentyears,improvementofsoftwareprocessisincreasinglygainingattention.Itsprac-ticeinsoftwareorganizationsconsistsofrepeatedlymea-suringthedevelopmentactivities, ndingpotentialprob-lemsintheprocesses,assessingimprovementplans,andprovidingfeedbackintotheprocesses.Projectmanage-mentforeffectivesoftwareprocessimprovementmustbeachievedbasedonquantitativedata.

Manysoftwaremeasurementmethodshavebeenpro-posedtobetterunderstand,monitor,control,andpredictsoftwareprocessesandproducts[4].Forinstance,theGoal-Question-Metric(GQM)paradigm[2]providesasophisti-catedmeasurementtechnique.GQMguidestosetupmea-surementgoals,createquestionsbasedonthegoals,andde-terminemeasurementmodelsandproceduresbasedonthe

questions.ThemeasurementbasedonGQMisalogicalandreasonablemethod.

However,initspractice,memberswhoparticipateinmeasurementactivitiesneedtostriveforthemeasurementprocessesoneverylastdetail.Datacollectionformeasure-mentingeneralrequireshighcostsandcollaborationwithdevelopers.Itisdif culttocollectcoherent,quantitativedatacontinuouslyandmoreovertoutilizethecollecteddataforpracticingsoftwareprocessimprovement.Fewstudieshaveproposedmeasurementtoolsfordealingwithanumberofprojectdataespeciallyintermsofalarge-scalesoftwareorganization.

Asameasurement-basedapproachtotheaboveis-sues,wehavebeenstudyingempiricalsoftwareengineer-ing[1,3]whichevaluatesvarioustechnologiesandtoolsbasedonquantitativedataobtainedthroughactualuse.Ourgoalistodevelopanenvironmentcomposedofavarietyoftoolsforsupportingmeasurementbasedsoftwareprocessimprovement,whichwecallEmpiricalsoftwareEngineer-ingEnvironment(ESEE).

Inthispaper,weintroduceEmpiricalProjectMonitor(EPM)asapartialimplementationofESEE,whichau-tomaticallycollectsandmeasuresquantitativedatafromthreekindsofrepositoriesinwidelyusedsoftwaredevel-opmentsupportsystemssuchascon gurationmanagementsystems,mailinglistmanagersandissuetrackingsystems.Collectingsuchthedatainsoftwaredevelopmentautomat-icallyandprovidingintegratedmeasurementresultsgraph-ically,EPMhelpsdevelopers/managerskeeptheirprojectsundercontrolinrealtime.

2EmpiricalProjectMonitor(EPM)

WehavedevelopedEmpiricalProjectMonitor(EPM)[9]whichautomaticallycollectsandanalyzesdatafrommulti-

Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita

Figure1.ThearchitectureofEPMintheESEEframework

plesoftwarerepositories.Figure1showsthearchitectureofEPMintheESEEframework.TheESEEframeworkisde-signedforsupportingmeasurementbasedprocessimprove-mentinsoftwareorganizationsbyprovidingvariousplug-gabletools.EPMconsistsoffourcomponentsaccordingtotheESEEframework:datacollection,formattranslation,datastore,anddataanalysis/visualization.Thissectionde-scribesanoverviewofEPMandthebasicdata owthroughEPM.

Automaticdatacollection:EPMautomaticallycollectsmultipleprojectdatafromthreekindsofrepositoriesinwidelyusedsoftwaredevelopmentsupportsystems.Forinstance,EPMcollectsversioninghistoriesfromcon gura-tionmanagementsystems(e.g.CVS1),mailarchivesfrommailinglistmanagers(e.g.Mailman2,Majordomo3,fml4),andissuetrackingrecordsfrom(bug)issuetrackingsys-tems(e.g.GNATS5,Bugzilla6).Becausethesedataareaccumulatedthrougheverydaydevelopmentactivitiesus-ingcommonGUItools(e.g.SourceShareTM7,WinCVS8),developers/managersdonotneedadditionalworkfordatacollection.Also,itdosenottakehighcoststointroduceEPMintoprojects/organizationsbecausethesystemsasthesourcesofdatacollectionareopensourcefreeware.

Formattranslationanddatastore:EPMconvertsthecollecteddataintotheXMLformatcalledthestandardizedempiricalsoftwareengineeringdata,sothatEPMcandeal

/

/

3Majordomo,/majordomo/4fml,/index.html.en

5GNATS,/software/gnats/6Bugzilla,/

7SourceShareTM,/8WinCVS,/

2Mailman,1CVS,

withnotonlytheabovethreekindsofsoftwarerepositoriesbutalsovariouskindsofrepositoriesaccordingtopurposesformeasurement.Datafromothersystemsareavailablebysmalladjustmentsofparameters.ThedataconvertedintotheXMLformatisstoredinthePostgreSQL9database.Analysisandvisualization:EPManalyzesthedatastoredinthePostgreSQLdatabase.Forinstance,inordertoanalyzedatarelatedtoCVS,EPMextractstheprocessdataabouteventssuchascheckin/checkout,transitionsofsourcecodesize,versionhistoriesofcomponents,andsoforth.Then,EPMvisualizesvariousmeasurementresultssuchasthegrowthoflinesofcodeandtherelationshipbe-tweencheckinandcheckout.EPMalsoprovidessummariesofeachrepositorysuchasinformationofCVSlogs.Allthemeasurementresultsareavailablethroughusingcommonwebbrowsers(e.g.seeFigure2),sothatusersareeasytosharetheresults.

Inthisway,EPMsupportsuserstoobtainquantitativedataatlowcostinrealtimeandprovidesthemwithvariousmeasurementresultsforunderstandingthecurrentdevelop-mentstatus.Thiswouldhelpuserskeeptheirprojectsundercontrol.

3Visualizationsofmeasurementresults

Dataminingtechniquesforsoftwarerepositorieshavebeenproposedtounderstandreasonsofsoftwarechanges[7],toidentifyhowcommunicationdelayamongdevel-opersinphysicallydistributedenvironmentshaveeffectsonsoftwaredevelopment[8],todetectpotentialsoftwarechangesandincompletechanges[11],andsoforth.Incon-trasttothesetools,thefeaturesofEPMaretovisualize

9PostgreSQL,

/

Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita

Figure2.Measurementresultsthroughwebbrowsers

combinationsofmeasurementresultsfromthreekindsofsoftwarerepositoriesandtobeabletodealwithdatafrommultipleprojectssimultaneously.

3.1Combinationsofmeasurementresults

Inadditiontoprovidingvisualizationsofmeasurementresultsfromeachsoftwarerepository,EPMalsovisualizescombinationsofmeasurementresultsfromthreekindsofrepositories.Thefollowingsshowtwoexamplesofthem.Bugissuesandcheckins:Figure3representsthere-lationshipbetweenthetransitionofthecumulativetotalofissues(thelinegraph)andthetimeofcheckins(thegrayedverticallinesontheX-axis)inourEASEproject[6].Thenumberofissuesandcheckinsaremeasuredfromtherepos-itoryinGNATSandCVSrespectively.Acheckinoftenoc-cursafterbugissuesarereportedbecausedeveloperstrytomodifyorresolvetheissues.Thegraphhelpsusers(de-velopers/managers)rememberthesituationwhereissuesbyevery leversionswereraised.Tothecontrary,the leit-selfwhichischeckedinCVSmayincludesomebugsifthegraphindicatesthatthereareissuesaftercheckins.

Bugissuesande-mailsamongdevelopers:Figure4illustratesthecommunicationhistoryamongdevelopersintheEASEproject.Theblacklinegraphisthetransitionofthecumulativetotalofe-mailsexchangedthroughusingMailman.Theverticalshorter/longerdashedlinesrepre-sentswhenbugissueswereraised/resolved.Thelight-grayverticallinesmeanwhenthechecked-in lesbydeveloperswereuploadedtoCVS.Fromthegraph,userscancon rmthestateofthecommunicationamongdevelopersandiden-tifythe leversionswhichmighthaveproblems.Becausediscussionsonissuesbecomeactiveusuallywhenissues

are

Figure3.Relationshipbetweenissuesandcheckins

reportedtoanissuetrackingsystem,municationproblemsamongdevelopersbringthede-creaseofsoftwareproductivityandreliability

[8].

Figure4.Historyofbugissuesande-mailsamongdevelopers

Theintegratedmeasurementresultsbasedondatafromcon gurationmanagementsystems,mailinglistmanagers,andissuetrackingsystemshelpdevelopersunderstandcur-rentandpasteventsindevelopmentactivities.

3.2Visualizationsofmultipleprojectdata

paringcurrentprojectswithpastoneswouldbehelp-fulformanagerstoestimatetheprogressofprojectsandtodetecttheunusualstatusinprojects.

Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita

Comparisonofmeasurementresultsamongmulti-pleprojects:EPMmakesmeasurementresultscompara-blewithmultipleprojects.Figure5representstherelation-shipofthegrowthoflinesofcodebetweentwoproject(theupperline:SPARS[10],thelowerline:EASE).Thebothprojectshavebeenproceedingunderthecollaborativere-searchwithauthors’universitiesandsomesoftwarecom-panies.Someresearchersanddevelopershavebeenpar-ticipatinginthebothprojects.Actuallyalthoughthebothhavedifferentpurposesandaspects,supposeherethattheyhavebeendevelopingsoftwaresystemsrespectivelyundersimilarconditions.Theprojectmanagerscancon rmsomecommoncharacteristicsandroughlyestimatetheprogressofthelaterproject(EASE)fromthegraph.Forinstance,SPARShasthetwophasesinwhichithaveevolvedrapidlyforreleasingmajorversions.EASEhasjustreleasedthe rstmajorversion.ThemanagersareeasytoguessthenearfutureoftheprogressofEASE:thedevelopmentofEPMwillstopforawhiletotesttheEPM,toreconsiderthede-sign,andsoforth.

parisonoftwoprojects

Distributionmapsofmultipleprojects:Usingmea-surementresultsfromthreekindsofrepositoriesinmulti-pleprojects,EPMcangeneratedistributionmaps.Figure6isadistributionmapusing100OpenSourceSoftwareDevelopment(OSSD)10,11,whichrepresentstherelationshipbetweenlinesofcode(theX-axis)andnumberofcheckins(theY-axis).Supposeherethattheseprojectsaremanagedbyonesoftwareorganization.Thegraphcanbeusedforhelp-ingmanagersidentify“unusual”projectswhichindicateex-tremehighorlowvalues.

,

/

11We

selectedthe100projectsinFigure6randomlyfromthelistof

mostactiveprojectsin

.

Figure6.Distributionmapof100OSSDprojects

3.3Customizationsofmeasurementparameters

EPMcurrentlyprovidesuserswiththe ingthedatabaseschemaforEPMwhichisopentothepublic,usersareabletoinputSQLsequencesandtocreatebargraphs,linegraphs,anddistributionmapssuchasFigure6.Becausewewouldliketosupportvar-iousprojectsandorganizationswhichhaveownproblemsrespectively,wedecidedtoprovidetheminimumtypesofgraphsandsummaryinformationratherthantoprovidealotoftheminadvance.Afterfeedbackfromsoftwareorga-nizationsusingEPM,wewilladdothertypesofgraphsinthenearfuture.CurrentlyEPMcanbeviewedasatoolforexploratorydataanalysis[5].

4Discussion

Inthissection,wereportacasestudyofapplyingEPMtoourprojectitself,inordertoobservetheactualusageofthepre-de ned5typesofgraphsmentionedabove.Wehaveinterviewedfourdevelopersontheadvantagesandthedis-advantagesofusingEPM.ThedevelopmentenvironmentofthisprojectissummarizedinTable1.

Oneoftheadvantagesisthatthegraphsmakedeveloperseasytounderstandthestatusoftheprojectbyidentifyingdistinctivepartsindicatedinthegraphs.Forinstance,thepartofthe atlineintheLoCgraphremindedthemwhythedevelopmentseemedtobestopped.Infact,alldeveloperswereonabusinesstripatthetime.Thiscouldhelpthem

Project management for effective software process improvement must be achieved based on quantitative data. However, because data collection for measurement requires high costs and collaboration with developers, it is difficult to collect coherent, quantita

Table1.EPMdevelopmentprojectincreasetheaccountabilityfortheirmanagers.Otheroneisthatthegraphsgeneratedinrealtimemotivateddevelopersto xbugs,sincetheycouldbeawarethattherewerestillunresolvedissues.

Incontrasttotheseadvantages,someproblemsrelatedtotheusageofEPMhavebeenfound.Oneisthatvisual-izationsaretoocomplicatedtounderstandthestatusoftheprojectinsomecases.Forinstance,developerscouldnotdistinguishwhich leversionscorrespondedtowhichver-ticallinesinFigure3,sinceonedevelopercheckedinCVSforbackupofhis leseverydayandthereforeanumberofcheckinsoccurred.Inthiscase,developersmightneedtousetwoCVS(e.g.oneisforsoftwarereleaseandanotherisforbackup).

TheaboveresultsarestilltheinitialevaluationsforEPM.EPMwillbeintroducedinsomesoftwarecompaniesinthenearfuture.WeintendtoevaluatetheusefulnessofEPMwithrespectto(1)theeffectsonsoftwaredevelop-mentandprocessimprovementbyprovidingmeasurementresultsfrommultiplesoftwarerepositories,and(2)theben-e tofgivingthecapabilitytomanagemultipleprojects.

5ConclusionandFutureWork

ThegoalofthisresearchistoconstructanenvironmentforsupportingmeasurementbasedsoftwaredevelopmentaccordingtotheESEEframework.Inthispaper,weintro-ducedEmpiricalProjectMonitor(EPM)asapartialimple-mentationofESEE,whichhelpsdevelopers/managerskeepprojectsundercontrolbyprovidingvariousvisualizationsofmeasurementresultsrelatedtoprojectactivities.Nowa-days,wecangatherandanalyzemassivedataonsoftwaredevelopmentinalargescaleusingrapidlygrowinghard-warecapabilities.Byanalyzingsuchthehugedatacol-lectedfromthousandsofsoftwaredevelopmentprojects,wewouldliketoprovideusefulknowledgeandbene tnotonlytoindividualdevelopers/managersbutalsotoorganizations.Empiricalstudyonsoftwaredevelopmentisanactiveareainthe eldofEmpiricalSoftwareEngineering(ESE).ButtheapproachesofESEhavenotbeensuf cientlyap-pliedtosoftwaredevelopmentinsoftwareindustryalthoughcompaniesholdmanyproblems.Thedatarelatedtosoft-waredevelopmentfromtheindustrialworldhasseldom

beenprovidedwithuniversity’sresearch.Wearecollabo-ratingwithsomesoftwaredevelopmentcompaniesasthe

EASEproject.Therefore,itwouldbeastrongtriggerforgoingbeyondtheobstacleofthetechnicalprogressinsoft-wareengineering.

Acknowledgment

ThisworkissupportedbytheComprehensiveDe-velopmentofe-SocietyFoundationSoftwareprogramoftheMinistryofEducation,Culture,Sports,ScienceandTechnology.WethankSatoruIwamura,EijiOnoandTairaShinkaiforsupportingthedevelopmentofEmpiricalProjectMonitor.

References

[1]A.Aurum,R.Jeffery,C.Wohlin,andM.Handzic.Manag-ingSoftwareEngineeringKnowledge.Springer,Germany,2003.

[2]V.Basili.GoalQuestionMetricParadigm,inEncyclopedia

ofSoftwareEngineering(J.Marciniaked.),pages528–532.JohnWeilyandSons,1994.

[3]V.Basili.Theexperimentalsoftwareengineeringgroup:A

perspective.ICSE’00awardpresentation,June2000.Lim-erick,Ireland.

[4]L.Briand,C.Differding,andD.Rombach.Practicalguide-linesformeasurement-basedprocessimprovement.Techni-calReportISERN-96-05,DepartmentofComputerScience,UniversityofKaiserslautern,Germany,1996.

[5]S.Card,J.Mackinlay,andB.Shneiderman.Readingsin

InformationVisualization:UsingVisiontoThink.Morgan-KaufmannPublishers,SanMeteo,CA,1999.

[6]EASE.TheEASE(EmpiricalApproachtoSoftwareEngi-neering)project,http://www.empirical.jp/intex-e.html.

[7]D.GermanandA.Mockus.Automatingthemeasurementof

opensourceprojects.InProceedingsofthe3rdWorkshoponOpenSourceSoftwareEngineering,pages63–67,Portland,Oregon,2003.

[8]J.D.Herbsleb,A.Mockus,T.A.Finholt,andR.E.Grinter.

Anempiricalstudyofglobalsoftwaredevelopment:Dis-tanceandspeed.InProceedingsofthe23rdinternationalconferenceonSoftwareengineering(ICSE’01),pages81–90,Toronto,Canada,2001.

[9]M.Ohira,R.Yokomori,M.Sakai,K.Matsumoto,K.Inoue,

andK.Torii.Empiricalprojectmonitor:Automaticdatacol-lectionandanalysistowardsoftwareprocessimprovement.InProceedingsof1stWorkshoponDependableSoftwareSystem,pages141–150,Tokyo,Japan,2004.[10]SPARS.TheSPARS(SoftwareProductArchiving

andRetrievingSystem)project,http://iip-lab.ics.es.osaka-u.ac.jp/SPARS/index.html.en.

[11]T.Zimmermann,P.Weissgerber,S.Diehl,andA.Zeller.

Miningversionhistoriestoguidesoftwarechanges.InPro-ceedingsofthe26thInternationalConferenceonSoftwareEngineering(ICSE’04),Edinburgh,Scotland,UK,2004(toappear).

本文来源:https://www.bwwdw.com/article/7481.html

Top