ABSTRACT Leakage Power Modeling and Optimization in Interconnection Networks

更新时间:2023-06-04 13:20:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Power will be the key limiter to system scalability as interconnection networks take up an increasingly significant portion of system power. In this paper, we propose an architectural leakage power modeling methodology that achieves 95-98 % accuracy agains

LeakagePowerModelingandOptimizationin

InterconnectionNetworks

XuningChenandLi-ShiuanPehDept.ofElectricalEngineering,PrincetonUniversity,NJ08544{xuningc,peh}@ee.princeton.edu

ABSTRACT

Powerwillbethekeylimitertosystemscalabilityasinter-connectionnetworkstakeupanincreasinglysigni cantpor-tionofsystempower.Inthispaper,weproposeanarchitec-turalleakagepowermodelingmethodologythatachieves95-98%accuracyagainstHSPICEestimates.Whenappliedtointerconnectionnetworks,combinedwithpreviousproposeddynamicpowermodels,wegainvaluableinsightsontotalnetworkpowerconsumption.Ourmodelingshowsrouterbu erstobeaprimecandidateforleakagepoweroptimiza-tion.Wethusinvestigatethedesignspaceofpower-awarebu erpolicies,proposeasuiteofpolicies,andexploretheimpactofvariouscircuitsmechanismsonthesepolicies.Simulationsshowpower-awarebu erssavingupto96.6%oftotalbu erleakagepower.

CategoriesandSubjectDescriptors

C.2.1[Computer-CommunicationNetworks]:Networkarchitectureanddesign

GeneralTerms

Measurement,Design

Keywords

Leakagepower,interconnectionnetworks,poweroptimiza-tion

1.INTRODUCTION

Aspowerbecomesthedominantconstraintinmanycom-putersystems,researchintopower-e cientsystemshasthrived.Inmanyofthesesystems,thenetworkfabricisasigni cantconsumerofpower.Thishasresultedinresearchersmodel-ing[9]andoptimizing[8,10]thedynamicpowerconsump-tionofinterconnectionnetworks.Astechnologyscalestodeepsub-micronprocesses,leakagepowerbecomesincreas-inglysigni cantascomparedtodynamicpower.Thereis

Permissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforpro torcommercialadvantageandthatcopiesbearthisnoticeandthefullcitationonthe rstpage.Tocopyotherwise,torepublish,topostonserversortoredistributetolists,requirespriorspeci cpermissionand/orafee.

ISLPED’03,August25–27,2003,Seoul,Korea.

Copyright2003ACM1-58113-682-X/03/0008...$5.00.

thusagrowingneedtocharacterizeandoptimizenetworkleakagepoweraswell.

Inthispaper,weproposeanewarchitecturalmethodologyforestimatingleakagepowerthatdistinguishestechnology-dependentfromtechnology-independentvariables,provid-ingthe exibilityofanarchitecture-levelpowermodelwherearchitecturalparameterssu ce,togetherwiththerigorousaccuracyofalow-levelmodel.Anaccuratemodelallowsarchitectstorapidlyestimateleakagepowerastheyiterateacrossalternativedesigns.Weappliedourmethodologytobothon-chipandchip-to-chipinterconnectionnetworks,andvalidatedourestimatesagainstHSPICE,obtaining95-98%accuracy.

Bycombiningourproposedleakagepowermodelwithadynamicpowermodel[9],wewereabletogatherinsightsonthetotalpowerconsumptionofnetworks,characteriz-ingthepowerbreakdownofvariousnetworkcomponentsastechnologyscales.Ourmodelingguidedustoinvestigateandproposepower-awarebu ersasaleakagepoweropti-mizationtechnique.Wethenexplorethedesignspaceofarchitecturalpoliciesforpower-awarebu ers,andproposeasuiteoftechniquesthatareabletosaveupto96.6%oftotalbu erleakagepower.

2.

ANARCHITECTURALLEAKAGEPOWERMODELINGMETHODOLOGY

Leakagecurrenthas vebasiccomponents:reversebiasedpnjunctioncurrent,sub-thresholdleakagecurrent,gatedin-duceddrainleakage,punchthroughcurrentandgatetun-nelingcurrent.Theseleakagecurrentcomponentshaveanalmostlinearrelationwithtransistorwidth.Forinstance,subthresholdcurrentIsubwhichcurrentlydominatesleakagecurrentisde nedasfollows[1]:

I1 exp( Vds

V)exp(VV gs th Vsub=I0

off)tnV(1)

t

IW

0=µ

qεsi·NDEPL

2ΦVt

2

(2)

s

Foragivencircuittypeiandinputstatesatapro-cesstechnology,subthresholdcurrentisalmostproportionaltothetransistorwidthW.Although,di erentcomponentswillhavedi erentimpactonleakagecurrentastechnologyscales,e.g.gatedinduceddraincurrentwillbecomemoreandmoresigni cant,thetotalleakagecurrentstillkeepsanalmostlinearrelationwithtransistorwidth.

Power will be the key limiter to system scalability as interconnection networks take up an increasingly significant portion of system power. In this paper, we propose an architectural leakage power modeling methodology that achieves 95-98 % accuracy agains

Sotoderiveanarchitecturalleakagepowermodel,we

cantransistorseparateprocesstechnology:

widththefromtechnology-independentthosethatstayinvariantvariablesforasuchspeci casIleak(i,s)=

W(type(i,s))L

·I

leak(i,s)

(3)

whereIleakagecurrent.I

leakistotalleakisleakagecur-rentperunittransistorwidthoverlength.W(type(i,s))referstothetransistorwidthofNMOSwhenNMOSdeter-minestheleakagecurrent(i.e.type(i,s)isN),orPMOS(whentype(i,s)isP).Astransistorwidthhasanegligi-blee ectonI

leak(i,s),Ileak(i,s)is xedforagivencircuittypeandinputstateundercertaintechnologyandature.Withthisapproximation,armedwithI

temper-leakforvar-iouskindsofcircuitcomponentsatdi erentinputstates,architectscanestimatetheleakagepowerforarchitecturalunitscomposedfromthesecircuitcomponents.Ourpro-posedmodelingmethodologyisasfollows:

1.Identifythefundamentalcircuitcomponents,andde-riveI

leak(i,s)foreachatdi erentinputstates.Exam-plesaresingleNMOSandPMOStransistors,NANDgates,inverters,etc.2.De nemajorarchitecturalbuildingblocks.Forinter-connectionnetworks,typicalbuildingblockswillbebu ers,crossbars,arbitersandlinks[9].Formicro-processors,suitablebuildingblockswillbecachelines,adders,etc.3.Identifythedistributionoftheinputstatesbasedonoperationcharacteristicsorsimulationandderivear-chitecturalequationsthatestimatetheleakagepowerforeachbuildingblock.Webelievethisisthe rstleakagepowermodelingmethod-ologythattrulyseparatestechnology-dependentandinde-pendentvariables.In[2],asinglekdesignisusedtore ectthecompositionofdevicetypes(N/P),geometries(W/L),states(on/o ),andstackingfactors.Asaresult,kdesignisextremelysensitivetochangesinanyofthevariablesandtheimpactofarchitecturalparametershardtoisolate.In

[6],Pleaklib

=χlib·CellsSlib

isusedtoestimatetheleak-agepowerinanASICdesignenvironment,whereχlib,Slibaretechnology-dependentparametersderivedthroughex-perimentsand”Cells”isthenumberofcellsinthedesign.Thismodeltargetsalaterdesignstagethenthearchitec-turalstage,whendesignersexplorevariouscircuitdesignsforaselectedarchitecture.

2.1DerivationofI

leak

Foreachcomponenti,wesimulateI

leak(i,s)usingHSPICEandtheBerkeleyPredictiveTechnologyModel[1]fortherangeofprocesstechnologiesandassociatedparametersinTable1.Table2liststheI

listed

leak(i,s)simulatedforeachfun-damentalcircuitcomponenti(Leakagecurrentsdi ersatdi erentstatesduetostakingandbodybiase ects).Cir-cuitstructurescanthenbehierarchicallycomposedfromthesefundamentalcircuitcomponents.

Table1:Parametersforvarioustechnologies.

th0dd

Table2:Ileak(i,s)foreachfundamentalcircuitcompo-

nentiatdi erentinputstatessattemperature80oC.Type(i,s)indicatesiftheNMOSorPMOStransistorisdominantindeterminingleakagecurrent.

I(i,s)

i

s(i,s)1P4.0e-99.7e-980.4e-901N7.9e-910.8e-946.0e-910N4.7e-95.1e-944.0e-911P8.1e-919.4e-9159.5e-901P3.6e-95.9e-945.3e-910P4.3e-99.7e-977.5e-911

P

0.9e-90.7e-9

5.9e-9

Figure1:AFIFObu erwith1readportand1write

port(adaptedfrom[4]).Tcisthepre-chargingtransis-tor,Twdthewordlinedriver,Tbdthewritebitlinedriver,Tmthememorycellinverter,andTprandTpwthepasstransistorsconnectingreadandwriteportstomemorycellsrespectively.

2.2Leakagepowermodelingofrouterbuffers

Weappliedourmethodologytothemajorbuildingblocksofinterconnectionnetworksasidenti edin[9]–bu ers,crossbars,arbiters,andlinks.Here,wewalkthroughourmodelingofrouterbu erstodemonstratethemethodology.Fig.1sketchesthecircuitstructureofarouterbu erpoolwithB it1bu ers,eachFbitswide,withPrreadportsandPwwriteports.ItshowsaFIFObu erthatiscomposedofthefundamentalcircuitcomponentsofPMOS,NMOStransistorsandinverters.Dimensionsofthecircuitstruc-turesuchashcell,wcell,dwareestimatedbyOrion[9]fromarchitecturalparameters.

Inputstateprobabilisticanalysis.Next,weanalyzetheprobabilitydistributionofeachinputstateofacircuitcomponentbyexamininghowarchitecturalunitsfunction.Forinstance,thewordlineinverterTwdisset(s=0)wheneverthatbu er/rowisreadorwritten.Thus,atanypointintime,onlyoneoutofBwordlineinverterswillbe

set.Hence,Ileak(Twd)=1W(type(INV,0))

·I leak(INV,0)+B 1W(type(INV,1)) ·Ileak(INV,1).Basically,giventheseI

leak(i,s),andtheprobabilitiesof1

segmentA itisofshortapacket.

for owcontrolunit,andisa xed-length

Power will be the key limiter to system scalability as interconnection networks take up an increasingly significant portion of system power. In this paper, we propose an architectural leakage power modeling methodology that achieves 95-98 % accuracy agains

eachblockinputstatePI=

rob (i,s),theleakagecurrentforabuilding

Prob(i,s)

W(type(i,s))leakis:

(Block)·I leak(i,s)(4)

i

s

L

whereW(type(s))referstothetransistorwidthofNMOS(when

type(s)isN)orPMOS(whentype(s)isP).

throughInputnetworkstatesimulation.simulation.

InputstatescanalsobetrackedIleak(Block,t)=

W(type(i,s(t)))L·I

leak(i,s(t))

(5)

i

where,Ileak(Block,t)istheleakagecurrentattimet,and

s(t)isthestateofcircuittypeiattimetwithinthiscircuitblock.

Finally,wecanestimatethetotalleakagecurrentofarouterbu er(Eq.6)whileitsleakagepowerisleakagecur-rentmultipliedbysupplyvoltage(Eq.7).

Ileak(buffer)=(Pr+Pw)BIleak(Twd)+2PwFIleak(Tbd)+2BFIleak(Tc)+2BFIleak(Tm)

+2BF(PwIleak(Tpw)+Pr·Ileak(Tpr))

(6)Pleak(buffer)=Ileak(buffer)·Vdd

(7)

2.3Validation

WevalidatedourmodelwithHSPICEsimulationofeachcompletefunctionalunitofachip-to-chiprouter(crossbar,arbiter,andbu ers)in0.07µmtechnology.Leakagecur-rentsunderdi erentinputstateswereestimatedwithourmodelandcomparedwiththeleakagecurrentsobtainedfromHSPICEsimulationforthesamefunctionalunitwiththesameinputstates,theexactstructureandfeaturesizes.Forinstance,a5-by-5matrixcrossbarunithas5datain-putsand25controlsignals.Thecombinationoftheirvaluesdeterminetheinputstateofthecrossbarandthustheleak-agecurrent.Forsuchfunctionalunitswithavastnumberofpossibleinputstates,weselectarandomsampleoftyp-icalinputstatesforvalidation.Theaccuracyofourmodelforthesefunctionalunitsiscomputedbyaveragingacrossdi erentinputstates.Table3showsmeanandstandardde-viationofourmodel’serrorin0.07µmtechnologycomparedwithHSPICEsimulation.Sinceleakagecurrentislargeat0.07µm,weexpectthemagnitudeoferrortobelargerthanthatinearlierprocesstechnology.

Table3:Validationofourmodelvs.HSPICEsimula-tionforeachmajorbuildingblockofarouter.

3.

DYNAMICANDLEAKAGEPOWERCH-ARACTERIZATIONOFINTERCONNE-CTIONNETWORKS

CombinedwithOrion,anarchitecturaldynamicpowermodelfornetworks[9],wecharacterizedthetotalpowerconsumptionofbothanon-chipnetworkandachip-to-chip

network.Theon-chipnetworkisparameterizedasin[3],witha4-by-4meshnetworkona12mm2chip,eachnodeclockedat1GHz,with5input/outputports(oneofwhichistheinjection/ejectionport),64 itbu ersperinputport(each it128bitswide),connectedwitha5-by-5matrixcrossbarand55:1arbiters.Therouterinthechip-to-chipnetworkhas256128-bit itbu ersperinputportinstead,otherparametersremainingthesameasthatintheon-chipnetwork.ThefeaturesizeofthetransistorsisderivedbyOrion[9]fromarchitecturalparametersbasedonthetimingdelayrequirementsandassumingminimumarea.

E ectofprocesstechnology.Tables4and5showtheestimatesforarouterinachip-to-chipandon-chipnetworkrespectivelyat50% itarrivingratein1sat80oC.Astech-nologyscales,leakagepowerbecomesincreasinglysigni -cant,startingfrom2.5%oftotal(leakage+switching)poweratcurrent0.18µmtechnology,toahefty60%at0.07µmtechnologyifclockfrequencyiskeptinvariantforthechip-to-chipnetwork.Evenassumingdoublingclockfrequenciesaswescaleprocesstechnology,leakagepowerremainsasig-ni cant27%at0.07µm.Thoughtheon-chipnetworkhasfewerstorageelements,leakagepowerstillrisestoasigni -cant21%at0.07µm,assumingclockfrequencydoubleseachprocessgeneration.

Table4:Dynamicandleakagepowerestimatesofa

routerinachip-to-chipnetwork.

power(W)power(W)Distributionofleakagepowerbetweenrouterandlinks.FromTable5,itisevidentthatfull-swingon-chiplinkdriversandwiresconsumesubstantialdynamicpower,overwhelmingthatoftheroutercorein0.10and0.07µmpro-cesses.However,whenyoulookatleakagepowerconsump-tionofroutervs.links,theconverseistrue.Aswiresdonotdissipateleakagepower,theleakagepowerconsumptionofjustthedriversisminimal,comparedtothatoftheroutercore.Thispromptedustodelveintoaleakagepowerbreak-downofvariousfunctionalunitswithinanon-chiprouter.Breakdownofleakagepowerwithinarouter.Fig.2showstheleakagepowerconsumedbythevariousmajorfunctionalunitsofanon-chiprouteranditslinksatdif-ferentprocesstechnologies.Itshowsbu ersconsumingap-proximately64%percentleakagepowerofthetotalnode(router+link)forallprocesstechnologies,standingasthelargestleakagepowerconsumer.Ourcharacterizationhigh-lightsrouterbu ersasaprimecandidateforleakagepoweroptimization.

4.POWER-AWAREBUFFERS

Asinterconnectionnetworksexperiencesigni canttem-poralandspatialvarianceinworkloadthatleadstohighlyvaryingbu erutilization,weproposepower-awarebu ersasanarchitecturaltechniqueforleakagepoweroptimizationininterconnectionnetworks–i.e.bu ersthatregulatetheirownleakagepowerconsumptionbasedonactualutilization.

Power will be the key limiter to system scalability as interconnection networks take up an increasingly significant portion of system power. In this paper, we propose an architectural leakage power modeling methodology that achieves 95-98 % accuracy agains

Table5:Dynamicandleakagepowerestimatesofanon-chiprouteranditslinks.

Figure2:Leakagepowerdistributionacrossthemajor

functionalunitsofanon-chiprouter:bu ers,arbiters,

crossbarandlinks.Arbiterleakagepowerisnegligibleandnotvisibleinthe gure.

Toexplorethepotentialofpower-awarebu ers,we rstcharacterizenetworkbu erutilizationwiththetra cmodelproposedin[8],withPoissontaskinter-arrivalrate,andself-similarpacketinter-arrivalrateswithineachtaskses-sion.Thisworkloadexhibitsthehightemporalandspatialvariancepresentinmanyreal-lifenetworks.Wesimulatethechip-to-chipnetworkdescribedinSec.3(2virtualchan-nelsperport).Fixed-lengthpacketsof20 itsareassumed.Fig.3graphstheaverageandminimumnumberofidlebu ersastra cincreases.Asexpected,alargenumberofbu ersisleftidleatlowinjectionrates.Interestingly,whiletherearerouterswhosebu ersarefully-occupied(minimumnumberofidlebu ers=0)athighnetworkload,averagebu erutilizationremainsratherlow,withabout85%idlebu ers.Thisisre ectiveofthehighvarianceinthework-loadthatresultsinalargegapbetweenaverageandmax-imumnetworkutilizationthatisinherentinmanyactualworkloads.Clearly,placingtheseidlebu ersinaninactivemodethatuseslessleakagepowerwillresultinsigni cantleakagepowersavings.

4.1Power-awarebufferpolicydesign

Arouterbu erisutilizedinastream-likefashion.Whena itentersarouter,itgetswrittenintoanunoccupiedbu er,andsitstherewhileaseriesofrouteroperationsistriggered:routing,virtual-channelallocationandswitchal-location.Whenitisscheduledtoleavetherouter,the itisreadfromthebu erpool,andthebu eristhenmarkedasunoccupiedandreleasedbacktothefreelist,readytobereusedwhenanew itenterstherouter.

Wetermapolicythatturnsabu ertoinactivemodeonlywhenit’sunoccupiedsingleandonethatswitchesabu ertoinactivemodeanytimeit’snotbeingaccessed,i.e.whenit’sbothunoccupiedandoccupied,double.Toevalu-atethee ectivenessofanypolicy,weneedayardstick–wede netwotheoreticallyideal,thoughunachievable,policies:

Figure3:Averageandminimumnumberofidlebu ers

outof128 its/bu er.

Ideal-Single,thatreducesleakagepowertozeroinstantlyforbu ersthatareunoccupiedwithnoadditionalpowerover-head,andIdeal-Double,thatdoessosimilarlyforbu erswhentheyarebothunoccupiedandnotbeingaccessed.Apower-awarebu erpolicycanbeoblivious,i.e.itdoesnottakecurrentbu erutilizationorworkloadintoaccount;oradaptive,tuningthepolicyaccordingtocurrentutiliza-tion.Itcanalsobeconservative,makingsurenetworkper-formanceisnotimpacted,vs.aggressive,targetingasmuchleakagepowersavingsaspossible,evenifthiscomesattheexpenseofnetworkperformance.

Fig.4showsthedesignspaceofpower-awarebu erpoli-ciesthatweenvision,andseveralsimplepoliciesthatweproposeateachdesignpoint.Eachpolicycantargeteithersingleordoubleleakagepowersavings.First,weproposeaconservativepolicy,Lookahead,thatobliviouslyplacesbu ersinlow-leakagemodeandwakesthemupNcyclesbeforetheyareaccessed.Whena itisreadfromthebu erqueue,thatbu erwillbeswitchedtothelow-leakagein-activemode(iftherearemorethanNfreebu ers),andwhena itarrivesandiswrittenintoarouterbu eratthetailofthequeue,thebu erthatisNcellsaheadwillbeswitchedtonormaloperatingmode.Thepolicyisconser-vativeasitsetsthelookaheadwindowofNtothenumberofcyclesneededtoswitchabu erfrominactivetoactivemode(transitiondelay),soafreebu erwillalwaysbeavail-ablewhen itsarrive,andnetworkperformancewillneverbea ected.Clearly,ifthebu ersizeBislessthanN,ourpolicywillresultinnoleakagepowersavings.Anaggres-sivevariantofthispolicy,Lookahead-AggsimplyshortensNtolessthanthetransitiondelay,tradingo performanceforhigherleakagepowersavings.OurimplementationofLookaheadinsertsanewlyfreedbu erbackattheheadofthefreelist,soanactivebu erhasthehighestchanceofreuse,minimizingtheimpactonnetworkperformancesigni cantly.Asimpleadaptivepolicy,wecallPredictive,usespriorbu erutilizationhistorytopredictfutureusage,adjustingthelookaheadwindowNaccordingly.Weusea

Power will be the key limiter to system scalability as interconnection networks take up an increasingly significant portion of system power. In this paper, we propose an architectural leakage power modeling methodology that achieves 95-98 % accuracy agains

AdaptiveOblivious

Conservative

Aggressive

Figure4:Designspaceofpower-awarebu erpolicies.simplestatistic–whentherearemorewritesthanreadstoabu erinatimewindowW,Nisincrementedtillithitsanupper-boundNhigh.Otherwise,itisdecrementedtoalower-boundNlow.Theintuitionisthatwhenbu erwritesoutnumberreads,thebu erpoolisbuildingup,withfewerandfewerfreebu ers,soanadaptivepolicyshouldbelessaggressiveinswitchingbu erstoinactivemodeinordertoenhancenetworkperformance.Conversely,whenmore itsareleavingratherthanenteringtherouter,anadaptivepol-icycanmoreaggressivelyswitcho bu ers,guessingthatfewerwillbeneeded.

4.2Circuit-levelmechanisms

Power-awarebu ersrequirecircuit-levelmechanismsthatallowbu erstobeputintoinactivemodeforleakagepowersavings.Severalcircuit-levelmechanismshavebeenpro-posedforleakagepowersavingsinSRAMs[4,7],targetedformicroprocessorcaches.Sincerouterbu ersareusuallyconstructedwithSRAMs,thesecanbereadilyappliedtopower-awarebu ers.

Thecharacteristicsofcircuit-levelmechanismsthatarecriticaltopower-awarebu ersare:(1)transitiondelay-thetimeittakestoswitchabu erbetweenthenormalop-eratingmodeandtheinactivemode;(2)transitionenergy-thedynamicenergyincurredeachtimetoe ectatran-sition;(3)leakagepowersavings-thedi erencebetweentheleakagepowerincurredatnormaloperatingmodeandthatatinactivemode;and(4)datapreservation-whethertheinactivemodepreservesthecontentsoftheSRAMs,i.e.whetherthiscircuittechniquecanbeappliedtobothsingleanddoublepower-awarebu erpolicies.

Inthispaper,wechoosetwocircuit-levelmechanismswithfairlydi erentcharacteristics–Drowsy[4],andGatedVddSRAMs[7].DrowsySRAMshavefastertransitiondelaysthanGatedSRAMs,preservesdatacontent,butdeliverslessleakage2energysavingsintheinactivemodeasshowninTable6.Bothtechniqueshavenegligiblee ectontheaccesstime.

5.EXPERIMENTALRESULTS

WeextendaC++networksimulatortoinvestigatethepower-performanceofpower-awarebu ers[5].Thesamesetofrouterparametersasthatinsection3withan8-by-8meshin0.07µmtechnologyisused.Hereaveragelatency,tencyreferstothetimefromthecreationofthe rst itofthepackettilltheejectionofitslast itfromthenetworkatthedestination,throughputreferstotheinjectionrateatwhichaveragenetworklatencyexceedstwicethelatencyatzeronetworkload,andleakage2

asWhileweassumedthecharacteristicsofitpublished,thatitdoesnotpreservedataGatedininactiveVddSRAMSmode,poorercanbetransitionsizedtodelay.

ensuredatapreservationthoughwithapowersavingsisexpressedasapercentageofthetotalleak-agepowerconsumedbyrouterbu ers.Simulationsarerunfor1millioncycles.

LeakagepowersavingsofLookaheadpolicy.Fig.5comparesthee ectivenessoftheconservativeLookaheadpolicy(N=10forGatedVdd,and1forDrowsycells)againsttheidealpolicies.Ideal-Doublesavescloseto100%ofbu erleakagepower,sinceitonlykeepsabu eractiveduringaccesses.Ideal-SinglegetssavingsclosetothatofIdeal-Doubleatlowtra cworkloadsas itsdonotstayinbu ersforlong.Astra cincreases,however,notshut-tingbu erso whentheyareoccupiedin-betweenwritesandreadsresultinalmost10%lessleakagepowersavings.Asimilardi erenceisobservedbetweenLookahead-SingleandLookahead-DoublewithDrowsycells.

With256 it-bu ersateachrouterinputport,Lookahead-SinglesavesmoreleakagepowerwithGatedVddratherthanDrowsycells.WhilethelongtransitiondelayofGatedVddresultsinalargeN=10,potentiallyleadingtoupto9fewerbu ersturnedinactive,thisisoverwhelmedbytheremain-ingsubstantialnumberofbu ersthatcanstillbeleveraged.Thus,withlargebu ers,thehigherleakagepowersavingsperSRAMcellofGatedVddleadstohigheroverallnetworkpowersavingsascomparedtoDrowsySRAMs.

Theconverseishowevertruewithsmallerbu ers(Fig.6).Here,thelargeNofLookahead(GatedVdd)constrainsthenumberofbu ersthatcanbeturnedinactive,andthelowtransitiondelayofDrowsycellswinover.Notethatastraf- crateincreases,however, itsoccupybu ersforalongertime,soLookahead-Single(Drowsy)isunabletoexploititsfasttransitiondelay.Lookahead-Double(Drowsy)howeverleveragesthisforhigherleakagepowersavingsathightra cinjectionrates.

Leakagepowersavingsofaggressiveandpredic-tivepolicies.WesimulatedsingleLookahead-Aggpoli-cies,withalookaheadwindowNshortenedfrom10to4and2,forGatedVdd.ThePredictivepolicysimulatedhasW=10,Nlow=1,Nhigh=2.Fig.7showsthatasexpected,Lookahead-AggimprovestheleakagepowersavingsofLooka-head,pushingsavingsupto81%atlowtra c.Predictivepushesitevenfurther,upto88%savingsatlowtra c.Evenatveryhightra cloads,Predictivestillsaves71%leakagepower,asitbetteradaptstoactualutilization.Thisshowsthatevenasimpleadaptivepolicycanoutperformobliviouspolicies.

Performanceimpactofpower-awarebu erpoli-cies.Lookahead,beingaconservativepolicy,doesnothaveanimpactonperformanceasitalwaysensurestherewillatleastbeanactivebu eravailableawaitinganarriving it.However,theaggressiveLookahead-AggandPredictivepoli-ciescanpotentiallycauseperformancepenalties.Fig.8sim-ulatesthelatency-throughputperformanceofthesetwopoli-cies,showingnegligibleperformancedegradationforbothpoliciesascomparedtoanetworkwithnopower-awarebu ers.

6.CONCLUSIONS

Wehaveproposedamethodologyformodelingleakagepoweronthearchitecturelevel.Tofacilitatetheuseof

thismethodology,wewilldistributetheI

leaktablesonline.WeherealsoincorporatedournetworkarchitecturalleakagepowermodelsintoOrion[9]soarchitectscaneasilyfactorindynamicandleakagepowerestimateswhenevaluating

Power will be the key limiter to system scalability as interconnection networks take up an increasingly significant portion of system power. In this paper, we propose an architectural leakage power modeling methodology that achieves 95-98 % accuracy agains

Figure6:LeakagepowersavingsunderLookahead-Singlepolicyfor64- itbu er.

networkarchitectures.

Bydelineatingthedesignspaceforpower-awarebu erpolicies,andexploringtheimpactofseveralsimplealterna-tives,wehopeourworkwillmotivatetheproposalofso-phisticatedpoliciesinthefuture.

Acknowledgments

TheauthorsaregratefultoK.FlautnerofUniversityofMichiganandS.KimofCarnegieMellonforprovidingde-tailedparametersofDrowsyCacheandGatedVddrespec-tively.AtPrinceton,wewishtothankHang-ShengWangforhishelpincharacterizingdynamicpowerusingOrion

Figure7:

LeakagepowersavingsunderLookahead-AggressiveandPredictivepoliciesfor64- itbu er.

Figure8:Averagelatencyunderdi erentpoliciesfor

GatedVddSRAMs.

andLiShangforassistancewiththePopNetnetworksimu-lator.ThisworkispartiallyfundedbyNSFCAREERgrantCCR-0237540.

7.REFERENCES

[1]BerkeleyPredictiveTechnologyModelandBSIM4.

Availableat

http://www-device.eecs.berkeley.edu/research.html.[2]J.AButtsandG.S.Sohi,“Astaticpowermodelfor

architects”,InProc.Intl.Symp.Microarchitecture,Califonia,Dec.2000,pp.191–201.

[3]W.J.DallyandB.Towles.“Routepackets,notwires:

On-chipinterconnectionnetworks”,InProc.DesignAutomationConference,LasVegas,June2001

[4]K.Flautner,N.S.Kim,S.Martin,D.Blaauw,andT.

Mudge,“Drowsycaches:simpletechniquesforreducingleakagepower”,puterArchitecture,Alaska,May2002,pp.219-230.

[5]http://www.ee.princeton.edu/~lshang/popnet.html

[6]R.Kumar,C.P.Ravikumar,“Leakagepowerestimation

fordeepsubmicroncircuitsinanASICdesigninvironment”,InProc.ASP-DAC/VLSIDesign,Bangalore,India,2002.pp.45-50.

[7]M.Powell,S.-H.Yang,B.Falsa ,K.Roy,andT.N.

Vijaykumar,“Gated-Vdd:acircuittechniquetoreduceleakageindeep-submicroncachememories”,InProc.Intl.Symp.LowPowerElectronicsandDesign,Italy,July,2000,pp.90-95.

[8]L.Shang,L.-S.Peh,andN.K.Jha,“Dynamicvoltage

scalingwithlinksforpoweroptimizationofinterconnectionnetworks”,InProc.Intl.Symp.onHigh-PerformanceComputerArchitecture,California,Jan.2003,pp.79-90.[9]H.Wang,X.Zhu,L.-S.Peh,andS.Malik,“Orion:a

power-performancesimulatorforinterconnectionnetworks”,InProc.Intl.Symp.Microarchitecture,Istanbul,Turkey,Nov.2002,pp.294-305.

[http://www.ee.princeton.edu/~peh/orion.html]

[10]F.Worm,P.Ienne,P.ThiranandG.D.Micheli,“An

adaptivelowpowertransmissionschemeforon-chip

networks”,InProc.Intl.Symp.SystemsSynthesis,Kyoto,Japan,October2002,pp.92-100.

本文来源:https://www.bwwdw.com/article/ldl1.html

Top