Base-Resolution Analysis of 5-Hydroxymethylcytosine in the mammalian genome

更新时间:2023-07-19 02:11:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Base-ResolutionAnalysisof5-HydroxymethylcytosineintheMammalianGenome

MiaoYu,1,5GaryC.Hon,2,5KeithE.Szulwach,3,5Chun-XiaoSong,1LiangZhang,1AudreyKim,2XuekunLi,3QingDai,1YinShen,2BeomseokPark,4Jung-HyunMin,4PengJin,3,*BingRen,2,*andChuanHe1,*

ofChemistryandInstituteforBiophysicalDynamics,TheUniversityofChicago,929E.57thStreet,Chicago,IL60637,USAInstituteforCancerResearch,DepartmentofCellularandMolecularMedicine,UCSDMooresCancerCancerandInstituteof

GenomeMedicine,UniversityofCalifornia,SanDiegoSchoolofMedicine,9500GilmanDrive,LaJolla,CA92093-0653,USA3DepartmentofHumanGenetics,EmoryUniversitySchoolofMedicine,615MichaelStreet,Atlanta,GA30322,USA4DepartmentofChemistry,TheUniversityofIllinoisatChicago,845WestTaylorStreet,Chicago,IL60606,USA5Theseauthorscontributedequallytothiswork

*Correspondence:peng.jin@emory.edu(P.J.),biren@ucsd.edu(B.R.),chuanhe@uchicago.edu(C.H.)DOI10.1016/j.cell.2012.04.027

2Ludwig1Department

SUMMARY

Thestudyof5-hydroxylmethylcytosines(5hmC)hasbeenhamperedbythelackofamethodtomapitatsingle-baseresolutiononagenome-widescale.Af nitypuri cation-basedmethodscannotpreciselylocate5hmCnoraccuratelydetermineitsrelativeabundanceateachmodi edsite.Weherepresentagenome-wideapproach,Tet-assistedbisul tesequencing(TAB-Seq),thatwhencombinedwithtraditionalbisul tesequencingcanbeusedformapping5hmCatbaseresolutionandquantifyingtherelativeabundanceof5hmCaswellas5mC.Applicationofthismethodtoembryonicstemcellsnotonlycon rmswidespreaddistributionof5hmCinthemammaliangenomebutalsorevealssequencebiasandstrandasymmetryat5hmCsites.Weobservehighlevelsof5hmCandreciprocallylowlevelsof5mCnearbutnotontranscriptionfactor-bindingsites.Additionally,therelativeabundanceof5hmCvariessigni cantlyamongdistinctfunc-tionalsequenceelements,suggestingdifferentmechanismsfor5hmCdepositionandmaintenance.

INTRODUCTION

5-methylcytosine(5mC)inmammaliangenomicDNAisessentialfornormaldevelopmentandimpactsavarietyofbiologicalfunc-tions.In2009,5-hydroxymethylcytosine(5hmC)wasdiscoveredasanotherrelativelyabundantformofcytosinemodi cationinembryonicstemcells(ESCs)andPurkinjeneurons(KriaucionisandHeintz,2009;Tahilianietal.,2009).TheTETproteins,whichareresponsibleforconversionof5mCto5hmC,havebeenshowntofunctioninESCregulation,myelopoiesis,andzygotedevelop-ment(Dawlatyetal.,2011;Guetal.,2011;Iqbaletal.,2011;Itoetal.,2010;Koetal.,2010;Kohetal.,2011;Wossidloetal.,

2011).5hmCwasfoundtobewidespreadinmanytissuesandcelltypes,althoughwithdiverselevelsofabundance(Globisch

¨nzeletal.,2010;Songetal.,2011;Szwagierczaketal.,2010;Mu

etal.,2010).Proteinsthatcanrecognize5hmC-containingDNAhavealsobeeninvestigated(Fraueretal.,2011;Yildirimetal.,2011).Inaddition,5hmCcanbefurtheroxidizedto5-formylcyto-sine(5fC)and5-carboxylcytosine(5caC)byTETproteins(Heetal.,2011;Itoetal.,2011;Pfaffenederetal.,2011),anddemethy-lationpathwaysthroughthesemodi edcytosineshavebeenshown(Cortellinoetal.,2011;Guoetal.,2011;Heetal.,2011;MaitiandDrohat,2011;Zhangetal.,2012).Together,thesestudiesprovideanemergingparadigminwhich5mCoxidationplaysimportantrolesinsculptingacell’sepigeneticlandscapeanddevelopmentalpotentialthroughtheregulationofdynamicDNAmethylationstates.

Strategiestolabeland/orenrich5hmCingenomicDNAhavebeendevelopedtoinvestigatethedistributionandfunctionof5hmCinthegenome(Ficzetal.,2011;Pastoretal.,2011;Rob-ertsonetal.,2011,2012;Songetal.,2011;Stroudetal.,2011;Williamsetal.,2011;Wuetal.,2011;Xuetal.,2011).Although5hmCismoreenrichedingenebodiesthantranscriptionstartingsitesinmousecerebellum(Songetal.,2011;Szulwachetal.,2011b),allgenome-widemapsof5hmCinhumanESCs(hESCs)andmouseESCs(mESCs)indicatethat5hmCtendstoexistingenebodies,promoters,andenhancers(Ficzetal.,2011;Pastoretal.,2011;Stroudetal.,2011;Szulwachetal.,2011a;Williamsetal.,2011;Wuetal.,2011;Xuetal.,2011).However,inallcases,theresolutionofthesemapswasrestrictedbythesizeoftheimmunoprecipitatedorchemicallycapturedDNA,whichvariedfromseveralhundredtooverathousandbases.

Thestudyof5mChasbeenfacilitatedbythedevelopmentofwhole-genomebisul tesequencingmethodsthatcanresolvethegenomiclocationofmethylcytosineatsingle-baseresolution(Cokusetal.,2008;Listeretal.,2008,2009).However,currentbisul tesequencingmethodscannotdistinguishbetween5mCand5hmC(Huangetal.,2010;Jinetal.,2010).Therefore,thegenome-widebisul tesequencingmapsgeneratedinrecentyearsmaynotaccuratelycapturethetrueabundanceof5mC

Cell149,1–13,June8,2012ª2012ElsevierInc.1

βGTUDP-Glc

5hmC

OH

HOHO

ON

O

NH2

NO

D

anti-5mC anti-5hmC anti-5fC anti-5caC

mES DNA

5gmC

Tet

βGT

5caC

βGT/mTet1

C (original hmC)

B

(only one strand is shown) X= 5mC or 5hmCoriginal sequence

GTATATTmGTATAT T TATAT T T T

original sequence5hmC oligo after TAB-Seq

G T A T A T T hmCATATTTTGTATAT T CA TAT T T T

5mC oligo after TAB-Seq

C

5’-GAC5-mCGGAGT-3’ m/z = 2777.9

Percent intensity

5’-GAC5-hmCGGAGT-3’ m/z = 2793.9

Percent intensity

βGT +UDP-Glc

Δm/zcal = 0Δm/zobs = 0.5

100

βGT +UDP-Glc

Δm/zcal = 162.2 Δm/zobs = 161.1

5’-GAC5-mCGGAGT-3’ m/zcal = 2777.9

5’-GAC5-gmCGGAGT-3’ m/zcal = 2956.1

Percent intensity

80604020

Percent intensity

mTet1

Δm/zcal = 30.0Δm/zobs = 30.0

100

mTet1

Δm/zcal = 0Δm/zobs = 1.5

5’-GAC5-caCGGAGT-3’ m/zcal = 2807.9

5’-GAC5-gmCGGAGT-3’ m/z = 2956.1

Percent intensity

8060

Percent intensity

4020

Figure1.TAB-SeqStrategyandValidation

(A)SchematicdiagramofTAB-Seq.5hmCsingenomicDNAareprotectedbyglucosylation,andthen5mCsareconvertedto5caCsbyTet-mediatedoxidation.Afterbisul tetreatment,both5caC(generatedfrom5mC)andCdisplayasT,whereas5gmC(generatedfromoriginal5hmC)displaysasC.

(B)TAB-Seqof76-merdsDNAwith5mCor5hmC.The76-merdsDNAwith5mC(left)or5hmC(right)modi cationwassubjecttoTAB-Seqasdescribedin(A).Sangersequencingresultsshowedthat5mCwascompletelyconvertedtoT(left)and5hmCwasstillreadasC(right).

(C)MassspectrometrycharacterizationoftheproductsfromTAB-SeqwithamodelDNA.ThedsDNAcontainsa5mC(left)or5hmC(right)ona9-merstrandannealedtoa11-mercomplementarystrand.TheDNAwassubjecttobGT-mediatedglucosylationandmTet1-mediatedoxidation.Thereactionsweremoni-toredbyMALDI-TOF/TOFwiththecalculatedandobservedmolecularweightindicated.

2Cell149,1–13,June8,2012ª2012ElsevierInc.

ateachbaseinthegenome.Amoredetailedunderstandingofthefunctionof5hmCaswellas5mChas,therefore,beenhamperedbythelackofasingle-baseresolutionsequencingtechnologycapableofdetectingtherelativeabundanceof5hmCpercytosine.

HerewepresentaTet-assistedbisul tesequencing(TAB-Seq)strategy,whichprovidesamethodforsingle-baseresolu-tiondetectionof5hmCamenabletobothgenome-wideandloci-speci csequencing.Applyingthismethod,wehavegener-atedgenome-wide,single-baseresolutionmapsof5hmCinESCs.Distinctclassesoffunctionalelementsexhibitvariableabundanceof5hmC,withpromoter-distalregulatoryelementsharboringthehighestlevelsof5hmC.Highlevelsof5hmCandreciprocallylowlevelsof5mCcanbefoundnearbindingsitesoftranscriptionfactors.Incontrastto5mC,5hmCsitesdisplaystrandasymmetryandsequencebias.Finally,thebase-resolutionmapsof5hmCprovidemoreaccurateestimatesofboth5hmCand5mClevelsateachmodi edcytosinethanpreviouswhole-genomebisul tesequencingapproaches.OurresultssupportadynamicDNAmethylationprocessatdistal-regulatoryelementsandsuggestthatdifferentmechanismsofDNAmodi cationmaybeinvolvedatdistinctclassesoffunc-tionalsequencesinthegenome.RESULTS

TAB-SeqofModelDNAandSpeci cLoci

Traditionalbisul tesequencingcannotdiscriminate5mCfrom5hmCbecausebothresistdeaminationbybisul tetreatment(Huangetal.,2010;Jinetal.,2010).WehaverecentlyfoundthatTETproteinsnotonlyoxidize5mCto5hmCbutalsofurtheroxidize5hmCto5caC,andthat5caCexhibitsbehaviorsimilartothatofunmodi edcytosineafterbisul tetreatment(Heetal.,2011;Itoetal.,2011).Thisdeaminationdifferencebetween5caCand5mC/5hmCunderstandardbisul tecondi-tionsinspiredustoexploreTAB-Seq.Inthisapproach,weuseb-glucosyltransferase(bGT)tointroduceaglucoseonto5hmC,generatingb-glucosyl-5-hydroxymethylcytosine(5gmC)toprotect5hmCfromfurtherTEToxidation.Afterblockingof5hmC,all5mCisconvertedto5caCbyoxidationwithanexcessofrecombinantTet1protein.Bisul tetreatmentoftheresultingDNAthenconvertsallCand5caC(derivedfrom5mC)touracilor5caU,respectively,whereastheoriginal5hmCbasesremainprotectedas5gmC.Thus,subsequentsequencingwillreveal5hmCasC,which,whencombinedwithtraditionalbisul tesequencingresults,willprovideanaccurateassessmentofabundanceofthismodi cationateachcytosine(Figure1A).We rstcon rmedthat5gmCisreadasCintraditionalbisul tesequencing(datanotshown).WeclonedandexpressedthecatalyticdomainofmouseTet1(mTet1)(FigureS1Aavailableonline),aspreviouslyreported(Itoetal.,2010).Wetestedadouble-strandedDNA(dsDNA)withsite-speci callyincorpo-rated5mCor5hmCmodi cation(Figure1B).Application

ofourmethodwithSangersequencingofthePCR-ampli edproductsshowedthattheoriginal5mCwascompletelycon-vertedintoTaftertreatment,indicatingef cientoxidationof5mCto5caCbymTet1(Figure1B).However,theoriginal5hmCwassequencedasC,con rmingthattheprotected5gmCisresistanttodeaminationunderbisul tetreatment(Figure1B).Theproductsofeachstepwerecon rmedbyMALDI-TOF/TOFusingashortermodelduplexDNA(Figure1C).Fullconversionof5mCinthecontextofgenomicDNAwasalsocon rmedbyconventionalbisul te,PCR,andbothSangerandsemiconductorsequencing(FiguresS1BandS1C).Additionally,applicationtogenomicDNAcon rmedconversionof5mCto5caCandprotectionof5hmC,andthat5fCisundetectablebyimmunoblotonthe nalreactionproducts(Figure1D).Thus,couplingbGT-mediatedtransferofglucoseto5hmCwithmTet1-catalyzedoxidationof5mCto5caCenablesthedistinc-tionof5hmCfrombothCand5mCaftersodiumbisul tetreatment.

Theabilitytodistinguish5hmCatbaseresolutionoffersasigni cantopportunitytofurtherparseDNAmethylation/hydroxymethylationstatesatspeci cgenomicloci.Weappliedtraditionalbisul tesequencingandTAB-Seqtoknown5hmC-enrichedlociinmousecerebellumthatwereidenti edpreviously(Songetal.,2011;Szulwachetal.,2011b).Comparingthesequencingresults,wewereabletoidentifygenuine5hmCand5mCsites(FigureS1D).

GenerationofBase-ResolutionMapsof5hmCinESCsWenextappliedTAB-SeqtogenomicDNAfromH1hESCsandE14Tg2amESCsandsequencedtoanaveragedepthof26.53and173percytosine,respectively.Successfuldetectionof5hmCisgovernedbythreekeyparameters:(1)ef cientconversionofunmodi edcytosinetouracil;(2)ef cientconver-sionof5mCto5caU/U;and(3)ef cientprotectionof5hmC.TodirectlyassesstheseconversionratesinthecontextofgenomicDNA,sequencedsampleswerespikedinwithfragmentsoflambdaDNAampli edbyPCRtocontainthreedistinctdomainshavingeitherunmodi edcytosine,5mC,or5hmC.Weobservelownonconversionratesforunmodi edcytosine(0.38%)and5mC(2.21%),contrastedtoahighnonconversionrateof5hmC(84.4%)(FigureS2B).Furtheranalysisindicatesthatthislattervalueisanunderestimateofthetrue5hmCprotectionrateinH1,whichiscloseto92.0%(FiguresS2DandS2E).Thesedatafurthercon rmthecapabilityofTAB-Seqforrobustdistinc-tionof5hmCfrom5mCandunmodi edcytosineinthecontextofgenomicDNA.

WenextfocusedouranalysisonthemapofH1hESCs.Tocon dentlyidentify5hmC-modi edbases,wetookadvan-tageofthehighlyannotatedH1methylomegeneratedwithmethylC-Seq,whichidenti esboth5mCaswellas5hmC.Accordingly,werestrictedoursearchfor5hmCtothesubsetofmethylatedbasespreviouslyidenti edbymethylC-Seq(Listeretal.,2009).Theprobabilitythatacytosinecanbecon dently

(D)Validationwithwesternblottingof5mCand5hmCconversioningenomicDNA(mouseES).TheuntreatedDNA,bGT-treatedDNA,andbGT/mTet1-treatedDNAweretestedwithdotblotassaysusingantibodiesagainst5mC,5hmC,5fC,and5caC,respectively.No5hmCcouldbeobservedafterglucosylation.Almostall5mCswereconvertedinto5caCsafterthemTet1-mediatedoxidation.SeealsoFigureS1.

Cell149,1–13,June8,2012ª2012ElsevierInc.

3

AB

CE

Figure2.GenerationofGenome-wideBase-ResolutionMapsof5hmC

(A)Snapshotofbase-resolution5hmCmaps(red)comparedtoaf nity-based5hmCmaps(gray)inH1cellsnearthePOU5F1gene.Alsoshownarebase-resolutionmapsoftraditionalbisul tesequencinginH1cells(black/gray).Positivevalues(darkershades)indicatecytosinesontheWatsonstrand,whereasnegativevalues(lightershades)indicatecytosinesontheCrickstrand.For5hmC,theverticalaxislimitsareÀ50%to+50%.Fortraditionalbisul tesequencing,thelimitsareÀ100%to+100%.OnlycytosinessequencedtodepthR5areshown.

(B)Overlapof5hmCwith82,221genomicregionspreviouslyidenti edasenrichedwith5hmCbyaf nitymapping(black),incomparisontorandomlychosen5mC(white)(seeExtendedExperimentalProcedures).

(C)Sequencecontextof5hmCsitescomparedtothereferencehumangenome.

(D)Heatmapofestimatedabundancesof5hmCand5mCformodi edcytosinessigni cantlyenrichedwith5hmC.5mCwasestimatedastheratefromtraditionalbisul tesequencing(5hmC+5mC)minusthemeasured5hmCrate.

(E)Thedistributionofestimatedabundancesof5hmC(red)and5mC(green)at5hmCsites.m:median.Errorbarsindicatestandarddeviation(SD).SeealsoFigureS2.

identi edas5hmCisgovernedbythesequencingdepthatthecytosineandabundanceofthemodi cation(FigureS2C).Modelingthisprobabilisticeventwithabinomialdistribution(Listeretal.,2009)withNasthedepthofsequencingatthecyto-sineandpasthe5mCnonconversionrate,weidenti edatotalof691,4145hmCswithafalsediscoveryrateof5%(FigureS2F;seeExtendedExperimentalProcedures).Givenanaveragesequencingdepthof26.5,ourassaycanonaverageresolve5hmChavinganabundanceof20%orhigher(FigureS2C).Genomicpro lesofabsolute5hmClevelsarecomparabletoamappreviouslygeneratedwithanaf nity-basedapproach(Szulwachetal.,2011a)(Figure2A).Assequencedfragmentsareequallydistributedamongthepopulationofcells,TAB-Seqprovidesasteady-stateglimpseof5hmCintheentirepopula-tion.Thisisincontrasttoaf nity-basedapproaches,which

4Cell149,1–13,June8,2012ª2012Elsevier

Inc.

biassequencingtoward5hmC-enrichedDNAfragments.ByTAB-Seq,identi ed5hmCsarehighlyclustered,unlike5mCs(FigureS3A),andtrackwellwithpeaksof5hmCenrichmentpreviouslyidenti edbyaf nitysequencing(Figure2A).Thereare7.6timesasmany5hmCsoverlappingaf nity-identi edregionsasexpectedbychance(Figure2B,Z-score=1,579).Furthermore,81.5%ofthese82,221af nity-identi edregionswererecoveredbyatleastone5hmC.Incontrast,only35.6%of5hmCsarerecoveredbyaf nity-basedapproaches,ingsemiconductorsequencing,weveri edthepresence/absenceof5hmCat57outof59individualcytosines(9outof11hydroxymethylatedCpGs,withdepthR30)withinregionsthatpreviouslyescapeddetectionby5hmCaf nitycapture(FigureS2A),underscoringthesensitivityandspeci cityofour

approach.

ApplicationofTAB-SeqtomESCsresultedin2,057,636high-con dence5hmCs.Thislargernumberofsitesislikelyattribut-abletohigher-levelexpressionofbothTet1andTet2inmESCsasrevealedbyRNA-Seqanalysis(Listeretal.,2011;Myersetal.,2011)(B.R.,unpublisheddata).LikeH1,these5hmCsarealsosigni cantlyenrichedatgenomiclocirecoveredbyaf nitysequencing(FigureS2J).Inaddition,these5hmCsitesaresignif-icantlyenrichedforpreviouslymappedbindingsitesofTet1(Williamsetal.,2011;Wuetal.,2011),con rmingtheTAB-Seqapproach.

BaseCompositionandGenomicDistributionof5hmCDNAmethylationofcytosinescanexistinseveralcontexts:CpG(denotedCG),CHG,andCHH(H=A,C,orT).AlthoughithasbeensuggestedthatmESCsmayharbor5hmCinnon-CGcontext(Ficzetal.,2011)andalthoughnon-CGmethylationispresentinhumanandmESCs(Listeretal.,2009;Stadleretal.,2011),wefoundthatnearlyall(99.89%)ofH15hmCsexistintheCGcontext(Figure2C).Similarly,this gureis98.7%inmESCs(FigureS2G).

ThecombinationoftraditionalmethylC-SeqandTAB-Seqmapsallowsustoestimatethetrueabundanceofboth5hmCand5mC.Weobservethat,inasteady-statepopulationofcells,5mCand5hmCoftencoexistatthesamecytosine(Figure2D).Themedianobservedabundanceof5hmCat5hmC-richcyto-sinesis19.2%,comparedto60.7%for5mCasestimatedfromtraditionalbisul tesequencing(Figure2E).Adjustingforthe92.0%protectionrateof5hmCbyTAB-Seq,weestimatethecorrectedmedian5hmCand5mCabundancetobe20.9%and59.0%,respectively.Theseresultssuggestthat,atthebaselevel,theabundanceof5hmCislowerthanthatof5mC.ThisobservationiscorroboratedinmESCs(FiguresS2HandS2I)andisconsistentwithapreviousestimateofglobal5hmClevelsinESCs(Tahilianietal.,2009).

Previousstudiesusingaf nity-basedapproacheshavedemonstratedthat5hmCisenrichedatpromoters,enhancers,CTCF-bindingsites,exons,andgenebodies(Ficzetal.,2011;Pastoretal.,2011;Stroudetal.,2011;Szulwachetal.,2011a;Williamsetal.,2011;Wuetal.,2011;Xuetal.,2011),suggestinganextensiveroleforthismodi cationingeneregulation.Sup-portingafunctionalroleof5hmC,weobserveatrendofincreasingsequenceconservationforincreasingabundanceof5hmC(FigureS3B).However,theabsoluteabundanceof5hmCcannotbeassessedfromaf nity-baseddetectionmethods,thereforeprecludingfurtherquantitativeanalysisof5hmC’sroleateachclassofregulatoryelements.InH1,wefoundthatalmosthalf(46.4%)ofthe5hmCsresideindistal-regulatoryelementsmappedbyChIP-SeqandDNase-Seq(Figure3A).Assessingrelativeenrichmentof5hmCateachclassofregulatoryelementbynormalizingwithgenomiccoverage,H1distal-regulatoryelementsincludingp300-bindingsites(observed/expected[o/e]=7.6),predictedenhancers(o/e=7.8),CTCF-bindingsites(o/e=5.1),andDNaseIhyper-sensitivesites(o/e=3.4)aremoreenrichedwith5hmCthanwithothergenicregions(Figure3B).Intriguingly,thesubsetofcytosinesshowingnearlyequallevelsof5mCand5hmCaremoreenrichedindistal-regulatoryelementsandlessenrichedatpromotersandgenicfeatures(FigureS3E),suggestingthat

activedemethylationisstrongestoutsideofgenes.Insupportofthisobservation,promoter-distalChIP-SeqpeaksforOCT4,SOX2,NANOG,KLF4,andTAFIIarealsomoreenrichedwith5hmCthanwithgenicfeatures(FigureS3D).Finally,weobservethatincreasingDNaseIhypersensitivitysignalcorrelateswellwithincreased5hmCanddecreased5mCenrichment(Fig-ureS3C).TheseresultsarealsosupportedbyobservationsinmESCs(FiguresS3FandS3G),thoughweobserveanincreaseinintragenic5hmCoccupancy.

Examiningonlythosegenomicelementshavingsigni cant5hmCenrichment,wefoundthattheabsolutelevelsof5hmCatallclassesofdistal-regulatoryelementsaresigni cantlyhigherthanthoseofpromoter-proximalelements(Figure3C).Incontrast,genebodieswithsigni cantlevelsof5hmCshowstatisticallylowerlevelsof5hmC.Furthermore,examiningtheestimatedlevelof5mCattheseloci,weobservedaninverserela-tionshipbetween5mCand5hmC(Figure3C).Distal-regulatoryelementshavethelowestlevelsof5mC,withp300andenhancershavingmedianabundancesof42.2%and53.7%,respectively.Thissuggeststhathighlydemethylatedelementssuchasp300containmorecytosinesinanon-5mC/5hmCform,impli-catingstrongerdemethylationattheseregulatoryelements.Incombinationwiththeobservationsthat(1)between44%and74%ofdistal-regulatoryelementsaresigni cantlyenrichedwith5hmCinhESCsandmESCs(Figures3DandS3H);(2)thesameclassofelementsarealsoenrichedwithhmCinmESCs(Figures3EandS3G);and(3)thesequence-conserveddistal-regulatoryelementsinH1areconservedfor5hmCinmESCs(Figure3F),ourdatasuggestthatthemarkingoffunctionalregu-latoryelementswith5hmCisanevolutionarilyconservedphenomenonwithpotentialfunctionalconsequences.Together,thesedatashowthat5hmCismostabundantatpromoter-distalregulatoryelementsandparticularlyenrichedindistal-regulatoryelements.

Besidesdistal-regulatoryelements,weobservesigni cantenrichmentof5hmCatgenesofalltiers,butlowlyexpressedgenesaremoreenrichedthanhighlyexpressedgenes(FigureS3I),consistentwithpreviousstudies(Pastoretal.,2011).Incontrasttotheabundant5hmCfoundatregulatoryelementsinH1,thevastmajorityofrepetitiveelementsarehighlyenrichedwith5mCbutnot5hmC(FigureS3J).Between3.5%and7.5%ofrepetitiveelementsaresigni cantlyenrichedwith5hmC,withlong-terminalrepeats(LTRs)beingthehighest(FigureS3K).Atthesesigni cantloci,theabsoluteabundanceof5hmCisonparwithpromotersbutlessthandistal-regulatoryelements(FigureS3L).

Pro lesofHydroxymethylcytosineatDistal-RegulatoryElements

5mCisthoughttoconferspeci citytogeneregulationbyin u-encingtranscriptionfactorbindingorservingasasubstrateofrecognitionforchromatinregulators(Bird,2011;ChenandRiggs,2011;JaenischandBird,2003;Quennevilleetal.,2011).Similarly,ithasbeensuggestedthat5hmCoffersadifferentplat-formuponwhichtranscriptionfactorsmaybindor5mC-speci cbindingproteinsmaybeexcluded(Hashimotoetal.,2012;KriaucionisandHeintz,2009;Valinlucketal.,2004;Yildirimetal.,2011).As5hmCisenrichednearenhancers,onepossibilityisthatthismodi edbaseisspeci callyrecognizedby

Cell149,1–13,June8,2012ª2012ElsevierInc.

5

#hmC per MB model (x1000)

A

p300

enhancer

Intergenic

CTCF

B

2.52

1.51

2.0

1.7

7.87.6

5.1

observedrandom (x10)obs/ave(rand)

3.4

Intragenic

DNase I

0.50

0.55

1.0

1.0

hmC bases

C

20

hmCG-enriched elements (H1)

100

%hmCG

10

%mCG

p

p

TSS 500

e I

bp

bp

F

00

00

ni

0b

0b

TC

ni

CTC20181614121086420

nc

50

00

as

nc

p3

ge

25

00

C

ge

p3

ha

ha

N

± 1

D

10

2

±

±

en

TSS

TSS

S

TS

D

706050

E

20181614121086420

TS

S

en

±

F

% covered sitesw/ significant 5hmC

%hmCG

403020100

bp

bp

%hmCG

Dbp

N

as

p

e I

er

er

F

c

c

±TS250S bp± 1kbenp30ha0nceCrTDCFNase IEInxotranIngenteicrgenic

exonic

00

F

I

00

enaseha Incerp300CTCF

er

er

FTCC

DN

c

ni

e

TC

0b

50

00

nc

50

nc

p3

as

ge

00

C

p3

ha

10

S ±

± 1

±

D

D

TSS

S

en

TS

TS

Figure3.GenomicDistributionof5hmCSites

(A)OverlapofH15hmCwithgenomicelements.GenicfeatureswereextractedfromtheUCSCKnownGenesdatabase(Hsuetal.,2006).Promoter-distalregulatoryelements(>5kbfromTSS)re ectthoseexperimentallymappedinH1cellsfromChIP-SeqandDNase-Seqexperiments.Each5hmCbaseiscountedonce:theoverlapofagenomicelementexcludesallpreviouslyoverlappedcytosinescounterclockwisetothearrow.Green:promoter-proximalelements;red:promoter-distalregulatoryelements;gray:genicregions;white:intergenicregions.

(B)TherelativeenrichmentofH15hmC(black)andrandomsites(gray)atgenomicelements,normalizedtothetotalcoverageoftheelementtype.Randomconsistsof10randomsamplingsof5mC(seeExtendedExperimentalProcedures).

(C)Thelevelsof5hmCG(left)and5mCG(right)forseveralclassesofgenomicelementssigni cantlyenrichedwith5hmCGinH1(p=0.01,binomial).Thedottedlineindicatesthe5mCnonconversionrate.Colorsasin(A).

(D)Thepercentageofdistal-regulatoryelementssigni cantlyenrichedwith5hmCGinH1.

(E)InmESCs,theabsolutelevelof5hmCGforseveralclassesofgenomicelementssigni cantlyenrichedwith5hmCG(p=0.01,Fisher’sexacttest).Colorsasin(A).

(F)Forgenomicelementssigni cantlyenrichedwith5hmCGinH1ESCsandconservedinmouse,thedistributionof5hmCGinmESCs.Colorsasin(A).Inallpanels,de nitionsofenhancers,p300,CTCF,andDNaseIsitesarepromoterdistal(>5kbfromTSS).

Forboxplots,notchesindicatemedian,boxesextendtothe25thand75thpercentiles,andwhiskersextendtononoutliers.ErrorbarsindicateSD.SeealsoFigureS3.

6Cell149,1–13,June8,2012ª2012ElsevierInc.

TS

S

en

±

ha

2

N

N

2

as

e

I

A

600500

TSS-distal p300 ChIP-Seq peaks

BTSS-distal p300 ChIP-Seq peaks w/ hmC

Figure4.Pro lesof5hmCatDistal-RegulatoryElements

(A)Frequencyof5hmCarounddistalp300-bindingsites.

(B)Absolutelevelsof5hmCG(red)and5mCG+5hmCG(black)aroundthedistalp300-bindingsitescontaininganOCT4/SOX2/TCF4/NANOGmotif(bluebar,center;consensus:ATTTGCATAACAATG).5mC(green)wasestimatedastheratefromtraditionalbisul tesequencing(5hmC+5mC)minusthemeasured5hmCrate.Thetophalfindicatesenrichmentonthestrandcontainingthemotif,withthebottomhalfindicatingtheoppositestrand.

(C)Frequencyof5hmCarounddistalCTCF-bindingsites,relativetotheCTCFmotif(bluebar,bottom).Thedifferentlinesrepresentdifferentstrands,orientedwithrespecttotheCTCFmotif(consensus:ATAGTGCCACCTGGTGGCCA).Opp,opposite.

(D)Absolutelevelsof5hmCG,5mCG,and5mCG+5hmCGarounddistalCTCF-bindingsitesanchoredattheCTCFmotif(bluebar,center).Colorsasin(B).SeealsoFigureS4.

3002001000

2000

1000

1000

2000

400

position relative to p300 summit

position relative to

OCT4/SOX2/TCF4/NANOG motif

C

hmC frequency

TSS-distal CTCF ChIP-Seq peaks

D

position relative to CTCF motifposition relative to CTCF motif

transcriptionfactorsasacorebaseinbindingmotifs.Butassequencemotifsaretypicallyshorterthan20bp,theresolutionofaf nity-basedapproachesisnotsuf cienttoresolvewhether5hmCisactuallypresentwithinoroutsideofthebindingsite.Weobservedthatwhereas5hmCisabundantwithin500bpofdistalp300-bindingsites,thereisalocaldepletionneartheexpectedtranscriptionfactor-bindingsite(Figures4AandS4A).Toincreaseresolution,weanchoredp300bindingwiththeOCT4/SOX2/TCF4/NANOGconsensusmotif(Listeretal.,2009).TotalDNAmethylation(5mC+5hmC)decreasestowardthemotif,inagreementwitharecentstudy(Stadleretal.,2011),whereas5hmCdisplaysabimodalpeakofenrichmentcenteredatthemotifwithamaximumaverageabundanceof12.3%(Figure4B).Similarly,forCTCF-bindingsites,weobservedabimodalenrichmentpro leof5hmCabundance$150bparoundthemotif,withalmostno5hmCwithinthemotifitself(Figure4C).5hmCincreasestoamaximumabundanceof13.4%,coincidingwithadramaticdepletionof5mCfromanaveragehighof86.2%toalowof21.0%(Figure3D).WealsoobservedsimilarresultsforNANOG-bindingsites(FiguresS4BandS4C).Together,thesedatasuggestthat5hmCistypicallynotobservedwithinpotentialbindingsitesoftranscriptionfactorsbutratherismostenrichedinregionsimmediatelyadjacenttosequencemotifs.Thereciprocalpro lesof5hmCand5mCareconsistentwithamodelofdynamicDNAmethylationassociatedwithDNA-bindingtranscriptionfactorsandprovideadditionalevidencesupportingarolefor5hmCinthelocallyreducedlevelsof5mCatdistal-regulatoryelements(Stadleretal.,2011).

AsymmetricHydroxymethylationatCGSequences

CytosinemethylationinCGcontextissymmetric,andthemain-tenancemethyltransferaseDNMT1ensuresef cientpropagationofsymmetric5mCGduringcelldivision,thusprovidingoneofthecentralmodesofepigeneticinheritance(Bird,2011;ChenandRiggs,2011;GollandBestor,2005;JaenischandBird,2003;Wigleretal.,1981).Ourobservationthatthebimodaldistributionof5hmCaroundCTCFisstrandasymmetric(Figures4Cand4D)promptedustoexaminewhether5hmCisstrandbiasedinH1.Whereas91.8%of5mCsaresymmetricallymodi ed,wefoundthatonly21.0%of5hmCsaresymmetric.However,becausetheabundanceof5hmCisrareatanygivencytosine(median19.2%;Figure2E),itispossiblethatsequencingdepthwasnotsuf cienttoidentifyall5hmCs,makingthisanunderestimate.Toaddressthisissue,wecomparedthepoolofallcalled5hmCswiththepooled5hmCcontentontheoppositecytosine(Figure5A).Theaverageabundanceof5hmCis20.0%atcalled5hmCs,comparedto10.9%attheoppositecytosine,whichcorrespondstoan83.8%enrichmentof5hmC(Figure5B,p<1310À15,binomial).Asacontrol,thebaseline5hmCcontentofallmethylatedcytosinesinCGcontextissymmetricandcompa-rabletothemethylcytosinenonconversionrate(Figure5C).Atpromotersandwithingenebodies,wefoundthatstrandbiasisnotdependentontheorientationofthetranscript(FigureS5A)(ppromoter=0.0339,pgenebody=0.0719).

Tocon rmtheasymmetryof5hmCG,weexaminedthediffer-enceinmethylationstateofcalled5hmCsandthecytosineslocatedattheoppositestrands.Fromtraditionalbisul tesequencing,themediandifferenceintotalmethylation(5mCG+5hmCG)betweencalledandoppositecytosinesis0%.Incontrast,TAB-Seqrevealsashifteddistributionwithamedianof10.9%lesshydroxymethylationontheoppositecytosine(Figure5D,p<1310À15,Wilcoxon).Simultaneousexaminationoftheabsolutelevelsof5hmConbothcalledandoppositecytosinesshowedthattheshiftinhydroxymethylation

Cell149,1–13,June8,2012ª2012ElsevierInc.7

%5mCG%5mCG + 5hmC

G

%5mCG%5mCG + 5hmCG

hmC frequency

Figure5.Asymmetryaround5hmCG

(A)Aschematicofnomenclature.Thecytosinewith5hmC(red)isdesignatedas‘‘called,’’whereasthecytosineontheoppositestrand(green)isdesignatedas‘‘opposite.’’

(B)Theaverage5hmCabundanceofcalled5hmCGresidues(red)comparedtotheoppositecytosineresidues(green).Called:calledcytosine;opp:oppositecytosine.

(C)Theaverage5hmC(black)and5mC(white)abundanceatcalledandoppositecytosines,forcalledcytosineshaving5hmC(left)or5mC+5hmC(right).5mC(whiteexcludingblack)wasestimatedastheratefromtraditionalbisul tesequencing(5hmC+5mC)minusthemeasured5hmCrate.Grayline:5mCnonconversionrate.

(D)Thedistributionofdifferencesin5hmCG(red)betweencalledandoppositecytosines,incomparisontodifferencesobservedfromtraditionalbisul tesequencing(green,5mCG+5hmCG).Calledandoppositecytosinesareeachsequencedtoatleastdepth10.

(E)For5hmC-calledsites,aheatmapof5hmCGabundanceatcalledandoppositecytosinepairs(left).Forthe5mC-calledsitesfromtraditionalbisul tesequencing,aheatmapof5mCG+5hmCGabundanceatcalledandoppositecytosinepairs(right)isshown.SeealsoFigureS5.

statetowardthecalledcytosineisevident,incontrasttoDNAmethylationlevelsthatremainsymmetric(Figure5E).Ouranal-ysisofthespike-inlambdaDNAshowednostrandnorsequencebiasoftheTAB-Seqmethod(FiguresS5BandS2D).Thisconclu-sionwasfurthersupportedbyanalyzingthebGT-catalyzedglucosylationef ciencyofafullyhydroxymethylatedmodeldsDNA,whichisover90%(FigureS5C).

5hmCIsStrandBiasedtowardG-RichSequences

Theasymmetryof5hmCinH1suggeststhat,onapopulationaverage,onestrandismorelikelytobehydroxymethylatedthantheotherstrand.Onepossibleexplanationforthisphenomenonisasequencepreferenceof5hmCforonestrandcomparedtotheother.Toexaminethissystematically,wealignedall5hmCsinCGcontextandexaminedbasecomposi-tion(Figure6A).Onthestrandcontaining5hmC,weobservedamodestincreaseinlocalguanineabundancewithdepletionofadenineandthyminecontent.Withinawindowof100bparound5hmCs,thelocalsequencecontentofguanineincreasestoanaverageof29.9%,signi cantlyhigherthanthe25.6%observedforrandomlysampledmethylatedcyto-sines(FigureS6A,p<1310À15,Wilcoxon).Theseobserva-tionsarenotafunctionofregulatoryelementclass,assimilar

8Cell149,1–13,June8,2012ª2012Elsevier

Inc.

trendsholdforsubsetsof5hmCfoundatpromoters,distal-regulatoryelements,andgenicregions(FigureS6C).Further-more,similartrendsareobservedinmESCs(FigureS6D),andanalysisofthespike-inlambdaDNAshowsthatthisobservationisnotasystematicbiasoftheTAB-Seqmethod(FigureS6B).

Ourobservationssuggestthat5hmCdepositionisbiasedtowardthestrandwithahigherlocaldensityofguanine.Totestthishypothesis,wedevelopedapredictivealgorithm:giventhatastrand-biasedhydroxymethylationeventexistsatapartic-ularCG(pvalue=0.01,Fisher’sexacttest)andthatonestrandhaslocalguaninecontentsigni cantlydifferentfromtheotherstrand(pvalue=0.01,Fisher’sexacttest),wepredictthestrandwithhigherguaninecontenttohavethehydroxymethylationevent.Thismodelcorrectlypredictsthehydroxymethylatedstrandwith82.7%accuracy,signi cantlybetterthanthe50%expectedbychance(Figure6B,p<1310À15,binomial),con- rmingthatlocalsequencecontentplaysaroleinstrand-speci chydroxymethylation.However,althoughbothhESCsandmESCsexhibitabiasof5hmCGtooccuronthestrandwithmoreguaninecontent(FiguresS6CandS6D),theeffectisweakerinmESCs(FiguresS6EandS6F),whichisonepotentialreasonthatguaninecontentdoesnotpredict5hmCin

mESCs.

A

40

100

5hmCG

GGGGGGGTCAGGGGCGGGGG50CCCAACCCGAAACGCCCCCAAACCATGACGTCCATTAAAA

0-10

TTTTTTAATTCTTTAATTTT

CG

% base

30

GA

TC

20

150 100 50

50

100

% base

CG

100

GGGGGGGTCAGGGGCGGGGG50CCCAAACCGAAACGCCCCCAAACCCTGACGTCCATAAAAA

0+10

TTTTTTAATTCTTTATTTTT

CG

CG

40

100

5mCG (rand)

TGAAGGAACAGGTTTATTAA

50GTGGTAGTGCTCAGCCCGCGAATTATCGATATGCGGAATT

0-10

Figure6.LocalSequenceContextaround5hmCG

+10

CCCCCCTCTGCACAA

TGCGC

CG

CG

100

TGAAGGAACAGGTTTATTAA50ATGGTAGTGCTCAGCCCGCGGATTATCGATATGCGGAATT

0-10

+10

CCCCCCTCTGCACAATGCGC

CG

CG

+10

30

150

20

150 100 50

050100150

genomic position relative to hmCGgenomic position relative to mCG

(A)Sequencecontext±150bparound5hmCGsites(left),comparedtothesamenumberofrandomlychosenmCGsites(right).Shownsequencesareonthesamestrandas5hmC.Inset:sequencecontext±10bparound5hmCGsitesthatareontheWatsonorCrickstrands.Positivecoordinatesindicatethe30direction.

(B)Shownhereisthefrequencyatwhichthefollowingtwoeventsco-occur:cytosinesshowsigni cantdifferencein5hmCGbetweenWatsonandCrickstrands(p=0.01,Fisher’sexacttest),andcytosineswithabundanceofguanine±50bparoundthesiteshowsigni cantstrandbias(p=0.01,Fisher’sexacttest).SeealsoFigureS6.

Watson

Watson

Crick

B

%hmCGhigher

Crick

ww

wc

cw

cc

%G higher%hmCG higher

OnepossibleexplanationisthelargedifferenceintheexpressionlevelsofTET1andTET2inhESCsandmESCs.

5hmCIsMostEnrichednearLow-CpGRegions

Recentaf nity-basedstudiesinmESCshaveobserved5hmCtobefrequentlyenrichedatCpGisland-containingpromoters(Ficzetal.,2011;Pastoretal.,2011;Williamsetal.,2011),andthatthehighestlevelsof5hmCcorrespondtothehighestdensityofCpGs(Ficzetal.,2011).Incontrast,anaf nity-based5hmCmapproducedinH1found5hmC-richregionstobedepletedofCpGdinucleotides(Szulwachetal.,2011a).Theseconfound-ingresultspromptedustoexaminetherelationshipbetweenabsolutesteady-state5hmClevelandCpGcontentatpromoters.Wefoundthatpromoterswiththehighestlevelsof5hmCGarealmostexclusivelyoflowCpGcontent(Figure7A)andarealsothepromotersmostlikelytohavethehighest5mCG(FigureS7A).Inagreementwiththisobservation,whenwedividepromotersbyCpGcontent,weobservethatthedensityof5hmCislowestathigh-CpGpromoters(HCPs),whereasatlow-CpGpromoters(LCPs)andintermediate-CpGpromoters(ICPs),5hmCisatleast3.3timesmoreabundant(FigureS7H).AnalysesofmESCsgivesimilarresults(Figure7B).InbothhESCsandmESCs,CpG-richpromotersarealmostdevoidofsteady-state5hmC.Moreover,theseresultsapplytopromoterscontainingH3K4me3orbivalentchromatinmodi ca-tions(FigureS7G).

Consideringtheabovetogetherwithourobservationofanincreasedlocaldensityofguanineonthestrandofhydroxyme-thylation,wepostulatedthatpromoterswithhighGCcontentbutlowCpGdensityaremorelikelytobehydroxymethylated.Indeed,suchbivalent(p<1310À300)andH3K4me3-onlypromoters(p=7.8310À286)aremoreenrichedwith5hmC(Figure7C).

Todeterminewhetherhydroxymethylationatdistal-regulatoryelementsisalsobiasedtowardlowCpGdensity,weexaminedthreeclassesofDNaseI-hyper-sensitivesites(DHSs):(1)thoselackingtheenhancerhistonemodi cationsH3K4me1andH3K27ac;(2)putativepoisedenhancersbearingonlyH3K4me1;and(3)putativeactiveenhancerswithbothmodi ca-tions(Hawkinsetal.,2010;Myersetal.,2011).Poisedandactiveenhancersexhibitthestrongestenrichmentof5hmC(Figures7D–7F),whichalmostexclusively

correspondstolow-CpGdensityregions.Likepromoters,thefewdistalDHSswithhighCpGdensityaregenerallycomposedoflow5hmCGcontent.Wealsoobservedsimilarresultsatdistalp300-bindingsites(FiguresS7BandS7C).Together,theseresultssuggestthatthehighestlevelsof5hmCoccuratregionsofthegenomewithlowCpGdensity.

ComparingDHSslackingtheH3K4me1andH3K27acenhancermarkstopoisedenhancershavingonlyH3K4me1,5mCGdropsby12.6%,and5hmCincreasesby2.7-fold(FiguresS7D–S7F).Incontrast,activeenhancershavingbothH3K4me1andH3K27achave8.3%less5mCGthanpoisedenhancersbutwithonlya1.08-foldincreasein5hmC.Theseresultssuggestthatwhereas5mCGisinverselyrelatedtobothH3K4me1andH3K27ac,5hmCisprimarilyproportionaltoH3K4me1.DISCUSSION

Bisul tesequencinghasbeenbroadlyusedtoanalyzethegenomicdistributionandabundanceof5mC(Bernsteinetal.,2007;Clarketal.,1994;Listeretal.,2008;Meissner,2010;PelizzolaandEcker,2011).However,becausetraditionalbisul- tesequencingcannotdistinguish5mCfrom5hmC,resultsfromsuchapproachescannotyetaccuratelyreveal5mCabun-dance(Huangetal.,2010;Jinetal.,2010).Recentexperimentsshowthat5hmCiswidespreadinthemammaliangenome,andatleasttwofunctionshavebeenproposedforthiscytosinemodi cation:(1)5hmCservesasanintermediateintheprocessofDNAdemethylation,eitherpassively(InoueandZhang,2011)oractivelythroughfurtheroxidation(Heetal.,2011;Itoetal.,2011;MaitiandDrohat,2011;Zhangetal.,2012);(2)5hmCmayberecognizedbychromatinfactors(Fraueretal.,2011;Yildirimetal.,2011),anditspresencecouldreducebindingofcertainmethyl-CpG-bindingproteins(Hashimotoetal.,2012;

Cell149,1–13,June8,2012ª2012ElsevierInc.9

% sites

A

D

Figure7.5hmCGisBiasedtowardLow-CpGRegions

(A,B,andD–F)Shownareheatmapsofpercent5hmCG(±250bpfromTSSorDHS)asafunctionofCpGdensityfor(A)promotersinH1ESCs,(B)promotersinmESCs,(D)DHSsiteslackingH3K4me1andH3K27ac,(E)DHSsiteswithapoisedenhancerchromatinsignature,and(F)DHSsiteswithanactiveenhancerchromatinsignature.

(C)TheGCcontentrelativetotheCpGcontentforthe5hmC-enrichedversusthe5hmCnot-enrichedpromoters.

Forboxplots,notchesindicatemedian,boxesextendtothe25thand75thpercentiles,andwhis-kersextendtononoutliers.SeealsoFigureS7.

% 5hmC

% 5hmC

B

E

originalgenomicDNAdonotinterferewithTAB-Seqbecausetheybehavelikeunmodi edcytosineunderbisul tetreat-ment(Heetal.,2011).Wealsoutilizedthismethodtoexaminepreviouslyreported5hmC-enrichedlociandsuccessfully

identi edgenuine5hmCsites.TheseresultsshowthegeneralutilityofTAB-Seqtoassess5hmCinaloci-speci cmanner,muchthesameashowtradi-Ftionalbisul tesequencingiscurrentlyCused.

Weappliedthistechniquetomamma-liangenomesbygeneratingsingle-baseresolutionmapsof5hmCinhESCsand

mESCs.Weshowthatthesemapsagreewellwithpreviousmapsgeneratedwithaf nity-based5hmCpro ling.Importantly,thesesingle-basemapsalsorevealed

asigni cantnumberofnew5hmCsites.

Analysesoftwo5hmCmapsinESCsiden-ti edseveraluniquesequence-basedcharacteristicsof5hmC.WeobservedbivalentH3K4me3

that,muchlike5mC,5hmCtendstooccuronly

primarilyatCpG-dinucleotidesyet,unlike5mC,exhibitsanasymmetricstrandbias.

KriaucionisandHeintz,2009;Valinlucketal.,2004).Thesefunc-Wealsoobservedarelativelystronglocalsequencepreferencetionsimplicatetwoopposingnotionsabouttherelativestabilitysurrounding5hmC,with5hmCoccurringwithinaG-richcontext.of5hmCatdistinctgenomicloci.Asthe rststeptowardunder-Thisobservationisconsistentwithapreviousreportthat5hmCstandingthesemolecularmechanismsassociatedwith5hmCregionsareGCskewed(Stroudetal.,2011).Thesesequence-function,itisimportanttonotonlypreciselylocate5hmCinbasedfeaturesassociatedwith5hmCmayprovideabasisforthegenomebutalsodeterminetherelativeabundanceateachfuturemechanisticinsightintothemeansbywhich5hmCismodi edsite.Herewedescribeamodi edbisul tesequencingdeposited,recognized,anddynamicallyregulated.methodthatwhencombinedwithtraditionalbisul tesequencingTheabilitytoquantify5hmCabundancewithbaseresolutioncandeterminethelocationof5hmCatsingle-baseresolutionofferedtheuniqueopportunitytoassessitsrelativeabundanceandquantitativelyassesstheabundanceof5mCand5hmCatatvariousregulatoryelementsandgenomicannotationswithouteachmodi edcytosine.bias.Incontrasttothenearlyuniformdistributionof5mCoutsideUsingmodelDNA,wedemonstratedthatcouplingbGT-ofpromoterregions,wefoundthattheabundanceof5hmCvariesmediatedprotectionof5hmCwithmTet1-basedoxidationamongdifferentclassesoffunctionalsequences.Itismosten-of5mCallowsforthedistinctionof5hmCfromunmodi edcyto-richedatdistal-regulatoryregionswherelevelsof5mCarecorre-sineand5mCbysequencing.5fCand5caCpresentedinthe

spondinglylowerthanthegenomeaverage.Thisobservation

%(G+C)/%CpG

% 5hmC

10Cell149,1–13,June8,2012ª2012ElsevierInc.

% 5hmC% 5hmC

agreeswithrecent ndingsfromothers(Stadleretal.,2011)andsuggeststhatactivedemethylationoccursatactiveregulatoryelementsthrough5hmC.Thisactivedemethylationisdistributedaround,butnotwithin,transcriptionfactorconsensusmotifs.Supportingthenotionofactivedemethylation,totalDNAmethylationexhibitsastrongnegativecorrelationwith5hmCatdistal-regulatoryelements(Spearmancorrelation=À0.30).Oneinterestingobservationofthesedistalcis-regulatoryelementsisthat5hmCand5mCoftenoccurtogetheratthesameposition.Currently,theexactmechanismsthatdeterminethedynamicsof5hmCand5mCatthesecis-regulatorysequencesareunclear.Previousaf nity-basedstudieshavesuggestedenrichmentof5hmCatCpG-richtranscriptionstartsites.However,theseobservationsreliedheavilyonantibody-baseddetection,whichhasbeenshowntoexhibitbiastoward5hmC-denseregions.Herewe ndthat,ingeneral,5hmCismostabundantatregionsoflowCpGcontent.Furthermore,evenpromoterswithrelativelyhigh5hmCcontenttendtohavelowCpGcontentinbothmESCsandhESCs.These ndingshighlighttheutilityofabase-resolutionmethodformeasuring5hmCabundanceandprovideinsightintothedynamicregulationof5hmCatpromotersiteswithdistinctCpGcontent.

Tahilianiandcolleagues(Tahilianietal.,2009)recentlyesti-matedthegenome-wideabundanceof5hmCtobeabout14timeslessthanthatof5mC,whichwouldcorrespondto$4.4million5hmCsinhuman.However,asourresultsindicatethatthebase-levelabundanceof5hmCisseveraltimeslowerthanthatof5mC,thisislikelyanunderestimate.Thecomparativelylownumberof5hmCscon dentlydetectedinourstudy(691,414)islikelyexplainedbythefrequenthydroxymethylationofgenebodiespreviouslyobservedinaf nity-basedstudies(Ficzetal.,2011;Pastoretal.,2011;Stroudetal.,2011;Szulwachetal.,2011a;Williamsetal.,2011;Wuetal.,2011;Xuetal.,2011).Becausegeniccytosineslikelyexistatarelativelylowabundanceof5hmC(3%–4%),theywouldhaveescapeddetectionatourcurrentsequencingdepth.Inordertoresolvelow-abundance5hmCsatsingle-baseprecision,signi cantlymoresequencingwouldberequired.Thisobservationhighlightsthebiasesinherentinaf nity-based5hmCmapping,whichcanamplifyfrequentweaksignalsfoundingenebodiestoover-shadowrarebutstrongeronesatdistal-regulatoryelements.Insummary,wehavedevelopedagenome-wideapproachtodetermine5hmCdistributionatbaseresolutionandhavegener-atedbase-resolutionmapsof5hmCinbothhESCsandmESCs.Thesemapsprovideatemplateforfurtherunderstandingthebiologicalrolesof5hmCinstemcellsaswellasgeneregulationingeneral.InconjunctionwithmethylC-Seq,theTAB-Seqmethoddescribedhererepresentsageneralapproachtomeasuretheabsoluteabundanceof5mCand5hmCatspeci csitesorgenome-wide,whichcouldbewidelyappliedtovariouscelltypesandtissues.

EXPERIMENTALPROCEDURES

GlucosylationandOxidationofGenomicDNA

Glucosylationreactionwasperformedina50mlsolutionwith50mMHEPESbuffer(pH8.0),25mMMgCl2,100ng/mlsonicatedgenomicDNAwithspike-incontrol,200mMUDP-Glc,and1mMwild-typebGT.Thereactionwasincu-

batedat37 Cfor1hr.Afterthereaction,theDNAwaspuri edbyaQIAquickNucleotideRemovalKit(QIAGEN).Theoxidationreactionwasperformedina50mlsolutionwith50mMHEPESbuffer(pH8.0),100mMammoniumiron(II)sulfate,1mMa-ketoglutarate,2mMascorbicacid,2.5mMDTT,100mMNaCl,1.2mMATP,10ng/mlglucosylatedDNA,and3mMrecombinantmTet1.Thereactionwasincubatedat37 Cfor1.5hr.AfterproteinaseKtreat-ment,theDNAwaspuri edwithMicroBio-Spin30Columns(Bio-Rad)andthenbyaQIAquickPCRPuri cationKit(QIAGEN).

Quantifying%5hmCGand%5mCG

Foragivengenomicinterval,theabundanceofhydroxymethylation(%hmCG)isestimatedasthenumberofcytosinebasecallsintheintervaldividedbythenumberofcytosineplusthyminebasecallsintheintervalfromTAB-Seqreads,wherethereferenceisinCGcontext.Toestimate%5mClevel,wesubtractedthetotalmethylationlevelfrommethylC-Seqbythe%5hmClevelfromTAB-Seq.Inallinstances,onlybasecallswithPhredscoreR20wereconsidered.ACCESSIONNUMBERS

SequencingdatahavebeendepositedtoGEO(accessionnumberGSE36173).

SUPPLEMENTALINFORMATION

SupplementalInformationincludesExtendedExperimentalProceduresandseven guresandcanbefoundwiththisarticleonlineatdoi:10.1016/j.cell.2012.04.027.

ACKNOWLEDGMENTS

ThisstudywassupportedbyNationalInstitutesofHealth(GM071440toC.H.,NS051630andP50AG025688toP.J.,U01ES017166toB.R.),aCatalystAward(C.H.andJ.-H.M.)fromtheChicagoBiomedicalConsortiumwithsupportfromtheSearleFundsatTheChicagoCommunityTrust,theEmoryGeneticsDiscoveryFund(P.J.),theSimonsFoundationAutismResearchInitiative(P.J.),theAutismSpeaksgrant(#7660toX.L.),andtheLudwigInsti-tuteforCancerResearch(B.R.).Received:March7,2012Revised:April2,2012Accepted:April19,2012

Publishedonline:May17,2012REFERENCES

Bernstein,B.E.,Meissner,A.,andLander,E.S.(2007).Themammalianepige-nome.Cell128,669–681.

Bird,A.(2011).ThedinucleotideCGasagenomicsignallingmodule.J.Mol.Biol.409,47–53.

Chen,Z.X.,andRiggs,A.D.(2011).DNAmethylationanddemethylationinmammals.J.Biol.Chem.286,18347–18353.

Clark,S.J.,Harrison,J.,Paul,C.L.,andFrommer,M.(1994).Highsensitivitymappingofmethylatedcytosines.NucleicAcidsRes.22,2990–2997.Cokus,S.J.,Feng,S.,Zhang,X.,Chen,Z.,Merriman,B.,Haudenschild,C.D.,Pradhan,S.,Nelson,S.F.,Pellegrini,M.,andJacobsen,S.E.(2008).ShotgunbisulphitesequencingoftheArabidopsisgenomerevealsDNAmethylationpatterning.Nature452,215–219.

Cortellino,S.,Xu,J.,Sannai,M.,Moore,R.,Caretti,E.,Cigliano,A.,LeCoz,M.,Devarajan,K.,Wessels,A.,Soprano,D.,etal.(2011).ThymineDNAglycosy-laseisessentialforactiveDNAdemethylationbylinkeddeamination-baseexcisionrepair.Cell146,67–79.

Dawlaty,M.M.,Ganz,K.,Powell,B.E.,Hu,Y.C.,Markoulaki,S.,Cheng,A.W.,Gao,Q.,Kim,J.,Choi,S.W.,Page,D.C.,andJaenisch,R.(2011).Tet1isdispensableformaintainingpluripotencyanditslossiscompatiblewithembryonicandpostnataldevelopment.CellStemCell9,166–175.

Cell149,1–13,June8,2012ª2012ElsevierInc.

11

Ficz,G.,Branco,M.R.,Seisenberger,S.,Santos,F.,Krueger,F.,Hore,T.A.,Marques,C.J.,Andrews,S.,andReik,W.(2011).Dynamicregulationof5-hydroxymethylcytosineinmouseEScellsandduringdifferentiation.Nature473,398–402.

Frauer,C.,Hoffmann,T.,Bultmann,S.,Casa,V.,Cardoso,M.C.,Antes,I.,andLeonhardt,H.(2011).Recognitionof5-hydroxymethylcytosinebytheUhrf1SRAdomain.PLoSONE6,e21306.

Globisch,D.,Mu

¨nzel,M.,Mu¨ller,M.,Michalakis,S.,Wagner,M.,Koch,S.,Bru

¨ckl,T.,Biel,M.,andCarell,T.(2010).Tissuedistributionof5-hydroxyme-thylcytosineandsearchforactivedemethylationintermediates.PLoSONE5,e15367.

Goll,M.G.,andBestor,T.H.(2005).Eukaryoticcytosinemethyltransferases.Annu.Rev.Biochem.74,481–514.

Gu,T.P.,Guo,F.,Yang,H.,Wu,H.P.,Xu,G.F.,Liu,W.,Xie,Z.G.,Shi,L.,He,X.,Jin,S.G.,etal.(2011).TheroleofTet3DNAdioxygenaseinepigeneticreprogrammingbyoocytes.Nature477,606–610.

Guo,J.U.,Su,Y.,Zhong,C.,Ming,G.L.,andSong,H.(2011).Hydroxylationof5-methylcytosinebyTET1promotesactiveDNAdemethylationintheadultbrain.Cell145,423–434.

Hashimoto,H.,Liu,Y.,Upadhyay,A.K.,Chang,Y.,Howerton,S.B.,Vertino,P.M.,Zhang,X.,andCheng,X.(2012).Recognitionandpotentialmechanismsforreplicationanderasureofcytosinehydroxymethylation.NucleicAcidsRes.PublishedonlineFebruary22,2012.10.1093/nar/gks155.

Hawkins,R.D.,Hon,G.C.,Lee,L.K.,Ngo,Q.,Lister,R.,Pelizzola,M.,Edsall,L.E.,Kuan,S.,Luu,Y.,Klugman,S.,etal.(2010).Distinctepigenomicland-scapesofpluripotentandlineage-committedhumancells.CellStemCell6,479–491.

He,Y.F.,Li,B.Z.,Li,Z.,Liu,P.,Wang,Y.,Tang,Q.,Ding,J.,Jia,Y.,Chen,Z.,Li,L.,etal.(2011).Tet-mediatedformationof5-carboxylcytosineanditsexcisionbyTDGinmammalianDNA.Science333,1303–1307.

Hsu,F.,Kent,W.J.,Clawson,H.,Kuhn,R.M.,Diekhans,M.,andHaussler,D.(2006).TheUCSCknowngenes.Bioinformatics22,1036–1046.

Huang,Y.,Pastor,W.A.,Shen,Y.,Tahiliani,M.,Liu,D.R.,andRao,A.(2010).Thebehaviourof5-hydroxymethylcytosineinbisul tesequencing.PLoSONE5,e8888.

Inoue,A.,andZhang,Y.(2011).Replication-dependentlossof5-hydroxyme-thylcytosineinmousepreimplantationembryos.Science334,194.

Iqbal,K.,Jin,S.G.,Pfeifer,G.P.,andSzabo

´,P.E.(2011)A108,3642–3647.

Ito,S.,D’Alessio,A.C.,Taranova,O.V.,Hong,K.,Sowers,L.C.,andZhang,Y.(2010).RoleofTetproteinsin5mCto5hmCconversion,ES-cellself-renewalandinnercellmassspeci cation.Nature466,1129–1133.

Ito,S.,Shen,L.,Dai,Q.,Wu,S.C.,Collins,L.B.,Swenberg,J.A.,He,C.,andZhang,Y.(2011).Tetproteinscanconvert5-methylcytosineto5-formylcyto-sineand5-carboxylcytosine.Science333,1300–1303.

Jaenisch,R.,andBird,A.(2003).Epigeneticregulationofgeneexpression:howthegenomeintegratesintrinsicandenvironmentalsignals.Nat.Genet.Suppl.33,245–254.

Jin,S.G.,Kadam,S.,andPfeifer,G.P.(2010).Examinationofthespeci cityofDNAmethylationpro lingtechniquestowards5-methylcytosineand5-hydroxymethylcytosine.NucleicAcidsRes.38,e125.

Ko,M.,Huang,Y.,Jankowska,A.M.,Pape,U.J.,Tahiliani,M.,Bandukwala,H.S.,An,J.,Lamperti,E.D.,Koh,K.P.,Ganetzky,R.,etal.(2010).Impairedhydroxylationof5-methylcytosineinmyeloidcancerswithmutantTET2.Nature468,839–843.

Koh,K.P.,Yabuuchi,A.,Rao,S.,Huang,Y.,Cunniff,K.,Nardone,J.,Laiho,A.,Tahiliani,M.,Sommer,C.A.,Mostoslavsky,G.,etal.(2011).Tet1andTet2regulate5-hydroxymethylcytosineproductionandcelllineagespeci cationinmouseembryonicstemcells.CellStemCell8,200–213.

Kriaucionis,S.,andHeintz,N.(2009).ThenuclearDNAbase5-hydroxymethyl-cytosineispresentinPurkinjeneuronsandthebrain.Science324,929–930.

12Cell149,1–13,June8,2012ª2012Elsevier

Inc.

Lister,R.,O’Malley,R.C.,Tonti-Filippini,J.,Gregory,B.D.,Berry,C.C.,Millar,A.H.,andEcker,J.R.(2008).Highlyintegratedsingle-baseresolutionmapsoftheepigenomeinArabidopsis.Cell133,523–536.

Lister,R.,Pelizzola,M.,Dowen,R.H.,Hawkins,R.D.,Hon,G.,Tonti-Filippini,J.,Nery,J.R.,Lee,L.,Ye,Z.,Ngo,Q.M.,etal.(2009).HumanDNAmethyl-omesatbaseresolutionshowwidespreadepigenomicdifferences.Nature462,315–322.

Lister,R.,Pelizzola,M.,Kida,Y.S.,Hawkins,R.D.,Nery,J.R.,Hon,G.,Antosiewicz-Bourget,J.,O’Malley,R.,Castanon,R.,Klugman,S.,etal.(2011).Hotspotsofaberrantepigenomicreprogramminginhumaninducedpluripotentstemcells.Nature471,68–73.

Maiti,A.,andDrohat,A.C.(2011).ThymineDNAglycosylasecanrapidlyexcise5-formylcytosineand5-carboxylcytosine:potentialimplicationsforactivedemethylationofCpGsites.J.Biol.Chem.286,35334–35338.

Meissner,A.(2010).Epigeneticmodi cationsinpluripotentanddifferentiatedcells.Nat.Biotechnol.28,1079–1088.

Mu

¨nzel,M.,Globisch,D.,Bru¨ckl,T.,Wagner,M.,Welzmiller,V.,Michalakis,S.,Mu

¨ller,M.,Biel,M.,andCarell,T.(2010).Quanti cationofthesixthDNAbasehydroxymethylcytosineinthebrain.Angew.Chem.Int.Ed.Engl.49,5375–5377.

Myers,R.M.,Stamatoyannopoulos,J.,Snyder,M.,Dunham,I.,Hardison,R.C.,Bernstein,B.E.,Gingeras,T.R.,Kent,W.J.,Birney,E.,Wold,B.,etal;ENCODEProjectConsortium.(2011).Auser’sguidetotheencyclopediaofDNAelements(ENCODE).PLoSBiol.9,e1001046.

Pastor,W.A.,Pape,U.J.,Huang,Y.,Henderson,H.R.,Lister,R.,Ko,M.,McLoughlin,E.M.,Brudno,Y.,Mahapatra,S.,Kapranov,P.,etal.(2011).Genome-widemappingof5-hydroxymethylcytosineinembryonicstemcells.Nature473,394–397.

Pelizzola,M.,andEcker,J.R.(2011).TheDNAmethylome.FEBSLett.585,1994–2000.

Pfaffeneder,T.,Hackner,B.,Truss,M.,Mu

¨nzel,M.,Mu¨ller,M.,Deiml,C.A.,Hagemeier,C.,andCarell,T.(2011).Thediscoveryof5-formylcytosineinembryonicstemcellDNA.Angew.Chem.Int.Ed.Engl.50,7008–7012.Quenneville,S.,Verde,G.,Corsinotti,A.,Kapopoulou,A.,Jakobsson,J.,Offner,S.,Baglivo,I.,Pedone,P.V.,Grimaldi,G.,Riccio,A.,andTrono,D.(2011).Inembryonicstemcells,ZFP57/KAP1recognizeamethylatedhexanu-cleotidetoaffectchromatinandDNAmethylationofimprintingcontrolregions.Mol.Cell44,361–372.

Robertson,A.B.,Dahl,J.A.,Va

gbø,C.B.,Tripathi,P.,Krokan,H.E.,andKlungland,A.(2011).Anovelmethodfortheef cientandselectiveidenti ca-tionof5-hydroxymethylcytosineingenomicDNA.NucleicAcidsRes.39,e55.Robertson,A.B.,Dahl,J.A.,Ougland,R.,andKlungland,A.(2012).Pull-downof5-hydroxymethylcytosineDNAusingJBP1-coatedmagneticbeads.Nat.Protoc.7,340–350.

Song,C.X.,Szulwach,K.E.,Fu,Y.,Dai,Q.,Yi,C.,Li,X.,Li,Y.,Chen,C.H.,Zhang,W.,Jian,X.,etal.(2011).Selectivechemicallabelingrevealsthegenome-widedistributionof5-hydroxymethylcytosine.Nat.Biotechnol.29,68–72.

Stadler,M.B.,Murr,R.,Burger,L.,Ivanek,R.,Lienert,F.,Scho

¨ler,A.,vanNimwegen,E.,Wirbelauer,C.,Oakeley,E.J.,Gaidatzis,D.,etal.(2011).DNA-bindingfactorsshapethemousemethylomeatdistalregulatoryregions.Nature480,490–495.

Stroud,H.,Feng,S.,MoreyKinney,S.,Pradhan,S.,andJacobsen,S.E.(2011).5-Hydroxymethylcytosineisassociatedwithenhancersandgenebodiesinhumanembryonicstemcells.GenomeBiol.12,R54.

Szulwach,K.E.,Li,X.,Li,Y.,Song,C.X.,Han,J.W.,Kim,S.,Namburi,S.,Hermetz,K.,Kim,J.J.,Rudd,M.K.,etal.(2011a).Integrating5-hydroxymethyl-cytosineintotheepigenomiclandscapeofhumanembryonicstemcells.PLoSGenet.7,e1002154.

Szulwach,K.E.,Li,X.,Li,Y.,Song,C.X.,Wu,H.,Dai,Q.,Irier,H.,Upadhyay,A.K.,Gearing,M.,Levey,A.I.,etal.(2011b).5-hmC-mediatedepigeneticdynamicsduringpostnatalneurodevelopmentandaging.Nat.Neurosci.14,1607–1616.

Szwagierczak,A.,Bultmann,S.,Schmidt,C.S.,Spada,F.,andLeonhardt,H.(2010).Sensitiveenzymaticquanti cationof5-hydroxymethylcytosineingenomicDNA.NucleicAcidsRes.38,e181.

Tahiliani,M.,Koh,K.P.,Shen,Y.,Pastor,W.A.,Bandukwala,H.,Brudno,Y.,Agarwal,S.,Iyer,L.M.,Liu,D.R.,Aravind,L.,andRao,A.(2009).Conversionof5-methylcytosineto5-hydroxymethylcytosineinmammalianDNAbyMLLpartnerTET1.Science324,930–935.

Valinluck,V.,Tsai,H.H.,Rogstad,D.K.,Burdzy,A.,Bird,A.,andSowers,L.C.(2004).Oxidativedamagetomethyl-CpGsequencesinhibitsthebindingofthemethyl-CpGbindingdomain(MBD)ofmethyl-CpGbindingprotein2(MeCP2).NucleicAcidsRes.32,4100–4108.

Wigler,M.,Levy,D.,andPerucho,M.(1981).ThesomaticreplicationofDNAmethylation.Cell24,33–40.

Williams,K.,Christensen,J.,Pedersen,M.T.,Johansen,J.V.,Cloos,P.A.,Rappsilber,J.,andHelin,K.(2011).TET1andhydroxymethylcytosineintranscriptionandDNAmethylation delity.Nature473,343–348.

Wossidlo,M.,Nakamura,T.,Lepikhov,K.,Marques,C.J.,Zakhartchenko,V.,Boiani,M.,Arand,J.,Nakano,T.,Reik,W.,andWalter,J.(2011).5-Hydroxy-

methylcytosineinthemammalianzygoteislinkedwithepigeneticreprogram-ming.NatCommun2,241.

Wu,H.,D’Alessio,A.C.,Ito,S.,Wang,Z.,Cui,K.,Zhao,K.,Sun,Y.E.,andZhang,Y.(2011).Genome-wideanalysisof5-hydroxymethylcytosinedistribu-tionrevealsitsdualfunctionintranscriptionalregulationinmouseembryonicstemcells.GenesDev.25,679–684.

Xu,Y.,Wu,F.,Tan,L.,Kong,L.,Xiong,L.,Deng,J.,Barbera,A.J.,Zheng,L.,Zhang,H.,Huang,S.,etal.(2011).Genome-wideregulationof5hmC,5mC,andgeneexpressionbyTet1hydroxylaseinmouseembryonicstemcells.Mol.Cell42,451–464.

Yildirim,O.,Li,R.,Hung,J.H.,Chen,P.B.,Dong,X.,Ee,L.S.,Weng,Z.,Rando,O.J.,andFazzio,T.G.(2011).Mbd3/NURDcomplexregulatesexpressionof5-hydroxymethylcytosinemarkedgenesinembryonicstemcells.Cell147,1498–1510.

Zhang,L.,Lu,X.,Lu,J.,Liang,H.,Dai,Q.,Xu,G.L.,Luo,C.,Jiang,H.,andHe,C.(2012).ThymineDNAglycosylasespeci callyrecognizes5-carboxylcyto-sine-modi edDNA.Nat.Chem.Biol.8,328–330.

Cell149,1–13,June8,2012ª2012ElsevierInc.

13

本文来源:https://www.bwwdw.com/article/e0g1.html

Top