Impact of regulatory variation from RNA to protein

更新时间:2023-05-12 05:24:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

Impact of regulatory variation from RNA to protein

Reports

(LCLs). We collected ribosome profiling data for 72 Yoruba LCLs and quantified protein abundance in 62 of these lines. Genome-wide genotypes and RNA-sequencing data were

1,2333

Alexis Battle,* Zia Khan, Sidney H. Wang, Amy Mitrano, available for all lines (19).

41,2,53

Ribosome profiling is an ef-Michael J. Ford, Jonathan K. Pritchard,§ Yoav Gilad§

fective way to measure changes 1

Department of Genetics, Stanford University, Stanford, CA 94305, USA. 2Howard Hughes Medical Institute, Stanford

in translational regulation using University, Stanford, CA 94305, USA. 3Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.

4sequencing (11). We obtained a MS Bioworks, LLC, 3950 Varsity Drive, Ann Arbor, MI 48108, USA. 5Department of Biology, Stanford University, Stanford, CA 94305, USA. median coverage of 12 million

mapped reads per sample and, *Present address: Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.

as expected, the ribosome profil- Present address: Department of Computer Science, University of Maryland, College Park, MD 20742, USA.

ing reads are highly concentrat- These authors contributed equally to this work. ed within coding regions and §Corresponding author. E-mail: pritch@stanford.edu (J.K.P); gilad@uchicago.edu (Y.G.) show an enrichment of a 3-bp

periodicity, reflecting the pro-The phenotypic consequences of expression quantitative trait loci (eQTLs)

gression of a translating ribo-are presumably due to their effects on protein expression levels. Yet, the

some (figs. S1-S3, table S1). impact of genetic variation, including eQTLs, on protein levels remains

We collected relative protein poorly understood. To address this, we mapped genetic variants that are

associated with eQTLs, ribosome occupancy (rQTLs), or protein abundance expression measurements using

a SILAC internal standard sam-(pQTLs). We found that most QTLs are associated with transcript

ple (20) and quantitative protein expression levels, with consequent effects on ribosome and protein levels.

mass spectrometry (fig. S4). To However, eQTLs tend to have significantly reduced effect sizes on protein

levels, suggesting that their potential impact on downstream phenotypes is confirm the quality of the prote-omics data (table S2) we evalu-often attenuated or buffered. Additionally, we identified a class of cis QTLs

ated the agreement between that affect protein abundance with little or no effect on mRNA or ribosome

measurements of distinct groups levels, suggesting that they may arise from differences in post-of peptides from the same pro-translational regulation.

tein. Differences between these measurements can reflect true

To understand the links between genetic and phenotypic biological variation (e.g., splicing), or experimental noise. variation it may be essential to first understand how genetic The high correlations (Spearman’s rho 0.7-0.9; R2 of 0.3-0.7; variation impacts the regulation of gene expression. depending on the sample) confirmed that we are able to Previous studies have evaluated the association between precisely quantify inter-individual variation in protein levels variation and transcript expression in humans (1–3). Yet (fig. S5). We also analyzed quantifications of peptides that protein abundances are more direct determinants of cellular overlapped non-synonymous SNPs that were heterozygous functions (4), and the impact of genetic differences on the in either the analyzed or the internal standard sample (fig. multi-stage process of gene expression through S6). The median ratios measured from these peptides transcription and translation to steady state protein levels, matched the expected values closely, indicating that our has not been fully characterized. Studies in model protein measurements were likely not subject to ratio com-organisms have shown that variation in mRNA and protein pression (figs. S7 and S8).

As a final quality check, we considered variation in ex-expression levels are often uncorrelated (5–8). Comparative

studies (9–13) have suggested that protein expression pression levels within and between genes. We found that evolves under greater evolutionary constraint than transcript and protein expression levels – which are the fur-transcript levels (14), and provided evidence consistent with thest removed processes studied here - are the least corre-buffering of protein expression with respect to variation lated (figs. S9 and S10). Our observations are in agreement introduced at the transcript level. Yet, in contrast to with most high-throughput studies that considered large comparative work, there are few reports of QTLs associated number of samples, although smaller studies have often

observed higher correlations (18, 21, 22). with protein levels (pQTLs) in humans (15–17).

We mapped genetic associations with regulatory pheno-Here we present a unified analysis of the association of

genetic variation with transcript expression, ribosome pro-types. First, we evaluated QTLs for each phenotype inde-filing (18), and steady state protein levels in a set of Hap-pendently by testing for association between the phenotype Map Yoruba (Ibadan, Nigeria) lymphoblastoid cell lines and all genetic variants with minor allele frequency >10% in

Impact of regulatory variation from RNA to protein

Downloaded from on December 19, 2014

Impact of regulatory variation from RNA to protein

a 20 kb window around the corresponding gene. We used a shared standardization, normalization, regression, and per-mutation pipeline for all three phenotypes. At an FDR of 10% we detected 2,355 eQTLs, 939 rQTLs, and 278 pQTLs (Table 1, fig. S11).

There is substantial overlap among detected QTLs (fig. S12). Among the 4,322 genes quantified for all three pheno-types, 54% of the genes with pQTLs also have a significant rQTL and/or eQTL. Given the incomplete statistical power to detect QTLs in each dataset independently, we performed replication testing across datasets, using the specific SNP-gene pairs underlying each class of QTLs. This analysis is less sensitive to power limitations than genome-wide test-ing. The results confirm that many QTLs are shared across all three phenotypes (example in Fig. 1A). In particular, most (90%) genetic variants associated with ribosome occu-pancy are also associated with transcript levels (fig. S13). In contrast, eQTLs showed the lowest overlap with pQTLs (35%), as expected (Fig. 1B).

Our observation that many SNPs identified as eQTLs are not associated with differences in protein levels is consistent with the notion that, across species, protein levels diverge less than transcript levels (12–17). Yet some QTLs may not replicate at the protein level simply due to incomplete map-ping power. To address this, and to avoid over-estimation of effect sizes due to ascertainment bias at significant QTLs, we focused on eQTLs detected previously in European sam-ples by the GEUVADIS study (2). We then attempted to rep-licate the GEUVADIS eQTLs using our transcript, ribosome profiling, and protein data and considered the mean QTL effect size in each data type. Mean effect sizes calculated in this way are expected to be unbiased with respect to either technical or biological variance.

Using this approach, we observed a reduced mean effect size for the GEUVADIS eQTLs in the protein data compared to either the RNA-seq data (t-test P = 6.7×10 3) or ribosome data (P = 5.6×10 3; Fig. 1C). In contrast, the average effect sizes observed for the RNA-seq and ribosome data are not significantly different from each other, and their effect sizes are highly correlated across the tested eQTLs (Pearson c = 0.79, P < 10 96, fig. S14). The reduction in effect size observed in protein data is robust with respect to potential technical confounders, including iBAQ intensity level and transcript model complexity (fig. S15). We thus conclude that the ma-jority of genetic variants affecting transcript levels also alter ribosomal occupancy, typically with a similar magnitude of effect. Yet, both eQTL mapping and effect-size analyses in-dicate that many eQTLs have attenuated (or absent) effects on steady state protein levels (fig. S16).

In addition to the observation of generally attenuated ef-fect sizes in pQTLs compared with eQTLs, we identified a subset of variants that appear to affect levels of proteins but not mRNA, and hence are candidates to affect post-transcriptional gene regulation. To evaluate evidence for these, we tested each SNP for association with one regulato-ry phenotype, while treating one or both of the other pheno-types as covariates (conditional model). Considering protein levels, with RNA levels as a covariate, we identified 146 pro-tein-specific QTLs (psQTLs) at FDR = 10% (Fig. 2A). The identification of psQTLs is generally robust to the choice of technology used to characterize transcript expression (fig. S17).

Using an alternative approach, an interaction model, we identified 68 psQTLs with significantly larger effects in pro-tein than mRNA (LRT; FDR = 10%). We also used the inter-action model to identify 76 expression-specific QTLs (esQTLs, interaction model, LRT; FDR = 10%). We then con-sidered the ribosomal data. We found that the effect sizes for ribosomal occupancy are similar to the esQTL effect siz-es (Fig. 2B). Yet, for psQTLs, low ribosome effect sizes are observed. Thus, for QTLs with discordant effects between transcript and protein, the ribosome data usually tracked with levels of RNA. Put together, these results allow us to identify loci where genetic variants have specific impacts on protein levels that are not fully mediated by regulation of either transcription or translation and hence may affect rates of protein degradation.

Finally, we performed enrichment analysis in which we considered each tested gene-SNP pair separately, and evalu-ated the full distribution of P values from the conditional model (rather than choosing a significance threshold) for different genomic and functional annotations. SNPs within transcribed regions (exonic and UTR) are enriched for more significant psQTL effects, compared to intergenic or intronic SNPs, even within the narrow 20 kb windows tested (figs. S18, S19, and S20). In addition, psQTLs are further enriched for non-synonymous sites (compared to all exonic SNPs, Table 2).

Investigating additional annotations (Table 2, fig. S21), we found that non-synonymous SNPs near acetylation sites showed nominal enrichment for psQTLs. This possibly re-flects the functional role of lysine acetylation in modulating protein degradation (23). Overall, the enrichment results suggest that genetic variants involved in post-transcriptional regulation are functionally distinct from ge-netic variants that primarily affect transcription – they are more likely to fall within translated regions of the gene, and more likely to occur at non-synonymous sites.

In summary, we have shown that while a substantial fraction of regulatory genetic variants influence gene ex-pression at all levels from mRNA to steady state protein abundance, there are also a number of effects with specific impact on particular expression phenotypes. QTLs affecting mRNA levels are on average attenuated or buffered at the protein level, as has been observed between species (14). Our analysis indicates that this attenuation is not evident at the stage of translation. While the overall phenotypic simi-larity between ribosome occupancy and protein abundance is high, cis-regulatory genetic effects on ribosome occupancy appear to be more strongly shared with mRNA than with

Impact of regulatory variation from RNA to protein

Genome Res. 24, 411–421 (2014). protein. These observations, along with the phenotype-specific QTL analysis, indicate a scarcity of translation-13. C. J. McManus, G. E. May, P. Spealman, A. Shteyman, Ribosome profiling reveals post-transcriptional buffering of divergent gene expression in yeast. Genome

specific QTLs, and minimal attenuation of genetic impact Res. 24, 422–430 (2014). between mRNA and ribosome phenotypes. 14. Z. Khan, M. J. Ford, D. A. Cusanovich, A. Mitrano, J. K. Pritchard, Y. Gilad, Primate REFERENCES AND NOTES

1. A. Battle, S. Mostafavi, X. Zhu, J. B. Potash, M. M. Weissman, C. McCormick, C. D.

Haudenschild, K. B. Beckman, J. Shi, R. Mei, A. E. Urban, S. B. Montgomery, D. F. Levinson, D. Koller, Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).

2. T. Lappalainen, M. Sammeth, M. R. Friedländer, P. A. ’t Hoen, J. Monlong, M. A.

Rivas, M. Gonzàlez-Porta, N. Kurbatova, T. Griebel, P. G. Ferreira, M. Barann, T. Wieland, L. Greger, M. van Iterson, J. Almlöf, P. Ribeca, I. Pulyakhina, D. Esser, T. Giger, A. Tikhonov, M. Sultan, G. Bertier, D. G. MacArthur, M. Lek, E. Lizano, H. P. Buermans, I. Padioleau, T. Schwarzmayr, O. Karlberg, H. Ongen, H. Kilpinen, S. Beltran, M. Gut, K. Kahlem, V. Amstislavskiy, O. Stegle, M. Pirinen, S. B. Montgomery, P. Donnelly, M. I. McCarthy, P. Flicek, T. M. Strom, H. Lehrach, S. Schreiber, R. Sudbrak, A. Carracedo, S. E. Antonarakis, R. Häsler, A. C. Syvänen, G. J. van Ommen, A. Brazma, T. Meitinger, P. Rosenstiel, R. Guigó, I. G. Gut, X. Estivill, E. T. Dermitzakis; Geuvadis Consortium, Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013). Medline

3. E. Grundberg, K. S. Small, Å. K. Hedman, A. C. Nica, A. Buil, S. Keildson, J. T. Bell,

T. P. Yang, E. Meduri, A. Barrett, J. Nisbett, M. Sekowska, A. Wilk, S. Y. Shin, D. Glass, M. Travers, J. L. Min, S. Ring, K. Ho, G. Thorleifsson, A. Kong, U. Thorsteindottir, C. Ainali, A. S. Dimas, N. Hassanali, C. Ingle, D. Knowles, M. Krestyaninova, C. E. Lowe, P. Di Meglio, S. B. Montgomery, L. Parts, S. Potter, G. Surdulescu, L. Tsaprouni, S. Tsoka, V. Bataille, R. Durbin, F. O. Nestle, S. O’Rahilly, N. Soranzo, C. M. Lindgren, K. T. Zondervan, K. R. Ahmadi, E. E. Schadt, K. Stefansson, G. D. Smith, M. I. McCarthy, P. Deloukas, E. T. Dermitzakis, T. D. Spector; Multiple Tissue Human Expression Resource (MuTHER) Consortium, Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).

4. C. Vogel, E. M. Marcotte, Insights into the regulation of protein abundance from

proteomic and transcriptomic analyses. Nat. Rev. Genet. 13, 227–232 (2012). Medline

5. A. Ghazalpour, B. Bennett, V. A. Petyuk, L. Orozco, R. Hagopian, I. N. Mungrue, C.

R. Farber, J. Sinsheimer, H. M. Kang, N. Furlotte, C. C. Park, P. Z. Wen, H. Brewer, K. Weitz, D. G. Camp 2nd, C. Pan, R. Yordanova, I. Neuhaus, C. Tilford, N. Siemers, P. Gargalovic, E. Eskin, T. Kirchgessner, D. J. Smith, R. D. Smith, A. J. Lusis, Comparative analysis of proteome and transcriptome variation in mouse. PLOS Genet. 7, e1001393 (2011). 6. E. J. Foss, D. Radulovic, S. A. Shaffer, D. M. Ruderfer, A. Bedalov, D. R. Goodlett, L.

Kruglyak, Genetic basis of proteome variation in yeast. Nat. Genet. 39, 1369–1375 (2007).

7. P. Picotti, M. Clément-Ziza, H. Lam, D. S. Campbell, A. Schmidt, E. W. Deutsch, H.

Röst, Z. Sun, O. Rinner, L. Reiter, Q. Shen, J. J. Michaelson, A. Frei, S. Alberti, U. Kusebauch, B. Wollscheid, R. L. Moritz, A. Beyer, R. Aebersold, A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis. Nature 494, 266–270 (2013).

8. F. W. Albert, S. Treusch, A. H. Shockley, J. S. Bloom, L. Kruglyak, Genetics of

single-cell protein abundance variation in large yeast populations. Nature 506, 494–497 (2014).

9. J. M. Laurent, C. Vogel, T. Kwon, S. A. Craig, D. R. Boutz, H. K. Huse, K. Nozue, H.

Walia, M. Whiteley, P. C. Ronald, E. M. Marcotte, Protein abundances are more conserved than mRNA abundances across diverse taxa. Proteomics 10, 4209–4212 (2010).

10. S. P. Schrimpf, M. Weiss, L. Reiter, C. H. Ahrens, M. Jovanovic, J. Malmström, E.

Brunner, S. Mohanty, M. J. Lercher, P. E. Hunziker, R. Aebersold, C. von Mering, M. O. Hengartner, Comparative functional analysis of the Caenorhabditis elegans and Drosophila melanogaster proteomes. PLOS Biol. 7, e48 (2009). Medline

11. M. Stadler, A. Fire, Conserved translatome remodeling in nematode species

executing a shared developmental transition. PLOS Genet. 9, e1003739 (2013).

12. C. G. Artieri, H. B. Fraser, Evolution at two levels of gene expression in yeast.

transcript and protein expression levels evolve under compensatory selection pressures. Science 342, 1100–1104 (2013). doi:10.1126/science.1242379

15. L. Wu, S. I. Candille, Y. Choi, D. Xie, L. Jiang, J. Li-Pook-Than, H. Tang, M. Snyder,

Variation and genetic control of protein abundance in humans. Nature 499, 79–82 (2013).

16. R. J. Hause, A. L. Stark, N. N. Antao, L. K. Gorsic, S. H. Chung, C. D. Brown, S. S.

Wong, D. F. Gill, J. L. Myers, L. A. To, K. P. White, M. E. Dolan, R. B. Jones, Identification and validation of genetic variants that influence transcription factor and cell signaling protein levels. Am. J. Hum. Genet. 95, 194–208 (2014).

17. N. Garge, H. Pan, M. D. Rowland, B. J. Cargile, X. Zhang, P. C. Cooley, G. P. Page,

M. K. Bunger, Identification of quantitative trait loci underlying proteome variation in human lymphoblastoid cells. Mol. Cell. Proteomics 9, 1383–1399 (2010).

18. N. T. Ingolia, S. Ghaemmaghami, J. R. Newman, J. S. Weissman, Genome-wide

analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009). Medline doi:10.1126/science.1168978

19. J. K. Pickrell, J. C. Marioni, A. A. Pai, J. F. Degner, B. E. Engelhardt, E. Nkadori, J.

B. Veyrieras, M. Stephens, Y. Gilad, J. K. Pritchard, Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).

20. S. E. Ong, B. Blagoev, I. Kratchmarova, D. B. Kristensen, H. Steen, A. Pandey, M.

Mann, Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 1, 376–386 (2002).

21. M. Wilhelm, J. Schlegl, H. Hahne, A. M. Gholami, M. Lieberenz, M. M. Savitski, E.

Ziegler, L. Butzmann, S. Gessulat, H. Marx, T. Mathieson, S. Lemeer, K. Schnatbaum, U. Reimer, H. Wenschuh, M. Mollenhauer, J. Slotta-Huspenina, J. H. Boese, M. Bantscheff, A. Gerstmair, F. Faerber, B. Kuster, Mass-spectrometry-based draft of the human proteome. Nature 509, 582–587 (2014).

22. B. Schwanhäusser, D. Busse, N. Li, G. Dittmar, J. Schuchhardt, J. Wolf, W. Chen,

M. Selbach, Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).

23. X. . Yang, E. Seto, Lysine acetylation: Codified crosstalk with other

posttranslational modifications. Mol. Cell 31, 449–461 (2008). doi:10.1016/j.molcel.2008.07.002

24. M. N. Lee, C. Ye, A. C. Villani, T. Raj, W. Li, T. M. Eisenhaure, S. H. Imboywa, P. I.

Chipendo, F. A. Ran, K. Slowikowski, L. D. Ward, K. Raddassi, C. McCabe, M. H. Lee, I. Y. Frohlich, D. A. Hafler, M. Kellis, S. Raychaudhuri, F. Zhang, B. E. Stranger, C. O. Benoist, P. L. De Jager, A. Regev, N. Hacohen, Common genetic variants modulate pathogen-sensing responses in human dendritic cells. Science 343, 1246980 (2014).

25. J. F. Degner, A. A. Pai, R. Pique-Regi, J. B. Veyrieras, D. J. Gaffney, J. K. Pickrell,

S. De Leon, K. Michelini, N. Lewellen, G. E. Crawford, M. Stephens, Y. Gilad, J. K. Pritchard, DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390–394 (2012). Medline doi:10.1038/nature10808

26. Y. Guan, M. Stephens, Practical issues in imputation-based association mapping.

PLOS Genet. 4, e1000279 (2008). 27. P. Scheet, M. Stephens, A fast and flexible statistical model for large-scale

population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006). Medline

28. K. A. Frazer, D. G. Ballinger, D. R. Cox, D. A. Hinds, L. L. Stuve, R. A. Gibbs, J. W.

Belmont, A. Boudreau, P. Hardenbol, S. M. Leal, S. Pasternak, D. A. Wheeler, T. D. Willis, F. Yu, H. Yang, C. Zeng, Y. Gao, H. Hu, W. Hu, C. Li, W. Lin, S. Liu, H. Pan, X. Tang, J. Wang, W. Wang, J. Yu, B. Zhang, Q. Zhang, H. Zhao, H. Zhao, J. Zhou, S. B. Gabriel, R. Barry, B. Blumenstiel, A. Camargo, M. Defelice, M. Faggart, M. Goyette, S. Gupta, J. Moore, H. Nguyen, R. C. Onofrio, M. Parkin, J. Roy, E. Stahl, E. Winchester, L. Ziaugra, D. Altshuler, Y. Shen, Z. Yao, W. Huang, X. Chu, Y. He, L. Jin, Y. Liu, Y. Shen, W. Sun, H. Wang, Y. Wang, Y. Wang, X. Xiong, L. Xu, M. M. Waye, S. K. Tsui, H. Xue, J. T.-F. Wong, L. M. Galver, J.-B. Fan, K. Gunderson, S. S.

Impact of regulatory variation from RNA to protein

Murray, A. R. Oliphant, M. S. Chee, A. Montpetit, F. Chagnon, V. Ferretti, M. Leboeuf, J.-F. Olivier, M. S. Phillips, S. Roumy, C. Sallée, A. Verner, T. J. Hudson, P.-Y. Kwok, D. Cai, D. C. Koboldt, R. D. Miller, L. Pawlikowska, P. Taillon-Miller, M. Xiao, L.-C. Tsui, W. Mak, Y. Q. Song, P. K. Tam, Y. Nakamura, T. Kawaguchi, T. Kitamoto, T. Morizono, A. Nagashima, Y. Ohnishi, A. Sekine, T. Tanaka, T. Tsunoda, P. Deloukas, C. P. Bird, M. Delgado, E. T. Dermitzakis, R. Gwilliam, S. Hunt, J. Morrison, D. Powell, B. E. Stranger, P. Whittaker, D. R. Bentley, M. J. Daly, P. I. W. de Bakker, J. Barrett, Y. R. Chretien, J. Maller, S. McCarroll, N. Patterson, I. Pe’er, A. Price, S. Purcell, D. J. Richter, P. Sabeti, R. Saxena, S. F. Schaffner, P. C. Sham, P. Varilly, D. Altshuler, L. D. Stein, L. Krishnan, A. Vernon Smith, M. K. Tello-Ruiz, G. A. Thorisson, A. Chakravarti, P. E. Chen, D. J. Cutler, C. S. Kashuk, S. Lin, G. R. Abecasis, W. Guan, Y. Li, H. M. Munro, Z. S. Qin, D. J. Thomas, G. McVean, A. Auton, L. Bottolo, N. Cardin, S. Eyheramendy, C. Freeman, J. Marchini, S. Myers, C. Spencer, M. Stephens, P. Donnelly, L. R. Cardon, G. Clarke, D. M. Evans, A. P. Morris, B. S. Weir, T. Tsunoda, J. C. Mullikin, S. T. Sherry, M. Feolo, A. Skol, H. Zhang, C. Zeng, H. Zhao, I. Matsuda, Y. Fukushima, D. R. Macer, E. Suda, C. N. Rotimi, C. A. Adebamowo, I. Ajayi, T. Aniagwu, P. A. Marshall, C. Nkwodimmah, C. D M. Royal, M. F. Leppert, M. Dixon, A. Peiffer, R. Qiu, A. Kent, K. Kato, N. Niikawa, I. F. Adewole, B. M. Knoppers, M. W. Foster, E. W. Clayton, J. Watkin, R. A. Gibbs, J. W. Belmont, D. Muzny, L. Nazareth, E. Sodergren, G. M. Weinstock, D. A. Wheeler, I. Yakub, S. B. Gabriel, R. C. Onofrio, D. J. Richter, L. Ziaugra, B. W. Birren, M. J. Daly, D. Altshuler, R. K. Wilson, L. L. Fulton, J. Rogers, J. Burton, N. P. Carter, C. M. Clee, M. Griffiths, M. C. Jones, K. McLay, R. W. Plumb, M. T. Ross, S. K. Sims, D. L. Willey, Z. Chen, H. Han, L. Kang, M. Godbout, J. C. Wallenburg, P. L’Archevêque, G. Bellemare, K. Saeki, H. Wang, D. An, H. Fu, Q. Li, Z. Wang, R. Wang, A. L. Holden, L. D. Brooks, J. E. McEwen, M. S. Guyer, V. O. Wang, J. L. Peterson, M. Shi, J. Spiegel, L. M. Sung, L. F. Zacharia, F. S. Collins, K. Kennedy, R. Jamieson, J. Stewart; International HapMap Consortium, A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007). 29. G. R. Abecasis, D. Altshuler, A. Auton, L. D. Brooks, R. M. Durbin, R. A. Gibbs, M.

E. Hurles, G. A. McVean; 1000 Genomes Project Consortium, A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

30. J. K. Pickrell, A. A. Pai, Y. Gilad, J. K. Pritchard, Noisy splicing drives mRNA

isoform diversity in human cells. PLOS Genet. 6, e1001236 (2010).

31. N. T. Ingolia, G. A. Brar, S. Rouskin, A. M. McGeachy, J. S. Weissman, The

ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat. Protoc. 7, 1534–1550 (2012).

32. J. D. Storey, A direct approach to false discovery rates. J. R. Stat. Soc. Series B

Stat. Methodol. 64, 479–498 (2002). doi:10.1111/1467-9868.00346

33. J. D. Storey, J. E. Taylor, D. Siegmund, Strong control, conservative point

estimation and simultaneous conservative consistency of false discovery rates: A unified approach. J. R. Stat. Soc. Series B Stat. Methodol. 66, 187–205 (2004).

34. S. E. Calvo, D. J. Pagliarini, V. K. Mootha, Upstream open reading frames cause

widespread reduction of protein expression and are polymorphic among humans. Proc. Natl. Acad. Sci. U.S.A. 106, 7507–7512 (2009).

35. Y. Wan, K. Qu, Q. C. Zhang, R. A. Flynn, O. Manor, Z. Ouyang, J. Zhang, R. C.

Spitale, M. P. Snyder, E. Segal, H. Y. Chang, Landscape and variation of RNA secondary structure across the human transcriptome. Nature 505, 706–709 (2014).

36. D. Karolchik, G. P. Barber, J. Casper, H. Clawson, M. S. Cline, M. Diekhans, T. R.

Dreszer, P. A. Fujita, L. Guruvadoo, M. Haeussler, R. A. Harte, S. Heitner, A. S. Hinrichs, K. Learned, B. T. Lee, C. H. Li, B. J. Raney, B. Rhead, K. R. Rosenbloom, C. A. Sloan, M. L. Speir, A. S. Zweig, D. Haussler, R. M. Kuhn, W. J. Kent, The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 42 (D1), D764–D770 (2014).

37. R. D. Finn, A. Bateman, J. Clements, P. Coggill, R. Y. Eberhardt, S. R. Eddy, A.

Heger, K. Hetherington, L. Holm, J. Mistry, E. L. Sonnhammer, J. Tate, M. Punta, Pfam: The protein families database. Nucleic Acids Res. 42 (D1), D222–D230 (2014).

38. P. V. Hornbeck, J. M. Kornhauser, S. Tkachev, B. Zhang, E. Skrzypek, B. Murray,

V. Latham, M. Sullivan, PhosphoSitePlus: A comprehensive resource for investigating the structure and function of experimentally determined post-

translational modifications in man and mouse. Nucleic Acids Res. 40 (D1), D261–D270 (2012).

39. Z. Dosztányi, V. Csizmok, P. Tompa, I. Simon, IUPred: Web server for the

prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21, 3433–3434 (2005).

40. I. A. Adzhubei, S. Schmidt, L. Peshkin, V. E. Ramensky, A. Gerasimova, P. Bork, A.

S. Kondrashov, S. R. Sunyaev, A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).

41. X. Liu, X. Jian, E. Boerwinkle, dbNSFP v2.0: A database of human non-synonymous SNVs and their functional predictions and annotations. Hum. Mutat. 34, E2393–E2402 (2013). Medline doi:10.1002/humu.22376

42. Z. Khan, J. S. Bloom, S. Amini, M. Singh, D. H. Perlman, A. A. Caudy, L. Kruglyak,

Quantitative measurement of allele-specific protein expression in a diploid yeast hybrid by LC-MS. Mol. Syst. Biol. 8, 602 (2012). 10.1038/msb.2012.34

43. A. A. Pai, C. E. Cain, O. Mizrahi-Man, S. De Leon, N. Lewellen, J. B. Veyrieras, J. F.

Degner, D. J. Gaffney, J. K. Pickrell, M. Stephens, J. K. Pritchard, Y. Gilad, The contribution of RNA decay quantitative trait loci to inter-individual variation in steady-state gene expression levels. PLOS Genet. 8, e1003000 (2012). ACKNOWLEDGMENTS

We thank M. Stephens, T. Flutre, and members of the Pritchard and Gilad labs for

helpful discussions. This work was supported by NIH grants GM077959,

HG007036, and MH084703. A.B. and J.K.P. were supported the Howard Hughes Medical Institute. Z.K. was supported by F32HG006972. The proteomics data are available at ProteomeXchange (accession PXD001406). The ribosome

profiling data are available at GEO (accession GSE61742). J.K.P. is on the Senior Advisory Board for 23andMe and DNANexus and holds stock in both. A.B. holds stock in Google, Inc. SUPPLEMENTARY MATERIALS

/cgi/content/full/science.1260793/DC1 Materials and Methods Figs. S1 to S21 Tables S1 to S3 References (25–43)

Supplementary Data Tables S1 to S4

03 September 2014; accepted 09 December 2014 Published online 18 December 2014 10.1126/science.1260793

Impact of regulatory variation from RNA to protein

Fig. 1. Comparisons of QTLs at three levels of gene regulation. (A) Many QTLs exhibit shared effects across mRNA, ribosome occupancy and protein. This example illustrates a shared QTL for the schlafen family member 5 (SLFN5) gene (24). The upper panels show mean sequence depth (per bp) for mRNA and ribosome occupancy, averaged among individuals with each genotype at the QTL SNP. The lower panel shows median log2 SILAC ratios at each detected peptide, relative to the shared internal standard. (B) Replication rates between independently tested cis-QTLs for each phenotype pair, at FDR=10%. QTLs detected for the phenotype labeled on each row were tested in the phenotype listed for each column, considering only the 4,322 genes quantified in all three phenotypes. (C) On average, eQTLs exhibit attenuated effects on protein abundance but not on ribosome occupancy. We used eQTLs detected by the GEUVADIS study to avoid ascertainment bias, and we polarized the alleles according to the direction of effect in GEUVADIS. The plot shows mean effect sizes and standard errors on the means, measured as expected fold-change per allele copy on a log2 scale.

Impact of regulatory variation from RNA to protein

Fig. 2. Protein-specific and RNA-specific QTLs. (A) An example of a protein-specific QTL, for the apolipoprotein L, 2 (APOL2) gene, detected by both the interaction model and the conditional models, indicating both larger effect (LRT, P = 3.3×10 6, interaction model; P = 5.1×10 13, conditional model) in protein than mRNA, and that the effect on protein is not mediated by either mRNA or ribosome occupancy (LRT, P = 2.1×10 12, conditional model). Plotting details as in Fig. 1A. While the causal variant underlying this pQTL is unknown, several linked variants near the 3′ end of APOL2 are all strongly associated with protein levels, including rs8142325 shown here and missense variant rs7285167 (βg = 0.83, P = 9.8×10 9; LRT, P = 2.1×10 5, interaction model; P = 5.5×10 13, conditional model). (B) Effect sizes for ribosome occupancy tend to track with RNA, not protein. (Top) effect sizes in all three phenotypes are shown for protein-specific QTLs. Effect sizes were estimated using linear regression in each of the phenotypes independently. The signs of the effects were set to be positive in protein. Solid lines reflect predicted effects based on a linear model. (Bottom) Similarly, effect sizes in all three phenotypes for esQTLs. Here, signs of the effects were set to be positive in RNA.

Impact of regulatory variation from RNA to protein

Table 1. Number of cis-QTLs identified at False Discovery Rate (FDR) 10%.

Protein abundance 4,381 62 278 mRNA expression 16,614 75 2,355

Table 2. Enrichment of genomic annotations among expression and protein-specific QTLs. Enrichments were evaluated by a continuous test using QTL results from the conditional model (see Supplementary Infor-mation). Columns (from left to right) describe the annotation being considered, the number of SNPs matching this annotation, the set of SNPs used as background for the corresponding test, the enrichment P values for protein-specific QTLs (psQTLs), and expression-specific QTLs (esQTLs), respectively.

Exonic 5′ UTR 3′ UTR Intronic

Ribo SNitch SNPs 12,568 6,488 15,139 628,591 414 Intergenic Intergenic Intergenic Intergenic Exonic

2.8 × 10 3.2 × 10 2.0 × 10 7.1 × 10 5.2 × 10 2.3 × 10 5.9 × 10 1.7 × 10 2.9 × 10* 2.5 × 10 *Depletion relative to background.

本文来源:https://www.bwwdw.com/article/ie8e.html

Top