ADVICE Automated Detection and Validation of Interaction by
更新时间:2023-03-21 09:17:01 阅读量: 实用文档 文档下载
- advice推荐度:
- 相关推荐
ADVICE:Automated Detection and Validation of Interaction by Co-Evolution
Soon-Heng Tan*,Zhuo Zhang and See-Kiong Ng
Knowledge Discovery Department,Institute for Infocomm Research,21Heng Mui Keng Terrace,Singapore 119613,Singapore
Received February 15,2004;Revised and Accepted April 30,2004
ABSTRACT
ADVICE (Automated Detection and Validation of Interaction by Co-Evolution)is a web tool for predict-ing and validating protein-protein interactions using the observed co-evolution between interacting pro-teins.Interacting proteins are known to share similar evolutionary histories since they undergo coordi-nated evolutionary changes to preserve interactions and functionalities.The web tool automates a com-monly adopted methodology to quantify the similari-ties in proteins’evolutionary histories for postulating potential protein–protein interactions.ADVICE can also be used to validate experimental data against spurious protein interactions by identifying those that have few similarities in their evolutionary histories.The web tool accepts a list of protein sequences or sequence pairs as input and retrieves orthologous sequences to compute the similarities in the proteins’evolutionary histories.To facilitate hypothesis generation,detected co-evolved proteins can be visualized as a network at the website.ADVICE is available at 6ce7e00c76c66137ee061912.sg.INTRODUCTION
Co-evolution is a process whereby two or more species inter-act and in?uence genetic changes in one another.The process is also evident at the molecular level,where interacting pro-teins exhibit coordinated mutations to evolve at a similar rate (1).Mutation—a mechanism of evolution—disrupts protein interactions when residue changes occur within inter-protein contact sites or at regions implicated in the structural integrity of proteins.When a disrupted interaction leads to reduced ?tness,the mutated sequence will be selected against and removed by natural selection.However,the mutated sequence will be retained if compensatory mutations that preserve the interaction occur in its interacting partners.As a result,inter-acting proteins will seem to evolve at the same rate and have similar evolutionary histories.This is a phenomenon that has been well characterized in various receptor–ligand systems (2–4)such as two-component signal transduction (5).
Observed co-evolution between interacting proteins has been used previously to predict protein interaction sites (6)and to improve docking algorithms (7,8).Recently,Goh et al .(9)adopted a statistical method to quantify the similarities in the evolutionary histories of proteins to predict the interactions of chemokines with their receptors based on the high correla-tion in the distance matrices constructed from multiple sequence alignments.Pazos and Valencia (10)extended the idea to genome-wide prediction of protein–protein interactions in Escherichia coli .The co-evolution approach was later further exploited to successfully pinpoint a family of ligands to its speci?c receptors (11).In these works,the methodology adopted to detect co-evolved interacting proteins consisted of the following sequential steps:(i)searching and retrieving pairs of orthologous sequences from databases,(ii)construct-ing distance matrices from the multiple sequence alignments of the retrieved orthologous sequences and (iii)measuring similarities in evolutionary histories of proteins by comparing the distance matrices constructed.
We have implemented ADVICE (Automated Detection and Validation of Interaction by Co-Evolution)—a web-based tool—that automates the steps needed to compute the similar-ities between proteins’evolutionary histories.The web tool can aid biologists in postulating potential protein–protein interactions using co-evolution.We also propose to use co-evolution between interacting proteins to rapidly validate experimentally derived protein–protein interactions against arti?cial interactions.It is possible that non-biological inter-actions that do not occur in nature may be detected under experimental conditions.However,these arti?cial interactions will not be subject to natural selection to exhibit co-evolution.As a consequence,ADVICE can be used to identify such spurious experimental interactions by ?nding interacting pairs that have little or no similarities in their evolutionary
*To whom correspondence should be addressed.Tel:+6568746929;Fax:+6567748056;Email:soonheng@6ce7e00c76c66137ee061912.sg The authors wish it to be known that,in their opinion,the first two authors should be regarded as joint First Authors
The online version of this article has been published under an open access 6ce7e00c76c66137ee061912ers are entitled to use,reproduce,disseminate,or display the open access version of this article provided that:the original authorship is properly and fully attributed;the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given;if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated.a2004,the authors
Nucleic Acids Research,Vol.32,Web Server issue aOxford University Press 2004;all rights reserved
Nucleic Acids Research,2004,Vol.32,Web Server issue W69–W72
DOI:10.1093/nar/gkh471
histories.ADVICE can be useful for rapidly assessing the quality of large volumes of interaction data from high-throughput detection methods such as yeast-two hybrid(12,13),af?nity puri?cation(14,15)and protein chip experiments(16).
INPUTS
ADVICE allows both interactive and batch modes for proces-sing.In the interactive mode,a user submits a pair of protein sequences in raw or FASTA format,or a list of protein sequences where all possible pairwise combinations of sequences will be permuted automatically by ADVICE for processing.When more than one pair of protein sequences is provided as input,ADVICE allows the detected co-evolved protein pairs to be visualized as a network.In the batch mode, the web tool accepts a list of sequence pairs for processing. The computed results will be sent to an email address provided by the user.
METHODOLOGY
Identifying orthologous sequences
The pair of sequences submitted by the user is used to search sequence databases for orthologous sequences based on sequence similarities.Identi?ed orthologous sequences will be used to compute each input protein’s evolutionary history. ADVICE allows users the option to search for orthologous sequences either from one of the four kingdoms of life (Eukaryota,Prokaryota,Archaebacteria and Viridae)or from the Swiss-Prot(release42.9)and/or TrEMBL(release 25.9)databases(17).BLAST v2.2.4(18)is used to search these databases and the user can control the sensitivity of the search by setting an E-value threshold for the BLAST hits. Constructing distance matrices
To detect co-evolved proteins from their evolutionary his-tories,we use only pairs of orthologous sequences occurring together in the same species for constructing the distance matrices.By default,ADVICE uses sequence pairs from the top10species(based on highest average E-value of the BLAST hits)to construct the respective distance matrices from multiple sequence alignments,excluding those species where more than one orthologous sequence of the input sequences is found(since it would be dif?cult to determine which is the actual ortholog).In the interactive mode,the user can manually inspect annotations of the sequences and remove/add orthologous sequence pairs(Figure1).ClustalW v1.84(19)is used to construct the two distance matrices from respective multiple sequence alignments of the pairs of ortho-logous
sequences.
Figure1.Pairs of orthologous sequences identified in different species using protein sequences input by users(sequence A and sequence B).Users can select the desired set of orthologous pairs to compute the similarity in the proteins’evolutionary histories.
W70Nucleic Acids Research,2004,Vol.32,Web Server issue
Measuring similarities in evolutionary distances
The correlation coef?cient (r )between two distance matrices is computed using Pearson’s correlation coef?cient equation:
r =P N à1i ?1P N j ?i t1
X ij à X à
áY ij à Y àá????????????????????????????????????????????????P N à1i ?1P N j ?i t1X ij à X à
á2q ????????????????????????????????????????????????P N à1i ?1P N j ?i t1Y ij à
Y àá2q ,where X and Y are two N ·N distance matrices and N is equal
to the number of orthologous sequence pairs retrieved (here,N is equal to the number of species as we allow only one sequence pair per species).X ij refer to the pairwise distance between sequences x i and x j from species S i and S j ,respectively.Similarly,Y ij refers to the pairwise distance between sequences y i and y j from species S i and S j respectively.This statistical approach is the same method used by Goh et al .(9)to quantify the correlation between two distance matrices for measuring the similarities in proteins’evolutionary histories.OUTPUT
ADVICE outputs the computed correlation coef?cient (r ),ranging from à1to 1,on the web page for each pair of input sequences.The distance matrices used to compute the correlation coef?cient are also presented on the web page.
In batch processing,the output data will be sent to an email address provided by the user.
When more than one pair of proteins is provided as input,in addition to computing the correlation coef?cient score between proteins’evolutionary histories,ADVICE also pro-vides the facility to visualize the computed co-evolved associations between proteins as a non-directional weighted graphical network (Figure 2).Each node on the network cor-responds to an input protein.The edge thickness between proteins corresponds to the computed correlation coef?cient.The thickness of the edges increases linearly with coef?cient score.In this way,users can identify highly co-evolved protein pairs 6ce7e00c76c66137ee061912ers can also ?lter out edges by specifying a correlation coef?cient threshold.All these facilities provide users with a global view of the detected associations between proteins.
INTERPRETATION
The computed correlation coef?cient ranges from à1to 1.A correlation coef?cient of 1corresponds to 100%correlation or similarities in the input proteins’evolutionary histories,while a score of à1implies 100%anti-correlation.A coef?-cient of 0will mean that there is no correlation.Goh et al .and Pazos et al .in their separate works have determined a lower coef?cient limit of 0.8to be a good indicator of
interacting
Figure 2.Detected co-evolved proteins visualized as a protein network.The edge thickness increases linearly with the computed correlation 6ce7e00c76c66137ee061912ers can specify the coefficient cut-off value for the construction of the network.
Nucleic Acids Research,2004,Vol.32,Web Server issue W71
proteins;users can therefore use this value to identify potential interacting proteins.To assess the sensitivity of this particular threshold,we have also computed the correlation coef?cient for 111yeast protein–protein interactions (20)(supplementary data)which represent a con?dent set of true interactions as they have been detected by multiple methods.Figure 3shows the distribution of computed coef?cients.The result indicates that the user can detect $45%of these high-con?dent inter-actions using a cut-off value of 0.8.In addition,we also tested ADVICE on a set of 63putative non-interacting yeast protein pairs where one protein is localized in the nuclear membrane while the other is localized in the mitochondrial inner mem-brane.Of these protein pairs,<5%were found to have correla-tion coef?cients >0.8.For a suitable upper bound for detecting spurious interactions,we have observed that $23%of these false interactions have coef?cients <0.3.For the high-con?dence interactions,only 2.7%of them have correlation coef?cients <0.3.Thus,for the purpose of validating experi-mental interactions,users can adopt a cut-off value of $0.3to detect potential spurious interactions.The use of a higher cut-off will need to be treated prudently or done in conjunction with other validation methods such as gene expressions for best result.
SUPPLEMENTARY MATERIAL
Supplementary Material is available at NAR Online.ACKNOWLEDGEMENTS
We thank Suisheng Tang and Han Hao for proofreading the manuscript.
REFERENCES
1.Fraser,H.B.,Hirsh,A.E.,Steinmetz,L.M.,Scharfe,C.and Feldman,M.W.(2002)Evolutionary rate in the protein interaction network.Science ,296,750–75
2.
2.Moyle,W.R.,Campbell,R.K.,Myers,R.V.,Bernard,M.P.,Han,Y.and Wang,X.(1994)Co-evolution of ligand–receptor pairs.Nature ,368,251–255.
3.van Kesteren,R.E.,Tensen,C.P.,Smit,A.B.,van Minnen,J.,
Kolakowski,L.F.,Meyerhof,W.,Richter,D.,van Heerikhuizen,H.,Vreugdenhil,E.and Geraerts,W.P.(1996)Co-evolution of
ligand–receptor pairs in the vasopressin/oxytocin superfamily of bioactive peptides.J.Biol.Chem.,271,3619–3626.
4.Hughes,A.L.and Yeager,M.(1999)Coevolution of the mammalian chemokines and their receptors.Immunogenetics ,49,115–124.
5.Koretke,K.K.,Lupas,A.N.,Warren,P.V.,Rosenberg,M.and Brown,J.R.(2000)Evolution of two-component signal transduction.Mol.Biol.Evol.,17,1956–1970.
6.Pazos,F.,Helmer-Citterich,M.,Ausiello,G.and Valencia,A.(1997)Correlated mutations contain information about protein–protein interaction.J.Mol.Biol.,271,511–523.
7.Jespers,L.,Lijnen,H.R.,Vanwetswinkel,S.,Van Hoef,B.,Brepoels,K.,Collen,D.and De Maeyer,M.(1999)Guiding a docking mode by phage display:selection of correlated mutations at the staphylokinase-plasmin interface.J.Mol.Biol.,290,471–479.
8.Jucovic,M.and Hartley,R.W.(1996)Protein–protein interaction:a genetic selection for compensating mutations at the barnase–barstar interface.Proc.Natl 6ce7e00c76c66137ee061912A ,93,2343–2347.
9.Goh,C.S.,Bogan,A.A.,Joachimiak,M.,Walther,D.and Cohen,F.E.(2000)Co-evolution of proteins with their interaction partners.J.Mol.Biol.,299,283–293.
10.Pazos,F.and Valencia,A.(2001)Similarity of phylogenetic
trees as indicator of protein–protein interaction.Protein Eng.,14,609–614.
11.Ramani,A.K.and Marcotte,E.M.(2003)Exploiting the co-evolution of
interacting proteins to discover interaction specificity.J.Mol.Biol.,327,273–284.
12.Giot,L.,Bader,J.S.,Brouwer,C.,Chaudhuri,A.,Kuang,B.,Li,Y.,
Hao,Y.L.,Ooi,C.E.,Godwin,B.,Vitols,E.et al .(2003)A protein interaction map of Drosophila melanogaster .Science ,302,1727–1736.
13.Li,S.,Armstrong,C.M.,Bertin,N.,Ge,H.,Milstein,S.,Boxem,M.,
Vidalain,P.O.,Han,J.D.,Chesneau,A.,Hao,T.et al .(2004)A map of the interactome network of the metazoan C.elegans .Science ,303,540–543.
14.Ho,Y.,Gruhler,A.,Heilbut,A.,Bader,G.D.,Moore,L.,Adams,S.L.,
Millar,A.,Taylor,P.,Bennett,K.,Boutilier,K.et al .(2002)Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry.Nature ,415,180–183.
15.Gavin,A.C.,Bosche,M.,Krause,R.,Grandi,P.,Marzioch,M.,Bauer,A.,
Schultz,J.,Rick,J.M.,Michon,A.M.,Cruciat,C.M.et al .(2002)
Functional organization of the yeast proteome by systematic analysis of protein complexes.Nature ,415,141–147.
16.Zhu,H.,Bilgin,M.,Bangham,R.,Hall,D.,Casamayor,A.,Bertone,P.,
Lan,N.,Jansen,R.,Bidlingmaier,S.,Houfek,T.et al .(2001)Global analysis of protein activities using proteome chips.Science ,293,2101–2105.
17.Boeckmann,B.,Bairoch,A.,Apweiler,R.,Blatter,M.C.,Estreicher,A.,
Gasteiger,E.,Martin,M.J.,Michoud,K.,O’Donovan,C.,Phan,I.et al .(2003)The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.Nucleic Acids Res.,31,365–370.
18.Altschul,S.F.,Gish,W.,Miller,W.,Myers,E.W.and Lipman,D.J.
(1990)Basic local alignment search tool.J.Mol.Biol.,215,403–410.
19.Thompson,J.D.,Higgins,D.G.and Gibson,T.J.(1994)CLUSTAL W:
improving the sensitivity of progressive multiple sequence alignment through sequence weighting,position-specific gap penalties and weight matrix choice.Nucleic Acids Res.,22,4673–4680.
20.von Mering,C.,Krause,R.,Snel,B.,Cornell,M.,Oliver,S.G.,Fields,S.
and Bork,P.(2002)Comparative assessment of large-scale data sets of protein–protein interactions.Nature ,417,
399–403.
Figure 3.Distribution of computed correlation coefficients between high-confidence interacting proteins and putative non-interacting protein pairs in yeast.
W72Nucleic Acids Research,2004,Vol.32,Web Server issue
正在阅读:
ADVICE Automated Detection and Validation of Interaction by03-21
秋之语作文700字06-26
新世纪竞争的本质是人才竞争05-16
装修公司开业前期的准备工作08-19
排词成句12-04
新闻报社员工的辞职报告范文三篇04-03
高中生物必修一、必修二、必修三知识点总结(人教版) - 图文09-27
管理咨询协议书03-27
凯里市凯棠乡利用远程站点学习贯彻06-22
- 1suggest,advice,propose,区别
- 2SHIPPING ADVICE 装船通知模板
- 3Tuning the dipolar interaction in quantum gases
- 4NonClinical Dose Formulation Analysis Method Validation and Sample Analysis
- 5Fluid–structure interaction between a two-dimensional
- 6Validation methodology from HKMA CA-G-4
- 7Predicting relevant empty spots in social interaction
- 8Abnormal Crowd Behavior Detection Based on the Energy Model
- 9A theoretical look at the direct detection of giant planets
- 10what advice would you give to freshmen for their year in college
- 教学能力大赛决赛获奖-教学实施报告-(完整图文版)
- 互联网+数据中心行业分析报告
- 2017上海杨浦区高三一模数学试题及答案
- 招商部差旅接待管理制度(4-25)
- 学生游玩安全注意事项
- 学生信息管理系统(文档模板供参考)
- 叉车门架有限元分析及系统设计
- 2014帮助残疾人志愿者服务情况记录
- 叶绿体中色素的提取和分离实验
- 中国食物成分表2020年最新权威完整改进版
- 推动国土资源领域生态文明建设
- 给水管道冲洗和消毒记录
- 计算机软件专业自我评价
- 高中数学必修1-5知识点归纳
- 2018-2022年中国第五代移动通信技术(5G)产业深度分析及发展前景研究报告发展趋势(目录)
- 生产车间巡查制度
- 2018版中国光热发电行业深度研究报告目录
- (通用)2019年中考数学总复习 第一章 第四节 数的开方与二次根式课件
- 2017_2018学年高中语文第二单元第4课说数课件粤教版
- 上市新药Lumateperone(卢美哌隆)合成检索总结报告
- Interaction
- Validation
- Detection
- Automated
- ADVICE
- 2011年中国矿业大学817物理化学基础考研专业课真题及答案
- DP520一体机维修手册
- 澜沧江云南段水电站工程概况
- 重型汽车起重机制造加工项目立项申请报告
- 远方小积分球作业指导书
- 江苏省东台市头灶镇中学中考历史第一轮复习 课时作业 八上部分(1
- 事业单位园林绿化年终工作总结
- 2016年江西省中考英语试卷(解析版)
- 冰淇淋市场分析报告
- 幼儿园大班健康优质课教案《我的心情我做主》含反思
- 铁路集装箱空箱调度模型及求解算法
- 南开大学20秋《对外贸易谈判》在线作业-1(参考答案)
- 2022年一年级数学上册第一次月考试卷(附参考答案)
- ArcGIS 10时态GIS应用实例
- 新课标人教版四年级上册语文全册教案
- 2022年浙江师范大学法政学院331社会工作原理之社会学
- 临床科室控感小组活动记录
- 篇目九:中国建设银行工作人员违规失职行为处理办法
- 2022年东南大学公共卫生学院723公共管理基础之经济学原理(微观经
- 教科版小学艺术四年级下册全册教案音乐