Bioinformatics-2016-homework - 图文

更新时间:2023-03-11 13:52:01 阅读量: 教育文库 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

“Bioinformatics” homework for

undergraduate (2016)

#1

How many nucleotide sequences from maize (Zea mays) have been stored in the public DNA database (such as GenBank)? How many Waxy (granule-bound starch synthase) gene sequences from maize in the database?

答:2016年11月6日星期日,访问NCBI(网址为https://www.ncbi.nlm.nih.gov/),在Nucleotide数据库中搜索zea mays,Species选择Plants,Molecule types选择genomic DNA/RNA,最终结果显示被存储在NCBI中的Zea mays的nucleotide sequences数量为446964。在搜索框中输入(zea mays[Organism]) AND waxy[Gene name],然后搜索得到玉米中Waxy基因序列在数据库中的数目为175。具体操作及结果如下图所示:

#2

A sequence was generated by a suppression subtractive hybridization (SSH) experiment. Please find the best hit(s) of the unknown sequence in the public database and predict its potential function. >an unknown sequence

CCTCGGAGATCTTCATGGGGGGCAAGAGCACCATCGTGCTgCACAACACCTGCGAGGACTCGCTCCTCGCTGCACCCATCATTCTTGATCTGGTGCTCCTGGCGGAGCTCAGCACCAGGATTCAGCTGAAGGCCGAGGGAGAGGTAAGAGTCTGACGAGATATGTTGCTAGTCTACTCTGTAGTCGAGATATACTTTGGGAGCCAAACTGAAGATTTCGCTGCTCCACTTGCATTTGTGCAGGACAAGTTCCATTCCTTCCATCCGGTTGCCACCATCCTGAGCTACCTCACCAAGGCACCCCTGGTAAGAAACAATTCTCGACTGTTTGCTCTAAATAACCTATAGATAAATAAAGACGATTAACTGACGTGCCACTGAATTCCTCTGTTAACAGGTTCCTCCTGGCACGCCGGTGGTGAACGCCCTGGCGAAGCAAAGGGCGATGCTGGAGAACATCATGAGGGCGTGTGTCGGCCTGGCGCCCGAAAACAACATGATCCTGGAGTACAAGTGAGGAGCGTGGCCCAAGCTCGCGGAGCCGAGAGCGACCGTACGTACGTAGCAAGTGGCGAGGGGCGACGGGAGGGCAGGACGAAGAAGAAGGCGAGATCGGCTGTGGAATTATTTGGCGGCTTGTCTTTAGTTTCCTTTGCGAATCTTTCCCTGGTTAAGTTTACCCCAGTGAGTGTGTGTCCTTGCGAGAAAAG

答:进入NCBI做blast,具体网址为http://blast.ncbi.nlm.nih.gov/Blast.cgi,选择Blastx,将上述序列复制到查询框中,参数选择默认参数,直接Blast,得到最佳联配结果为Inositol-3-phosphate synthase [Dichanthelium oligosanthes]。进入EMBL做blast,具体网址为http://www.ebi.ac.uk/Tools/sss/ncbiblast/,选择Blastx,将上述序列复制到查询框中,参数选择默认参数,直接Blast,得到最佳联配结果为Inositol-3-phosphate synthase。根据两处的联配结果可以推测这个未知序列可能的功能与Inositol-3-phosphate synthase相同。

#3

Use dynamic programming method, the Needleman-Wunsch algorithm, to perform global alignment of the sequences: P1=HEAGAWGHEP P2=EPAWHEAEAG

Scoring system: BLOSUM50 scoring matrix with gap penalty 8. BLOSUM50 (partial) A E G H P W A 5 -1 0 -2 -1 -3 E 6 -3 0 -1 -3 G 8 -2 -2 -3 H 10 -2 -3 P 10 -4 W 15

答:具体每一步动态规划的计算过程如下图所示,以黄颜色突出的部分表示达到最优联配所需经过的每一步。 P2 E P A W H E A E A G P1 0 -8 -16 -24 -32 -40 -48 -56 -64 -72 -80 H -8 0 -8 -16 -24 -22 -30 -38 -46 -54 -72 E -16 -2 -1 -9 -17 -24 -16 -24 -32 -40 -48 A -24 -10 -3 4 -4 -12 -20 -17 -25 -27 -35 G -32 -18 -11 -3 1 -6 -14 -20 -20 -25 -19 A -40 -26 -19 -6 -6 -1 -7 -9 -17 -15 -23 W -48 -34 -27 -14 9 1 -4 -10 -12 -20 -18 G -56 -42 -36 -22 1 7 -1 -4 -12 -12 -12 H -64 -50 -44 -30 -7 11 7 -1 -4 -12 -14 E -72 -58 -51 -38 -15 3 17 9 5 -3 -11 P -80 -66 -48 -46 -23 -5 9 16 8 4 -4 最终可以得到最佳的联配方式如下所示,其中下划线表示空位

P1: HEAGAWGH_EP_ P2: _EP_AWHEAEAG

Score:-8+6-1-8+5+15-2+0-8+6-1-8= - 4

#4

Please find genes in a genomic segment of bamboo (Download).

答:打开如下网址http://www.softberry.com/berry.phtml?topic=fgenesh&group=programs&subgroup=gfind ,Organism选择Monocot plants(因为里面没有竹子对应的选项),然后运行在线程序,最后得到结果如下,它给出了可能的基因及它们编码的蛋白质的碱基序列。

FGENESH 2.6 Prediction of potential genes in Monocot genomic DNA Time : Sun Nov 6 04:05:56 2016 Seq name: test sequence

Length of sequence: 49600

Number of predicted genes 10: in +chain 2, in -chain 8. Number of predicted exons 22: in +chain 10, in -chain 12.

Positions of predicted genes and exons: Variant 1 from 1, Score:299.051538 G Str Feature Start End Score ORF Len 1 - PolA 6777 0.44

1 - 1 CDSo 6884 - 7057 4.33 6884 - 7057 174 1 - TSS 8311 -1.78 2 + TSS 17591 -4.18

2 + 1 CDSf 17742 - 17811 16.41 17742 - 17810 69 2 + 2 CDSl 19834 - 20792 87.86 19836 - 20792 957 2 + PolA 21649 0.44 3 + TSS 21801 -7.58

3 + 1 CDSf 22009 - 22085 19.33 22009 - 22083 75 3 + 2 CDSi 22583 - 22652 3.65 22584 - 22652 69 3 + 3 CDSi 23070 - 23145 6.56 23070 - 23144 75 3 + 4 CDSi 23236 - 23353 18.37 23238 - 23351 114 3 + 5 CDSi 24144 - 24233 8.37 24145 - 24231 87 3 + 6 CDSi 24306 - 24381 6.47 24307 - 24381 75 3 + 7 CDSi 24523 - 24650 5.26 24523 - 24648 126 3 + 8 CDSl 24731 - 24800 8.36 24732 - 24800 69 3 + PolA 25006 0.44 4 - PolA 26777 -1.06

4 - 1 CDSl 27135 - 28019 59.89 27135 - 28019 885 4 - 2 CDSf 28097 - 28504 40.08 28097 - 28504 408 4 - TSS 28623 -6.38 5 - PolA 30964 0.44

5 - 1 CDSl 30993 - 31177 8.88 30993 - 31175 183 5 - 2 CDSi 31212 - 31431 -7.39 31213 - 31431 219 5 - 3 CDSf 31504 - 31548 8.15 31504 - 31548 45 5 - TSS 31608 -1.28 6 - PolA 33364 0.44

6 - 1 CDSo 33766 - 33954 4.42 33766 - 33954 189 6 - TSS 34021 -3.18 7 - PolA 34094 -1.06

7 - 1 CDSo 34700 - 34975 17.27 34700 - 34975 276 7 - TSS 35444 -6.08 8 - PolA 35848 0.44

8 - 1 CDSo 36075 - 36458 20.75 36075 - 36458 384 8 - TSS 37019 -5.38 9 - PolA 40341 -1.06

9 - 1 CDSo 40879 - 41067 9.51 40879 - 41067 189 9 - TSS 41777 -5.68 10 - PolA 43349 -1.06

10 - 1 CDSl 44280 - 44545 7.17 44280 - 44543 264 10 - 2 CDSf 46131 - 46686 33.96 46132 - 46686 555 10 - TSS 46908 -8.18

Predicted protein(s):

>FGENESH:[mRNA] 1 1 exon (s) 6884 - 7057 174 bp, chain - ATGGGGGTGAATATGAAGGGTAAGCAGCACATGCCGCGGCCATGTGCGTCGGTGGTTCAC TGGTTCAGTTTCCACGTCCACGAGTGGCCTCGCACTGTCGATAGCGATCGAATGAACGTT CTTTGCTGCTGCACGGCGGGAGCTTCTCCGGAACAGTCAGGGCTGATTGGTTAG

>FGENESH: 1 1 exon (s) 6884 - 7057 57 aa, chain - MGVNMKGKQHMPRPCASVVHWFSFHVHEWPRTVDSDRMNVLCCCTAGASPEQSGLIG

>FGENESH:[mRNA] 2 2 exon (s) 17742 - 20792 1029 bp, chain + ATGCGCCGGGTAGCGCTGTTGCTGCTGCTCGTCTGCGCGGCGGCGCGCGCCGCCGCGGTC GTCACCGACGGGCTTCTTCCGAACGGCAACTTCGAGGATGGCCCGCCCAAGTCGGCGCTG GTGAACGGCACTGTGGTGTCGGGCGCCAACGCCATCCCTAGCTGGGAGACCTCCGGCTTC GTGGAGTACATCGAGTCGGGGCACAAGCAGGGCGACATGCTCCTGGTGGTGCCCCAGGGC GCCCACGCCGTGCGCCTGGGCAACGAGGCCTCCATCCGGCAGCGCCTCTCCGTCACCCGG GGCGCCTACTACTCCATCACCTTCAGCGCGGCGCGCACCTGCGCGCAGGCCGAGCGCCTC AACGTCTCCGTGTCCCCCGAGTGGGGCGTCCTCCCGATGCAGACCATCTACGGCAGCAAC GGGTGGGACTCGTACGCCTGGGCCTTCAAGGCCAAGCTGGACACGGTGACGCTCGTCCTC CACAACCCCGGCGTCGAGGAGGACCCGGCCTGCGGCCCGCTCATCGACGGCGTCGCCATC CGGGCCCTGTACCCGCCCACGCTGGCCCGCGGCGGCAACATGCTCAAGAACGGCGGCTTC GAGGAGGGGCCCTACTTTTTACCCAACGCGTCGTGGGGCGTGCTCGTGCCGCCCAACATC GAGGACGACCACTCCCCGCTCCCGGCCTGGATGATCGTGTCCTCCAAGGCCGTCAAGTAC GTGGACGCCGCGCACTTTAAGGTCCCCAGGGCGCGGCGCGCCGTGGAGCCTGGTGGCCCC GGGGAGGGAAGCGGCTGGTGCAGGAGGTGGCGCCACCGTGCGGTGGAGCTACCACCCTGG CCTTCGCCGTGGGGGACGCCGCCGACGGGTGCGAGGGGTCGCATGGTGGGGCCGAGGCGT ACACCGGCGCGGCCCACCCGTGAAGGTGGGCCGTACGAGTCCCAAGGGGACGGGAACTTC CTTTTTTCTTCTTCACGGCCATCGCCAGCCGCACCCGGGTCGTGTTCCAGAGCACCTTCT ACCACATGA

>FGENESH: 2 2 exon (s) 17742 - 20792 342 aa, chain + MRRVALLLLLVCAAARAAAVVTDGLLPNGNFEDGPPKSALVNGTVVSGANAIPSWETSGF VEYIESGHKQGDMLLVVPQGAHAVRLGNEASIRQRLSVTRGAYYSITFSAARTCAQAERL NVSVSPEWGVLPMQTIYGSNGWDSYAWAFKAKLDTVTLVLHNPGVEEDPACGPLIDGVAI RALYPPTLARGGNMLKNGGFEEGPYFLPNASWGVLVPPNIEDDHSPLPAWMIVSSKAVKY VDAAHFKVPRARRAVEPGGPGEGSGWCRRWRHRAVELPPWPSPWGTPPTGARGRMVGPRR TPARPTREGGPYESQGDGNFLFSSSRPSPAAPGSCSRAPSTT

>FGENESH:[mRNA] 3 8 exon (s) 22009 - 24800 705 bp, chain + ATGCGGCTGCTCCTGCTCCTCCTCGCCGGCGCCGCCGCCCGCGCCTCCGACGACCCCTTC CTCTCCGGCGGACGGCGGCGCTCCCCAATCAGCAGACGGTGGACTACCCCAGCTTCAAGC TCGTCATCGTCGGCGATGGTGGCACAGTCGTCTCTGCATCTTGTAGGCAAAACCACCTTT GTGAAGAGGCATCTGACTGGTGAGTTTGAGAAGAAGTATGAACCCACCATTGGTGTTGAG GTTCATCCCCTGGACTTCTACACCAACCGCGGGAAGATCCGGTTCTACTGCTGGGACACT GCAGGGCAGGAGAAGTTTGGTGGGCTCAGGGATGGATACTACGTCCATGGACAGTGTGGG

ATCATTATGTTTGATGTAACCTCACGGCTGAGTTACAAGAATGTTCCAACTTGGCACCGT GATTTATCCAGGGTCTGTGACAACATCCCAATTGTGCTTTGTGGGAACAAGGTCGACGTG AAGAACAGGCAGGTCAAGGCAAAGCAGCAACCTATTTATTGGACGTGGGTAAACCAACCC CTTTTTTGTTGTGACAGTGATGCCAATCTCCACTTTGTTGAAAGCCCTGCTCTCGTTCCT CCAGATGTCACAATTGACATGGTCGCCCAGCAGCAGCATGAAGCTGAGCTGTTAATCGCT GTAGCCCAACCACTGCCTGATGATGACGATGACCTCATCGAGTAG

>FGENESH: 3 8 exon (s) 22009 - 24800 234 aa, chain + MRLLLLLLAGAAARASDDPFLSGGRRRSPISRRWTTPASSSSSSAMVAQSSLHLVGKTTF VKRHLTGEFEKKYEPTIGVEVHPLDFYTNRGKIRFYCWDTAGQEKFGGLRDGYYVHGQCG IIMFDVTSRLSYKNVPTWHRDLSRVCDNIPIVLCGNKVDVKNRQVKAKQQPIYWTWVNQP LFCCDSDANLHFVESPALVPPDVTIDMVAQQQHEAELLIAVAQPLPDDDDDLIE

>FGENESH:[mRNA] 4 2 exon (s) 27135 - 28504 1293 bp, chain - ATGCGGATCAGGAAAGGGAGTCATGTGGAGGTGTGGACGCAGGACGCGGCGTCGCCGGTG GGCGCGTGGCGCGTCGGGGAGGTCACCTGGGGCAACGGCCACTCGTACACCATGCGGTGG CACGACGGCGGCGGCGAGGTCTCCGGCCGCATCTCGAGGAAGTCGGTCCGCCCCCGCCCG CCGCCCGCCCCCGTGCCGCGGGACCTCGACGCCGGGGACATGGTCGAGGTGTTCGACCAC GACGACTGCCTCTGGAAGTGCGCCGAGGTCAAGGGCGCCGCCGCCGACGACGACCGCCGC TTCGTCGTCAAGGTCGTCGGCGCCACCAATGTCCTGACGGTCCCGCCGCAGAGGCTCCGC ATCCGGCAGGTTCTCAGGGACGACGACGTCTGGGTCGCGCTCCACAAGAGCTCGTTTCCT GACACCTCGCCGTGGTTCTTTGCTTCTCAGGACAACCAGATCGCCGTCCCTAGCGCGACG CCGCCGTTCCACGCCTACGGCGGAGGCGCTGGCATGGGCATCGGCAGAACCAAAGGCGGC CATAAGCCCATGGCGCCAGGCTTCACGCCGCTGCTGCAGAAGAGGAGCCCGCTGCTGCAG AAGAGAAGCTTCGGTATGCTGGGTTCGAGCACAATAACCCCCAATGGCAAGAGATTCGAC GACACCGCCAAGAGGATTTGTGCCAAGGAAGAGCCCAGATATGAAGTAGAAGTGGTCGTC CCAAACGTGCGCCTGAACAAGCAAGACGAGATGAGCGGCGAAGATGTTGACGTGCTTGGG ACACGCAGTGATTCCGATGATGATCATCATCAGCAGCAGCAGCAGCACGAGGACGAGGAT GACGATGACGATAGTGATGATTCTGCATCATCATCCTCGGATGATGACAGCAGCAGTGAC AGCAGTAACAGCGACAGCAGAACCAGGAGCACCGGAGCCGGCAAGAATTGCACGGCAGCT CTCGCAAGCAGGCCTTGTAACGATCAGAAGGCCGATCAGCTGCAACCCAGCGAGAAAGAA CATCGTGACGACATATCTGAATCGCATCACGAGACCCTGAACGATGAGAAGGCGGCGGTG GTGCAGGAACACATCCACCGTCTGGAGCTGGAGGCCTACACTAATCTGATGAAGGCGTTC CATGCATGTGGCAAAGCGCTGAGCTGGGAGAAGGCCGAACTGCTCACTGACCTCCGCGTG CATCTCCATATCTCTAACGATGAGCACCTGCGGGTGCTTAACATGATCTTGAACCGCAAG GGCAGATTTGGAGGATCACATGCAAATTCTTAA

>FGENESH: 4 2 exon (s) 27135 - 28504 430 aa, chain - MRIRKGSHVEVWTQDAASPVGAWRVGEVTWGNGHSYTMRWHDGGGEVSGRISRKSVRPRP PPAPVPRDLDAGDMVEVFDHDDCLWKCAEVKGAAADDDRRFVVKVVGATNVLTVPPQRLR IRQVLRDDDVWVALHKSSFPDTSPWFFASQDNQIAVPSATPPFHAYGGGAGMGIGRTKGG HKPMAPGFTPLLQKRSPLLQKRSFGMLGSSTITPNGKRFDDTAKRICAKEEPRYEVEVVV PNVRLNKQDEMSGEDVDVLGTRSDSDDDHHQQQQQHEDEDDDDDSDDSASSSSDDDSSSD SSNSDSRTRSTGAGKNCTAALASRPCNDQKADQLQPSEKEHRDDISESHHETLNDEKAAV VQEHIHRLELEAYTNLMKAFHACGKALSWEKAELLTDLRVHLHISNDEHLRVLNMILNRK GRFGGSHANS

>FGENESH:[mRNA] 5 3 exon (s) 30993 - 31548 450 bp, chain -

ATGGCAGCCAAGATATTTGCCCTTCCTGCTCTCCTTGCTCTTTCGTACTTCCCATCAACG ACAACTATGGGATTGGCACACTCGTTCGTGCATACCTACAGGCAACAGCAGGCATTTGTG CCAAGCATCTCACCGTTTTCAACTGTCATCCCACAATTCCCATACCTATACAATCAATTG CCCATCTCGCAACTAGCACACCTATACAACCAATTTGTCATCTCACAGTTGCCATTTCTG TACAACCATCTTGTCATCTCACAATCTTCCATATGTGTACAACCAGCTGCCATATCTATA CAACCAATTGGCTATATCCCACAACTGCCAAACGTGTACAACCAGCTAGCTGTTGCGAAC GCTGGAACCTTCCTGCCATTCAACGAGCTGGCTTTGAGGAACCCTGCCACTTTCTGGCAA CAACCCATTATTGGCAGCGTCTTCTTTTAG

>FGENESH: 5 3 exon (s) 30993 - 31548 149 aa, chain - MAAKIFALPALLALSYFPSTTTMGLAHSFVHTYRQQQAFVPSISPFSTVIPQFPYLYNQL PISQLAHLYNQFVISQLPFLYNHLVISQSSICVQPAAISIQPIGYIPQLPNVYNQLAVAN AGTFLPFNELALRNPATFWQQPIIGSVFF

>FGENESH:[mRNA] 6 1 exon (s) 33766 - 33954 189 bp, chain - ATGGGTGTCCTGCTAGAGAAAGGAGACGGACGCCACACGATGGTCGCGCGTGTCATGATA ATGCCACAGAACAGTGGAGCTGCATCGAATTGGAAAGGATCAAACATGATGGCCACATCA GAGGATTCAGATCCAATAGACAAGGCGCATGGGACAAAGAGGAGGGTGTTGAGAAGTGAA ATATCATAG

>FGENESH: 6 1 exon (s) 33766 - 33954 62 aa, chain - MGVLLEKGDGRHTMVARVMIMPQNSGAASNWKGSNMMATSEDSDPIDKAHGTKRRVLRSE IS

>FGENESH:[mRNA] 7 1 exon (s) 34700 - 34975 276 bp, chain - ATGCCGGGCCGCCGCGTGCTCCTCTGCCTCACGGCGCTGGCGGCGGCACTAGCACCAGTA GCAGAGGGCGACGACCCCTCCACACCAGCCAGGGCTACCGCGGCACCTACAAGTGGCTTG AGCCATCATGGGGCTCCACTCCCCAAGACCGTAGACAGCACATGGTCGAGGGACTCGAGG CCCATGATGAGACGGAAGAGTGCAGAGATGTTGCGTGTCCTTGTAGCTTGTGACGGGTCA TTGGAGACAGGCGCAGAGGGGCCTGGGGCCCCCTAA

>FGENESH: 7 1 exon (s) 34700 - 34975 91 aa, chain - MPGRRVLLCLTALAAALAPVAEGDDPSTPARATAAPTSGLSHHGAPLPKTVDSTWSRDSR PMMRRKSAEMLRVLVACDGSLETGAEGPGAP

>FGENESH:[mRNA] 8 1 exon (s) 36075 - 36458 384 bp, chain - ATGGCGACAGGGAAGGAGAAGAGGGGCGGCGCCGTCGGGGAGGGAGGGGCAGCGCGGCGG CGGACAAACCTGAGCAGGTGGGGTGGAGGTTGCGAATGGGAGGGGGCTGACCCTGCCCAC GCAATCCACGCGGCGGCGGCGGCACTGGAGATCCACGCGACGGCATTGTTGGAGAGGAGA GCGGCGGCGGCGATCCACGAAGCGCAGCCCTGCCCACGCAATCCACGCGGCGGCGGCACT GGAGATCCACGCGACGGCAGTGTTGGAGAGGAGAGCGGCGGCGGCGATCCACGAAGCGGC GGCGTGGCGGCGTTGGAGAGGAGAGGGGAACCGGCGGAGGGGGAAGAACGAGCGTTGCGG GATGCGGGTTTGGCGACGGCGTAG

>FGENESH: 8 1 exon (s) 36075 - 36458 127 aa, chain - MATGKEKRGGAVGEGGAARRRTNLSRWGGGCEWEGADPAHAIHAAAAALEIHATALLERR AAAAIHEAQPCPRNPRGGGTGDPRDGSVGEESGGGDPRSGGVAALERRGEPAEGEERALR DAGLATA

>FGENESH:[mRNA] 9 1 exon (s) 40879 - 41067 189 bp, chain - ATGGACGTGGGCCACCTTCCGACGTACGACCCGCGGTCGGACGCGGCGAAGAAGGAGGCC CTGGACGCCTCGCGCGCCGAACTTGCCCGCACCCTCGTCCACCTCGTCCCCGTCGCCGTG

CTCCTCTGCGGCCTCCTGCTCTGGTCGCTCTCCAGTGGCGACGTCCCCGGTAACGCATCG CAAGCCTAA

>FGENESH: 9 1 exon (s) 40879 - 41067 62 aa, chain - MDVGHLPTYDPRSDAAKKEALDASRAELARTLVHLVPVAVLLCGLLLWSLSSGDVPGNAS QA

>FGENESH:[mRNA] 10 2 exon (s) 44280 - 46686 822 bp, chain - ATGCGCCCGCCACGGCTCCGCGCCGCCGCCGCCGCCGCCGCCTCGTTGCCGCTCCCACGG CGCGGCTTATGCACCTCCGCCGCTGACCCCACCTCCTCCGCGGCATCTTCCTCTCCTTCC CCGCCGCAGCAGTGCGCCACCCAGACCCTCTCCGCTCTATTCGCCAGACTCCCCGCCGCC CGTGGCCCGGCCGTCGCCGATGACCTCGCCTCCTCTCTCCGCGCGCTTCTCGCCTCCTCG CCGACCCACCCGCGCGCCTTCCCGCTTCTCCGGTCCGCCGCCCTGGAGAAGCGCCTCCCA CCCGACGCGCTCGTCGATGCCGTTCTCTCCGCCGCCGACGCCGGCTCGCCGGCTGCGGCC GCGCTCCTCAGCAGACTCCTCGCCTGCCTCTCCCGCACCGCCCGCGACTTCTCGGCAGCC ACGGCCGCGTACGCCCGCATGGTCGCACGGGGCGTCGTCCCGGACGCCAAGTCGCGCACC GACCTGATCGTCGTCACGGCGCGGGGCGCGTCGGCCGCGGACGCGCTCGCGCTGTTCGAT GAAATGCGGGGCAAGGCGTTAACACACGGCCTTTGTCGAAGTGGAGACATTGATGATGCT AAGAAGTTATTGGACGAGATGAGAAGATTAGATGTACATCCAAATACTCTTACTTATAAC ATGCTGATAAATGCATACATCCGTGATGGTAAGCTGCAAGAGGCATTCCAGTTGCATGAT GAAATGCTCAACAGCGGTGTTGTCCCTGATGATACTACACATGATATACTAGTTAGCTTG AAACCTGTGGAAGCAAGCCACACAGATGCAGAAATCCTATGA

>FGENESH: 10 2 exon (s) 44280 - 46686 273 aa, chain - MRPPRLRAAAAAAASLPLPRRGLCTSAADPTSSAASSSPSPPQQCATQTLSALFARLPAA RGPAVADDLASSLRALLASSPTHPRAFPLLRSAALEKRLPPDALVDAVLSAADAGSPAAA ALLSRLLACLSRTARDFSAATAAYARMVARGVVPDAKSRTDLIVVTARGASAADALALFD EMRGKALTHGLCRSGDIDDAKKLLDEMRRLDVHPNTLTYNMLINAYIRDGKLQEAFQLHD EMLNSGVVPDDTTHDILVSLKPVEASHTDAEIL

#5

Please find consensus sequences (domains/motifs) in plant disease resistance genes(Download).

答:进入网址http://www.ebi.ac.uk/Tools/msa/,点击Launch Clustal Omega,将待联配的序列复制到输入框,OUTPUT FORMAT选择MSF,然后提交得到结果如下 !!AA_MULTIPLE_ALIGNMENT 1.0

squid.msf MSF: 2362 Type: P November 06, 2016 09:15 Check: 8558 ..

Name: N Len: 2362 Check: 2542 Weight: -1.00 Name: M Len: 2362 Check: 3314 Weight: -1.00 Name: L6 Len: 2362 Check: 8870 Weight: -1.00 Name: Cre3 Len: 2362 Check: 7828 Weight: -1.00 Name: Xa1 Len: 2362 Check: 517 Weight: -1.00 Name: RPS5 Len: 2362 Check: 5106 Weight: -1.00 Name: RPS2 Len: 2362 Check: 1172 Weight: -1.00 Name: RPP8 Len: 2362 Check: 7193 Weight: -1.00 Name: Mi Len: 2362 Check: 3945 Weight: -1.00

Name: rx Len: 2362 Check: 8071 Weight: -1.00 //

1 50 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi MEKRKDNEEA NNSLVLFSAL SKDIADVLVF LENEENQKAL DKDQVEKIKL rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

51 100 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi KMAFICTYVQ LSCSDFEQFE DIMTRKRQEV ENLLQPLLDD DVFTSLTSNM rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

101 150 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi DDCISLYHRS YKSDAIMMDE QLDFLLLNLY HLSKHHAEKI FPGVTQYEVL rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

151 200 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 ~~~~~~MEEV EAGW...... .......... ......LEGG IRWLAETI.. RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi QNICGNIRDF HGLIVNGCIK HEMVENVLPL FQLMADRVGH FLWDDQTDED rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

201 250 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ MSYLRDVATA VALLLDNLCC GRPNLNNDN. L6 ~~~~~~~~~~ ~~~~~~~~~~ MSYLREVATA VALLLPFILL NKFWRPNSK. Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 ..LDNLDADK L......... .......... .......... .......... RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi SRLSELDEDE QNDRDSRLFK LAHLLLKIVP VELEVIHICY TNLKASTSAE rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

251 300 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~M ASSSSSSR.. M .EDTIQQT.D STSPVVDPSS SSQSM..DST SVVDAISDST NPSASFPS.. L6 .......... .......DSI VNDDD..DST SEVDAISDST NPSGSFPS.. Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 .DEWIRQIRL AAD....TEK LRAEIE.KVD GVVAAVKGRA IGNRSLARSL RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~M RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi VGLFIKQLLE .TS....PDI LREYLIPLQE HMVTVITPST SGARNIHVMM rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

301 350 N .....WSY.. .DVFLSFRGE DTRKTFTSHL YEVLNDKGI. ...KTFQDDK M .....VEY.. .DVFLSFRGP DTRYQITDIL YRFLCRSKI. ...HTFKDDD L6 .....VEY.. .EVFLSFRGP DTREQFTDFL YQSLRRYKI. ...HTFRDDD Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 GRLRGLLYDA DDAVD...EL D........Y FRLQQQVEGG VTTR.FEAEE RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~MGG CVSVSLSCDR RPS2 D......... ...FI..... .......... ....SSLIVG CAQVLCESM. RPP8 ~~~~~~~~~M AEAFVSFGLE K........L WDLLSRESER .......... Mi EFLLLILSDM PKDFI..HHD K........L FDLLDRVGVL TREV.STLVR rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~MA YAAV.TSLMR

351 400 N RLEYGAT... .......... .......... .......... .......... M ELHKGEE... .......... .......... .......... .......... L6 ELLKGKE... .......... .......... .......... .......... Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 TVGDGAEDED DIPM...DNT DVPEAVAAGS SKKRSKAWEH FTTVEFTADG RPS5 EVNQ...FSQ WLCVSGSYIQ NLSENLAS.. ...LQKAMGV L......... RPS2 .......... NMAERRGHKT DLRQAITD.. ...LETAIGD L......... RPP8 .......... .......... .......... .......... .......... Mi DLEEEPRNKE GNNQTNCATL DLLENIEL.. ...LKKDLKH V......... rx TIHQ...... SMELTGCDLQ PFYEKLKS.. ...L...... ..........

401 450 N .IPGELCKAI EESQFAIVVF SEN....... .....YAT.. ..SRWCLNEL M .IKVNLLRAI DQSKIYVPII SRG....... .....YAD.. ..SKWCLMEL L6 .IGPNLLRAI DQSKIYVPII SSG....... .....YAD.. ..SKWCLMEL Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 KDSKARCKYC HKDLCCTSKN GTSALRNHL. .......... .......... RPS5 .......NAK RDD....... .......... .......... .VQGRINREE RPS2 .......KAI RDD....... .......... .......... .LTLRIQQDG RPP8 .......... .......... .......... .......... ...LQGVDEQ Mi .....YLKAL DSSQCCFPMS DGPLFMHLLH IHLNDLLDSN AYSIALIKEE rx .......RAI LEK.SCNIMG DH........ .......... ..........

451 500 N VKIMECK.TR FKQTVIPIFY DVDPSHVR.. .......... .......... M AKIVRHQKLD TRQIIIPIFY MVDPKDVR.. .......... .......... L6 AEIVRRQEED PRRIILPIFY MVDPSDVR.. .......... .......... Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 .NVCKR.... KR..VTSTDQ PVNPSSAGEG A......... .....SNATG RPS5 FTGHRR.... RL........ .........A QVQVWLTRIQ TIENQFNDLL RPS2 LEGRSC.... .S........ .........N RAREWLSAVQ VTETKTALLL RPP8 IDGLKR.... QLRSLQSLLK DADAKKHGSD RVRNFLEDVK DLVFDAEDII Mi IELVKQ.... DLKFIRSFFV DAEQ.....G LYKDLWARVL DVAYEAKDVI rx .......... .......... ...E.....G .LTILEVEIV EVAYTTEDMV

501 550 N .......... ......NQKE S.FAKAFE.. .EHETKYKDD VEGIQRWRIA M .......... ......HQTG P.YRKAFQ.. .KHSTRYDE. .MTIRSWKNA L6 .......... ......HQTG C.YKKAFR.. .KHANKFDG. .QTIQNWKDA Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~MSN KKLVKSLKKI ENIINEAH.. Xa1 NSVGRKRMRM DGTS..THHE A.VSTHPWNK AELSNRIQCM THQLEEAV.. RPS5 S.....TCNA E..IQ..... R.LCLCGF.. .CS.KNVKMS YLYGKRVIVL RPS2 V.....RFRR R..EQRTRMR RRYLSCFG.. .CA......D YKLCKKVSAI

RPP8 ESYVLNKLRG EGKGVKKHVR R.LARFLT.. .DRHKVASDI EGITKRIS.. Mi DSI....... ...IVRDN.. G.LLHLIF.. .SLPITIKKI KLIKEE.... rx DSESRNVFLA QNLEERSR.. A.MWEIFF.. .VLEQALECI DSTVKQWM..

551 600 N LNEAANLKGS CDNRDKTDAD CIRQIVDQ.I ..SSKLCKIS .LSYLQNIVG M LNEVGALKGW HVKNNDEQGA IADEVSAN.I ..WSHISKEN FILETDELVG L6 LKKVGDLKGW HIGKNDKQGA IADKVSAD.I ..WSHISKEN LILETDELVG Cre3 .QILEKL... .......NLS SISDGN.IRH TMVV.NPTTT .AVSPQKVFG Xa1 .NEVMRL... .......CRS SSSNQSRQGT PPAT.NATTS SYLPEPIVYG RPS5 LREVEGLSSQ GVFDIVTEAA P..IAEVEEL P......... ...IQSTIVG RPS2 LKSIGELRER SE.AIKTDGG SI.QVTCREI P......... ...IKS.VVG RPP8 .DVIGEMQSF GIQQIIDGVR SLSLQE..RQ RVQREIRQTY PDSSESDLVG Mi ...ISAL... ..DENIPKDR GLIVVNSPKK PVER.....K SLTTDKITVG rx .ATSDSM... ..KDLKPQTS SL..VSLPEH .........D VEQPENIMVG

601 650 N IDTHLEKIES LLEIG..... .....INGVR IMGIWGMGGV GKTTIARAIF M IDDHVEVILE MLSLD..... .....SKSVT MVGLYGMGGI GKTTTAKAVY L6 IDDHITAVLE KLSLD..... .....SENVT MVGLYGMGGI GKTTTAKAVY Cre3 RDNDRDKIIA MLHEKEGGLD PSTSKGLCFS VIGIHGVSGS GKSTLAQFVY Xa1 RAAEMETIKQ LIMS...... ...NRSNGIT VLPIVGNGGI GKTTLAQLVC RPS5 QDSMLDKVWN CLM....... .....EDKVW IVGLYGMGGV GKTTLLTQIN RPS2 NTTMMEQVLE FLSE...... .....EEERG IIGVYGPGGV GKTTLMQSIN RPP8 VEQSVKELVG HLVE...... .....NDVHQ VVSIAGMGGI GKTTLARQVF Mi FEEETNLILR KLTSG..... .....SADLD VISITGMPGS GKTTLAYKVY rx RENEFEMMLD QLARG..... .....GRELE VVSIVGMGGI GKTTLATKLY

651 700 N DTLLGRMD.. SSYQFDGACF LKDIK...EN KRGMHSLQNA LLSELLREK. M NKIS...... ..SHFDRCCF VDNVRAMQEQ KDGIFILQKK LVSEILRMD. L6 NKIS...... ..SCFDCCCF IDNIRETQE. KDGVVVLQKK LVSEILRIDS Cre3 AHEKNDKQDN KEDHFDLVMW VHV.....SQ DFSVWGIFKE LYEAASDPKV Xa1 KDL......V IKSQFNVKIW VYV.....SD KFDVVKITRQ ILDHVSNQ.. RPS5 NKFS.....K LGGGFDVVIW VVV.....SK NATVHKIQKS IGEKLGLVGK RPS2 NELI.....T KGHQYDVLIW VQM.....SR EFGECTIQQA VGARLGL... RPP8 HHDL.....V .RRHFDGFAW VCV.....SQ QFTQKHVWQR ILQELQPHDG Mi NDKS.....V .SSRFDLRAW CTV.....DQ GCDEKKLLNT IFSQVSDSDS rx SDPC.....I .MSRFDIRAK ATV.....SQ EYCVRNVLLG LLSLTSDE..

701 750 N ..ANYNNEED GKHQMASRLR SKKVLIVLDD IDNKD....H YLEYLAGDLD M .SVGFTNDSG GRKMIKERVS KSKILVVLDD VDEKF....K FEDILGCP.K L6 GSVGFNNDSG GRKTIKERVS RFKILVVLDD VDEKF....K FEDMLGSP.K

Cre3 PCPQFNNLNA LEEELERKLD GKRFLLVLDD VWCNADVGNQ ELPKLLSPLK Xa1 SHEGISNLDT LQQDLEEQMK SKKFLIVLDD VWEIRT...D DWKKLLAPLR RPS5 NWDEKNK.NQ RALDIHNVLR RKKFVLLLDD IWEKV..... ELKVIGVPYP RPS2 SWDEKETGEN RALKIYRALR QKRFLLLLDD VWEEI..... DLEKTGVPRP RPP8 DILQ.MDEYA LQRKLFQLLE AGKYLVVLDD VWKKE..... DWDVIKAVFP Mi KLS...ENID VADKLRKQLF GKRYLIVLDD VWDTT..... TWDELTRPFP rx ......PDDQ LADRLQKHLK GRRYLVVIDD IWTTE..... AWDDIKLCFP

751 800 N WFG....... ..NGSRIIIT TRDKHLI... .EK.NDIIYE VTALPDHESI M DFD....... ..SGTRFIIT SRNQNVLSRL NEN.QCKLYE VGSMSEQHSL L6 DFI....... ..SQSRFIIT SRSMRVLGTL NEN.QCKLYE VGSMSKPRSL Cre3 KGK....... ..KGSKILVT TRSKYALPDL CPGVRYTAMP ITEVDDTAFF Xa1 PNDQVNSSQE EATGNMIILT TRIQSIAKSL GT...VQSIK LEALKDDDIW RPS5 SGE....... ..NGCKVAFT THSKEVCGRM G.V..DNPME ISCLDTGNAW RPS2 DRE....... ..NKCKVMFT TRSIALCNNM G.A..EYKLR VEFLEKKHAW RPP8 R.K....... ..RGWKMLLT SRNEGVGIHA DPT..CLTFR ASILNPEESW Mi ESK....... ..KGSRIILT TREKEVALHG KLN..TDPLD LRLLRPDESW rx DCY....... ..NGSRILLT TRNVEVAEYA SSG..KPPHH MRLMNFDESW

801 850 N QLFKQHAFGK EV.....PNE NFEKLSLEVV NYAKGLPLAL KVWGSLLHNL M ELFSKHAFKK NT.....PPS DYETLANDIV STTGGLPLTL KVTGSFLFRQ L6 ELFSKHAFKK NT.....PPS YYETLANDVV DTTAGLPLTL KVIGSLLFKQ Cre3 ELFMHYALE. .D...GQDQS MFQNIGVEIA KKLKGSPLAA RTVGGNLRRQ Xa1 SLFKVHAFGN DK...HDSSP GLQVLGKQIA SELKGNPLAA KTVGSLLGTN RPS5 DLLKKKVGEN T....LGSHP DIPQLARKVS EKCCGLPLAL NVIGETMSFK RPS2 ELFCSKVWRK D....LLESS SIRRLAEIIV SKCGGLPLAL ITLGGAMAHR RPP8 KLCERIVFPR RDETEVRLDE EMEAMGKEMV THCGGLPLAV KVLGGLLANK Mi ELLEKRAFGN E....S.CPD ELLDVGKEIA ENCKGLPLVA DLIAGVIAGR rx NLLHKKIFEK E....GSYSP EFENIGKQIA LKCGGLPLAI TVIAGLLSKM

851 900 N ..RLTEWKSA IEHMKNN... .......SYS GIIDKLKISY DGLEPK.QQE M ..EIGVWEDT LEQLRKTL.. .......DLD EVYDRLKISY DALKAE.AKE L6 ..EIAVWEDT LEQLRRTL.. .......NLD EVYDRLKISY DALNPE.AKE Cre3 .QDVDHWRRV GDQDLFKV.. .....WT... ...GPLWWSY YQLGEQ.ARR Xa1 .LTIDHWDSI IKSEEWKS.. .....LQQAY GIMQALKLSY DHLSNP.LQQ RPS5 .RTIQEWRHA TEVLTSATD. ....FSGMED EILPILKYSY DSLNGEDAKS RPS2 .ETEEEWIHA SEVLTRFPA. ....EMKGMN YVFALLKFSY DNLESDLLRS RPP8 .HTVPEWKRV SDNIGSQIVG GSCLDDNSLN SVYRILSLSY EDLPTH.LKH Mi EKKRSVWLEV QSSLSSFI.. .....LNSEV EVMKVIELSY DHLPHH.LKP rx GQRLDEWQRI GENVSSVVS. .....TDPEA QCMRVLALSY HHLPSH.LKP

901 950 N MFLDIACFLR GEE..KDYIL QILESCHIGA EYG....... .....LRILI M IFLDIACFFI GRN..KEMPY YMWSECKFYP KSN....... .....IIFLI L6 IFLDIACFFI GQN..KEEPY YMWTDCNFYP ASN....... .....IIFLI Cre3 CFAYCSIFPR RHRLYRDELV RLWMAEGFIR NTDEGADAED VGLGIFNELL Xa1 CVSYCSLFPK GYSFSKAQLI QIWIAQGFVE ESSEKL..EQ KGWKYLAELV RPS5 CFLYCSLFPE DFEIRKEMLI EYWICEGFIK EKQGREKAFN QGYDILGTLV RPS2 CFLYCALFPE EHSIEIEQLV EYWVGEGFLT SSHGVNTIY. KGYFLIGDLK RPP8 RFLFLAHFPE YSKISAYDLF NYWAVEGIYD ...GSTIQ.D SGEYYLEELV Mi CLLYFASFPK DTSLTIYELN VYFGAEGFVG KTEMNSME.E VVKIYMDDLI rx CFLYFAIFTE DEQISVNELV ELWPVEGFLN EEEGKSIE.E VATTCINELI

951 1000 N DKSLVFIS.. .....EYNQV QMHDLIQDMG KYIVNFQK.. DPGE...RS. M QRCMIQVG.. .....DDGVL EMHDQLRDMG REIVRREDVQ RPWK...RS. L6 QRCMIQVG.. .....DDDEF KMHDQLRDMG REIVRREDV. LPWK...RS. Cre3 SISFLQPGGQ DWYNHGKEYY LVHDLLYDLA GAVAGTDCFR IDNNMIQKGE Xa1 NSGFLQQVES T..RFSSEYF VMHDLMHDLA QKVSQTEYAT IDGSECT... RPS5 RSSLLLEG.. ...AKDKDVV SMHDMVREMA LWIFSDLGKH KERCIVQ.AG RPS2 AAC.LLET.. ...GDEKTQV KMHNVVRSFA LWMASEQGTY KELILVE.PS RPP8 RRNLVIADNR .YLSSHSKNC QMHDMMREVC LSKAKEENFL QIIK....DP Mi YSSLVIC..F .NEIGYALNF QIHDLVHDFC LIKARKENLF DQIR....SS rx DRSLIFIHNF .SFRGTIESC GMHDVTRELC LREARNMNFV NVIR....GK

1001 1050 N .......... .......... ...RLWLAKE VEEVMSNNTG TMAMEAI... M .......... .......... ...RIWSREE GIDLLLNKKG SSQVKAI... L6 .......... .......... ...RIWSAEE GIDLLLNKKG SSKVKAI... Cre3 SWAKDVPRDV RHLFVQS... YDA....... ..ALITGKIL VLE....... Xa1 ....ELAPSI RHLSIVTDSA YRKEKYRNIS RNEVFEKRLM KVK......S RPS5 IGLDELPE.. .......... ..VENWRAVK RMSLMNNNFE KILGSPE... RPS2 MGHTEAPK.. .......... ..AENWRQAL VISLLDNRIQ TLPEKLI... RPP8 TSTSTINAQ. ......S... ..PS...RSR RLSIHSGKAF HLLGHKN... Mi APSDLLPRQ. ......I... ..TIDCDEEE HFGLN...FV MFDSNKKRHS rx SDQNSCAQS. ......M... ..QRSFKSRS RIRIHKVEEL AWCRNSEAHS

1051 1100 N ..WVSS...Y ......SSTL RFSNQAVKNM KRLRVFNMGR SSTHYA.I.. M ..SIPNNMLY ..AWESGVKY EFKSECFLNL SELRLFFVGS TTLLTGDF.. L6 ..SIPW.... ......GVKY EFKSECFLNL SELRYLH.AR EAMLTGDF.. Cre3 ..NLHTLVIY SVGGDTTVEE IVIKNILKSL PKLRVLAIAL CL.EKDGFIC Xa1 RSKLRSLVLI ..GQYDSHFF KYFKDAFKEA QHLRLLQITA TYADSDS... RPS5 CVELITLFLQ ..NNYK.LV. DISMEFFRCM PSLAVLDLSE NH........ RPS2 CPKLTTLMLQ ..QNSS.LK. KIPTGFFMHM PVLRVLDLSF .T........

RPP8 NTKVRSLIVW ..DE...DFG IRSASVFHNL TLLRVLDLYW VK........ Mi GKHLYSLRII ..GDQL.DDS VSDAFHLRHL RLLRVLDLHT SF........ rx ......IIML ..GGFE.CVT LE.....LSF KLVRVLDLGL NT........

1101 1150 N ....DYLPNN LRCFVCTNYP WESFPS.... .......... .......... M ....NNL... .....LPNLK WLDLPRYA.. ....H..... ..GLYDPPVT L6 ....NNL... .....LPNLK WLELPFYK.. ....H..... ..GEDDPPLT Cre3 RPNILSVPES ISQ..LKHLR YLAFRTDI.. .......... .ECRVILPSS Xa1 ......FLSS LVN..STHLR YLKIVTEE.. .......... ..SGRTLPRS RPS5 S..LSELPEE ISE..LVSLQ YLDLSGTY.. .......... ...IERLPHG RPS2 S..ITEIPLS IKY..LVELY HLSMSGTK.. .......... ...ISVLPQE RPP8 F.EGGKLPSS IGG..LIHLR YLSLFLAG.. .......... ...VSHLPST Mi IMVKDSLLNE ICM..LNHLR YLSIDTQ... .......... ...VKYLPLS rx WPI...FPSG VLS..LIHLR YLSLRFNPCL QQYQGSKEAV PSSIIDIPLS

1151 1200 N TFELKMLVHL QLRHNSL.RH LWTETKHLPS LRRIDLSWSK RL..TRTPDF M NFTMKKLVIL VSTNS..... .KTEWSHMI. .......... ....KMA... L6 NYTMKNLIIV ILEHSHITAD DWGGWRHMM. .......... ....KVC... Cre3 LNQLYQMQLL DFGVC..... .......... .......MN. .LVFSC.GD. Xa1 LRKYYHLQVL DIGYR..... .......... .......FG. .IPRIS.NDI RPS5 LHELRKLVHL KLERT..... .......... .........R RLESISG..I RPS2 LGNLRKLKHL DLQRT..... .......... .........Q FLQTIPRDAI RPP8 MRNLKLLL.. YLNLS..... .......... .......VNN KEPIHVPNVL Mi FSNLWNLESL FVSTN..... .......... .......... RSILVLLPRI rx ISSLCYLQTF KLNLP..... .......... .......FPS YYPFILPSEI

1201 1250 N TGMPNLEYVN LYQCSNLEEV HHSLGCCSKV IG...LYLND CKSLKRFPCV M ...PRLKVVR LYSDYGVSQ. ..RLSFCWRF PKSIEVLSMS GIEIKEVDIG L6 ...CFSA.VH MKVCYLLIC. ..SSYFCFNL LSDG~~~~~~ ~~~~~~~~~~ Cre3 ........LI NL...RH... .....VCSG. .......... .PGLQFSNIG Xa1 NNLLSLRHLV AY...DE... .....VCS.. .......... ....SIANIG RPS5 SYLSSLRTLR LRDSKTT... .....LDTG. .......... ....LMKELQ RPS2 CWLSKLEVLN LYYSYAG... .....WELQS FG.....EDE AEELGFADLE RPP8 KEMIQLRYLS LP........ .....LKKD. DK........ .TKLEL...G Mi LDLVKLRVLS VD........ .....ACSF. FD.....MDA DESILIAEDT rx LTMPQLRTLC MG........ .....WNYL. RS.....HEP TEN.....RL

1251 1300 N NVESLEYLGL RSCDSLEKLP EIYGRMKPEI QIHMQGSG.. .IRELPSSIF M ELKNLKTLDL TSCRIQKISG GTFGMLKGLI ELRLDSIKCT NLREVVADIG L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

Cre3 RLVSLQTIPA FKVSHE.... .......... ....QGHEAK QLRYLNRLSG Xa1 KMTSLQELGN FIVQNN.... .......... ...LSGFEVT QLKSMNKLV. RPS5 LLEHLELIT. TDISSG.... .......... ...L...... .VGELFC.YP RPS2 YLENLTTLGI TVLSLE.... .......... ...T...... .LKTLFE.FG RPP8 DLVNLEFLF. ...GFS.... .......... ...T...... QHSSVTDL.L Mi KLENLRILTE LLISYS.... .......... ...K...... DTKNIFKRFP rx VLKNLQCLNQ LNPRY..... .......... .......... CTGSFFRLFP

1301 1350 N QY......KT HVTKLLLWN. ...MKN.LV. .......... .......... M QLSSLKVLKT EGAQEVQFEF PLALKE.LST S......... .......... L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ELSIYG.LQS VESREEALAF DLAAKKRLAE LTLSF..... .......... Xa1 QLSVSQ.LEN VRTQEEACGA KLKDKQHLEK LHLSWKDAWN GYDSDESYED RPS5 .......... .......... ..RVGRCIQH I..YIRDHWE R......... RPS2 .......... .......... ..ALHKHIQH L..HVEECNE L......... RPP8 .......... .......... ..HMTK.LRY LAVSLSERCN FETL...... Mi .......... .......... ..NLQL.LS. F..ELKESWD YSTE.....Q rx .......... .......... ..NLKK.LQV F..GVPEDFR NSQD.....L

1351 1400 N ALPSSICRLK S.LVSLSVSG CSKLESLPEE ....IGDLDN LRV.....FD M SRIPNLSQLL D.LEVLKVYG CNDGFDIPPA ....KSTEDE GSV.....WW L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 .......... .......... .......... .........G GSS....... Xa1 EYGSDMNIET E.GEELSVGD ANGAQSLQHH .......SNI SSE....... RPS5 .......... .PEESVGVLV LPAIHNLCYI ....S..... ........IW RPS2 LYF.NLPSLT NHGRNLRRLS IKSCHDLEYL ....VTPAD. ........FE RPP8 ..SSSLRELR N.LETLYVLF SPEIFMVDYM GEFVL..... .........D Mi HWFSELDFLT E.LETLSVGF .........K ...SSNTNDS GSS....... rx Y...DFRYLY Q.LEELTFRL Y.....YPYA ACFLKNTAPS GSTQDPLRFQ

1401 1450 N .ASD...... TLILRPPSSI IRLNKLIILM FRGFKDG.VH FEFPP..... M KASK...... LKSLKLYRTR IN......IN VVDASSG.GR YLLPSSLTSL L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ....EVAAEV LEGLCPPVGL VT......LD IRDYDGLVYP KWMVGRQNGA Xa1 ....LASSEV LEGLEPHHGL KY......LR ISGYNGSTSP TWLPSSLTCL RPS5 NCWM.WE... .....IMIEK TP........ ........WK KNLT..NPNF RPS2 NDWL.PSLEV LTLHSLHNLT RV........ ........WG NSVS..QDCL RPP8 HFIHLKEL.. ..GLAVRMSK IP........ ........DQ HQLP..PH.L Mi .......... ......VATN RP........ ........WD FHFP..SN.L rx TEILHKEID. FGGTAPPTLL LP........ ........PP DAFP..QN.L

1451 1500 N .......... ........VA EGLHSLEYLN ......LSYC NLIDGGLP.. M ......EIYW C...KEPTWL PGIENLENLT SLVVDDVDIF QTLGGDL... L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 PE..KLQQLG LSGWSQPGPA PA........ .......... .......... Xa1 QT...LHLEK CGKWQIL.PL ERLGLLVKLV ......LIKM .........R RPS5 SNLSNVRIEG CDGLKDLTWL LFAPNLINLR ......VWGC KHLEDIISKE RPS2 RNIRCINISH CNKLKNVSWV QKLPKLEVIE ......LFDC REIEELISEH RPP8 AQI.Y.ICNC RMEEDPMPIL EKLLHLKSVK ......LTFK AFAG.....R Mi KIL.W.LREF PLTSDSLSTI ARLPNLEELS ......LYHT IIHG.....E rx KSLTF.RGEF SVAWKDLSIV GKLPKLEVLI ......LSWN AFIG.....K

1501 1550 N .......EEI GSLSSLKKLD LSRN...... .......... .......... M .......DGL QGLRSLETLT ITEVNGLTRI KGLMDLLCSS TCKLEKLEIK L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 .......... .....LKAFN HLRCLNL... .......... .........M Xa1 NA......TE LSIPSLEELV LIALPSL... .......... .........N RPS5 KAASVLEKEI LPFQKLECLN LYQLSEL... .......... .......... RPS2 ESPSVEDPTL FP..SLKTLR TRDLPEL... .......... .......... RPP8 R.MVCSK.G. .GFTQLCALE ISEQSEL... .......... .......... Mi E.WNMGEED. .TFENLKFLN .FNQVSI... .......... .......... rx E.WEVVE.E. .GFPHLKFLF .LDDVYI... .......... ..........

1551 1600 N ....NFEHLP S....SIAQL GALQSL.DLK DCQRLTQLPE LPP....... M ACHDLTEILP CELHDQTVVV PSFEKL.TIR DCPRLEVGPM IRSLPKFP.. L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 HCS..WNALP ....CNMEHL SSLETV.III KCLNIRSLPT LPQSL..... Xa1 TCS..CTSIR .......NLN SSLKVL.KIK NCPVLKVFPL FEISQKFEIE RPS5 ......KSIY ....WNALPF QRLRCLDILN NCPKLRKLPL DSKSVV.... RPS2 ......NSIL ....PSRFSF QKVETL.VIT NCPRVKKLPF QERRTQ.... RPP8 ......EEWI ....VEEGSM PCLRTL.TIH DCEKLKELPD GLKYIT.... Mi ......SKWE ....VGEESF PNLEKL.KLR GCHKLEEIPP SFGDIY.... rx ......RYWR ....ASSDHF PYLERV.ILR DCRNLDSIPR DFADIT....

1601 1650 N .......ELN ELHVDCHMAL KFIHYLVTK. ...R....KK L......... M .......MLK KLDLAV.... ....ANITK. ...E....ED LDVIGSLQEL L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 .TYFWLL... .......... KCDDGFMESC QTVGHPNWKK I...QHICRK Xa1 RTSSWLPHLS KLTIYN.CPL SCVHSSLPPS AISGYGEYGR CTLPQSLEEL RPS5 .......KVE EFVIKY.KEK KWIERVEWE. DEAT....QY RFLP...... RPS2 .........M NLPTVY.CEE KWWKALEKD. QPNE....EL CYLP......

RPP8 .......SLK ELKIEG.MKR EWKE...KL. VPGG....ED YYKVQHIP.. Mi .......SLK SIKIVK.SPQ LE.DSALKI. KEYA....ED MRGGDELQIL rx .......TLA LIDIDY.CQQ SVVNSAKQI. QQDI....QD NYGS.SIEVH

1651 1700 N .......... .....HRVKL DDAHNDTMYN LFAYTMFQNI SSMRHDI... M .......... ...VDLRIEL DDTSSG.IER IASLSKLKKL TTLRVKVPSL L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 YFSE~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 YIHEYSQETL QPCFSGNLTL LRK.L..... ....HVLGNS NLVSLQLHSC RPS5 ........TC RLR~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ........RF VPN~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 .....DV.QF INC....... DQ~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi GQK..NIPLF K~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx .....TRHLF IPKSVTTVED DDDSVTT... ....DEDDDD DDFEKEVASC

1701 1750 N ......SASD SLSLTVFTGQ PYPEKIPSWF HHQ.....GW DSSVSV.NLP M REIEELAALK SLQRLILEGC TSLERLRL.. .......... E...KL.KEP L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 TALEELIIQS CESLSSLDGL QLLGNLRLLR AHRCLSGHGE DGRCILPQSL RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx RNNVE~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

1751 1800 N ENWYIPD... KFLGFAVCYS .......... ....RSLIDT TAHLIPVCD. M DIGGCPDLTE LVQTVVVCPS .......... ......LVEL TIRDCPRLEV L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 EELYIHEYS. .QETLQPCFS GNLTLLRKLH VLGNSNLVSL QLHSCTALEE RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

1801 1850 N DKMSRMTQK. .....LA.LS ECDTE..... SSNYSEWDIH FFFVPFAGLW M GPMIRSLPK. .....FPMLK KLDLA..... VANIIEEDLD VIG.SLEELV L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 .LIIQSCESL SSLDGLQLLG NLRLLRAHRC LSGHGEDGRC ILPQSLEELY RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

1851 1900 N D.TSKANGKT PNDYGIIRLS FS........ .......... .......... M ILSLKLD..D TSSSSIERIS FLSKLQKLFR LR........ .......... L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 IHEYSQETLQ P........C FSGNLTLLRK LHVLGNSNFV SLQLHSCTAL RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

1901 1950 N .......... .....GEEKM YGLRLLYKEG PEV....... ......NALL M .....VKVSS LREIEGLAEL KSLQLLFLKG CTS....... ......LERL L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 EELIIQSCES LSSLDGLQLL GNLRLLQAHR CLSGHGEDGR CILPQSLEEL RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

1951 2000 N QMRENSNEPT EHS.....TG IRR.TQYNNR TSFYE..... ......LING M W.......PD .......... ..E.QQLDNN KSMRI..... ......DIRG L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 YIHEYSQETL QPCFSGNLTL LRKLHVLGNS NLVSLQLHSC TALEELIIQS RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

2001 2050 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M CKSLSVDH.L SAL....... .......... ..KSTLPPNV KIRWP....D L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 CESLSSLDGL QLLGNLRLLQ AHRCLSGHGE DGRCILPQSL EELYIHEYSQ RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

2051 2100 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M EKYK~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 ETLQPCFSGN LTLLRKLHVL GNSNLVSLQL HSCTSLEELK IQSCESLSSL RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

2101 2150 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 DGLQLLGNLR LLQAHRCLSG HGEDGRCILP QSLEELFISE YSLETLQPCF RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

2151 2200 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 LTNLTCLKQL EVSGTTSLKS LELQSCTALE HLKIQGCASL ATLEGLQFLH RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

2201 2250 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 ALRHMKVFRC PGLPPYLGSS SEQGYELCPR LERLDIDDPS ILTTSFCKHL RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

2251 2300 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 TSLQRLELNY CGSEVARLTD EQERALQLLT SLQELRFKYC YNLIDLPAGL RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

2301 2350 N ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ M ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ L6 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Cre3 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Xa1 HSLPSLKRLE IRSCRSIARL PEKGLPPSFE ELDIIACSNE LAQQCRTLAS RPS5 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPS2 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ RPP8 ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ Mi ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ rx ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~ ~~~~~~~~~~

2351 2362 N ~~~~~~~~~~ ~~ M ~~~~~~~~~~ ~~ L6 ~~~~~~~~~~ ~~

Cre3 ~~~~~~~~~~ ~~ Xa1 TLKVKINGGY VN RPS5 ~~~~~~~~~~ ~~ RPS2 ~~~~~~~~~~ ~~ RPP8 ~~~~~~~~~~ ~~ Mi ~~~~~~~~~~ ~~ rx ~~~~~~~~~~ ~~

基于联配的结果用MEGA7.0软件可以寻找domain和motif,部分结果截取如下

#7

Construct a phylogenetic tree for the above plant disease resistance genes

答:与上一题相似,可以在完成联配的同时由Clustal Omega在线程序生成进化树。也可以用MEGA7.0软件可以在计算机上完成序列联配,然后根据联配的结果可以生成进化树。具体的结果如下图所示。

Figure. Evolutionary relationships of taxa

The evolutionary history was inferred using the Neighbor-Joining method [1]. The bootstrap consensus tree inferred from 1000 replicates [2] is taken to represent the evolutionary history of the taxa analyzed [2]. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches [2]. The evolutionary distances were computed using the Poisson correction method [3] and are in the units of the number of amino acid substitutions per site. The analysis involved 10 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 2131 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [4].

1. Saitou N. and Nei M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4:406-425.

2. Felsenstein J. (1985). Confidence limits on phylogenies: An approach using the bootstrap. Evolution 39:783-791.

3. Zuckerkandl E. and Pauling L. (1965). Evolutionary divergence and convergence in proteins. Edited in Evolving Genes and Proteins by V. Bryson and H.J. Vogel, pp. 97-166. Academic Press, New York.

4. Kumar S., Stecher G., and Tamura K. (2016). MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets.Molecular Biology and Evolution 33:1870-1874.

Disclaimer: Although utmost care has been taken to ensure the correctness of the caption, the caption text is provided \is\without any warranty of any kind. Authors advise the user to

carefully check the caption prior to its use for any purpose and report any errors or problems to the authors immediately (www.megasoftware.net). In no event shall the authors and their employers be liable for any damages, including but not limited to special, consequential, or other damages. Authors specifically disclaim all other warranties expressed or implied, including but not limited to the determination of suitability of this caption text for a specific purpose, use, or application.

本文来源:https://www.bwwdw.com/article/paww.html

Top