生物信息学 实验三、核酸序列分析、常用分子信息学软件使用

更新时间:2023-04-28 12:15:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

实验三、核酸序列分析、常用分子信息学软件使用

1 、用BioEdit等软件进行序列分析

用BioEdit打开FASTA格式的序列

Sequence-->Nucleic acid-->

1)Sequence-->Nucleic acid-->Nucleotid composition

DNA molecule: gi|725605238|ref|XM_010330964.1| PREDICTED: Saimiri boliviensis boliviensis interferon, lambda 3 (IFNL3), mRNA

Length = 744 base pairs

Molecular Weight = 227718.00 Daltons, single stranded

Molecular Weight = 453776.00 Daltons, double stranded

G+C content = 61.56%

A+T content = 38.44%

Nucleotide Number Mol%

A 141 18.95

C 251 33.74

G 207 27.82

T 145 19.49

2)Sequence-->Nucleic acid-->Base composition and mass export(monoisotopic)

Title LENGTH A G C T Gap Other Mono_exact_mass BaseComp OppositeStrand_exact_mass OppositeStrand_BaseComp

gi|725605238|ref|XM_010330964.1| PREDICTED: Saimiri boliviensis boliviensis interferon, lambda 3 (IFNL3), mRNA 744 141 207 251 145 0 0 228830.3528025 A141 G207 C251 T145 230626.6695859 A145 G251 C207 T141

3)Sequence-->Nucleic acid-->Base compoeition and mass export with average masses

Title LENGTH A_COUNT G_COUNT C_COUNT T_COUNT GAP (not used in db) OTHER_COUNT MONO_EXACT_MASS BASE_COMP

OPPOSITE_STRAND_MASS OPPOSITE_STRAND_BASE_COMP SEQUENCE (not used in db) OppositeStrand_Ave_mass

OppositeStrand_BaseComp

gi|725605238|ref|XM_010330964.1| PREDICTED: Saimiri boliviensis boliviensis interferon, lambda 3 (IFNL3), mRNA 744 141 207 251 145 0 0 228939.3442020 A141 G207 C251 T145 230736.4659940 A145 G251 C207 T141

4)Sequence-->Nucleic acid-->complement CACTGTCGGAGTCTCACCCCAGGAAGACGACTGCTTCTGGTCTCTAGTCTTTACTTTGCTCCGTACTCGTCCCCGA CGTACCGTCACGACCACGACTACCGGTGTCACGACTGACAGTGACCTCGTCAAGGACAGTGGTCCGGGGGGTCCCG GGAGGGCCTACGTTCCCCGACGGTGTATCGGGTCAAGTTCAGAGACAGGGGTGTCCTCGACGTTCGGAAATTCTCT CGGTTTCTACGGAATCTTCTCAGCGAAGACGACTTCCTGACGTCCACGGCGAGGGCGGAGAAGGGGTCCTGGACCC TGGACTCCGTCGACGTCCACTCCCTCGCGGGGCACCGGAACCTCCGACTCGACCGGGACTGCGACCTCCAAGACCT CCGGTGGCGACTGTTACTGTACCGGGACCCACTGCAGGACCTGGCCGGGGAAGTGTGGGACGTGGTGCAGGAGAGG GTCGAGGCCCGGACACAGGTCGGAGTCGGGTGCCGTCCCGGGTCCGGGACCCCAGCGGAGGTGGTGACCGACGTGG CCGAGGTCCTCCGGGGTTTTTTCCTCAGGAGACCGACGGACCTCCGGAGACAGTGGAAGTTGGAGAAGGCGGAGGA GTGCGCCCTGGACTTTACGCAGCGGTCGCCCCTGGACACACGGACTGGAAGGGTGGAGGGACGGTGGGTAGTACGT TAGACTCTAAAATAAATATGTAGTTGGTGAACAGAATTAAATAACGGTGGGTTAGCGATA

5)Sequence-->Nucleic acid-->reverse complement TATCGCTAACCCACCGTTATTTAATTCTGTTCACCAACTACATATTTATTTTAGAGTCTAACGTACTACCCACCGT CCCTCCACCCTTCCAGTCCGTGTGTCCAGGGGCGACCGCTGCGTAAAGTCCAGGGCGCACTCCTCCGCCTTCTCCA ACTTCCACTGTCTCCGGAGGTCCGTCGGTCTCCTGAGGAAAAAACCCCGGAGGACCTCGGCCACGTCGGTCACCAC CTCCGCTGGGGTCCCGGACCCGGGACGGCACCCGACTCCGACCTGTGTCCGGGCCTCGACCCTCTCCTGCACCACG TCCCACACTTCCCCGGCCAGGTCCTGCAGTGGGTCCCGGTACAGTAACAGTCGCCACCGGAGGTCTTGGAGGTCGC AGTCCCGGTCGAGTCGGAGGTTCCGGTGCCCCGCGAGGGAGTGGACGTCGACGGAGTCCAGGGTCCAGGACCCCTT CTCCGCCCTCGCCGTGGACGTCAGGAAGTCGTCTTCGCTGAGAAGATTCCGTAGAAACCGAGAGAATTTCCGAACG TCGAGGACACCCCTGTCTCTGAACTTGACCCGATACACCGTCGGGGAACGTAGGCCCTCCCGGGACCCCCCGGACC

ACTGTCCTTGACGAGGTCACTGTCAGTCGTGACACCGGTAGTCGTGGTCGTGACGGTACGTCGGGGACGAGTACGG AGCAAAGTAAAGACTAGAGACCAGAAGCAGTCGTCTTCCTGGGGTGAGACTCCGACAGTG

6)Sequence-->Nucleic acid-->DNA-->RNA GUGACAGCCUCAGAGUGGGGUCCUUCUGCUGACGAAGACCAGAGAUCAGAAAUGAAACGAGGCAUGAGCAGGGGCU GCAUGGCAGUGCUGGUGCUGAUGGCCACAGUGCUGACUGUCACUGGAGCAGUUCCUGUCACCAGGCCCCCCAGGGC CCUCCCGGAUGCAAGGGGCUGCCACAUAGCCCAGUUCAAGUCUCUGUCCCCACAGGAGCUGCAAGCCUUUAAGAGA GCCAAAGAUGCCUUAGAAGAGUCGCUUCUGCUGAAGGACUGCAGGUGCCGCUCCCGCCUCUUCCCCAGGACCUGGG ACCUGAGGCAGCUGCAGGUGAGGGAGCGCCCCGUGGCCUUGGAGGCUGAGCUGGCCCUGACGCUGGAGGUUCUGGA GGCCACCGCUGACAAUGACAUGGCCCUGGGUGACGUCCUGGACCGGCCCCUUCACACCCUGCACCACGUCCUCUCC CAGCUCCGGGCCUGUGUCCAGCCUCAGCCCACGGCAGGGCCCAGGCCCUGGGGUCGCCUCCACCACUGGCUGCACC GGCUCCAGGAGGCCCCAAAAAAGGAGUCCUCUGGCUGCCUGGAGGCCUCUGUCACCUUCAACCUCUUCCGCCUCCU CACGCGGGACCUGAAAUGCGUCGCCAGCGGGGACCUGUGUGCCUGACCUUCCCACCUCCCUGCCACCCAUCAUGCA AUCUGAGAUUUUAUUUAUACAUCAACCACUUGUCUUAAUUUAUUGCCACCCAAUCGCUAU

7)Sequence-->Nucleic acid-->RNA-->DNA GTGACAGCCTCAGAGTGGGGTCCTTCTGCTGACGAAGACCAGAGATCAGAAATGAAACGAGGCATGAGCAGGGGCT GCATGGCAGTGCTGGTGCTGATGGCCACAGTGCTGACTGTCACTGGAGCAGTTCCTGTCACCAGGCCCCCCAGGGC CCTCCCGGATGCAAGGGGCTGCCACATAGCCCAGTTCAAGTCTCTGTCCCCACAGGAGCTGCAAGCCTTTAAGAGA GCCAAAGATGCCTTAGAAGAGTCGCTTCTGCTGAAGGACTGCAGGTGCCGCTCCCGCCTCTTCCCCAGGACCTGGG ACCTGAGGCAGCTGCAGGTGAGGGAGCGCCCCGTGGCCTTGGAGGCTGAGCTGGCCCTGACGCTGGAGGTTCTGGA GGCCACCGCTGACAATGACATGGCCCTGGGTGACGTCCTGGACCGGCCCCTTCACACCCTGCACCACGTCCTCTCC CAGCTCCGGGCCTGTGTCCAGCCTCAGCCCACGGCAGGGCCCAGGCCCTGGGGTCGCCTCCACCACTGGCTGCACC GGCTCCAGGAGGCCCCAAAAAAGGAGTCCTCTGGCTGCCTGGAGGCCTCTGTCACCTTCAACCTCTTCCGCCTCCT CACGCGGGACCTGAAATGCGTCGCCAGCGGGGACCTGTGTGCCTGACCTTCCCACCTCCCTGCCACCCATCATGCA ATCTGAGATTTTATTTATACATCAACCACTTGTCTTAATTTATTGCCACCCAATCGCTAT

8)Sequence-->Nucleic acid-->Translate

Frame 1:

>gi|725605238|ref|XM_010330964.1| PREDICTED: Saimiri boliviensis boliviensis interferon, lambda 3 (IFNL3), mRNA

1 GTG ACA GCC TCA GAG TGG GGT CCT TCT GCT GAC GAA GAC CAG AGA 45

1 Val Thr Ala Ser Glu Trp Gly Pro Ser Ala Asp Glu Asp Gln Arg 15

46 TCA GAA ATG AAA CGA GGC ATG AGC AGG GGC TGC ATG GCA GTG CTG 90

16 Ser Glu Met Lys Arg Gly Met Ser Arg Gly Cys Met Ala Val Leu 30

91 GTG CTG ATG GCC ACA GTG CTG ACT GTC ACT GGA GCA GTT CCT GTC 135

31 Val Leu Met Ala Thr Val Leu Thr Val Thr Gly Ala Val Pro Val 45

136 ACC AGG CCC CCC AGG GCC CTC CCG GAT GCA AGG GGC TGC CAC ATA 180

46 Thr Arg Pro Pro Arg Ala Leu Pro Asp Ala Arg Gly Cys His Ile 60

181 GCC CAG TTC AAG TCT CTG TCC CCA CAG GAG CTG CAA GCC TTT AAG 225

61 Ala Gln Phe Lys Ser Leu Ser Pro Gln Glu Leu Gln Ala Phe Lys 75

226 AGA GCC AAA GAT GCC TTA GAA GAG TCG CTT CTG CTG AAG GAC TGC 270

76 Arg Ala Lys Asp Ala Leu Glu Glu Ser Leu Leu Leu Lys Asp Cys 90

271 AGG TGC CGC TCC CGC CTC TTC CCC AGG ACC TGG GAC CTG AGG CAG 315

91 Arg Cys Arg Ser Arg Leu Phe Pro Arg Thr Trp Asp Leu Arg Gln 105

316 CTG CAG GTG AGG GAG CGC CCC GTG GCC TTG GAG GCT GAG CTG GCC 360

106 Leu Gln Val Arg Glu Arg Pro Val Ala Leu Glu Ala Glu Leu Ala 120

361 CTG ACG CTG GAG GTT CTG GAG GCC ACC GCT GAC AAT GAC ATG GCC 405

121 Leu Thr Leu Glu Val Leu Glu Ala Thr Ala Asp Asn Asp Met Ala 135

406 CTG GGT GAC GTC CTG GAC CGG CCC CTT CAC ACC CTG CAC CAC GTC 450

136 Leu Gly Asp Val Leu Asp Arg Pro Leu His Thr Leu His His Val 150

451 CTC TCC CAG CTC CGG GCC TGT GTC CAG CCT CAG CCC ACG GCA GGG 495

151 Leu Ser Gln Leu Arg Ala Cys Val Gln Pro Gln Pro Thr Ala Gly 165

496 CCC AGG CCC TGG GGT CGC CTC CAC CAC TGG CTG CAC CGG CTC CAG 540

166 Pro Arg Pro Trp Gly Arg Leu His His Trp Leu His Arg Leu Gln 180

541 GAG GCC CCA AAA AAG GAG TCC TCT GGC TGC CTG GAG GCC TCT GTC 585

181 Glu Ala Pro Lys Lys Glu Ser Ser Gly Cys Leu Glu Ala Ser Val 195

586 ACC TTC AAC CTC TTC CGC CTC CTC ACG CGG GAC CTG AAA TGC GTC 630

196 Thr Phe Asn Leu Phe Arg Leu Leu Thr Arg Asp Leu Lys Cys Val 210

631 GCC AGC GGG GAC CTG TGT GCC TGA CCT TCC CAC CTC CCT GCC ACC 675

211 Ala Ser Gly Asp Leu Cys Ala End Pro Ser His Leu Pro Ala Thr 225

676 CAT CAT GCA ATC TGA GAT TTT ATT TAT ACA TCA ACC ACT TGT CTT 720

226 His His Ala Ile End Asp Phe Ile Tyr Thr Ser Thr Thr Cys Leu 240

721 AAT TTA TTG CCA CCC AAT CGC TAT 744

241 Asn Leu Leu Pro Pro Asn Arg Tyr

Each codon is read as left nucleotide, top nucleotide, right nucleotide

Each entry is organized as follows:

The number of occurrences of the codon in the sequence

Preference of that codon in organism represented by the codon table

(as a fraction of all codons coding for the same amino acid)

Three-letter code for the amino acid coded for according to the codon table

|A C G T |

----------------------------- A |4 3 2 1 |A

|0.76 0.12 0.04 0.07 | |Lys Thr Arg Ile |

----------------------------- A |1 7 2 1 |C

|0.61 0.43 0.27 0.46 | |Asn Thr Ser Ile |

----------------------------- A |4 3 9 5 |G

|0.24 0.23 0.03 1 |

|Lys Thr Arg Met |

----------------------------- A |3 3 0 1 |T

|0.39 0.21 0.13 0.47 | |Asn Thr Ser Ile |

----------------------------- C |1 3 1 0 |A

|0.31 0.2 0.05 0.03 |

|Gln Pro Arg Leu |

----------------------------- C |8 9 6 10 |C

|0.48 0.1 0.37 0.1 |

|His Pro Arg Leu |

----------------------------- C |9 1 4 20 |G

|0.69 0.55 0.08 0.55 | |Gln Pro Arg Leu |

----------------------------- C |2 5 0 3 |T

|0.52 0.16 0.42 0.1 |

|His Pro Arg Leu |

----------------------------- G |3 5 1 0 |A

|0.7 0.22 0.09 0.17 |

|Glu Ala Gly Val |

----------------------------- G |10 17 4 7 |C

|0.41 0.25 0.4 0.2 |

|Asp Ala Gly Val |

----------------------------- G |11 0 2 6 |G

|0.3 0.34 0.13 0.34 |

|Glu Ala Gly Val |

-----------------------------

G |3 3 3 2 |T

|0.59 0.19 0.38 0.29 |

|Asp Ala Gly Val |

-----------------------------

T |0 3 2 2 |A

|0.62 0.12 0.3 0.11 |

|End Ser End Leu |

-----------------------------

T |0 5 6 4 |C

|0.47 0.17 0.57 0.49 |

|Tyr Ser Cys Phe |

-----------------------------

T |0 1 4 2 |G

|0.09 0.13 1 0.11 |

|End Ser Trp Leu |

-----------------------------

T |2 4 3 2 |T

|0.53 0.19 0.43 0.51 |

|Tyr Ser Cys Phe |

-----------------------------

Frame 2:

>gi|725605238|ref|XM_010330964.1| PREDICTED: Saimiri boliviensis boliviensis interferon, lambda 3 (IFNL3), mRNA

2 TGA CAG CCT CAG AGT GGG GTC CTT CTG CTG ACG AAG ACC AGA GAT 46

1 End Gln Pro Gln Ser Gly Val Leu Leu Leu Thr Lys Thr Arg Asp 15

47 CAG AAA TGA AAC GAG GCA TGA GCA GGG GCT GCA TGG CAG TGC TGG 91

16 Gln Lys End Asn Glu Ala End Ala Gly Ala Ala Trp Gln Cys Trp 30

92 TGC TGA TGG CCA CAG TGC TGA CTG TCA CTG GAG CAG TTC CTG TCA 136

31 Cys End Trp Pro Gln Cys End Leu Ser Leu Glu Gln Phe Leu Ser 45

137 CCA GGC CCC CCA GGG CCC TCC CGG ATG CAA GGG GCT GCC ACA TAG 181

46 Pro Gly Pro Pro Gly Pro Ser Arg Met Gln Gly Ala Ala Thr End 60

182 CCC AGT TCA AGT CTC TGT CCC CAC AGG AGC TGC AAG CCT TTA AGA 226

61 Pro Ser Ser Ser Leu Cys Pro His Arg Ser Cys Lys Pro Leu Arg 75

227 GAG CCA AAG ATG CCT TAG AAG AGT CGC TTC TGC TGA AGG ACT GCA 271 76 Glu Pro Lys Met Pro End Lys Ser Arg Phe Cys End Arg Thr Ala 90

272 GGT GCC GCT CCC GCC TCT TCC CCA GGA CCT GGG ACC TGA GGC AGC 316 91 Gly Ala Ala Pro Ala Ser Ser Pro Gly Pro Gly Thr End Gly Ser 105

317 TGC AGG TGA GGG AGC GCC CCG TGG CCT TGG AGG CTG AGC TGG CCC 361 106 Cys Arg End Gly Ser Ala Pro Trp Pro Trp Arg Leu Ser Trp Pro 120

362 TGA CGC TGG AGG TTC TGG AGG CCA CCG CTG ACA ATG ACA TGG CCC 406 121 End Arg Trp Arg Phe Trp Arg Pro Pro Leu Thr Met Thr Trp Pro 135

407 TGG GTG ACG TCC TGG ACC GGC CCC TTC ACA CCC TGC ACC ACG TCC 451 136 Trp Val Thr Ser Trp Thr Gly Pro Phe Thr Pro Cys Thr Thr Ser 150

452 TCT CCC AGC TCC GGG CCT GTG TCC AGC CTC AGC CCA CGG CAG GGC 496 151 Ser Pro Ser Ser Gly Pro Val Ser Ser Leu Ser Pro Arg Gln Gly 165

497 CCA GGC CCT GGG GTC GCC TCC ACC ACT GGC TGC ACC GGC TCC AGG 541 166 Pro Gly Pro Gly Val Ala Ser Thr Thr Gly Cys Thr Gly Ser Arg 180

542 AGG CCC CAA AAA AGG AGT CCT CTG GCT GCC TGG AGG CCT CTG TCA 586 181 Arg Pro Gln Lys Arg Ser Pro Leu Ala Ala Trp Arg Pro Leu Ser 195

587 CCT TCA ACC TCT TCC GCC TCC TCA CGC GGG ACC TGA AAT GCG TCG 631 196 Pro Ser Thr Ser Ser Ala Ser Ser Arg Gly Thr End Asn Ala Ser 210

632 CCA GCG GGG ACC TGT GTG CCT GAC CTT CCC ACC TCC CTG CCA CCC 676 211 Pro Ala Gly Thr Cys Val Pro Asp Leu Pro Thr Ser Leu Pro Pro 225

677 ATC ATG CAA TCT GAG ATT TTA TTT ATA CAT CAA CCA CTT GTC TTA 721 226 Ile Met Gln Ser Glu Ile Leu Phe Ile His Gln Pro Leu Val Leu 240

722 ATT TAT TGC CAC CCA ATC GCT 742

241 Ile Tyr Cys His Pro Ile Ala 247

Each codon is read as left nucleotide, top nucleotide, right nucleotide Each entry is organized as follows:

The number of occurrences of the codon in the sequence

Preference of that codon in organism represented by the codon table (as a fraction of all codons coding for the same amino acid)

Three-letter code for the amino acid coded for according to the codon table

|A C G T |

-----------------------------

A |2 4 2 1 |A

|0.76 0.12 0.04 0.07 |

|Lys Thr Arg Ile |

-----------------------------

A |1 10 7 2 |C

|0.61 0.43 0.27 0.46 |

|Asn Thr Ser Ile |

-----------------------------

A |4 3 10 4 |G

|0.24 0.23 0.03 1 |

|Lys Thr Arg Met |

-----------------------------

A |1 2 5 2 |T

|0.39 0.21 0.13 0.47 |

|Asn Thr Ser Ile |

-----------------------------

C |4 12 0 0 |A

|0.31 0.2 0.05 0.03 |

|Gln Pro Arg Leu |

-----------------------------

C |2 13 3 2 |C

|0.48 0.1 0.37 0.1 |

|His Pro Arg Leu |

-----------------------------

C |7 2 2 10 |G

|0.69 0.55 0.08 0.55 |

|Gln Pro Arg Leu |

-----------------------------

C |1 11 0 3 |T

|0.52 0.16 0.42 0.1 |

|His Pro Arg Leu |

-----------------------------

G |0 4 1 0 |A

|0.7 0.22 0.09 0.17 |

|Glu Ala Gly Val |

-----------------------------

G |1 7 7 3 |C

|0.41 0.25 0.4 0.2 |

|Asp Ala Gly Val |

-----------------------------

G |4 2 10 3 |G

|0.3 0.34 0.13 0.34 |

|Glu Ala Gly Val |

-----------------------------

G |1 5 1 0 |T

|0.59 0.19 0.38 0.29 |

|Asp Ala Gly Val |

-----------------------------

T |0 6 10 3 |A

|0.62 0.12 0.3 0.11 |

|End Ser End Leu |

-----------------------------

T |0 11 9 4 |C

|0.47 0.17 0.57 0.49 |

|Tyr Ser Cys Phe |

-----------------------------

T |2 1 12 0 |G

|0.09 0.13 1 0.11 |

|End Ser Trp Leu |

-----------------------------

T |1 4 2 1 |T

|0.53 0.19 0.43 0.51 |

|Tyr Ser Cys Phe |

-----------------------------

Frame3:

>gi|725605238|ref|XM_010330964.1| PREDICTED: Saimiri boliviensis boliviensis interferon, lambda 3 (IFNL3), mRNA

3 GAC AGC CTC AGA GTG GGG TCC TTC TGC TGA CGA AGA CCA GAG ATC 47

0 Asp Ser Leu Arg Val Gly Ser Phe Cys End Arg Arg Pro Glu Ile 14

48 AGA AAT GAA ACG AGG CAT GAG CAG GGG CTG CAT GGC AGT GCT GGT 92

15 Arg Asn Glu Thr Arg His Glu Gln Gly Leu His Gly Ser Ala Gly 29

93 GCT GAT GGC CAC AGT GCT GAC TGT CAC TGG AGC AGT TCC TGT CAC 137

30 Ala Asp Gly His Ser Ala Asp Cys His Trp Ser Ser Ser Cys His 44

138 CAG GCC CCC CAG GGC CCT CCC GGA TGC AAG GGG CTG CCA CAT AGC 182

45 Gln Ala Pro Gln Gly Pro Pro Gly Cys Lys Gly Leu Pro His Ser 59

183 CCA GTT CAA GTC TCT GTC CCC ACA GGA GCT GCA AGC CTT TAA GAG 227

60 Pro Val Gln Val Ser Val Pro Thr Gly Ala Ala Ser Leu End Glu 74

228 AGC CAA AGA TGC CTT AGA AGA GTC GCT TCT GCT GAA GGA CTG CAG 272 75 Ser Gln Arg Cys Leu Arg Arg Val Ala Ser Ala Glu Gly Leu Gln 89

273 GTG CCG CTC CCG CCT CTT CCC CAG GAC CTG GGA CCT GAG GCA GCT 317 90 Val Pro Leu Pro Pro Leu Pro Gln Asp Leu Gly Pro Glu Ala Ala 104

318 GCA GGT GAG GGA GCG CCC CGT GGC CTT GGA GGC TGA GCT GGC CCT 362 105 Ala Gly Glu Gly Ala Pro Arg Gly Leu Gly Gly End Ala Gly Pro 119

363 GAC GCT GGA GGT TCT GGA GGC CAC CGC TGA CAA TGA CAT GGC CCT 407 120 Asp Ala Gly Gly Ser Gly Gly His Arg End Gln End His Gly Pro 134

408 GGG TGA CGT CCT GGA CCG GCC CCT TCA CAC CCT GCA CCA CGT CCT 452 135 Gly End Arg Pro Gly Pro Ala Pro Ser His Pro Ala Pro Arg Pro 149

453 CTC CCA GCT CCG GGC CTG TGT CCA GCC TCA GCC CAC GGC AGG GCC 497 150 Leu Pro Ala Pro Gly Leu Cys Pro Ala Ser Ala His Gly Arg Ala 164

498 CAG GCC CTG GGG TCG CCT CCA CCA CTG GCT GCA CCG GCT CCA GGA 542 165 Gln Ala Leu Gly Ser Pro Pro Pro Leu Ala Ala Pro Ala Pro Gly 179

543 GGC CCC AAA AAA GGA GTC CTC TGG CTG CCT GGA GGC CTC TGT CAC 587 180 Gly Pro Lys Lys Gly Val Leu Trp Leu Pro Gly Gly Leu Cys His 194

588 CTT CAA CCT CTT CCG CCT CCT CAC GCG GGA CCT GAA ATG CGT CGC 632 195 Leu Gln Pro Leu Pro Pro Pro His Ala Gly Pro Glu Met Arg Arg 209

633 CAG CGG GGA CCT GTG TGC CTG ACC TTC CCA CCT CCC TGC CAC CCA 677 210 Gln Arg Gly Pro Val Cys Leu Thr Phe Pro Pro Pro Cys His Pro 224

678 TCA TGC AAT CTG AGA TTT TAT TTA TAC ATC AAC CAC TTG TCT TAA 722 225 Ser Cys Asn Leu Arg Phe Tyr Leu Tyr Ile Asn His Leu Ser End 239

723 TTT ATT GCC ACC CAA TCG CTA 743

240 Phe Ile Ala Thr Gln Ser Leu 246

Each codon is read as left nucleotide, top nucleotide, right nucleotide Each entry is organized as follows:

The number of occurrences of the codon in the sequence

Preference of that codon in organism represented by the codon table

(as a fraction of all codons coding for the same amino acid)

Three-letter code for the amino acid coded for according to the codon table

|A C G T |

-----------------------------

A |2 1 7 0 |A

|0.76 0.12 0.04 0.07 |

|Lys Thr Arg Ile |

-----------------------------

A |1 2 5 2 |C

|0.61 0.43 0.27 0.46 |

|Asn Thr Ser Ile |

-----------------------------

A |1 1 2 1 |G

|0.24 0.23 0.03 1 |

|Lys Thr Arg Met |

-----------------------------

A |2 0 3 1 |T

|0.39 0.21 0.13 0.47 |

|Asn Thr Ser Ile |

-----------------------------

C |5 11 1 1 |A

|0.31 0.2 0.05 0.03 |

|Gln Pro Arg Leu |

-----------------------------

C |10 7 2 5 |C

|0.48 0.1 0.37 0.1 |

|His Pro Arg Leu |

-----------------------------

C |7 6 1 10 |G

|0.69 0.55 0.08 0.55 |

|Gln Pro Arg Leu |

-----------------------------

C |4 17 4 6 |T

|0.52 0.16 0.42 0.1 |

|His Pro Arg Leu |

-----------------------------

G |3 5 14 0 |A

|0.7 0.22 0.09 0.17 |

|Glu Ala Gly Val |

-----------------------------

G |4 7 12 4 |C

|0.41 0.25 0.4 0.2 |

|Asp Ala Gly Val |

-----------------------------

G |5 2 5 3 |G

|0.3 0.34 0.13 0.34 |

|Glu Ala Gly Val |

-----------------------------

G |1 12 3 1 |T

|0.59 0.19 0.38 0.29 |

|Asp Ala Gly Val |

-----------------------------

T |2 3 5 1 |A

|0.62 0.12 0.3 0.11 |

|End Ser End Leu |

-----------------------------

T |1 2 6 2 |C

|0.47 0.17 0.57 0.49 |

|Tyr Ser Cys Phe |

-----------------------------

T |0 2 2 1 |G

|0.09 0.13 1 0.11 |

|End Ser Trp Leu |

-----------------------------

T |1 4 4 2 |T

|0.53 0.19 0.43 0.51 |

|Tyr Ser Cys Phe |

-----------------------------

Section:

>gi|725605238|ref|XM_010330964.1| PREDICTED: Saimiri boliviensis boliviensis interferon, lambda 3 (IFNL3), mRNA

1 GTG ACA GCC TCA GAG TGG GGT CCT TCT GCT GAC GAA GAC CAG AGA 45

1 Val Thr Ala Ser Glu Trp Gly Pro Ser Ala Asp Glu Asp Gln Arg 15

46 TCA GAA ATG AAA CGA GGC ATG AGC AGG GGC TGC ATG GCA GTG CTG 90

16 Ser Glu Met Lys Arg Gly Met Ser Arg Gly Cys Met Ala Val Leu 30

91 GTG CTG ATG GCC ACA GTG CTG ACT GTC ACT GGA GCA GTT CCT GTC 135

31 Val Leu Met Ala Thr Val Leu Thr Val Thr Gly Ala Val Pro Val 45

136 ACC AGG CCC CCC AGG GCC CTC CCG GAT GCA AGG GGC TGC CAC ATA 180

46 Thr Arg Pro Pro Arg Ala Leu Pro Asp Ala Arg Gly Cys His Ile 60

181 GCC CAG TTC AAG TCT CTG TCC CCA CAG GAG CTG CAA GCC TTT AAG 225 61 Ala Gln Phe Lys Ser Leu Ser Pro Gln Glu Leu Gln Ala Phe Lys 75

226 AGA GCC AAA GAT GCC TTA GAA GAG TCG CTT CTG CTG AAG GAC TGC 270 76 Arg Ala Lys Asp Ala Leu Glu Glu Ser Leu Leu Leu Lys Asp Cys 90

271 AGG TGC CGC TCC CGC CTC TTC CCC AGG ACC TGG GAC CTG AGG CAG 315 91 Arg Cys Arg Ser Arg Leu Phe Pro Arg Thr Trp Asp Leu Arg Gln 105

316 CTG CAG GTG AGG GAG CGC CCC GTG GCC TTG GAG GCT GAG CTG GCC 360 106 Leu Gln Val Arg Glu Arg Pro Val Ala Leu Glu Ala Glu Leu Ala 120

361 CTG ACG CTG GAG GTT CTG GAG GCC ACC GCT GAC AAT GAC ATG GCC 405 121 Leu Thr Leu Glu Val Leu Glu Ala Thr Ala Asp Asn Asp Met Ala 135

406 CTG GGT GAC GTC CTG GAC CGG CCC CTT CAC ACC CTG CAC CAC GTC 450 136 Leu Gly Asp Val Leu Asp Arg Pro Leu His Thr Leu His His Val 150

451 CTC TCC CAG CTC CGG GCC TGT GTC CAG CCT CAG CCC ACG GCA GGG 495 151 Leu Ser Gln Leu Arg Ala Cys Val Gln Pro Gln Pro Thr Ala Gly 165

496 CCC AGG CCC TGG GGT CGC CTC CAC CAC TGG CTG CAC CGG CTC CAG 540 166 Pro Arg Pro Trp Gly Arg Leu His His Trp Leu His Arg Leu Gln 180

541 GAG GCC CCA AAA AAG GAG TCC TCT GGC TGC CTG GAG GCC TCT GTC 585 181 Glu Ala Pro Lys Lys Glu Ser Ser Gly Cys Leu Glu Ala Ser Val 195

586 ACC TTC AAC CTC TTC CGC CTC CTC ACG CGG GAC CTG AAA TGC GTC 630 196 Thr Phe Asn Leu Phe Arg Leu Leu Thr Arg Asp Leu Lys Cys Val 210

631 GCC AGC GGG GAC CTG TGT GCC TGA CCT TCC CAC CTC CCT GCC ACC 675 211 Ala Ser Gly Asp Leu Cys Ala End Pro Ser His Leu Pro Ala Thr 225

676 CAT CAT GCA ATC TGA GAT TTT ATT TAT ACA TCA ACC ACT TGT CTT 720 226 His His Ala Ile End Asp Phe Ile Tyr Thr Ser Thr Thr Cys Leu 240

721 AAT TTA TTG CCA CCC AAT CGC TAT 744

241 Asn Leu Leu Pro Pro Asn Arg Tyr

Each codon is read as left nucleotide, top nucleotide, right nucleotide Each entry is organized as follows:

The number of occurrences of the codon in the sequence

Preference of that codon in organism represented by the codon table

(as a fraction of all codons coding for the same amino acid)

Three-letter code for the amino acid coded for according to the codon table

|A C G T |

-----------------------------

A |4 3 2 1 |A

|0.76 0.12 0.04 0.07 |

|Lys Thr Arg Ile |

-----------------------------

A |1 7 2 1 |C

|0.61 0.43 0.27 0.46 |

|Asn Thr Ser Ile |

-----------------------------

A |4 3 9 5 |G

|0.24 0.23 0.03 1 |

|Lys Thr Arg Met |

-----------------------------

A |3 3 0 1 |T

|0.39 0.21 0.13 0.47 |

|Asn Thr Ser Ile |

-----------------------------

C |1 3 1 0 |A

|0.31 0.2 0.05 0.03 |

|Gln Pro Arg Leu |

-----------------------------

C |8 9 6 10 |C

|0.48 0.1 0.37 0.1 |

|His Pro Arg Leu |

-----------------------------

C |9 1 4 20 |G

|0.69 0.55 0.08 0.55 |

|Gln Pro Arg Leu |

-----------------------------

C |2 5 0 3 |T

|0.52 0.16 0.42 0.1 |

|His Pro Arg Leu |

-----------------------------

G |3 5 1 0 |A

|0.7 0.22 0.09 0.17 |

|Glu Ala Gly Val |

-----------------------------

G |10 17 4 7 |C

|0.41 0.25 0.4 0.2 |

|Asp Ala Gly Val |

-----------------------------

G |11 0 2 6 |G

|0.3 0.34 0.13 0.34 |

|Glu Ala Gly Val |

-----------------------------

G |3 3 3 2 |T

|0.59 0.19 0.38 0.29 |

|Asp Ala Gly Val |

-----------------------------

T |0 3 2 2 |A

|0.62 0.12 0.3 0.11 |

|End Ser End Leu |

-----------------------------

T |0 5 6 4 |C

|0.47 0.17 0.57 0.49 |

|Tyr Ser Cys Phe |

-----------------------------

T |0 1 4 2 |G

|0.09 0.13 1 0.11 |

|End Ser Trp Leu |

-----------------------------

T |2 4 3 2 |T

|0.53 0.19 0.43 0.51 |

|Tyr Ser Cys Phe |

-----------------------------

9)Sequence-->Nucleic acid-->Find next ORF ATGAAACGAGGCATGAGCAGGGGCTGCATGGCAGTGCTGGTGCTGATGGCCACAGTGCTGACTGTCACTGGAGCAG TTCCTGTCACCAGGCCCCCCAGGGCCCTCCCGGATGCAAGGGGCTGCCACATAGCCCAGTTCAAGTCTCTGTCCCC ACAGGAGCTGCAAGCCTTTAAGAGAGCCAAAGATGCCTTAGAAGAGTCGCTTCTGCTGAAGGACTGCAGGTGCCGC TCCCGCCTCTTCCCCAGGACCTGGGACCTGAGGCAGCTGCAGGTGAGGGAGCGCCCCGTGGCCTTGGAGGCTGAGC TGGCCCTGACGCTGGAGGTTCTGGAGGCCACCGCTGACAATGACATGGCCCTGGGTGACGTCCTGGACCGGCCCCT TCACACCCTGCACCACGTCCTCTCCCAGCTCCGGGCCTGTGTCCAGCCTCAGCCCACGGCAGGGCCCAGGCCCTGG GGTCGCCTCCACCACTGGCTGCACCGGCTCCAGGAGGCCCCAAAAAAGGAGTCCTCTGGCTGCCTGGAGGCCTCTG TCACCTTCAACCTCTTCCGCCTCCTCACGCGGGACCTGAAATGCGTCGCCAGCGGGGACCTGTGTGCCTGA

10)Sequence-->Nucleic acid-->Creat Plasmid from Sequence

11)Restriction map

BioEdit version 7.0.9.0 (6/27/07) Restriction Mapping Utility

(c)1998, Tom Hall

gi|725605238|ref|XM_010330964.1| PREDICTED: Saimiri boliviensis boliviensis interferon, lambda 3 (IFNL3), mRNA Restriction Map

2014/11/29 17:27:57

744 base pairs

Translations: 3 2 1 -1 -2 -3

Restriction Enzyme Map:

D S L R V G S F C * R R P

E I R N E T R H E Q G L H

* Q P Q S G V L L L T K T R D Q K * N E A * A G A A

V T A S E W G P S A D E D Q R S E M K R G M S R G C M

1 GTGACAGCCTCAGAGTGGGGTCCTTCTGCTGACGAAGACCAGAGA TCAGAAATGAAACGAGGCA TGAGCAGGGGCTGCAT 80

1 CACTGTCGGAGTCTCACCCCAGGAAGACGACTGCTTCTGGTCTCTAGTCTTTACTTTGCTCCGTACTCGTCCCCGACGTA 80

H C G * L P T R R S V F V L S * F H F S A H A P A A H

V A E S H P G E A S S S W L D S I F R P M L L P Q M

S L R L T P D K Q Q R L G S I L F S V L C S C P S C

BslI NlaIV Hin4I BbsI MnlI Hin4I TspDTI HpyF10VI

MnlI Hin4I MboII Hin4I BstAPI

EcoO109I BbvI MwoI

PpuMI AlwNI

BspCNI

BseMII

G S A G A D G H S A D C H W S S S C H Q A P Q G P P G

W Q C W C * W P Q C * L S L E Q F L S P G P P G P S R

A V L V L M A T V L T V T G A V P V T R P P R A L P

81 GGCAGTGCTGGTGCTGA TGGCCACAGTGCTGACTGTCACTGGAGCAGTTCCTGTCACCAGGCCCCCCAGGGCCCTCCCGG 160

81 CCGTCACGACCACGACTACCGGTGTCACGACTGACAGTGACCTCGTCAAGGACAGTGGTCCGGGGGGTCCCGGGAGGGCC 160

C H Q H Q H G C H Q S

D S S C N R D G P G G P G

E R

A T S T S I A V T S V T V P A T G T V L G G L A R G S

P L A P A S P W L A S Q * Q L L E Q * W A G W P G G P

MslI TspRI EaeI TspRI TspRI AlwNI BpmI BsaJI ApaI

BtsI MscI BsrI Eco57MI EcoO109I

HphI EcoO109I EcoO109I

NlaIV PspOMI BbvI

BsaJI BanII

SfaNI

NlaIV

Bme1580I

Bsp1286I

C K G L P H S P V Q V S V P T G A A S L * E S Q R C

M Q G A A T * P S S S L C P H R S C K P L R E P K M P

D A R G C H I A Q F K S L S P Q

E L Q A

F K R A K D A

161 A TGCAAGGGGCTGCCACA TAGCCCAGTTCAAGTCTCTGTCCCCACAGGAGCTGCAAGCCTTTAAGAGAGCCAAAGA TGCC 240 161 TACGTTCCCCGACGGTGTA TCGGGTCAAGTTCAGAGACAGGGGTGTCCTCGACGTTCGGAAATTCTCTCGGTTTCTACGG 240

I C P A A V Y G L E L R Q G W L L Q L G K L S G F I G

A L P Q W M A W N L D R D G C S S C A K L L A L S A

H L P S G C L G T * T E T G V P A A L R * S L W L H R

MnlI FokI BsrI BsmAI AlwNI SfaNI MwoI

BstF5I BmrI BsmFI BbvI Cac8I HpyF10VI

FalI

EarI

L R R V A S A E G L Q V P L P P L P Q D L G P E A A A

* K S R F C * R T A G A A P A S S P G P G T * G S C

L E E S L L L K D C R C R S R L F P R T W D L R Q L Q

241 TTAGAAGAGTCGCTTCTGCTGAAGGACTGCAGGTGCCGCTCCCGCCTCTTCCCCAGGACCTGGGACCTGAGGCAGCTGCA 320 241 AA TCTTCTCAGCGAAGACGACTTCCTGACGTCCACGGCGAGGGCGGAGAAGGGGTCCTGGACCCTGGACTCCGTCGACGT 320 * F L R K Q Q L V A P A A G A E E G P G P V Q P L Q L

K S S D S R S F S Q L H R E R R K G L V Q S R L C S C

L L T A E A S P S C T G S G G R G W S R P G S A A A

PleI SfcI NlaIV MwoI FauI MnlI MnlI AarI MspA1I

MboII FalI MwoI HpyF10VI EarI BseMII Bsu36I PvuII

MlyI PstI BsrBI BsaJI BslI BspMI BsmFI

AarI BanI MboII EcoO109I SfcI

BspMI HpyF10VI PpuMI EcoO109I MnlI

Eco57I PflMI

Eco57MI AlwNI

BspCNI

BsaJI

BbvI

PpuMI

NlaIV

G E G A P R G L G G * A G P D A G G S G G H R * Q * H

R * G S A P W P W R L S W P * R W R F W R P P L T M T

V R E R P V A L E A E L A L T L E V L E A T A D N D

321 GGTGAGGGAGCGCCCCGTGGCCTTGGAGGCTGAGCTGGCCCTGACGCTGGAGGTTCTGGAGGCCACCGCTGACAATGACA 400 321 CCACTCCCTCGCGGGGCACCGGAACCTCCGACTCGACCGGGACTGCGACCTCCAAGACCTCCGGTGGCGACTGTTACTGT 400

H P L A G H G Q L S L Q G Q R Q L N Q L G G S V I V

T L S R G T A K S A S S A R V S S T R S A V A S L S M

P S P A G R P R P P Q A P G S A P P E P P W R Q C H C

PstI HphI MnlI BglI MwoI MnlI HgaI BpmI BpmI

BbvI HaeII BseMII BlpI HpyF10VI BslI MnlI Eco57MI Eco57MI

BsaJI BspCNI Cac8I Hpy188III MspA1I TaqII

BtgI BsaJI

StyI HpyF10VI

MwoI

G P G * R P G P A P S H P A P R P L P A P G L C P A

W P W V T S W T G P F T P C T T S S P S S G P V S S L

M A L G D V L D R P L H T L H H V L S Q L R A C V Q P

401 TGGCCCTGGGTGACGTCCTGGACCGGCCCCTTCACACCCTGCACCACGTCCTCTCCCAGCTCCGGGCCTGTGTCCAGCCT 480 401 ACCGGGACCCACTGCAGGACCTGGCCGGGGAAGTGTGGGACGTGGTGCAGGAGAGGGTCGAGGCCCGGACACAGGTCGGA 480

H G Q T V D Q V P G K V G Q V V D E G L E P G T D L R

A R P S T R S R G R * V R C W T R E W S R A Q T W G

P G P H R G P G A G E C G A G R G R G A G P R H G A E

BsaJI BsaHI HphI NlaIV BsaXI BmgBI BseYI BslI BbvCI

BsaJI ZraI BsrFI MnlI BsaXI Bpu10I

AatII BsgI BslI

S A H G R A Q A L G S P P P L A A P A P G G P K K G V

S P R Q G P G P G V A S T T G C T G S R R P Q K R S

Q P T A G P R P W G R L H H W L H R L Q E A P K K E S

481 CAGCCCACGGCAGGGCCCAGGCCCTGGGGTCGCCTCCACCACTGGCTGCACCGGCTCCAGGAGGCCCCAAAAAAGGAGTC 560 481 GTCGGGTGCCGTCCCGGGTCCGGGACCCCAGCGGAGGTGGTGACCGACGTGGCCGAGGTCCTCCGGGGTTTTTTCCTCAG 560 L G R C P G P G P T A E V V P Q V P E L L G W F L L G

* G V A P G L G Q P R R W W Q S C R S W S A G F F S D

A W P L A W A R P D G G G S A A G A G P P G L F P T

BsaJI BspCNI EcoO109I BsgI BpmI TspRI MnlI EcoO109I BbvI

BtgI BslI ApaI BceAI BbvI Eco57MI BsrFI BslI NlaIV BslI

MnlI NlaIV BsaJI MnlI MwoI

BslI BsaJI BslI BsrI HpyF10VI

BseMII PflMI NlaIV

EcoO109I AlwNI

PspOMI BsaJI

BanII

Bme1580I

Bsp1286I

L W L P G G L C H L Q P L P P P H A G P E M R R Q R G

P L A A W R P L S P S T S S A S S R G T * N A S P A G

S G C L E A S V T F N L F R L L T R D L K C V A S G

561 CTCTGGCTGCCTGGAGGCCTCTGTCACCTTCAACCTCTTCCGCCTCCTCACGCGGGACCTGAAATGCGTCGCCAGCGGGG 640 561 GAGACCGACGGACCTCCGGAGACAGTGGAAGTTGGAGAAGGCGGAGGAGTGCGCCCTGGACTTTACGCAGCGGTCGCCCC 640 R A A Q L G R D G E V E E A E E R P V Q F A D G A P

E P Q R S A E T V K L R K R R R V R S R

F H T A L P S

R Q S G P P R Q * R * G R G G G * A P G S I R R W R P

PleI MnlI HphI MnlI BseRI MnlI BslI BsmFI MspA1I

MlyI StuI MboII EarI MnlI FauI EcoO109I

MnlI EciI FauI EcoO109I MwoI PpuMI

BpmI PpuMI HpyF10VI

Eco57MI HgaI Cac8I

MnlI

NlaIV

P V C L T F P P P C H P S C N L R F Y L Y I N H L S

T C V P D L P T S L P P I M Q S E I L F I H Q P L V L

D L C A * P S H L P A T H H A I * D F I Y T S T T C L

641 ACCTGTGTGCCTGACCTTCCCACCTCCCTGCCACCCA TCA TGCAA TCTGAGA TTTTA TTTA TACATCAACCACTTGTCTT 720

641 TGGACACACGGACTGGAAGGGTGGAGGGACGGTGGGTAGTACGTTAGACTCTAAAA TAAA TATGTAGTTGGTGAACAGAA 720 V Q T G S R G V E R G G M M C D S I K N I C * G S T K

R H A Q G E W R G A V W * A I Q S K I * V D V V Q R

G T H R V K G G G Q W G D H L R L N * K Y M L W K D *

NlaIV BsmFI MnlI BspCNI TaqII

BseMII

* F I A T Q S L

I Y C H P I A

N L L P P N R Y

721 AA TTTA TTGCCACCCAA TCGCTAT 744

721 TTAAA TAACGGTGGGTTAGCGATA 744

I * Q W G I A I

L K N G G L R *

N I A V W D S

Restriction table:

Enzyme Recognition frequency Positions

__________________________________________________________________________ AarI CACCTGCnnnn'nnnn_ 2 261, 310

AatII G_ACGT'C 1 417

AlwNI CAG_nnn'CTG 5 75, 131, 211, 300, 504

ApaI G_GGCC'C 2 154, 498

BanI G'GyrC_C 1 273

BanII G_rGCy'C 2 154, 498

BbsI GAAGACnn'nnnn_ 1 42

BbvI GCAGCnnnnnnnn'nnnn_ 7 62, 158, 198, 303, 325, 513, 554 BbvCI CC'TCA_GC 1 480

BceAI ACGGCnnnnnnnnnnnn'nn_ 1 504

BglI GCCn_nnn'nGGC 1 347

BlpI GC'TnA_GC 1 351

Bme1580I G_kGCm'C 2 154, 498

BmgBI CAC'GTC 1 448

BmrI ACTGGGnnnn_n' 1 178

BpmI CTGGAGnnnnnnnnnnnnnn_nn' 5 141, 389, 398, 521, 593

Bpu10I CC'TnA_GC 1 480

BsaHI Gr'CG_yC 1 414

BsaJI C'CnnG_G 12 146, 147, 293, 300, 336, 342

405, 406, 486, 497, 503, 504 BsaXI ACnnnnnCTCCnnnnnnn_nnn' 1 467

BsaXI GGAGnnnnnGTnnnnnnnnn_nnn' 1 437

BseMII CTCAGnnnnnnnn_nn' 5 24, 299, 342, 494, 679

本文来源:https://www.bwwdw.com/article/457q.html

Top