
更新时间:2023-03-21 07:40:01 阅读量: 实用文档 文档下载


高速運算於生物資訊之應用 高速運算於生物資訊之應用 運算於生物資訊HPC for Bioinformatics
Jazz Wang Yao-Tsung Wangjazz@fc918d4fe518964bcf847c73.tw
高速運算於生物資訊之應用HPC for Bioinformatics
( 60 % ) HPC = High Performance Computing What is HPC? Types of HPC ? Can I solve my problem with HPC ? ( 30% ) HPC & Bioinformatics Application ( 10% ) Open Source for Bioinformatics
PART 1 :
PART 2 : PART 3 :

PART 1 :
HPC 101Jazz Wang Yao-Tsung Wangjazz@fc918d4fe518964bcf847c73.tw
What is HPC ? & Why HPC ?

Source: fc918d4fe518964bcf847c73/whatishpc/WhatIsHPC.pdf
Source: fc918d4fe518964bcf847c73/whatishpc/WhatIsHPC.pdf

Source: fc918d4fe518964bcf847c73/whatishpc/WhatIsHPC.pdf

Source: fc918d4fe518964bcf847c73/whatishpc/WhatIsHPC.pdf Types of HPC ?

Source: http://blog.tice.de/a_icons/icons/512%20Time%20Machine.png
Back to Year 1960s ...Brief History of Computing (1/5)1960 PDP-1 . . . 1965 PDP-7 . . . 1969 1st UnixSource: fc918d4fe518964bcf847c73/2007/07/ Mainframe Super Computer

Evolution of Computing Architecture (1/5)
Multiple UsersMainframe Super Computer
Single CPU
Shared Memory
One Admin.
Single Super Computer
使用者心裡的『謎之聲』(1/5)~ 隊排好久喔 排 行程式,要 等執真希望自 己有一台 電腦可以 跑!!可惡,程式又死掉 了,又得重排一次
~ 玩得起的玩具 才 電腦是有錢人 超級

Back to Year 1970s ...

1991 Linux Back to Year 1980s ...

Brief History of Computing (2/5)
Source: fc918d4fe518964bcf847c73.tw
Mainframe Super Computer
PC / Linux Cluster Parallel
Evolution of Computing Architecture (2/5)
Multiple UsersPC / Linux Cluster Parallel
Separate Separate CPU Memory
One Admin.
nframe per mputer
Multiple PC in One Location

Brief History of Computing (3/5)
Source: http://www.scei.co.jp/folding/en/dc Mainframe Super Computer PC / Linux Cluster Parallel Internet Distributed Computing
Evolution of Computing Architecture (3/5)
Multiple Users
One Admin.
Single Shared Single Shared CPU Memory CPU Memory Single Single Powerful Server Powerful Server Network
Multiple Users
One Admin.
/ Linux uster rallel
Internet Distributed Computing
Single Broker
One Admin.

使用者心裡的『謎之聲』(3/5)XD 麼抽象啊~ 麼這 散式物件怎 分啊!網路斷線了~ 不能動了~
大家把閒 置電腦都 貢獻出來 吧!!
! 戲 ,其餘免談 給我網路遊
1997 Volunteer Computing 1999 SETI@HOME
2003 Globus Toolkit 2
2002 Berkley BOINC
2004 EGEE gLite
Back to Year 2000s ...

Brief History of Computing (4/5)
Source: http://gridcafe.web.cern.ch/gridcafe/whatisgrid/whatis Mainframe Super Computer PC / Linux Cluster Parallel Internet Virtual Org. Distributed Grid Computing Computing
Evolution of Computing Architecture (4/5)
Multiple Users Multiple PC in one location One Admin. Multiple PC in other location
Multiple Users
Grid MiddlewareNetwork
One Admin.
Internet Virtual Org. Distributed Grid omputing Computing
Virtual Organization Heterogeneous CyberInfrastructure

使用者心裡的『謎之聲』(4/5)? 要不到資源 什麼 認證了,為 已給我啥?可用資源在美 國,慢慢搬檔案吧 !
為什麼人 家Google 那麼會算 ?!長官,請幫我
! 源共享政策吧 們去談好資
2001 Autonomic Computing IBM
2006 Apache Hadoop
2005 Utility Computing Amazon EC2 / S3
2007 Cloud Computing Google + IBM
Back to Year 2007 ...

Brief History of Computing (5/5)
Source: fc918d4fe518964bcf847c73/2008/02/14/cloud-computing/ PC / Linux Cluster Parallel Internet Virtual Org. Data Explode Cloud Distributed Grid Computing Computing Computing
ainframe Super omputer
Evolution of Computing Architecture (5/5)
Each User || Virtual Admin.
Access any time, any where with mobile device
Multiple PC in different locations
Multiple Admin.
Virtual World
Physical World
Virtual Org. Data Explode Cloud Grid Computing Computing
What is NEXT ?! Mobile Computing ?!

Source: http://cyberpingui.free.fr/humour/evolution-white.jpg

Falling to the Ground ...


PART 2 : HPC & Bioinformatics Application
Jazz Wang Yao-Tsung Wangjazz@fc918d4fe518964bcf847c73.tw
(Basic Local Alignment Search Tool)
? fc918d4fe518964bcf847c73/ fc918d4fe518964bcf847c73/ fc918d4fe518964bcf847c73 ? National Center for Biotechnology Information ? BLAST is an algorithm for comparing primary biological BLAST用來比對生物序列 主要結構 用來比對生物序列的 結構) sequence information. ( BLAST用來比對生物序列的主要結構) – the amino-acid sequences of different proteins amino– the nucleotides of DNA sequences– (例如:不同蛋白質的氨基酸序列DNA序列的核甘酸) 例如:不同蛋白質的氨基酸序列DNA序列的核甘酸) 序列DNA序列
? 用途:搜尋其他物種(如:老鼠)未知基因,是否也存在人類基因中 用途:搜尋其他物種( 物種 老鼠)未知基因,是否也存在人類基因中 基因 ? 優點:使用啟發式搜索來找出相關的序列,比動態規劃快上50倍。 優點:使用啟發式搜索來找出相關的序列, 動態規劃快上50倍 式搜索來找出相關的序列 快上50 ? 缺點:不能夠保證搜尋到的序列和所要找的序列之間的相關性。 缺點: 能夠保證搜尋到的序列和所要找的序列之間的相關性。 到的序列和所要找的序列之間 ? 技術問題:巨大的序列資料庫需要進行比對,怎樣計算才快? 技術問題:巨大的序列資料庫需要進行比對,怎樣計算才快? 需要進行比對 計算才快? fc918d4fe518964bcf847c73/w/index.php?title=BLAST_(生物資訊學)&variant=zh /fc918d4fe518964bcf847c73/w/index.php?title=BLAST_(生物資訊 =zhSource: fc918d4fe518964bcf847c73/w/index.php?title=BLAST_(生物資訊學)&variant=zh-tw

PART 2.1 :
Cluster 101 & mpiBLASTJazz Wang Yao-Tsung Wangjazz@fc918d4fe518964bcf847c73.tw
At First, We have “ 4 + 1 ” PC Cluster
It'd better be
Manage Scheduler

Then, We connect 5 PCs with Gigabit Ethernet Switch
GiE Switch
10/100/1000 MBps
Add 1 NIC for WAN
Compute Nodes
4 Compute Nodes will communicate via LAN Switch. Only Manage Node have Internet Access for Security!
Manage Node

