IBM小型机日常维护手册

更新时间:2024-04-09 10:37:01 阅读量: 综合文库 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

小型机日常维护操作

小型机的维护主要是看一些系统的运行信息,包括CPU占用率,内存的使用情况,硬盘空间等,下面介绍几条常用的命令:

一. 查看cpu

sar

sar 【采样频率】 【采样次数】 CPU平均数据采样检查

对cpu进行六十次的十秒钟采样,计10分钟,最后一组数据为平均值

参考值: %idle>30%,%wio<30%

%wio<30 是一个临界参考值,当wio超过 10%,就应该予以关注。如果wio超过40-50% ,则系统处于瘫痪边缘。

例如:sar 3 10 意即 每3秒钟采样一次,采10次! #sar 3 10

10:42:13 %usr %sys %wio %idle 10:42:16 6 7 1 87 10:42:19 6 5 2 87 10:42:23 8 7 1 84 10:42:26 4 6 1 89 10:42:29 5 7 2 87 10:42:32 7 6 1 86 10:42:35 6 4 1 89 10:42:38 5 6 1 88 10:42:41 6 7 2 85 10:42:44 6 6 1 87

Average 6 6 1 87

二. 查看内存 vmstat

vmstat 【采样频率】 【采样次数】

内存实时检查

持续观测内存换页操作pi、po,查找资源瓶颈

Pi,po最好保持为0,持续不为0时,说明内存负荷过重,系统频繁进行页交换

例如:vmstat 2 5 意即 每2秒钟采样一次,采5次! #vmstat 2 5

kthr memory page faults cpu ----- ----------- ------------------------ ------------ -----------

r b avm fre re pi po fr sr cy in sy cs us sy id wa

2 1 1167414 352340 0 0 0 0 0 0 25 27005 15592 5 5 88 2

1 0 1167421 352297 0 0 0 0 0 0 4146 30346 18289 5 6 88 2 0 0 1168033 351662 0 0 0 0 0 0 4053 30121 15829 4 6 89 1 0 0 1167518 352157 0 0 0 0 0 0 4078 26353 15764 3 5 90 2 1 0 1167490 352148 0 0 0 0 0 0 4109 34246 18417 7 6 86 1

#topas

Topas Monitor for host: HZ_SMC_SMC1B EVENTS/QUEUES FILE/TTY

Sat Jan 17 10:48:31 2004 Int erval: 2 Cswitch 18496 Readch 1132.3K Syscall 33397 Writech 96691 Kernel 6.3 |## | Reads 571 Rawin 0 User 5.3 |## | Writes 82 Ttyout 0 Wait 1.4 | | Forks 11 Igets 6 Idle 86.8 |######################## | Execs 13 Namei 838 Runqueue 0.0 Dirblk 29 Network KBPS I-Pack O-Pack KB-In KB-Out Waitqueue 1.0 lo0 249.2 1441 1441 249.2 249.2

en5 193.1 763 827 174.7 211.5 PAGING MEMORY en0 88.4 468 527 56.6 120.1 Faults 4457 Real,MB 8191 en4 26.8 36 64 2.0 51.6 Steals 0 % Comp 56.2 PgspIn 0 % Noncomp 28.0 Disk Busy% KBPS TPS KB-Read KB-Writ PgspOut 0 % Client 0.8 hdisk0 11.4 83.4 20 0.0 166.8 PageIn 2

hdisk1 9.4 83.4 20 0.0 166.8 PageOut 48 PAGING SPACE hdisk3 1.9 95.3 6 0.0 190.6 Sios 35 Size,MB 10240 hdisk2 0.0 11.9 1 0.0 23.8 % Used 0.5 NFS (calls/sec) % Free 99.4 WLM-Class CPU% Mem% Disk-I/O% ServerV2 0 ClientV2 0 Press:

ServerV3 0 \Name PID CPU% PgSp Owner ClientV3 18 \smcapp 37250 2.11014.9 smc mtiserver 22188 1.8 77.4 smc mapserver 47100 1.6 39.4 smc ctiserver 39880 0.7 26.7 smc dbdaemon 21014 0.7144.3 smc icdcomm 20124 0.5 2.2 smc billcreat 21796 0.5 10.0 smc topas 44376 0.4 1.7 root

从该命令的运行结果可以看出系统的运行情况,其中PAGING SPACE 的使用率不能超过20%!

四、IO实时检查

#iostat 2 2

I/O实时检查

持续观测热点盘的使用情况,查找资源瓶颈

依据输出的数据,分析负荷主要在那些盘上,做到负荷均衡,达到性能最佳

tty: tin tout avg-cpu: % user % sys % idle % iowait

0.0 0.5 0.5 0.7 98.7 0.1

Disks: % tm_act Kbps tps Kb_read Kb_wrtn hdisk0 1.1 8.6 2.0 52500 2406090 hdisk1 0.7 6.3 1.5 4650 1804326 hdisk2 0.0 0.0 0.0 0 0 hdisk3 0.0 0.0 0.0 0 0 cd0 0.0 0.0 0.0 0 0

tty: tin tout avg-cpu: % user % sys % idle % iowait

0.0 269.0 0.7 2.6 96.7 0.0

Disks: % tm_act Kbps tps Kb_read Kb_wrtn hdisk0 0.0 2.0 0.5 4 0 hdisk1 0.0 0.0 0.0 0 0 hdisk2 0.0 0.0 0.0 0 0 hdisk3 0.0 0.0 0.0 0 0 cd0 0.0 0.0 0.0 0 0

五、Paging space空间的检查

Paging space的空间使用率过高,表示系统内存不足,参考值percent used<20%

最好为1%

#lsps –s Total Paging Space Percent Used 4768MB 1%

六、 查看硬盘 df –k

#df -k

Filesystem 1024-blocks Free %Used Iused %Iused Mounted on /dev/hd4 196608 155820 21% 2938 3% / /dev/hd2 1114112 297456 74% 26672 10% /usr /dev/hd9var 851968 763064 11% 853 1% /var /dev/hd3 1572864 1520364 4% 107 1% /tmp /dev/hd1 3080192 2393932 23% 389 1% /home /proc - - - - - /proc /dev/hd10opt 65536 55908 15% 387 3% /opt

/dev/lv00 3080192 987876 68% 42554 6% /home/oracle

HZ_SMC_SMC1A:/home/smc/config 3080192 2452140 21% 300 1% /home/smc/mnt /dev/lv2 20971520 20310064 4% 40 1% /bpsbill /dev/lv3 5242880 5077964 4% 19 1% /checkbill

/dev/lv4 20971520 20312948 4% 20 1% /checkbill/checkbak

七、 查看磁盘柜的位置(挂在哪台机器上) lsvg –o

#lsvg –o data1vg data2vg rootvg

八、 查看vg的属性 lsvg datavg(rootvg)

#lsvg data1vg

VOLUME GROUP: data1vg VG IDENTIFIER: 000dda1a00004c00000000f616a8a678 VG STATE: active PP SIZE: 64 megabyte(s)

VG PERMISSION: read/write TOTAL PPs: 8688 (556032 megabytes) MAX LVs: 512 FREE PPs: 586 (37504 megabytes) LVs: 280 USED PPs: 8102 (518528 megabytes) OPEN LVs: 215 QUORUM: 2 TOTAL PVs: 1 VG DESCRIPTORS: 2 STALE PVs: 0 STALE PPs: 0 ACTIVE PVs: 1 AUTO ON: no MAX PPs per PV: 10160 MAX PVs: 12 LTG size: 128 kilobyte(s) AUTO SYNC: no HOT SPARE: no

九、 查看错误日志 errpt

#errpt |more

IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION

0619C49F 0114170104 U U topsvcs Contact with a neighboring adapter lost 173C787F 0114162304 I S topsvcs Possible malfunction on local adapter

FE2DEE00 0114161904 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET FE2DEE00 0114161904 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET 0619C49F 0114160404 U U topsvcs Contact with a neighboring adapter lost 0619C49F 0114155704 U U topsvcs Contact with a neighboring adapter lost

FE2DEE00 0114154604 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET FE2DEE00 0114154604 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET FE2DEE00 0114154604 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET FE2DEE00 0114154604 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET 173C787F 0114154404 I S topsvcs Possible malfunction on local adapter

FE2DEE00 0114154404 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET FE2DEE00 0114154404 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET

查看硬件报错: #errpt –dH

IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION BFE4C025 0227155606 P H sysplanar0 UNDETERMINED ERROR BFE4C025 0220174806 P H sysplanar0 UNDETERMINED ERROR 2F3E09A4 1221171405 I H sysplanar0 REPAIR ACTION 2F3E09A4 1221171405 I H sys0 REPAIR ACTION 其中T表示临时错误,P表示永久错误 查看某个错误的具体解释: ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

Errpt –aj BFE4C025(IDENTIFIER)

十、查看小型机的配置信息(包括内存,cpu,硬盘,型号,序列号等) #prtconf

System Model: IBM,7026-6M1 Machine Serial Number: 10BACFA Processor Type: PowerPC_RS64-IV Number Of Processors: 4

Processor Clock Speed: 752 MHz CPU Type: 64-bit Kernel Type: 32-bit LPAR Info: -1 NULL Memory Size: 4096 MB

Good Memory Size: 4096 MB

Firmware Version: IBM,M2P020806 Console Login: enable Auto Restart: false Full Core: false

Network Information

Host Name: NJ_MT2SMC2 IP Address: 10.33.25.34

Sub Netmask: 255.255.255.192 Gateway: 10.33.25.1 Name Server:

Domain Name:

Paging Space Information

Total Paging Space: 4768MB Percent Used: 1%

Volume Groups Information

============================================================================== rootvg:

PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION hdisk0 active 542 24 12..06..00..06..00 hdisk1 active 542 47 00..00..00..00..47

==============================================================================

十二、备份配置文件

tar –cvf config20060508.tar /home/smc/config

十三、查看双机状态 lssrc –g cluster

#lssrc -g cluster

Subsystem Group PID Status clstrmgrES cluster 26584 active clsmuxpdES cluster 19696 active clinfoES cluster 26864 active

十四、杀死进程 kill –9 PID

kill –9 10554:杀死PID为10554的进程 13915266695

十五、查看oracle进程 ps –u oracle #ps -u oracle

UID PID TTY TIME CMD 203 16710 - 0:02 tnslsnr 203 24982 - 82:26 oracle 203 26054 - 3:57 oracle 203 27726 - 6:23 oracle 203 28324 - 3:59 oracle 203 28590 - 0:32 oracle

203 28816 - 3:51 oracle 203 29072 - 116:18 oracle 203 29284 - 4:04 oracle 203 29642 - 3:58 oracle 203 29834 - 3488:56 oracle 203 30008 - 122:31 oracle 203 30326 - 19:56 oracle 203 30612 - 0:02 oracle 203 31582 - 185:31 oracle 203 31752 - 12:43 oracle 203 32042 - 3:59 oracle 203 32330 - 19:59 oracle 203 32552 - 4:30 oracle 203 32834 - 4:01 oracle 203 33154 - 3478:51 oracle 203 33578 - 3:54 oracle 203 33816 - 12:42 oracle 203 34094 - 4:14 oracle 203 34350 - 3:55 oracle 203 41428 - 2:08 oracle 203 53266 - 0:00 oracle

十六、查看oracle日志

oracle日志在$ORACLE_BASE/admin/ora92/bdump/ alert_ora92.log

$ tail -20 alert_ora92.log

Thread 1 advanced to log sequence 6005

Current log# 2 seq# 6005 mem# 0: /dev/rlvora_redo2 Thu Mar 2 19:41:11 2006

Thread 1 advanced to log sequence 6006

Current log# 3 seq# 6006 mem# 0: /dev/rlvora_redo3 Thu Mar 2 20:27:40 2006

Thread 1 advanced to log sequence 6007

Current log# 1 seq# 6007 mem# 0: /dev/rlvora_redo1 Thu Mar 2 21:10:24 2006

Thread 1 advanced to log sequence 6008

Current log# 2 seq# 6008 mem# 0: /dev/rlvora_redo2 Thu Mar 2 21:50:50 2006

Thread 1 advanced to log sequence 6009

Current log# 3 seq# 6009 mem# 0: /dev/rlvora_redo3 Thu Mar 2 22:29:17 2006

Thread 1 advanced to log sequence 6010

Current log# 1 seq# 6010 mem# 0: /dev/rlvora_redo1 Thu Mar 2 23:19:07 2006

Thread 1 advanced to log sequence 6011

Current log# 2 seq# 6011 mem# 0: /dev/rlvora_redo2

当发现有故障时候,可以用命令: snap –r

如果显示nothing to clean up表明可以继续操作下去,否则咨询IBM工程师 snap –gc会在

/tmp/ibmsupt下面产生一个snap.pax.Z的文件。

本文来源:https://www.bwwdw.com/article/0f7r.html

Top