rac实用部署文档

更新时间:2024-01-27 17:19:01 阅读量: 教育文库 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

***节点主机

public NIC Public IP : 192.168.80.40 VIP: 192.168.1.31

PRIVATE NIC PRIVATE IP :10.10.10.10

***节点主机2 PUBLIC NIC PUBLIC IP: 192.168.80.41 VIP: 192.168.1.32 PRIVATE NIC PRIVATE IP: 10.10.10.11

//////////////////////////////////////////////////////////////////////////////////////////////////////

***重启网卡服务 service network restart

//////////////////////////////////////////////////////////////////////////////////////////////////////

***编辑rac1/2.vmx

将rac1所有的文件复制到另外一个文件夹,然后修改配置文件,将displayname修改成rac2. 这三行是必须得添加的,作用允许两个节点同时 启动: Ethernet1.present = “TRUE“ 一般都添加到此行下面

disk.locking = \

diskLib.dataCacheMaxSize = \ scsi1.sharedBus = \

//////////////////////////////////////////////////////////////////

***修改电脑CPU设置

your host's bios does not have valid NUMA information. please update the host's Bios or associate the virtual machine with the processors in a single NUMA node(CEC). please read VMware Knowledge Base articles 928 and 1236.

之后,就退出来了.始终无法登陆到LINUX.

后通过网上找到是由于双核的原因导致的,因此,只需要在vmx文件中加上一下语句,即可: processor0.use = \ processor1.use = \

如果是四核,就必须加上以下语句: processor0.use = TRUE processor1.use = FALSE processor2.use = FALSE processor3.use = FALSE

//////////////////////////////////////////////////////////////////

***修改主机名 执行命令: hostname rac1

***修改/etc/sysconfig/network文件,修改为: [root@rac1 ~]# vi /etc/sysconfig/network

NETWORKING=yes HOSTNAME=rac1

//////////////////////////////////////////////////////////////////////////////////////////////

***为网卡配置IP

用修改文件的方法配置IP相关信息:

[root@rac1 ~]# vi /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0

ONBOOT=yes

BOOTPROTO=none

IPADDR=192.168.80.82

NETMASK=255.255.255.0

TYPE=Ethernet

[root@rac1 ~]# vi /etc/sysconfig/network-scripts/ifcfg-eth1

DEVICE=eth1

ONBOOT=yes

BOOTPROTO=none

IPADDR=10.10.11.2

NETMASK=255.255.255.0

TYPE=Ethernet

***以下这个网卡用于自动获取IP

[root@rac1 ~]# vi /etc/sysconfig/network-scripts/ifcfg-eth2

DEVICE=eth2

BOOTPROTO=dhcp

ONBOOT=yes

TYPE=Ethernet

***重启网卡服务

service network restart

//////////////////////////////////////////////////////////////////////////////////////////////

***修改/etc/hosts文件

[root@rac1 ~]# vi /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

192.168.80.81 rac1

192.168.80.83 rac1-vip

10.10.11.1 rac1-priv

192.168.80.82 rac2

192.168.80.84 rac2-vip

10.10.11.2 rac2-priv

注意:127.0.0.1 localhost必须在/etc/hosts文件中存在,否则在后面安装rac过程可能会出错。

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***建立用户组及赋予相关权限 groupadd oinstall groupadd dba

mkdir u01

useradd -g oinstall -G dba -d /u01/oracle oracle passwd oracle

***更改u01目录的权限 chown -R oracle:oinstall u01

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***配置oracle环境变量vi /u01/oracle/.bash_profile

export ORACLE_BASE=/u01/oracle export ORACLE_TERM=xterm

export NLS_LANG=AMERICAN_AMERICA.ZHS16GBK export CRS_HOME=$ORACLE_BASE/product/crs

export ORACLE_HOME=$ORACLE_BASE/product/database export ORACLE_ADMIN=$ORACLE_HOME/network/admin export ORA_NL$33=$ORACLE_HOME/ocommon/nls/admin/data export PATH=/usr/bin:/usr/sbin:/usr/local/bin:/usr/X11R6/bin:$PATH export ORACLE_SID=rac2

export PATH=/u01/oracle/product/database/bin:$ORACLE_BASE/product/crs/bin:$PATH

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***内核设置 vi /etc/sysctl.conf 加到文件末尾

kernel.shmall = 2097152

kernel.shmmax = 2147483648 kernel.shmmni = 4096

kernel.sem = 250 32000 100 128

fs.file-max = 65536

net.ipv4.ip_local_port_range = 1024 65000 net.core.rmem_default = 262144 net.core.rmem_max = 262144 net.core.wmem_default = 262144 net.core.wmem_max = 262144

***使它生效

sysctl -p

//////////////////////////////////////////////////////////////////////////////////////////////

***在所有节点间配置ssh

[root@rac1 ~]# su - oracle [oracle@rac1 ~]$ mkdir .ssh

[oracle@rac1 ~]$ chmod 700 .ssh [oracle@rac1 ~]$ cd .ssh

[oracle@rac1 .ssh]$ ssh-keygen -t rsa [oracle@rac1 .ssh]$ ssh-keygen -t dsa

另一台再执行一遍

***添加密钥信息到验证文件中 这一系列步骤只需要在其中一个节点执行就可以了(这里选择rac1): [oracle@rac1 .ssh]$ touch authorized_keys 把各个节点的密钥信息都放在上一步新建的验证文件中:

[oracle@rac1 .ssh]$ ssh rac1 cat /u01/oracle/.ssh/id_rsa.pub >> authorized_keys [oracle@rac1 .ssh]$ ssh rac2 cat /u01/oracle/.ssh/id_rsa.pub >> authorized_keys

[oracle@rac1 .ssh]$ ssh rac1 cat /u01/oracle/.ssh/id_dsa.pub >> authorized_keys [oracle@rac1 .ssh]$ ssh rac2 cat /u01/oracle/.ssh/id_dsa.pub >> authorized_keys

*** 在rac1把存储公钥信息的验证文件传送到rac2上 [oracle@rac1 .ssh]$ pwd

/home/oracle/.ssh

[oracle@rac1 .ssh]$ scp authorized_keys rac2:/u01/oracle/.ssh

***设置验证文件的权限

在每一个节点执行:

chmod 600 /u01/oracle/.ssh/authorized_keys

***启用用户一致性

在你要运行OUI的节点以oracle用户运行(这里选择rac1): [oracle@rac1 .ssh]$ exec /usr/bin/ssh-agent $SHELL

[oracle@rac1 .ssh]$ ssh-add

***验证ssh配置是否正确

以oracle用户在所有节点分别执行: ssh rac1 date ssh rac2 date ssh rac1-priv date ssh rac2-priv date

如果不需要输入密码就可以输出时间,说明ssh验证配置成功。必须把以上命令在两个节点都运行,每一个命令在第一次执行的时候需要输入yes。

如果不运行这些命令,即使ssh验证已经配好,安装clusterware的时候也会出现错误: The specified nodes are not clusterable

因为,配好ssh后,还需要在第一次访问时输入yes,才算是真正的无障碍访问其他服务器。 //////////////////////////////////////////////////////////////////////////////////////////////

***把其中一台主机时间同步为标准时间

地址有

微软公司授时主机(美国) time.windows.com

台警大授时中心(台湾) asia.pool.ntp.org

中科院授时中心(西安) 210.72.145.44

网通授时中心(北京) 219.158.14.130

/usr/sbin/ntpdate asia.pool.ntp.org

***编辑计划任务

crontab -e

*/5 * * * * /usr/sbin/ntpdate asia.pool.ntp.org >> /var/log/ntpdate.log

***两台机器间同步时间

RAC1作为NTP服务器

[root@rac1 ~]# mv /etc/ntp.conf /etc/ntp.conf_bak [root@rac1 ~]# vi /etc/ntp.conf server 127.127.1.0 #local clock fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008

[root@rac1 ~]# chkconfig --level 2345 ntpd on

[root@rac1 ~]# ps -ef|grep ntp

root 22113 4333 0 18:43 pts/2 00:00:00 grep ntp

[root@rac1 ~]# /etc/init.d/ntpd start

Starting ntpd: [ OK ]

[root@rac1 ~]# ps -ef|grep ntp

ntp 22127 1 0 18:44 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g root 22130 4333 0 18:44 pts/2 00:00:00 grep ntp

[root@rac-01 ~]# ntpq -p

remote refid st t when poll reach delay offset jitter

==============================================================================

LOCAL(0) .LOCL. 10 l 3 64 1 0.000 0.000 0.001

[root@rac-01 ~]# netstat -tulnp|grep ntp

udp 0 0 10.10.10.20:123 0.0.0.0:* 3335/ntpd

udp 0 0 192.168.0.140:123 0.0.0.0:* 3335/ntpd

udp 0 0 127.0.0.1:123 0.0.0.0:* 3335/ntpd

udp 0 0 0.0.0.0:123 0.0.0.0:* 3335/ntpd

udp 0 0 fe80::20c:29ff:feee:123 :::* 3335/ntpd

udp 0 0 fe80::20c:29ff:feee:123 :::* 3335/ntpd

udp 0 0 ::1:123 :::* 3335/ntpd

udp 0 0 :::123 :::*

3335/ntpd

***编辑cron文件

在vi /etc/cron.d/ntp内容如下:

/usr/sbin/ntpdate 192.168.80.81> /dev/null 2>&1 其中timeserver_ip是时间服务器IP 重启crond。

/etc/rc.d/init.d/crond restart 测试命令

/usr/sbin/ntpdate timeserver_ip

***在另外一台机器上配置

[root@rac2 ~]# /usr/sbin/ntpdate 192.168.80.81

17 Jun 18:57:55 ntpdate[22800]: step time server 192.168.0.130 offset -0.579328 sec

可以将这个命令作为一个周期性运行的命令。

/usr/sbin/ntpdate -s timeserver_ip && hwclock --systohc

将客户端和服务器端的这个参数修改成yes。同步成功以后,自动的更新bios。 vi /etc/sysconfig/ntpd SYNC_HWCLOCK=yes

***创建计划任务多长时间同步一次

crontab -e

*/5 * * * * /usr/sbin/ntpdate 192.168.80.82 >> /var/log/ntpdate.log

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***内核版本

[root@rac1 ~]# uname -r ***2.6.18-194.el5 [root@rac1 ~]#

***需要的asm包

oracleasmlib-2.0.4-1.el5.x86_64.rpm

oracleasm-support-2.1.7-1.el5.x86_64.rpm

oracleasm-2.6.18-194.el5-2.0.5-1.el5.x86_64.rpm

***常规包检查

rpm -q binutils compat-db compat-libstdc++-33 control-center gcc gcc-c++ glibc glibc-common gnome-libs libstdc++ libstdc++-devel make pdksh sysstat xscreensaver setarch glibc-devel libaio libaio-devel compat-libstdc++-33 compat-gcc-34 compat-gcc-34-c++- libXp openmotif22

rpm -ivh libXp-1.0.0-8.1.el5.i386.rpm

rpm -ivh compat-db-4.2.52-5.1.x86_64.rpm

rpm -ivh kernel-headers-2.6.18-194.el5.x86_64.rpm rpm -ivh glibc-headers-2.5-49.x86_64.rpm rpm -ivh glibc-devel-2.5-49.x86_64.rpm rpm -ivh libgomp-4.4.0-6.el5.x86_64.rpm rpm -ivh gcc-4.1.2-48.el5.x86_64.rpm

rpm -ivh libstdc++-devel-4.1.2-48.el5.x86_64.rpm rpm -ivh gcc-c++-4.1.2-48.el5.x86_64.rpm rpm -ivh pdksh-5.2.14-36.el5.x86_64.rpm rpm -ivh sysstat-7.0.2-3.el5.x86_64.rpm rpm -ivh libaio-devel-0.3.106-5.x86_64.rpm

rpm -ivh libstdc++44-devel-4.4.0-6.el5.x86_64.rpm rpm -ivh compat-gcc-34-3.4.6-4.x86_64.rpm rpm -ivh libXp-1.0.0-8.1.el5.x86_64.rpm rpm -ivh openmotif22-2.2.3-18.x86_64.rpm

rpm -ivh compat-libstdc++-33-3.2.3-61.x86_64.rpm rpm -ivh compat-gcc-34-3.4.6-4.x86_64.rpm

rpm -ivh compat-gcc-34-c++-3.4.6-4.x86_64.rpm rpm -ivh openmotif22-2.2.3-18.x86_64.rpm

rpm -ivh oracleasm-support-2.1.7-1.el5.x86_64.rpm

rpm -ivh oracleasm-2.6.18-194.el5-2.0.5-1.el5.x86_64.rpm rpm -ivh oracleasmlib-2.0.4-1.el5.x86_64.rpm

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***设置用户资源限制

因为所有的进程都是以oracle用户身份运行。因此需要定义oracle用户能够使用的系统资源数量。

vi /etc/security/limits.conf

#ftp hard nproc 0 #@student - maxlogins 4

# End of file

oracle soft memlock 5242880 oracle hard memlock 5242880 oracle soft nproc 2047

oracle hard nproc 16384 oracle soft nofile 65536 oracle hard nofile 65536

\

***查看资源限制

[root@rac1 src]# su - oracle [oracle@rac1 ~]$ ulimit -a

core file size (blocks, -c) 0

data seg size (kbytes, -d) unlimited max nice (-e) 0

file size (blocks, -f) unlimited pending signals (-i) 14400 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 65536 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 max rt priority (-r) 0

stack size (kbytes, -s) 10240

cpu time (seconds, -t) unlimited max user processes (-u) 2047

virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***配置hangcheck-timer模块。

这是linux提供的一个内核级IO-Fencing模块。这个模块会监控linux 内核运行的状态。如

果linux长时间挂起,这个内核会自动的重启系统。这个模块在内核空间运行,不受负载的影响。

配置这个模块需要两个参数: hangcheck_tick:多长时间检查一次,缺省是30秒 hangcheck_margin:延迟上限,缺省是180秒

hangcheck-time模块会根据hangcheck_tick的设置,定时检查内核,只要两次检查的时间间隔小于hangcheck_tick+hangcheck_margin,都会认为内核是运行正常,否则认为系统异常,该模块会自动重启系统。

CRS本身还有一个参数:MissCount参数。

上面的三个参数影响RAC的重构,假设节点间心跳信息丢失,Clusterware必须确保在进行重构时,故障节点确实是dead状态。

严重问题:节点临时负载过高导致心跳丢失,然后其他节点开始重构,但是节点却没有重启(没有dead),这就会损坏数据库。

因此要保证MissCount必须大于hangcheck_tick+hangcheck_margin的和。这样可以保证节点开始重构时,其他节点已经被hangcheck-timer模块重启。

[root@rac1 etc]# find /lib/modules/ -name hangcheck-timer.ko /lib/modules/2.6.18-8.el5/kernel/drivers/char/hangcheck-timer.ko [root@rac1 etc]# modprobe -v hangcheck-timer

insmod /lib/modules/2.6.18-8.el5/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=30 hangcheck_margin=180

***配置系统启动时自动加载模块。在/etc/rc.d/rc.local中添加如下内容

echo \[root@rac-02 ~]# vi /etc/modprobe.conf

alias scsi_hostadapter mptbase alias scsi_hostadapter1 mptspi alias eth0 pcnet32 alias eth1 pcnet32

options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180

***配置模块的参数

[root@rac-02 ~]# modprobe hangcheck-timer

[root@rac-02 ~]# grep Hangcheck /var/log/messages |tail -2

Jun 19 04:14:13 rac-02 kernel: Hangcheck: starting hangcheck timer 0.9.0 (tick is 30 seconds, margin is 180 seconds).

Jun 19 04:14:13 rac-02 kernel: Hangcheck: Using get_cycles(). 显然参数有问题

重新加载模块,问题解决。 两个节点上都要执行相同的操作

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***配置裸设备(AS5)

ACTION==\ACTION==\ACTION==\ACTION==\KERNEL==\KERNEL==\KERNEL==\KERNEL==\

[root@rac2 ~]# vi /etc/rc.local

#!/bin/sh #

# This script will be executed *after* all the other init scripts. # You can put your own initialization stuff in here if you don't # want to do the full Sys V style init stuff.

touch /var/lock/subsys/local modprobe -v hangcheck-timer partprobe

raw /dev/raw/raw1 /dev/sdb1 raw /dev/raw/raw2 /dev/sdb2 raw /dev/raw/raw3 /dev/sde1 raw /dev/raw/raw4 /dev/sde2 chown -R oracle.dba /dev/raw/ chmod 660 /dev/raw/raw1 chmod 660 /dev/raw/raw2 chmod 660 /dev/raw/raw3 chmod 660 /dev/raw/raw4

***机器未重启手动先挂载上 [root@rac2 ~]# partprobe raw /dev/raw/raw1 /dev/sdb1 raw /dev/raw/raw2 /dev/sdb2 raw /dev/raw/raw3 /dev/sde1 raw /dev/raw/raw4 /dev/sde2

[root@rac2 ~]# ll /dev/raw total 0

crw------- 1 root root 162, 1 Jun 17 13:22 raw1 crw------- 1 root root 162, 2 Jun 17 13:22 raw2

另一台上执行

chkconfig --list rawdevices

service rawdevices restart

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***创建ASM磁盘

创建ASM磁盘有多种方式,我们使用的是ASMLib方法,要求必须安装ASMLib RPM包,我们前面已经安装过。

建立/oracle/product/database,修改属主为oracle:dba

mkdir /u01/oracle/product/crs -p

mkdir /u01/oracle/product/database -p chown -R oracle.dba /u01/

***在建立ASM磁盘以前,我们首先需要建立相应的磁盘设备 /dev/sdc /dev/sdc1 /dev/sdd /dev/sdd1

[root@rac1 ~]# fdisk /dev/sdc

Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel. Changes will remain in memory only, until you decide to write them. After that, of course, the previous

content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n Command action e extended

p primary partition (1-4) p

Partition number (1-4): 1

First cylinder (1-391, default 1): Using default value 1

Last cylinder or +size or +sizeM or +sizeK (1-391, default 391): Using default value 391

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table. Syncing disks.

***创建ASM磁盘

[root@rac1 ~]# /etc/init.d/oracleasm configure Configuring the Oracle ASM library driver.

This will configure the on-boot properties of the Oracle ASM library driver. The following questions will determine whether the driver is loaded on boot and what permissions it will have. The current values will be shown in brackets ('[]'). Hitting without typing an answer will keep that current value. Ctrl-C will abort.

Default user to own the driver interface []: oracle Default group to own the driver interface []: dba Start Oracle ASM library driver on boot (y/n) [n]: y

Fix permissions of Oracle ASM disks on boot (y/n) [y]: y

Writing Oracle ASM library driver configuration: [ OK ]

Creating /dev/oracleasm mount point: [ OK ]

Loading module \ [ OK ] Mounting ASMlib driver filesystem: [ OK ] Scanning system for ASM disks: [ OK ]

***配置和启用ASM驱动。

/etc/init.d/oracleasm createdisk VOL1 /dev/sdc1 /etc/init.d/oracleasm createdisk VOL2 /dev/sdc2 /etc/init.d/oracleasm createdisk VOL3 /dev/sdd1 /etc/init.d/oracleasm createdisk VOL4 /dev/sdd2

[root@rac1 ~]# /etc/init.d/oracleasm createdisk VOL1 /dev/sdc1

Marking disk \ [ OK ] [root@rac1 ~]# /etc/init.d/oracleasm createdisk VOL2 /dev/sdc1

Marking disk \ [ OK ]

***在RAC1上建立了两个ASM磁盘。

oracleasm listdisks

***在RC2上配置一下ASM驱动,一定要加上目录

[root@rac2 ~]# /etc/init.d/oracleasm enable

Writing Oracle ASM library driver configuration: [ OK ]

Creatingmom /dev/oracleasm mount point: [ OK ] Loading module \ [ OK ] Mounting ASMlib driver filesystem: [ OK ] Scanning system for ASM disks: [ OK ] [root@rac2 ~]# /etc/init.d/oracleasm status

Checking if ASM is loaded: [ OK Checking if /dev/oracleasm is mounted: [ OK ] [root@rac2 ~]# /etc/init.d/oracleasm listdisks VOL1 VOL2

[root@rac2 ~]# /etc/init.d/oracleasm configure Configuring the Oracle ASM library driver.

This will configure the on-boot properties of the Oracle ASM library driver. The following questions will determine whether the driver is loaded on boot and what permissions it will have. The current values will be shown in brackets ('[]'). Hitting without typing an answer will keep that current value. Ctrl-C will abort.

Default user to own the driver interface []: oracle

]

Default group to own the driver interface []: dba Start Oracle ASM library driver on boot (y/n) [y]: y

Fix permissions of Oracle ASM disks on boot (y/n) [y]: y

Writing Oracle ASM library driver configuration: [ OK ]

Scanning system for ASM disks: [ OK ] //////////////////////////////////////////////////////////////////////////////////////////////////////////

***降低版本Red Hat Enterprise Linux Server release 5.5 (Tikanga)

[root@rac-01 ~]# vi /etc/redhat-release

Red Hat Enterprise Linux Server release 4 (Tikanga)

//////////////////////////////////////////////////////////////////////////////////////////////////////////

******安装oracle的clusterware软件

***解压光盘文件

gzip -d 10201_clusterware_linux_x86_64.cpio.gz cpio -idmv < 10201_clusterware_linux_x86_64.cpio

***解压后删除节省磁盘空间 rm -f

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***两个位置分别建立crs路径。这个和我们的环境变量是一致的。

export CRS_HOME=$ORACLE_BASE/product/crs

[oracle@rac1 crs]$ pwd /u01/oracle/product/crs [oracle@rac1 crs]$

////////////////////////////////////////////////////////////////////////////////////////////////////////// ***设置Xstart

/usr/bin/xterm -ls -display $DISPLAY

//////////////////////////////////////////////////////////////////////////////////////////////////////////

su - oracle cd clusterware ./runInstaller

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***修改BUG

安装完clusterware,在执行那两个脚本之前操作 [oracle@rac2 bin]$ vi srvctl

注释以下两行即可:/crs/bin目录下

#LD_ASSUME_KERNEL=2.4.19 #export LD_ASSUME_KERNEL

oracle@rac2 bin]$ vi vipca

#if [ \ #then

# LD_ASSUME_KERNEL=2.4.19 # export LD_ASSUME_KERNEL #fi

#End workaround

*** 运行/crs/bin/vipca脚本 配置注册vip

////////////////////////////////////////////////////////////////////////////////////////////////////////// 注册完vip 测试下连接 ssh rac1 date ssh rac2 date ssh rac1-priv date ssh rac2-priv date ssh rac1-vip date ssh rac2-vip date

////////////////////////////////////////////////////////////////////////////////////////////////////////// [root@rac1 ~]# mount|grep asm

oracleasmfs on /dev/oracleasm type oracleasmfs (rw)

//////////////////////////////////////////////////////////////////////////////////////////////////////////

vi /etc/sysconfig/rawdevices

/dev/raw/raw1 /dev/sdb1 /dev/raw/raw2 /dev/sdb2 /dev/raw/raw3 /dev/sde1 /dev/raw/raw4 /dev/sde2

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////////////////////////////////////////// 问题三的解决方法:

To workaround issue#3 (vipca failing on non-routable VIP IP ranges, manually or during root.sh), if you still have the OUI window open, click OK and it will create the “oifcfg” information, then cluvfy will fail due to vipca not completed successfully, skip below in this note and run vipca manually then return to the installer and cluvfy will succeed. Otherwise you may configure the interfaces for RAC manually using the oifcfg command as root, like in the following example (from any node):

/bin # ./oifcfg setif -global eth0/172.16.89.0:public

/bin # ./oifcfg setif -global eth1/192.168.0.0:cluster_interconnect /bin # ./oifcfg getif

eth0 192.168.1.0 global public

eth1 10.10.10.0 global cluster_interconnect

然后在手工运行vipca添加nodeapps resource即可。

详细的情况记录在Oracle notes: 414163.1

/bin # ./oifcfg setif -global eth0/192.168.80.0:public

/bin # ./oifcfg setif -global eth1/10.10.11.0:cluster_interconnect

//////////////////////////////////////////////////////////////////////////////////////////////////////////

*************维护部分**************

//////////////////////////////////////////////////////////////////////////////////////////////////////////

***查看裸设备 ll /dev/raw

***查看asm

/etc/init.d/oracleasm listdisks

RAC 在启动的时候crs 等进程都是自动启动的: [root@rac1 init.d]# ls -l /etc/init.d/init.*

-r-xr-xr-x 1 root root 1951 Feb 26 22:38 /etc/init.d/init.crs -r-xr-xr-x 1 root root 4714 Feb 26 22:38 /etc/init.d/init.crsd -r-xr-xr-x 1 root root 35394 Feb 26 22:38 /etc/init.d/init.cssd -r-xr-xr-x 1 root root 3190 Feb 26 22:38 /etc/init.d/init.evmd

***查看一下crs 的状态: [root@raw1 bin]# ./crs_stat -t [oracle@rac1 ~]$ crs_stat -t

Name Type Target State Host ------------------------------------------------------------

ora.kofu.db application ONLINE UNKNOWN rac1 ora....kofu.cs application ONLINE UNKNOWN rac1 ora....fu1.srv application ONLINE UNKNOWN rac2 ora....fu2.srv application ONLINE UNKNOWN rac1 ora....u1.inst application ONLINE OFFLINE ora....u2.inst application ONLINE OFFLINE ora....SM2.asm application ONLINE UNKNOWN rac1 ora....C1.lsnr application ONLINE UNKNOWN rac1 ora.rac1.gsd application ONLINE UNKNOWN rac1 ora.rac1.ons application ONLINE UNKNOWN rac1 ora.rac1.vip application ONLINE ONLINE rac1 ora....SM1.asm application ONLINE UNKNOWN rac2

ora....C2.lsnr application ONLINE UNKNOWN rac2 ora.rac2.gsd application ONLINE UNKNOWN rac2 ora.rac2.ons application ONLINE UNKNOWN rac2 ora.rac2.vip application ONLINE ONLINE rac2

***解决方法:

1. 用crs_stat 查看进程全部信息:

[root@rac2 bin]# ./crs_stat

2.对offline和unknown可以先关闭在开启

crs_stop -f ora.comb.comb.comb1.srv crs_stop -f ora.comb.comb.comb2.srv crs_stop -f ora.comb.comb.cs crs_stop -f ora.comb.comb1.inst crs_stop -f ora.comb.comb2.inst crs_stop -f ora.comb.db

crs_stop -f ora.rac1.ASM1.asm

crs_stop -f ora.rac1.LISTENER_RAC1.lsnr crs_stop -f ora.rac1.gsd crs_stop -f ora.rac1.ons crs_stop -f ora.rac1.vip

crs_stop -f ora.rac2.ASM2.asm

crs_stop -f ora.rac2.LISTENER_RAC2.lsnr crs_stop -f ora.rac2.gsd crs_stop -f ora.rac2.ons crs_stop -f ora.rac2.vip

crs_start ora.comb.comb.comb1.srv crs_start ora.comb.comb.comb2.srv crs_start ora.comb.comb.cs crs_start ora.comb.comb1.inst crs_start ora.comb.comb2.inst crs_start ora.comb.db

crs_start ora.rac1.ASM1.asm

crs_start ora.rac1.LISTENER_RAC1.lsnr crs_start ora.rac1.gsd

crs_start ora.rac1.ons crs_start ora.rac1.vip

crs_start ora.rac2.ASM2.asm

crs_start ora.rac2.LISTENER_RAC2.lsnr crs_start ora.rac2.gsd crs_start ora.rac2.ons crs_start ora.rac2.vip

3. 如果crs_stop不能结束,crs_start 不能启动的进程,我们有2中方法来解决: 是用crs_stop -f 参数把crs中状态是UNKNOWN的服务关掉,然后再用crs_start -f (加一个-f的参数)启动所有的服务就可以。要分别在两个节点上执行;

4.转换到root用户下用/etc/init.d/init.crs stop先禁用crs,然后再用/etc/init.d/init.crs start去启用crs,启用crs后会自动启动crs的一系列服务,注意此种方法需要在两台节点上都执行; 5. 可以用命令一次启动和关闭相关进程

[root@rac2 bin]# ./crs_stop -all

[root@rac2 bin]# ./crs_start -all

***查看监听状态 netstat -an|grep 1521

启动监听:lsnrctl start 查看监听:lsnrctl status 停止监听:lsnrctl stop

***查看系统状态

/u01/oracle/product/crs/bin/crs_stat -t -v

////////////////////////////////////////////////////////////////////////////////////////////////////////// 在rac1上启动了实例rac1

select instance_name from v$instance;

////////////////////////////////////////////////////////////////////////////////////////////////////////// oem 安装

emca -config dbcontrol db -repos recreate OEM乱码问题

开启服务

emctl start dbconsole

首先停止oem的dbconsole 服务。

emctl stop dbconsole

*修改乱码

我们把font.properties文件进行备份: cp $ORACLE_HOME/jdk/jre/lib/font.properties $ORACLE_HOME/jdk/jre/lib/font.properties.bak cp $ORACLE_HOME/jre/1.4.2/lib/font.properties $ORACLE_HOME/jre/1.4.2/lib/font.properties.bak cp $ORACLE_HOME/javavm/lib/ojvmfonts/font.properties $ORACLE_HOME/javavm/lib/ojvmfonts/font.properties.bak

然后用font.properties.zh_CN.Redhat替换font.properties: cp $ORACLE_HOME/jdk/jre/lib/font.properties.zh_CN.Redhat $ORACLE_HOME/jdk/jre/lib/font.properties cp $ORACLE_HOME/jre/1.4.2/lib/font.properties.zh_CN.Redhat $ORACLE_HOME/jre/1.4.2/lib/font.properties cp $ORACLE_HOME/javavm/lib/ojvmfonts/font.properties.zh_CN.Redhat $ORACLE_HOME/javavm/lib/ojvmfonts/font.properties

覆盖后打开 font.properties 文件,看最后一行 vi $ORACLE_HOME/jdk/jre/lib/font.properties vi $ORACLE_HOME/jre/1.4.2/lib/font.properties

vi $ORACLE_HOME/javavm/lib/ojvmfonts/font.properties

filename.-misc-zysong18030-medium-r-normal--*-%d-*-*-c-*-iso10646-1=/usr/share/fonts/zh_CN/TrueType/zysong.ttf

我们发现字体文件 /usr/share/fonts/zh_CN/TrueType/uming.ttf 根本是不存在的,有些系统可以直接做一个链接文件链接到系统存在的字体文件就可以解决掉乱码问题,但是我的系统做了链接以后还是没能解决,只好修改三个目录下修改后的 font.properties 文件的最后一行为如下内容:

filename.-misc-zysong18030-medium-r-normal--*-%d-*-*-c-*-iso10646-1=/usr/share/fonts/chinese/TrueType/uming.ttf

修改的前提必须保证系统里存在这个字体文件:

[oracle@rhel5 lib]$ ls /usr/share/fonts/chinese/TrueType/ fonts.dir fonts.scale ukai.ttf uming.ttf 自己可以找本系统对应的中文字体文件。

***如果没有需要手动的安装中文包

fonts-chinese-3.02-12.el5.noarch.rpm

m17n-db-common-cjk-1.3.3-48.el5.noarch.rpm m17n-db-chinese-1.3.3-48.el5.noarch.rpm

fonts-ISO8859-2-1.0-17.1.noarch.rpm

fonts-ISO8859-2-75dpi-1.0-17.1.noarch.rpm

fonts-ISO8859-2-100dpi-1.0-17.1.noarch.rpm

接下来删除OEM缓存文件: 一般情况下中文缓存目录$ORACLE_HOME/oc4j/j2ee/oc4j_applications/applications/em/em/cabo/images/cache/zhs rm $ORACLE_HOME/oc4j/j2ee/oc4j_applications/applications/em/em/cabo/images/cache/zhs/*

重启OEM:

emctl start dbconsole

////////////////////////////////////////////////////////////////////////////////////////////////////////// rac验证

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

是 -rf

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

//////////////////////////////////////////////////////////////////////////////////////////////////////////

////////////////////////////////////////////////////////////////////////////////////////////////////////// 在rhel5 64上面安装oracle10g的时候,会出现下面的错误:

准备从以下地址启动 Oracle Universal Installer /tmp/OraInstall2009-05-05_03-41-43PM. 请稍候...[oracle@iparkdb database]$ Exception in thread \java.lang.UnsatisfiedLinkError: /tmp/OraInstall2009-05-05_03-41-43PM/jre/1.4.2/lib/i386/libawt.so: libXp.so.6: wrong ELF class: ELFCLASS64

at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(Unknown Source) at java.lang.ClassLoader.loadLibrary(Unknown Source) at java.lang.Runtime.loadLibrary0(Unknown Source) at java.lang.System.loadLibrary(Unknown Source)

at sun.security.action.LoadLibraryAction.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at sun.awt.NativeLibLoader.loadLibraries(Unknown Source) at sun.awt.DebugHelper.(Unknown Source) at java.awt.Component.(Unknown Source

采用下面的方法,可以解决这个问题:

rpm -ivh libXp-1.0.0-8.1.el5.i386.rpm

虽然操作系统、oracle都是64位,依然要安装libXp

另外,在LD_LIBRARY_PATH中,环境变量应该为/lib:/usr/lib

////////////////////////////////////////////////////////////////////////////////////////////////////////// 卸载

rm /etc/oracle/*

rm -f /etc/init.d/init.cssd rm -f /etc/init.d/init.crs rm -f /etc/init.d/init.crsd rm -f /etc/init.d/init.evmd rm -f /etc/rc2.d/K96init.crs rm -f /etc/rc2.d/S96init.crs rm -f /etc/rc3.d/K96init.crs rm -f /etc/rc3.d/S96init.crs rm -f /etc/rc5.d/K96init.crs rm -f /etc/rc5.d/S96init.crs rm -Rf /etc/oracle/scls_scr rm -f /etc/inittab.crs

//////////////////////////////////////////////////////////////////////////////////////////////////////// crs-1006 crs-0215(2009-10-22 14:48:47)转载标签: 杂谈 ====安装日志相当重要=可以看错误原因

oracle@rac1 /]$ cat /u01/oracle/product/10.2.0/crs_1/log/solaris10b/racg/ora.rac1.vip.log

Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle. All rights reserved.

2008-05-10 23:16:10.407: [ RACG][3086899968] [15730][3086899968][ora.rac1.vip]: checkIf: Default gateway is not defined (host=rac1) Interface eth0 checked failed (host=rac1)

checkIf: Default gateway is not defined (host=rac1) Interface eth1 checked failed (host=rac1)

Invalid parameters, or failed to bring up VIP (host=rac1)

2008-05-10 23:16:10.408: [ RACG][3086899968] [15730][3086899968][ora.rac1.vip]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/app/oracle/product/10.2.0/crs_1

2008-05-10 23:16:10.408: [ RACG][3086899968] [15730][3086899968][ora.rac1.vip]: clsrcexecut: cmd = /u01/app/oracle/product/10.2.0/crs_1/bin/racgeut -e _USR_ORA_DEBUG=0

54 /u01/app/oracle/product/10.2.0/crs_1/bin/racgvip start rac1

http://hi.http://www.wodefanwen.com//fly_ch/blog/item/15917bf2eb36dc16b07ec59b.html

解决办法:设置默认网关192.168.18.1,10.10.10.1

vi /etc/defaultrouter ====添加 192.168.4.254===

必须重启才能生效 查看

netstat -r

net 10.10.10.0 gateway 10.10.10.254 metric 1 passive net 192.168.4.0 gateway 192.168.4.254 metric 1 passive

在我装crs执行vipca出crs-1006和crs-0215的错,由于当时没有metalink帐号,解决这个错花了我一个半星期的时间,更改了不同的网络设置,装了上十次crs,在绝望中重装系统,把系统域名解释服务DNS、NIS都禁掉,只用hosts文件,错误就消失了,装完之后就metalink就来了,上面有10.1.0.4此错的solution:

10.1.0.4 and above introduced a parameter FAIL_WHEN_DEFAULTGW_NOT_FOUND in the $ORA_CRS_HOME/bin/racvip to address this problem.

The following steps will fix the VIP starting problem for above mentioned scenario.

- stop nodeapps

- As root, vi the script $ORA_CRS_HOME/bin/racgvip and change the value of variable FAIL_WHEN_DEFAULTGW_NOT_FOUND=0 .

- start nodepps and you should see the resources ONLINE

You may proceed with netca and dbca to create a RAC database after this.

找不到网关引起的,但我的网关设置应该没问题,除非它跑去检查private地址的网关,比

较奇怪

////////////////////////////////////////////////////////////////////////////////////////////////////

CRS-0184: Cannot communicate with the CRS daemon 2008-10-10 08:47 CRS-0184: Cannot communicate with the CRS daemon Hi all,

Recently when we rebooted our RAC system we observed that the cluster crs was not able to start. when we try to issue crs_stat -t we got the above error. # crs_stat -t

CRS-0184: Cannot communicate with the CRS daemon.

There are many cause for this error which can be deteremined by checking the xcrs logs in the location $ORA_CRS_HOME/crs/log. Here it is necessary to look for crs.log and ocssd.lod. My logs contained the following crs.log

2007-04-11 14:37:34.020: [ COMMCRS][1693]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))

2007-04-11 14:37:34.020: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9

2007-04-11 14:37:34.021: [ CRSRTI][1] CSS is not ready. Received status 3 from CSS. Waiting for good status ..

2007-04-11 14:37:35.740: [ COMMCRS][1695]clsc_connect: (100f78610) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))

2007-04-11 14:37:35.740: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9 When we checked ocssd.log it contained the following

CSSD]2007-04-11 12:53:56.211 [6] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rdsk/c5t8d0s5)

[ CSSD]2007-04-11 12:53:56.211 [10] >TRACE: clssnmvKillBlockThread: spawned for disk 1 (/dev/rdsk/c5t9d0s5) initial sleep interval (1000)ms

[ CSSD]2007-04-11 12:53:56.211 [11] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (/dev/rdsk/c5t8d0s5) initial sleep interval (1000)ms

[ CSSD]2007-04-11 12:53:56.228 [1] >TRACE: clssnmFatalInit: fatal mode enabled

[ CSSD]2007-04-11 12:53:56.269 [13] >TRACE: clssnmconnect: connecting to node 1, flags 0×0001, connector 1

[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=drdb1-priv)(PORT=49895))

[ CSSD]2007-04-11 12:53:56.274 [13] >TRACE: clssnmconnect: connecting to node 0, flags 0×0000, connector 1

[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))

[ CSSD]2007-04-11 12:53:56.279 [14] >ERROR: clssgmclientlsnr: listening failed for (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1)) (3)

[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))

[ CSSD]2007-04-11 12:53:56.279 [14] >TRACE: clssgmclientlsnr: listening on

(ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_drdb1_crs))

[ CSSD]2007-04-11 13:07:36.516 >USER: Oracle Database 10g CSS Release 10.2.0.2.0 Production Copyright 1996, 2004 Oracle. All rights reserved.

[ clsdmt]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=drdb1DBG_CSSD))

[ CSSD]2007-04-11 13:07:36.516 >USER: CSS daemon log for node drdb1, number 1, in cluster crs

[ clsdmt]Terminating clsdm listening thread

[ CSSD]2007-04-11 13:07:36.536 [1] >TRACE: clssscmain: local-only set to false

[ CSSD]2007-04-11 13:07:36.545 [1] >TRACE: clssnmReadNodeInfo: added node 1 (drdb1) to cluster

[ CSSD]2007-04-11 13:07:36.588 [5] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1

[ CSSD]2007-04-11 13:07:36.588 [1] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor

By checking the above logs we have realised the listener of CSS deamon was unable to start.

the reason why it was unable to start was that each time server reboots it creates a socket at /tmp/.oracle or /var/tmp/.oracle directory .

Also if there are previously existing sockets they cannot be reused or deleted automatically from this directory .oracle.

Therefore the solution to above problem was obtained by deleting all the files inside .oracle directoery in /var/tmp or /tmp.

Hence the crs started and cluster came up.

本文来源:https://www.bwwdw.com/article/g8kw.html

Top