Tuesday, February 27, 2007

Oracle 10g RAC on Solaris 10

I had been longing to do this..May be because of lack of resources or learning, I couldn't do it all these days. Honestly, I was searching for a way to get the shared storage work on my Laptop.

I came across iSCSI (OpenFiler) and immediately tried to get Solaris working with it. I downloaded the pre-installed version of Solaris 10 from Sun site. Its pretty good and I bet you would keep gazing at the look and feel of the Java console. pretty decent.

I configured two solaris boxes and did some work using iscsiadm on solaris. Before this, I added the solaris nodes IP's in the openfiler console to allow them to access the shared devices.

And Bingo!!! I could see the partitions on the solaris nodes.

Fortunately or unfortunately, there was a power outage at my home, and both solaris boxes which were on external disk crashed...Resultingly, they were not coming up fine, and ran into maintenance mode. Wasted an hour running fsck and svcs clear boot archive. They are breathing fine now. I am going to install 10g RAC tomorrow. I need to configure raw devices for OCR and VOTING DISK and use ASM option for dbfs.


My second stint with Oracle 10g R2 RAC on RHEL4 was a bit tiresome and a lot of learning though. Earlier when I installed RAC 10g on RHEL4 I had learnt of configuring the shared storage in VMWARE. and the options to use to allow both nodes to share the common disk.set this parameter in both of the vmx files of the vmware--- disk.locking="FALSE" and there are other parameters for disklibcache etc. But disk.locking alone worked for me fine. I used GSX 3.2.1 on windows 2003 server and OCFS2 for OCR and VOTING DISK, and datafiles.
I had tried the option of having the shared oracle_home and crs_home and was successful. The problem I faced while configuring shared disks, the UUID of the shared disk (that we can see in the OCFS2CONSOLE) was not matching between the two nodes and resultingly the nodes were treating the shared disks as individual disks. I realised this later when I ran mounted.ocfs2 -d which showed two different things on both nodes.
So what I learnt was to unload,offline and stop the ocfs2 cluster stack using /etc/init.d/o2cb unload,offline and stop and restart again. Also there is a link between the MAC address of the NIC cards with o2cb. so once when we start the o2cb, there should not be any change in the IP or MAC address. else, we get the 'Transport end point reached error'.
First start the o2cb cluster on the first node selecting 'configure nodes' and if we dont get the values of the nodes and their ips on the second node when we select 'configure nodes', then copy the cluster.conf in /etc/ocfs2 to the second node. this should resolve the issue. Then start the cluster stack on the second node. OCFS2 hearbeat would show offline at this moment as we haven't yet formatted and mounted the shared disks. Once we format and mount the shared disk on first node, we should see the /dev/sdx on the second node as well. Dont forget to match the UUID and running mounted.ocfs2 -d.

No comments: