Thursday, May 11, 2006

VCS Howto

http://pzi.net/VCS-HOWTO
http://www.blacksheepnetworks.com/security/resources/veritas-cluster-server-install-notes.html

Veritas Cluster Server (VCS) HOWTO:
===================================
$Id: VCS-HOWTO,v 1.25 2002/09/30 20:05:38 pzi Exp $
Copyright (c) Peter Ziobrzynski, pzi@pzi.net

Contents:
---------
- Copyright
- Thanks
- Overview
- VCS installation
- Summary of cluster queries
- Summary of basic cluster operations
- Changing cluster configuration
- Configuration of a test group and test resource type
- Installation of a test agent for a test resource
- Home directories service group configuration
- NIS service groups configuration
- Time synchronization services
- ClearCase configuration

Copyright:
----------

This HOWTO document may be reproduced and distributed in whole or in
part, in any medium physical or electronic, as long as this copyright
notice is retained on all copies. Commercial redistribution is allowed
and encouraged; however, the author would like to be notified of any
such distributions.

All translations, derivative works, or aggregate works incorporating
any this HOWTO document must be covered under this copyright notice.
That is, you may not produce a derivative work from a HOWTO and impose
additional restrictions on its distribution. Exceptions to these rules
may be granted under certain conditions.

In short, I wish to promote dissemination of this information through
as many channels as possible. However, I do wish to retain copyright
on this HOWTO document, and would like to be notified of any plans to
redistribute the HOWTO.

If you have questions, please contact me: Peter Ziobrzynski <pzi@pzi.net>

Thanks:
-------

- Veritas Software provided numerous consultations that lead to the
cluster configuration described in this document.

- Parts of this document are based on the work I have done for
Kestrel Solutions, Inc.

- Basis Inc. for assisting in selecting hardware components and help
in resolving installation problems.

- comp.sys.sun.admin Usenet community.

Overview:
---------

This document describes the configuration of a two or more node Solaris
Cluster using Veritas Cluster Server VCS 1.1.2 on Solaris 2.6. Number
of standard UNIX services are configured as Cluster Service Groups:
user home directories, NIS naming services, time synchronization (NTP).
In addition a popular Software Configuration Management system from
Rational - ClearCase is configured as a set of cluster service groups.

Configuration of various software components in the form
of a cluster Service Group allows for high availability of the application
as well as load balancing (fail-over or switch-over). Beside that cluster
configuration allows to free a node in the network for upgrades, testing
or reconfiguration and then bring it back to service very quickly with
little or no additional work.

- Cluster topology.

The cluster topology used here is called clustered pairs. Two nodes
share disk on a single shared SCSI bus. Both computers and the disk
are connected in a chain on a SCSI bus. Both differential or fast-wide
SCSI buses can be used. Each SCSI host adapter in each node is assigned
different SCSI id (called initiator id) so both computers can coexist
on the same bus.

+ Two Node Cluster with single disk:

Node Node
| /
| /
| /
| /
|/
Disk

A single shared disk can be replaced by two disks each on its private
SCSI bus connecting both cluster nodes. This allows for disk mirroring
across disks and SCSI buses.
Note: the disk here can be understood as disk array or a disk pack.

+ Two Node Cluster with disk pair:

Node Node
|\ /|
| \ / |
| \ |
| / \ |
|/ \|
Disk Disk

Single pair can be extended by chaining additional node and connecting
it to the pair by additional disks and SCSI buses. One or more nodes
can be added creating N node configuration. The perimeter nodes have
two SCSI host adapters while the middle nodes have four.

+ Three Node Cluster:

Node Node Node
|\ /| |\ /|
| \ / | | \ / |
| \ | | \ |
| / \ | | / \ |
|/ \| |/ \|
Disk Disk Disk Disk

+ N Node Cluster:

Node Node Node Node
|\ /| |\ /|\ /|
| \ / | | \ / | \ / |
| \ | | \ | ...\ |
| / \ | | / \ | / \ |
|/ \| |/ \|/ \|
Disk Disk Disk Disk Disk

- Disk configuration.

Management of the shared storage of the cluster is performed with the
Veritas Volume Manager (VM). The VM controls which disks on the shared
SCSI bus are assigned (owned) to which system. In Volume Manager disks
are grouped into disk groups and as a group can be assigned for access
from one of the systems. The assignment can be changed quickly allowing
for cluster fail/switch-over. Disks that compose disk group can be
scattered across multiple disk enclosures (packs, arrays) and SCSI
buses. We used this feature to create disk groups that contains VM
volumes mirrored across devices. Below is a schematics of 3 cluster
nodes connected by SCSI busses to 4 disk packs (we use Sun Multipacks).

The Node 0 is connected to Disk Pack 0 and Node 1 on one SCSI bus and
to Disk Pack 1 and Node 1 on second SCSI bus. Disks 0 in Pack 0 and 1
are put into Disk group 0, disks 1 in Pack 0 and 1 are put into Disk
group 1 and so on for all the disks in the Packs. We have 4 9 GB disks
in each Pack so we have 4 Disk groups between Node 0 and 1 that can be
switched from one node to the other.

Node 1 is interfacing the the Node 2 in the same way as with the Node 0.
Two disk packs Pack 2 and Pack 3 are configured with disk groups 4, 5,
6 and 7 as a shared storage between the nodes. We have a total of 8 disk
groups in the cluster. Groups 0-3 can be visible from Node 0 or 1 and
groups 4-7 from Node 1 and 2. Node 1 is in a privileged situation and
can access all disk groups.

Node 0 Node 1 Node 2 ... Node N
------- ------------------- ------
|\ /| |\ /|
| \ / | | \ / |
| \ / | | \ / |
| \ / | | \ / |
| \ / | | \ / |
| \ / | | \ / |
| \ / | | \ / |
| \ | | \ |
| / \ | | / \ |
| / \ | | / \ |
| / \ | | / \ |
| / \ | | / \ |
| / \ | | / \ |
| / \ | | / \ |
|/ \| |/ \|
Disk Pack 0: Disk Pack 1: Disk Pack 2: Disk Pack 3:

Disk group 0: Disk group 4:
+----------------------+ +------------------------+
| Disk0 Disk0 | | Disk0 Disk0 |
+----------------------+ +------------------------+
Disk group 1: Disk group 5:
+----------------------+ +------------------------+
| Disk1 Disk1 | | Disk1 Disk1 |
+----------------------+ +------------------------+
Disk group 2: Disk group 6:
+----------------------+ +------------------------+
| Disk2 Disk2 | | Disk2 Disk2 |
+----------------------+ +------------------------+
Disk group 3: Disk group 7:
+----------------------+ +------------------------+
| Disk3 Disk3 | | Disk3 Disk3 |
+----------------------+ +------------------------+

- Hardware details:

Below is a detailed listing of the hardware configuration of two
nodes. Sun part numbers are included so you can order it directly
form Sunstore and put it on your Visa:

- E250:
+ Base: A26-AA
+ 2xCPU: X1194A
+ 2x256MB RAM: X7004A,
+ 4xUltraSCSI 9.1GB hard drive: X5234A
+ 100BaseT Fast/Wide UltraSCSI PCI adapter: X1032A
+ Quad Fastethernet controller PCI adapter: X1034A

- MultiPack:
+ 4x9.1GB 10000RPM disk
+ Storedge Mulitpack: SG-XDSK040C-36G

- Connections:

+ SCSI:
E250: E250:
X1032A-------SCSI----->Multipack<----SCSI---X1032A
X1032A-------SCSI----->Multipack<----SCSI---X1032A

+ VCS private LAN 0:
hme0----------Ethernet--->HUB<---Ethernet---hme0

+ VCS private LAN 1:
X1034A(qfe0)--Ethernet--->HUB<---Ethernet---X1034A(qfe0)

+ Cluster private LAN:
X1034A(qfe1)--Ethernet--->HUB<---Ethernet---X1034A(qfe1)

+ Public LAN:
X1034A(qfe2)--Ethernet--->HUB<---Ethernet---X1034A(qfe2)

Installation of VCS-1.1.2
----------------------------

Two systems are put into the cluster: foo_c and bar_c

- Set scsi-initiator-id boot prom envrionment variable to 5 on one
of the systems (say bar_c):

ok setenv scsi-initiator-id 5
ok boot -r

- Install Veritas Foundation Suite 3.0.1.

Follow Veritas manuals.

- Add entries to your c-shell environment:

set veritas = /opt/VRTSvmsa
setenv VMSAHOME $veritas
setenv MANPATH ${MANPATH}:$veritas/man
set path = ( $path $veritas/bin )

- Configure the ethernet connections to use hme0 and qfe0 as Cluster
private interconnects. Do not create /etc/hostname.{hme0,qfe0}.
Configure qfe2 as the public LAN network and qfe1 as Cluster main private
network. The configuration files on foo_c:

/etc/hosts:
127.0.0.1 localhost
# public network (192.168.0.0/16):
192.168.1.40 bar
192.168.1.51 foo
# Cluster private network (network address 10.2.0.0/16):
10.2.0.1 bar_c
10.2.0.3 foo_c loghost

/etc/hostname.qfe1:
foo_c

/etc/hostname.qfe2:
foo

The configuration files on bar_c:

/etc/hosts:
127.0.0.1 localhost
# Public network (192.168.0.0/16):
192.168.1.40 bar
192.168.1.51 foo
# Cluster private network (network address 10.2.0.0/16):
10.2.0.1 bar_c loghost
10.2.0.3 foo_c

/etc/hostname.qfe1:
bar_c

/etc/hostname.qfe2:
bar

- Configure at least two VM diskgroups on shared storage (Multipacks)
working from on one of the systems (e.g. foo_c):

+ Create cluster volume groups spanning both multipacks
using vxdiskadm '1. Add or initialize one or more disks':

cluster1: c1t1d0 c2t1d0
cluster2: c1t2d0 c2t2d0
...

Name vmdisks like that:

cluster1: cluster101 cluster102
cluster2: cluster201 cluster202
...

You can do it for 4 disk groups with this script:

#!/bin/sh
for group in 1 2 3 4;do
vxdisksetup -i c1t${group}d0
vxdisksetup -i c2t${group}d0
vxdg init cluster${group} cluster${group}01=c1t${group}d0
vxdg -g cluster${group} adddisk cluster${group}02=c2t${group}d0
done

+ Create volumes in each group mirrored across both multipacks.
You can do it with the script for 4 disk groups with this script:

#!/bin/sh
for group in 1 2 3 4;do
vxassist -b -g cluster${group} make vol01 8g layout=mirror
cluster${group}01 cluster${group}02
done

+ or do all diskgroup and volumes in one script:

#!/bin/sh
for group in 1 2 3 4;do
vxdisksetup -i c1t${group}d0
vxdisksetup -i c2t${group}d0
vxdg init cluster${group} cluster${group}01=c1t${group}d0
vxdg -g cluster${group} adddisk cluster${group}02=c2t${group}d0
vxassist -b -g cluster${group} make vol01 8g layout=mirror
cluster${group}01 cluster${group}02
done

+ Create veritas file systems on the volumes:

#!/bin/sh
for group in 1 2 3 4;do
mkfs -F vxfs /dev/vx/rdsk/cluster$group/vol01
done

+ Deport a group from one system: stop volume, deport a group:

# vxvol -g cluster2 stop vol01
# vxdg deport cluster2

+ Import a group and start its volume on the other system to
see if this works:

# vxdg import cluster2
# vxrecover -g cluster2 -sb

- With the shared storage configured it is important to know how to
manually move the volumes from one node of the cluster to the other.
I use a cmount command to do that. It is like a rc scritp with additional
argument for the disk group.

To stop (deport) the group 1 on a node do:

# cmount 1 stop

To start (import) the group 1 on the other node do:

# cmount 1 start

The cmount script is as follows:

#!/bin/sh
set -x
group=$1
case $2 in
start)
vxdg import cluster$group
vxrecover -g cluster$group -sb
mount -F vxfs /dev/vx/dsk/cluster$group/vol01 /cluster$group
;;
stop)
umount /cluster$group
vxvol -g cluster$group stop vol01
vxdg deport cluster$group
;;
esac

- To remove all shared storage volumes and groups do:

#!/bin/sh
for group in 1 2 3 4; do
vxvol -g cluster$group stop vol01
vxdg destroy cluster$group
done

- Install VCS software:
(from install server on athena)

# cd /net/athena/export/arch/VCS-1.1.2/vcs_1_1_2a_solaris
# pkgadd -d . VRTScsga VRTSgab VRTSllt VRTSperl VRTSvcs VRTSvcswz clsp

+ correct /etc/rc?.d scripts to be links:
If they are not symbolic links then it is hard to disable VCS
startup at boot. If they are just rename /etc/init.d/vcs to
stop starting and stopping at boot.

cd /etc
rm rc0.d/K10vcs rc3.d/S99vcs
cd rc0.d
ln -s ../init.d/vcs K10vcs
cd ../rc3.d
ln -s ../init.d/vcs S99vcs

+ add -evacuate option to /etc/init.d/vcs:

This is optional but I find it important to switch-over
all service groups from the node that is being shutdown.
When I take a cluster node down I expect the rest of the
cluster to pick up the responsibility to run all services.
The default VCS does not do that. The only way to move a
group from one node to another is to crash it or do manual
switch-over using hagrp command.

'stop')
$HASTOP -local -evacuate > /dev/null 2>&1
;;

- Add entry to your c-shell environment:

set vcs = /opt/VRTSvcs
setenv MANPATH ${MANPATH}:$vcs/man
set path = ( $vcs/bin $path )

- To remove the VCS software:
NOTE: required if demo installation fails.

# sh /opt/VRTSvcs/wizards/config/quick_start -b
# rsh bar_c 'sh /opt/VRTSvcs/wizards/config/quick_start -b'
# pkgrm VRTScsga VRTSgab VRTSllt VRTSperl VRTSvcs VRTSvcswz clsp
# rm -rf /etc/VRTSvcs /var/VRTSvcs
# init 6

- Configure /.rhosts on both nodes to allow each node transparent rsh
root access to the other:

/.rhosts:

foo_c
bar_c

- Run quick start script from one of the nodes:
NOTE: must run from /usr/openwin/bin/xterm - other xterms cause terminal
emulation problems

# /usr/openwin/bin/xterm &
# sh /opt/VRTSvcs/wizards/config/quick_start

Select hme0 and qfe0 network links for GAB and LLT connections.
The script will ask twice for the links interface names. Link 1 is hme0
and link2 is qfe0 for both foo_c and bar_c nodes.

You should see the heartbeat pings on the interconnection hubs.

The wizard creates LLT and GAB configuration files in /etc/llttab,
/etc/gabtab and llthosts on each system:

On foo_c:

/etc/llttab:

set-node foo_c
link hme0 /dev/hme:0
link qfe1 /dev/qfe:1
start

On bar_c:

/etc/llttab:

set-node bar_c
link hme0 /dev/hme:0
link qfe1 /dev/qfe:1
start

/etc/gabtab:

/sbin/gabconfig -c -n2

/etc/llthosts:

0 foo_c
1 bar_c

The LLT and GAB communication is started by rc scripts S70llt and S92gab
installed in /etc/rc2.d.

- We can configure private interconnect by hand creating above files.

- Check basic installation:

+ status of the gab:

# gabconfig -a

GAB Port Memberships
===============================================================
Port a gen 1e4c0001 membership 01
Port h gen dd080001 membership 01

+ status of the link:

# lltstat -n

LLT node information:
Node State Links
* 0 foo_c OPEN 2
1 bar_c OPEN 2

+ node parameters:

# hasys -display

- Set/update VCS super user password:

+ add root user:

# haconf -makerw
# hauser -add root
password:...
# haconf -dump -makero

+ change root password:

# haconf -makerw
# hauser -update root
password:...
# haconf -dump -makero

- Configure demo NFS service groups:

NOTE: You have to fix the VCS wizards first: The wizard perl scripts
have a bug that makes the core dump in the middle of filling out
configuration forms. The solution is to provide shell wrapper for one
binary and avoid running it with specific set of parameters. Do the
following in VCS-1.1.2 :

# cd /opt/VRTSvcs/bin
# mkdir tmp
# mv iou tmp
# cat << EOF > iou
#!/bin/sh
echo "[$@]" >> /tmp/,.iou.log
case "$@" in
'-c 20 9 -g 2 2 3 -l 0 3') echo "skip bug" >> /tmp/,.iou.log ;;
*) /opt/VRTSvcs/bin/tmp/iou "$@" ;;
esac
EOF
# chmod 755 iou

+ Create NFS mount point directories on both systems:

# mkdir /export1 /export2

+ Run the wizard on foo_c node:

NOTE: must run from /usr/openwin/bin/xterm - other xterms cause
terminal emulation problems

# /usr/openwin/bin/xterm &
# sh /opt/VRTSvcs/wizards/services/quick_nfs

Select for groupx:
- public network device: qfe2
- group name: groupx
- IP: 192.168.1.53
- VM disk group: cluster1
- volume: vol01
- mount point: /export1
- options: rw
- file system: vxfs

Select for groupy:
- public network device: qfe2
- group name: groupy
- IP: 192.168.1.54
- VM disk group: cluster2
- volume: vol01
- mount point: /export2
- options: rw
- file system: vxfs

You should see: Congratulations!...

The /etc/VRTSvcs/conf/config directory should have main.cf and
types.cf files configured.

+ Reboot both systems:

# init 6

Summary of cluster queries:
----------------------------

- Cluster queries:

+ list cluster status summary:

# hastatus -summary

-- SYSTEM STATE
-- System State Frozen

A foo_c RUNNING 0
A bar_c RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled
State

B groupx foo_c Y N ONLINE
B groupx bar_c Y N OFFLINE
B groupy foo_c Y N OFFLINE
B groupy bar_c Y N ONLINE

+ list cluster attributes:

# haclus -display
#Attribute Value
ClusterName my_vcs
CompareRSM 0
CounterInterval 5
DumpingMembership 0
Factor runque 5 memory 1 disk 10 cpu 25 network 5
GlobalCounter 16862
GroupLimit 200
LinkMonitoring 0
LoadSampling 0
LogSize 33554432
MajorVersion 1
MaxFactor runque 100 memory 10 disk 100 cpu 100 network 100
MinorVersion 10
PrintMsg 0
ReadOnly 1
ResourceLimit 5000
SourceFile ./main.cf
TypeLimit 100
UserNames root cDgqS68RlRP4k

- Resource queries:

+ list resources:

# hares -list
cluster1 foo_c
cluster1 bar_c
IP_192_168_1_53 foo_c
IP_192_168_1_53 bar_c
...

+ list resource dependencies:

# hares -dep
#Group Parent Child
groupx IP_192_168_1_53 groupx_qfe1
groupx IP_192_168_1_53 nfs_export1
groupx export1 cluster1_vol01
groupx nfs_export1 NFS_groupx_16
groupx nfs_export1 export1
groupx cluster1_vol01 cluster1
groupy IP_192_168_1_54 groupy_qfe1
groupy IP_192_168_1_54 nfs_export2
groupy export2 cluster2_vol01
groupy nfs_export2 NFS_groupy_16
groupy nfs_export2 export2
groupy cluster2_v cluster2

+ list attributes of a resource:
# hares -display export1
#Resource Attribute System Value
export1 ConfidenceLevel foo_c 100
export1 ConfidenceLevel bar_c 0
export1 Probed foo_c 1
export1 Probed bar_c 1
export1 State foo_c ONLINE
export1 State bar_c OFFLINE
export1 ArgListValues foo_c /export1
/dev/vx/dsk/cluster1/vol01 vxfs rw ""
...

- Groups queries:

+ list groups:

# hagrp -list
groupx foo_c
groupx bar_c
groupy foo_c
groupy bar_c

+ list group resources:

# hagrp -resources groupx
cluster1
IP_192_168_1_53
export1
NFS_groupx_16
groupx_qfe1
nfs_export1
cluster1_vol01

+ list group dependencies:

# hagrp -dep groupx

+ list of group attributes:

# hagrp -display groupx
#Group Attribute System Value
groupx AutoFailOver global 1
groupx AutoStart global 1
groupx AutoStartList global foo_c
groupx FailOverPolicy global Priority
groupx Frozen global 0
groupx IntentOnline global 1
groupx ManualOps global 1
groupx OnlineRetryInterval global 0
groupx OnlineRetryLimit global 0
groupx Parallel global 0
groupx PreOnline global 0
groupx PrintTree global 1
groupx SourceFile global ./main.cf
groupx SystemList global foo_c 0 bar_c 1
groupx SystemZones global
groupx TFrozen global 0
groupx TriggerEvent global 1
groupx UserIntGlobal global 0
groupx UserStrGlobal global
groupx AutoDisabled foo_c 0
groupx AutoDisabled bar_c 0
groupx Enabled foo_c 1
groupx Enabled bar_c 1
groupx ProbesPending foo_c 0
groupx ProbesPending bar_c 0
groupx State foo_c |ONLINE|
groupx State bar_c |OFFLINE|
groupx UserIntLocal foo_c 0
groupx UserIntLocal bar_c 0
groupx UserStrLocal foo_c
groupx UserStrLocal bar_c

- Node queries:

+ list nodes in the cluster:

# hasys -list
foo_c
bar_c

+ list node attributes:

# hasys -display bar_c
#System Attribute Value
bar_c AgentsStopped 1
bar_c ConfigBlockCount 54
bar_c ConfigCheckSum 48400
bar_c ConfigDiskState CURRENT
bar_c ConfigFile /etc/VRTSvcs/conf/config
bar_c ConfigInfoCnt 0
bar_c ConfigModDate Wed Mar 29 13:46:19 2000
bar_c DiskHbDown
bar_c Frozen 0
bar_c GUIIPAddr
bar_c LinkHbDown
bar_c Load 0
bar_c LoadRaw runque 0 memory 0 disk 0 cpu 0 network 0
bar_c MajorVersion 1
bar_c MinorVersion 10
bar_c NodeId 1
bar_c OnGrpCnt 1
bar_c SourceFile ./main.cf
bar_c SysName bar_c
bar_c SysState RUNNING
bar_c TFrozen 0
bar_c UserInt 0
bar_c UserStr

- Resource types queries:

+ list resource types:
# hatype -list
CLARiiON
Disk
DiskGroup
ElifNone
FileNone
FileOnOff
FileOnOnly
IP
IPMultiNIC
Mount
MultiNICA
NFS
NIC
Phantom
Process
Proxy
ServiceGroupHB
Share
Volume

+ list all resources of a given type:
# hatype -resources DiskGroup
cluster1
cluster2

+ list attributes of the given type:
# hatype -display IP
#Type Attribute Value
IP AgentFailedOn
IP AgentReplyTimeout 130
IP AgentStartTimeout 60
IP ArgList Device Address NetMask Options ArpDelay
IfconfigTwice
IP AttrChangedTimeout 60
IP CleanTimeout 60
IP CloseTimeout 60
IP ConfInterval 600
IP LogLevel error
IP MonitorIfOffline 1
IP MonitorInterval 60
IP MonitorTimeout 60
IP NameRule IP_ + resource.Address
IP NumThreads 10
IP OfflineTimeout 300
IP OnlineRetryLimit 0
IP OnlineTimeout 300
IP OnlineWaitLimit 2
IP OpenTimeout 60
IP Operations OnOff
IP RestartLimit 0
IP SourceFile ./types.cf
IP ToleranceLimit 0
- Agents queries:

+ list agents:
# haagent -list
CLARiiON
Disk
DiskGroup
ElifNone
FileNone
FileOnOff
FileOnOnly
IP
IPMultiNIC
Mount
MultiNICA
NFS
NIC
Phantom
Process
Proxy
ServiceGroupHB
Share
Volume

+ list status of an agent:
# haagent -display IP
#Agent Attribute Value
IP AgentFile
IP Faults 0
IP Running Yes
IP Started Yes

Summary of basic cluster operations:
------------------------------------

- Cluster Start/Stop:

+ stop VCS on all systems:
# hastop -all

+ stop VCS on bar_c and move all groups out:
# hastop -sys bar_c -evacuate

+ start VCS on local system:
# hastart

- Users:
+ add gui root user:
# haconf -makerw
# hauser -add root
# haconf -dump -makero
- Group:

+ group start, stop:
# hagrp -offline groupx -sys foo_c
# hagrp -online groupx -sys foo_c

+ switch a group to other system:
# hagrp -switch groupx -to bar_c

+ freeze a group:
# hagrp -freeze groupx

+ unfreeze a group:
# hagrp -unfreeze groupx

+ enable a group:
# hagrp -enable groupx

+ disable a group:
# hagrp -disable groupx

+ enable resources a group:
# hagrp -enableresources groupx

+ disable resources a group:
# hagrp -disableresources groupx

+ flush a group:
# hagrp -flush groupx -sys bar_c

- Node:

+ feeze node:
# hasys -freeze bar_c

+ thaw node:
# hasys -unfreeze bar_c

- Resources:

+ online a resouce:
# hares -online IP_192_168_1_54 -sys bar_c

+ offline a resouce:
# hares -offline IP_192_168_1_54 -sys bar_c

+ offline a resouce and propagte to children:
# hares -offprop IP_192_168_1_54 -sys bar_c

+ probe a resouce:
# hares -probe IP_192_168_1_54 -sys bar_c

+ clear faulted resource:
# hares -clear IP_192_168_1_54 -sys bar_c

- Agents:

+ start agent:
# haagent -start IP -sys bar_c

+ stop agent:
# haagent -stop IP -sys bar_c

- Reboot a node with evacuation of all service groups:
(groupy is running on bar_c)

# hastop -sys bar_c -evacuate
# init 6
# hagrp -switch groupy -to bar_c

Changing cluster configuration:
--------------------------------

You cannot edit configuration files directly while the
cluster is running. This can be done only if cluster is down.
The configuration files are in: /etc/VRTSvcs/conf/config

To change the configuartion you can:

+ use hagui
+ stop the cluster (hastop), edit main.cf and types.cf directly,
regenerate main.cmd (hacf -generate .) and start the cluster (hastart)
+ use the following command line based procedure on running cluster

To change the cluster while it is running do this:

- Dump current cluster configuration to files and generate main.cmd file:

# haconf -dump
# hacf -generate .
# hacf -verify .

- Create new configuration directory:

# mkdir -p ../new

- Copy existing *.cf files in there:

# cp main.cf types.cf ../new

- Add new stuff to it:

# vi main.cf types.cf

- Regenerate the main.cmd file with low level commands:

# cd ../new
# hacf -generate .
# hacf -verify .

- Catch the diffs:

# diff ../config/main.cmd main.cmd > ,.cmd

- Prepend this to the top of the file to make config rw:

# haconf -makerw

- Append the command to make configuration ro:

# haconf -dump -makero

- Apply the diffs you need:

# sh -x ,.cmd

Cluster logging:
-----------------------------------------------------

VCS logs all activities into /var/VRTSvcs/log directory.
The most important log is the engine log engine.log_A.
Each agent also has its own log file.

The logging parameters can be displayed with halog command:

# halog -info
Log on hades_c:
path = /var/VRTSvcs/log/engine.log_A
maxsize = 33554432 bytes
tags = ABCDE

Configuration of a test group and test resource type:
=======================================================

To get comfortable with the cluster configuration it is useful to
create your own group that uses your own resource. Example below
demonstrates configuration of a "do nothing" group with one resource
of our own type.

- Add group test with one resource test. Add this to
/etc/VRTSvcs/conf/config/new/types.cf:

type Test (
str Tester
NameRule = resource.Name
int IntAttr
str StringAttr
str VectorAttr[]
str AssocAttr{}
static str ArgList[] = { IntAttr, StringAttr, VectorAttr, AssocAttr }
)

- Add this to /etc/VRTSvcs/conf/config/new/main.cf:

group test (
SystemList = { foo_c, bar_c }
AutoStartList = { foo_c }
)

Test test (
IntAttr = 100
StringAttr = "Testing 1 2 3"
VectorAttr = { one, two, three }
AssocAttr = { one = 1, two = 2 }
)

- Run the hacf -generate and diff as above. Edit it to get ,.cmd file:

haconf -makerw

hatype -add Test
hatype -modify Test SourceFile "./types.cf"
haattr -add Test Tester -string
hatype -modify Test NameRule "resource.Name"
haattr -add Test IntAttr -integer
haattr -add Test StringAttr -string
haattr -add Test VectorAttr -string -vector
haattr -add Test AssocAttr -string -assoc
hatype -modify Test ArgList IntAttr StringAttr VectorAttr AssocAttr
hatype -modify Test LogLevel error
hatype -modify Test MonitorIfOffline 1
hatype -modify Test AttrChangedTimeout 60
hatype -modify Test CloseTimeout 60
hatype -modify Test CleanTimeout 60
hatype -modify Test ConfInterval 600
hatype -modify Test MonitorInterval 60
hatype -modify Test MonitorTimeout 60
hatype -modify Test NumThreads 10
hatype -modify Test OfflineTimeout 300
hatype -modify Test OnlineRetryLimit 0
hatype -modify Test OnlineTimeout 300
hatype -modify Test OnlineWaitLimit 2
hatype -modify Test OpenTimeout 60
hatype -modify Test RestartLimit 0
hatype -modify Test ToleranceLimit 0
hatype -modify Test AgentStartTimeout 60
hatype -modify Test AgentReplyTimeout 130
hatype -modify Test Operations OnOff
haattr -default Test AutoStart 1
haattr -default Test Critical 1
haattr -default Test Enabled 1
haattr -default Test TriggerEvent 0
hagrp -add test
hagrp -modify test SystemList foo_c 0 bar_c 1
hagrp -modify test AutoStartList foo_c
hagrp -modify test SourceFile "./main.cf"
hares -add test Test test
hares -modify test Enabled 1
hares -modify test IntAttr 100
hares -modify test StringAttr "Testing 1 2 3"
hares -modify test VectorAttr one two three
hares -modify test AssocAttr one 1 two 2

haconf -dump -makero

- Feed it to sh:

# sh -x ,.cmd

- Both group test and resource Test should be added to the cluster

Installation of a test agent for a test resource:
-------------------------------------------------
This agent does not start or monitor any specific resource. It just
maintains its persistent state in ,.on file. This can be used as a
template for other agents that perform some real work.

- in /opt/VRTSvcs/bin create Test directory

# cd /opt/VRTSvcs/bin
# mkdir Test

- link in the precompiled agent binary for script implemented methods:

# cd Test
# ln -s ../ScriptAgent TestAgent

- create dummy agent scripts in /opt/VRTSvcs/bin/Test:
(make then executable - chmod 755 ...)

online:
#!/bin/sh
echo "`date` $0 $@" >> /opt/VRTSvcs/bin/Test/log
echo yes > /opt/VRTSvcs/bin/Test/,.on
offline:

#!/bin/sh
echo "`date` $0 $@" >> /opt/VRTSvcs/bin/Test/log
echo no > /opt/VRTSvcs/bin/Test/,.on
open:
#!/bin/sh
echo "`date` $0 $@" >> /opt/VRTSvcs/bin/Test/log
close:
#!/bin/sh
echo "`date` $0 $@" >> /opt/VRTSvcs/bin/Test/log
shutdown:
#!/bin/sh
echo "`date` $0 $@" >> /opt/VRTSvcs/bin/Test/log
clean:
#!/bin/sh
echo "`date` $0 $@" >> /opt/VRTSvcs/bin/Test/log
monitor:

#!/bin/sh
echo "`date` $0 $@" >> /opt/VRTSvcs/bin/Test/log
case "`cat /opt/VRTSvcs/bin/Test/,.on`" in
no) exit 100 ;;
*) exit 101 ;;
esac

- start the agent:

# haagent -start Test -sys foo_c

- distribute the agent code to other nodes:

# cd /opt/VRTSvcs/bin/
# rsync -av --rsync-path=/opt/pub/bin/rsync Test bar_cs/bin

- start test group:

# hagrp -online test -sys foo_c

Note:

Distribution or synchronization of the agent code is very important for
cluster intergrity. If the agents differ on various cluster nodes
unpredictible things can happen. I maintain a shell script in the
veritas agent directory (/opt/VRTSvcs/bin) to distribute code of all
agents I work on:

#!/bin/sh
set -x
mkdir -p /tmp/vcs
for dest in hades_c:/opt/VRTSvcs/bin /tmp/vcs;do
rsync -av --rsync-path=/opt/pub/bin/rsync --exclude=log
--exclude=,.on ,.sync CCViews CCVOBReg CCVOBMount ClearCase Test
CCRegistry NISMaster NISClient $dest
done
cd /tmp
tar cvf vcs.tar vcs

Home directories service group configuration:
=============================================

We configure home directories to be a service group consisting of an IP address
and the directory containing all home directories.
Users can consistently connect (telnet, rsh, etc.) to the logical IP and expect
to find thier home directories local on the system.
The directory that we use is the source directory for the automounter
that mounts all directories as needed on the /home subdirectoies. We put
directories on the /cluster3/homes directory and mount it with /etc/auto_home:

* localhost:/cluster3/homes/&

We assume that all required user accounts are configured on all cluster
nodes. This can be done by hand rdisting the /etc/passwd and group files
or by using NIS. We used both methods and NIS one is described below.
All resources of the group are standard VCS supplied ones so we do not
have to implement any agent code for additional resources.

Group 'homes' has the following resource (types in brackets):

homes:

IP_homes (IP)
| |
v v
share_homes (Share) qfe1_homes (NIC)
|
v
mount_homes (Mount)
|
v
volume_homes (Volume)
|
v
dgroup_homes (DiskGroup)

The service group definition for this group is as follows (main.cf):

group homes (
SystemList = { bar_c, foo_c }
AutoStartList = { bar_c }
)

DiskGroup dgroup_homes (
DiskGroup = cluster3
)

IP IP_homes (
Device = qfe2
Address = "192.168.1.55"
)

Mount mount_homes (
MountPoint = "/cluster3"
BlockDevice = "/dev/vx/dsk/cluster3/vol01"
FSType = vxfs
MountOpt = rw
)

Share share_homes (
PathName = "/cluster3"
Options = "-o rw=localhost"
OnlineNFSRestart = 0
OfflineNFSRestart = 0
)

NIC qfe2_homes (
Device = qfe2
NetworkType = ether
)

Volume volume_homes (
Volume = vol01
DiskGroup = cluster3
)

IP_homes requires qfe2_homes
IP_homes requires share_homes
mount_homes requires volume_homes
share_homes requires mount_homes
volume_homes requires dgroup_homes

NIS service group configuration:
=================================

NIS is configured as two service groups: one for the NIS Master server
and the other for the NIS clients. The server is configured to store all
NIS source data files on the shared storage in /cluster1/yp directory.
We copied the follwing files to /cluster1/yp:

auto_home ethers mail.aliases netmasks protocols services
auto_master group netgroup networks publickey timezone
bootparams hosts netid passwd rpc

The makefile in /var/yp required some changes to reflect different then
defalt /etc location of source files. Also the use of sendmail to generate
new aliases while the NIS service was in the process of starting up was
hanging and we had to remove it from the stardart map generatetion.
The limitation here is that the new mail aliases can only be added when
the NIS is completely running. The follwing diffs have been applied to
/var/yp/Makefile:

*** Makefile- Sun May 14 23:33:33 2000
--- Makefile.var.yp Fri May 5 07:38:02 2000
***************
*** 13,19 ****
# resolver for hosts not in the current domain.
#B=-b
B=
! DIR =/etc
#
# If the passwd, shadow and/or adjunct files used by rpc.yppasswdd
# live in directory other than /etc then you'll need to change the
--- 13,19 ----
# resolver for hosts not in the current domain.
#B=-b
B=
! DIR =/cluster1/yp
#
# If the passwd, shadow and/or adjunct files used by rpc.yppasswdd
# live in directory other than /etc then you'll need to change the
***************
*** 21,30 ****
# DO NOT indent the line, however, since /etc/init.d/yp attempts
# to find it with grep "^PWDIR" ...
#
! PWDIR =/etc
DOM = `domainname`
NOPUSH = ""
! ALIASES = /etc/mail/aliases
YPDIR=/usr/lib/netsvc/yp
SBINDIR=/usr/sbin
YPDBDIR=/var/yp
--- 21,30 ----
# DO NOT indent the line, however, since /etc/init.d/yp attempts
# to find it with grep "^PWDIR" ...
#
! PWDIR =/cluster1/yp
DOM = `domainname`
NOPUSH = ""
! ALIASES = /cluster1/yp/mail.aliases
YPDIR=/usr/lib/netsvc/yp
SBINDIR=/usr/sbin
YPDBDIR=/var/yp
***************
*** 45,51 ****
else $(MAKE) $(MFLAGS) -k all NOPUSH=$(NOPUSH);fi

all: passwd group hosts ethers networks rpc services protocols ! netgroup bootparams aliases publickey netid netmasks c2secure timezone auto.master auto.home

c2secure:
--- 45,51 ----
else $(MAKE) $(MFLAGS) -k all NOPUSH=$(NOPUSH);fi

all: passwd group hosts ethers networks rpc services protocols ! netgroup bootparams publickey netid netmasks timezone auto.master auto.home

c2secure:
***************
*** 187,193 ****
@cp $(ALIASES) $(YPDBDIR)/$(DOM)/mail.aliases;
@/usr/lib/sendmail -bi -oA$(YPDBDIR)/$(DOM)/mail.aliases;
$(MKALIAS) $(YPDBDIR)/$(DOM)/mail.aliases $(YPDBDIR)/$(DOM)/mail.byaddr;
- @rm $(YPDBDIR)/$(DOM)/mail.aliases;
@touch aliases.time;
@echo "updated aliases";
@if [ ! $(NOPUSH) ]; then $(YPPUSH) -d $(DOM) mail.aliases; fi
--- 187,192 ----

We need only one master server so only one instance of this service group
is allowed on the cluster (group is not parallel).

Group 'nis_master' has the following resources (types in brackets):

nis_master:

master_NIS (NISMaster)
|
v
mount_NIS (Mount)
|
v
volume_NIS (Volume)
|
v
dgroup_NIS (DiskGroup)

The client service group is designed to configure domain name on the
node and then start ypbind in a broadcast mode. We need NIS client to
run on every node so it is designed as parallel group. Clients cannot
function without Master server running somewhere on the cluster network
so we include dependency between client and master service groups as
'online global'.
The client group unconfigures NIS completely from the node when it is
shotdown. This may seem radical but it is required for consistency
with the startup.

To allow master group to come on line we also include in this group
automatic configuration of the domain name.

The nis_master group is defined as follows (main.cf):

group nis_master (
SystemList = { bar_c, foo_c }
AutoStartList = { bar_c }
)

DiskGroup dgroup_NIS (
DiskGroup = cluster1
)

Mount mount_NIS (
MountPoint = "/cluster1"
BlockDevice = "/dev/vx/dsk/cluster1/vol01"
FSType = vxfs
MountOpt = rw
)

NISMaster master_NIS (
Source = "/cluster1/yp"
Domain = mydomain
)

Volume volume_NIS (
Volume = vol01
DiskGroup = cluster1
)

master_NIS requires mount_NIS
mount_NIS requires volume_NIS
volume_NIS requires dgroup_NIS

Group 'nis_client' has the following resource (types in brackets):

nis_client:

client_NIS (NISClient)

The nis_client group is defined as follows (main.cf):

group nis_client (
SystemList = { bar_c, foo_c }
Parallel = 1
AutoStartList = { bar_c, foo_c }
)

NISClient client_NIS (
Domain = mydomain
)

requires group nis_master online global

Both master and client service group use custom built resource and
correspnding agent code. The resource are defined as follows (in types.cf):

type NISClient (
static str ArgList[] = { Domain }
NameRule = resource.Name
str Domain
)

type NISMaster (
static str ArgList[] = { Source, Domain }
NameRule = resource.Name
str Source
str Domain
)

The agents code for NISMaster:

- online:

Time synchronization services (xntp):
======================================
,,,

ClearCase configuration:
=========================

ClearCase is a client server system prividing so called multi-vesion file
system functionality. The mvfs file systems are used to track contents
of files, directories, symbolic links, in versions of so called elements.
Elements are stored in VOBs (mvfs objects) and are looked at using Views
objects. Information about objects like their location, permissions,
etc. is stored in distributed database called registry. For ClearCase to
be configured on a system the Registry, VOB and View server processes
have to be started. VOBs and Views store their data in a regular
directory trees. The VOB and View storage directories can be located on
the shared storage of the cluster and cluster service groups configured
to mount it and start needed server processes.

We configured ClearCase as a set of four service groups: ccregistry, views,
vobs_group_mnt, vobs_group_reg. Each node in the cluster must have a
standard ClearCase installed and configured into the same region. All
views and VOBs need to be configured to use their storage directories
on the cluster shared storage. In our case we used /cluster2/viewstore
for views storate directory and /cluster4/vobstore for VOB
storage directory. All VOBs must be public.

The licensing of clearcase in the cluster is resolved by configuring
each node in the cluster as the license server for itself. This is done
by transfering all your licenses from one node to the other and still
keeping the other license server. Since this may be a streatch of the
licensing agreement you may want to use a separate license server outside
of the cluster.

Groups and resources:
---------------------

All four service groups (ccregistry, views, vobs_group_mnt, vobs_group_reg)
perform a specialized clearcase function that can be isolated to a single
node of the cluster. All nodes of the cluster run the basic clearcase
installation and this is performed by the resource type named ClearCase.
Each of the service groups includes this resource.
The ccregistry service group transfers clearcase master registry server
to a particular cluster node. This is performed by the specialized resource
of type CCRegistry.
The Views are handled by service groups that include specialized resource
of type CCViews. This resource registeres and starts all views sharing the
same storage directory.
The VOBs functionality is handled by two separate service groups: one that
registres a VOB on the cluster node and the other that mounts it on the same
or other cluster node. The VOB registration is performed by the specialized
resource of type CCVOBReg and the VOB mounting by the resource of type
CCVOBMount.
Detailed description of each service group and their resources follows:

ccregistry service group:
--------------------------

The ccregistry group is responsible for configuring a cluster node as
a primary registry server and if necessary unconfiguring it from any
other nodes on the cluster. All nodes in the cluster are configured as
registry backup servers that store a copy of the primary registry data.
The /var/adm/atria/rgy/rgy_hosts.conf has to be configured with all
cluster nodes as backups:

/var/adm/atria/rgy/rgy_hosts.conf:

foo_c
foo_c bar_c

This group uses two custom resources: ccregistry_primary and
ccase_ccregistry. The ccase_ccregistry is of type ClearCase and is
responsible for starting basic ClearCase services. No views or VOBs are
configured at this point. Other service groups will do that later. The
ccregistry_primary resource is changing configuration files to configure
a host as primary registry server.

ccregistry:

ccregistry_primary (CCRegistry)
|
|
v
ccase_ccregistry (ClearCase)

The ccregistry group is defined as follows (main.cf):

group ccregistry (
SystemList = { bar_c, foo_c }
AutoStartList = { bar_c }
)

CCRegistry ccregistry_primary (
)

ClearCase ccase_ccregistry (
)

ccregistry_primary requires ccase_ccregistry

The custom resource for the group CCRegistry and ClearCase are defined
as (in types.cf):

type CCRegistry (
static str ArgList[] = { }
NameRule = resource.Name
)

type ClearCase (
static str ArgList[] = { }
NameRule = resource.Name
static str Operations = OnOnly
)

ClearCase resource implemenation:

The ClearCase 'online' agent is responsible for configuring registry
configuration files and starting ClearCase servers. Configration is done
in such a way that clearcase runs only one registry master server in the
cluster. The /var/adm/atria/rgy/rgy_hosts.conf file is configured to use
the current node as the master only if no clearcase is running on other
cluster nodes. If clearcase service group is detected in on-line state
anywhere in the cluster the current node is started as the registry backup
server. It is assumed that the other node claimed the master registry
status already. The master status file /var/adm/atria/rgy/rgy_svr.conf
is updated to indicate the current node status. After the registry
configuration files are prepared the standard clearcase startup script
is run /usr/atria/etc/atria_start.

ClearCase/online agent:

> #!/bin/sh
> # ClearCase online:
> if [ -r /view/.specdev ];then
> # Running:
> exit 0
> fi
>
> this=`hostname`
> primary=`head -1 /var/adm/atria/rgy/rgy_hosts.conf`
> backups=`tail +2 /var/adm/atria/rgy/rgy_hosts.conf`
> master=`cat /var/adm/atria/rgy/rgy_svr.conf`
>
> online=
> for host in $backups;do
> if [ "$host" != "$this" ];then
> stat=`hagrp -state ccregistry -sys $host | grep ONLINE | wc -l`
> if [ $stat -gt 0 ];then
> online=$host
> break
> fi
> fi
> done
>
> if [ "$this" = "$primary" -a "$online" != "" ];then
> # Erase master status:
> cp /dev/null /var/adm/atria/rgy/rgy_svr.conf
>
> # Create configuraion file with this host as a master:
> cat <<-EOF > /var/adm/atria/rgy/rgy_hosts.conf
> $online
> $backups
> EOF
> fi
>
> # Normal ClearCase startup:
> /bin/sh -x /usr/atria/etc/atria_start start

The ClearCase resource is configured not to use 'offline' but only
'shutdown' agent. The 'offline' could be dangerous for clearcase if
VCS missed the monitor detection and decided to restart it.
The 'shutdown' ClearCase agent stops all clearcase servers using standard
clearcase shutdown script (/usr/atria/etc/atria_start).

ClearCase/shutdown:

> #!/bin/sh
> # ClearCase shutdown:
> # Normal ClearCase shutdown:
> /bin/sh -x /usr/atria/etc/atria_start stop

ClearCase/monitor:

> #!/bin/sh
> # ClearCase monitor:
> if [ -r /view/.specdev ];then
> # Running:
> exit 110
> else
> # Not running:
> exit 100
> fi

CCRegistry resource implemenation:

This resource verifies if the current node is configured as the registry
master server and if not performs switch-over from other node to this one.
The complication here was the sequence of events: when switching
over a group from one node to the other the VCS engine first offlines it
on the node that is on-line and then brings it on-line on the one that
was offline.
With registry switch-over the sequence of events has to be reversed: the
destination node must fist perform the rgy_switchover and transfer the
master status to itself while the master is up and next the old master
must be shutdown and configured as backup.

For this sequence to be implemented the offline agent (that is
called first on the current primary) does not perform the switchover
but only marks the intent of the master to be transfered by creating
a marker file ,.offline in the agent directory. The monitor script
that is called next on the current master is reporting the primary
being down if it finds the ,.offline marker.

CCRegistry/offline:

> #!/bin/sh
> # CCRegistry offline:
> if [ `ps -ef | grep albd_server | grep -v grep | wc -l` -eq 0 ];then
> # No albd_server - no vobs:
> exit 1
> fi
>
> this=`hostname`
> primary=`head -1 /var/adm/atria/rgy/rgy_hosts.conf`
> backups=`tail +2 /var/adm/atria/rgy/rgy_hosts.conf`
>
> if [ "$this" != "$primary" ];then
> # This host is not configured as primary - do nothing:
> exit 1
> fi
>
> touch /opt/VRTSvcs/bin/CCRegistry/,.offline
>
> exit 0

Next the online agent on the target node performs the actual switch-over
using rgy_switchover.
Next the monitor script in the following iteration on the old primary sees
that the primary was tranfered by looking into the rgy_hosts.conf file and
then removes the ,.offline marker.

> #!/bin/sh
> # CCRegistry online:
> if [ `ps -ef | grep albd_server | grep -v grep | wc -l` -eq 0 ];then
> # No albd_server - no vobs:
> exit 1
> fi
>
> this=`hostname`
> primary=`head -1 /var/adm/atria/rgy/rgy_hosts.conf`
> backups=`tail +2 /var/adm/atria/rgy/rgy_hosts.conf`
>
> if [ "$this" = "$primary" ];then
> # This host is already configured as primary - do nothing:
> exit 1
> fi
>
> # Check if this host if on the backup list - if not do nothing.
> # Only backups can become primary.
>
> contine=0
> for backup in $backups; do
> if [ "$backup" = "$this" ];then
> continue=1
> fi
> done
> if [ $continue -eq 0 ];then
> exit 1
> fi
>
>
> # Check if backup data exists. If not do nothing:
> if [ ! -d /var/adm/atria/rgy/backup ];then
> exit 1
> fi
>
> # Check how old the backup data is. If it is to old do nothing:
> # ,,,
>
>
> # Put the backup on line and switch hosts. Change from $primary to $this host.
> # Assign last $backup host in backup list as backup:
>
> /usr/atria/etc/rgy_switchover -backup "$backups" $primary $this
>
> touch /opt/VRTSvcs/bin/CCRegistry/,.online
>
> exit 0

Sometimes the rgy_switchover running on the target node does not complete
the registry transfer and the operation has to be retried. To do this
the online agent leaves an ,.online marker in the agent directory right
after the rgy_switchover is run. Next the monitor agent looks for the
,.online marker and if it finds it it retries the rgy_switchover.
As soon as the monitor agent detects that the configuration files have
been properly updated and the switch-over was completed it removes the
,.online marker.

To maintain integrity of the agent operation the open and close agents
remove both marker files (,.online and ,.offline) that may have been
left there from the previous malfunctioning or crashed system.

CCRegstry/open:

> cd /opt/VRTSvcs/bin/CCRegistry
> rm -f ,.offline ,.online

CCRegstry/close:

> cd /opt/VRTSvcs/bin/CCRegistry
> rm -f ,.offline ,.online

vobs_<group>_reg service group:
---------------------------------
The first step in configuring clearcase is to create, register and
mount VOBs. This source group is designed to register a set of VOBs
that use a specific storage directory. All VOBs that are located in a
given directory are registered on the current cluster node. The <group>
is the parameter that should be replace with the uniqe name indicating
a group of VOBs. We used this name to consitently name Veritas Volume
Manager disk group, mount point directory and a collection of cluster
resources desinged to provide VOBs infrastructure. The vobs_<group>_reg
service group is built of the following resources:

- ccase_<group>_reg resrouce of type ClearCase. This resource powers up
clearcase on the cluster node and makes it ready for use. See above the
detailed description of this group implementation.

- ccvobs_<group>_reg resource of type CCVOBReg. This is a custom resource
that registers given set of VOBs identified by given VOB tags, storage
directory.

- mount_<group>_reg resource of type Mount. This resource mounts given
Veritas Volume on a directory.

- volume_<group>_reg resrouce of type Volume. This resource starts indicated
Veritas Volume in a given disk group.

- dgroup_<group>_reg resource of type DiskGroup onlines given Vertas
disk group.

Here is the dependency diagram of the resrouces of this group:

ccvobs_<group>_reg(CCVOBReg)
| |
v v
ccase_<group>_reg (ClearCase) mount_<group>_reg (Mount)
|
v
volume_<group>_reg(Volume)
|
v
dgroup_<group>_reg(DiskGroup)

There can be many instances of this service group - one for each collection
of VOBs. Each set can be managed separately onlining it on various
cluster nodes and providing load balancing functionality.
One of our implemetations used "cluster4" for the name of the <group>.
We named Veritas disk group "cluster4", the VOBs storage directory
/cluster4/vobstore. Here is the example definition of the vobs_cluster4_reg
group (in main.cf):

group vobs_cluster4_reg (
SystemList = { foo_c, bar_c }
AutoStartList = { foo_c }
)

CCVOBReg ccvobs_cluster4_reg (
Storage = "/cluster4/vobstore"
CCPassword = foobar
)

ClearCase ccase_cluster4_reg (
)

DiskGroup dgroup_cluster4_reg (
DiskGroup = cluster4
)

Mount mount_cluster4_reg (
MountPoint = "/cluster4"
BlockDevice = "/dev/vx/dsk/cluster4/vol01"
FSType = vxfs
MountOpt = rw
)

Volume volume_cluster4_reg (
Volume = vol01
DiskGroup = cluster4
)

requires group ccregistry online global

ccvobs_cluster4_reg requires ccase_cluster4_reg
ccvobs_cluster4_reg requires mount_cluster4_reg
mount_cluster4_reg requires volume_cluster4_reg
volume_cluster4_reg requires dgroup_cluster4_reg

CCVOBReg resource implementation:

The resource type is defined as follows:

type CCVOBReg (
static str ArgList[] = { CCPassword, Storage }
NameRule = resource.Name
str Storage
str CCPassword
)

The CCPasswd is the ClearCase registry password. The Storage is the
directory where VOBs storage directories are located.

The online agent checks the storage directory and uses basenames of all
directory entries with suffix .vbs as the VOB's tags to register.
First we try to unmount, remove tags, unregister and kill VOB's servers.
Removing of tags is done with the send-expect engine (expect) running
'ct rmtag' command so we can interactively provide registry password.
When the VOB previous instance is cleaned up it is registered and tagged.

> #!/bin/sh
> # CCVOBReg online:
>
> shift
> pass=$1
> shift
> vobstorage=$1
>
> if [ `ps -ef | grep albd_server | grep -v grep | wc -l` -eq 0 ];then
> # No albd_server - no views:
> exit 1
> fi
>
> # Handle all VOBs created in the VOB storage directory:
> if [ ! -d $vobstorage ];then
> exit
> fi
>
> for tag in `cd $vobstorage; ls | sed 's/.vbs//'`;do
> storage=$vobstorage/$tag.vbs
>
> # Try to cleanup first:
> cleartool lsvob /vobs/$tag
> status=$?
> if [ $status -eq 0 ];then
> cleartool umount /vobs/$tag
>
> expect -f - <<-EOF
> spawn cleartool rmtag -vob -all /vobs/$tag
> expect "Registry password:"
> send "$pass\n"
> expect eof
> EOF
>
> cleartool unregister -vob $storage
>
> pids=`ps -ef | grep vob | grep "$storage" | grep -v grep | awk '
> { print $2 }'`
>
> for pid in $pids;do
> kill -9 $pid
> done
> fi
>
> # Now register:
> cleartool register -vob $storage
> cleartool mktag -vob -pas $pass -public -tag /vobs/$tag $storage
> done
>

The monitor agent is implemented by checking 'ct lsvob' output and comparing
the vobs listed as registered on the current host versus the vobs found
in the VOB's storage directory:

> #!/bin/sh
> # CCVOBReg monitor:
>
>
> vobs -l` -eq 0 ];then
> # No albd_server:
> exit 100
> fi
>
> # Handle all VOBs created in the VOB storage directory:
> if [ ! -d $vobstorage ];then
> exit 100
> fi
>
> # Number of VOBS found in the storage:
> nvobs_made=`cd $vobstorage; ls | sed 's/.vbs//' | wc -l`
>
> # Number of VOBS registered on this host:
> nvobs_reg=`cleartool lsvob | grep /net/$host$vobstorage | wc -l`
>
> #if [ $nvobs_reg -lt $nvobs_made ];then
> if [ $nvobs_reg -lt 1 ];then
> # Not running:
> exit 100
> else
> # Running:
> exit 110
> fi

The offline agent work in the same way as the online with excetion of
registering and tagging of the VOB.

vobs_<group>_mnt service group:
--------------------------------
After VOBs are registered and tagged on the cluster node they need to
be mounted. The mount can be done anywhere in the cluster and not necessaryly
on the same node where they are registered.
The vobs_<group>_mnt service group performs the mounting operation. It is
desinged to complement vobs_<group>_reg services group and operate on a set
of VOBs.

The following resource compose this service group:

- ccase_<group>_mnt resource of type ClearCase. This resource powers up
clearcase on the cluster node and makes it ready for use.

- ccvobs_<group>_mnt resource of type CCVOBMount.
The work of mounting a set of VOBs is implemented in the in this resource.
The VOBs are defined as a list of tags.

Here is the dependency diagram of the resrouces of this group:

ccvobs_<group>_mnt (CCVOBMount)
|
v
ccase_<group>_mnt (ClearCase)

There may be many instances of the vobs_<group>_mnt - the <group> is used
as the name of the VOBs group. We used "cluster4" to match the name of
the vobs_cluster4_reg group. Here is how we defined it (in main.cf):

group vobs_cluster4_mnt (
SystemList = { foo_c, bar_c }
AutoStartList = { foo_c }
Parallel = 1
PreOnline = 1
)

CCVOBMount ccvobs_cluster4_mnt (
CCPassword = foobar
Tags = { cctest, admin }
)

ClearCase ccase_cluster4_mnt (
)

requires group vobs_cluster4_reg online global

ccvobs_cluster4_mnt requires ccase_cluster4_mnt

CCVOBMount resource implementation:

The resource type is defined as follows:

type CCVOBMount (
static str ArgList[] = { CCPassword, Tags }
NameRule = resource.Name
str CCPassword
str Tags[]
str Storage
)

The CCPassword is the ClearCase registry password. The Tags is the
list of VOB's tags to mount.

The online agent mounts and unlocks the list of VOBs. The NFS shares are
also refreshed to allow for remove VOBs use.

> #!/bin/sh
> # CCVOBMount online:
> shift
> pass=$1
> shift
> shift
> tags=$*
>
> if [ `ps -ef | grep albd_server | grep -v grep | wc -l` -eq 0 ];then
> # No albd_server - no vobs:
> exit 1
> fi
> for tag in $tags;do
> cleartool mount /vobs/$tag
> cleartool unlock vob:/vobs/$tag
> done
>
> # Refresh share table - othewise remote nodes can't mount storage directory:
> shareall

The offline agent terminates all processes that use clearcase file systems.
Unexports all views and then umounts all vobs locking them first.

> #!/bin/sh
> # CCVOBMount offline:
> shift
> pass=$1
> shift
> shift
> tags=$*
>
> if [ `ps -ef | grep albd_server | grep -v grep | wc -l` -eq 0 ];then
> # No albd_server - no vobs:
> exit 1
> fi
>
> # Kill users of mvfs:
> tokill=`/usr/atria/sun5/kvm/5.6/fuser_mvfs -n /dev/ksyms`
> while [ -n "$tokill" ];do
> kill -HUP $tokill
> tokill=`/usr/atria/sun5/kvm/5.6/fuser_mvfs -n /dev/ksyms`
> done
>
> # Unexport views:
> /usr/atria/etc/export_mvfs -au
>
> for tag in $tags;do
> on=`cleartool lsvob /vobs/$tag | grep '^*' | wc -l`
> if [ $on -ne 0 ];then
> cleartool lock vob:/vobs/$tag
> cleartool umount /vobs/$tag
> fi
> done

views service group:
----------------------

The views service group manages a group of views configured to use
a specific directory as parent of views storage directories.
All views that are found in the provided directory are started, stopped
and monitored. The group is using the following resources:

views:

views_views (CCViews)
| |
v v
ccase_views (ClearCase) mount_views (Mount)
|
v
volume_views (Volume)
|
v
dgroup_views (DiskGroup)

The views custom resource CCViews is defined as follows (in types.cf):

type CCViews (
static str ArgList[] = { CCPassword, Storage }
NameRule = resource.Name
str CCPassword
str Storage
)

ClearCase service groups and NIS:
---------------------------------------

,,,

Disk backup of the shared storage:
-------------------------------------

The backup node of the cluster should switch all of the shared storage to
itself before doing the backup. This can be done with a simple switchover
of all the storage related service groups to the backup node, doing the
backup and switching the groups back to its intended locations.
We do it with the following shell script that does full backup of all cluster
to the DAT tape evey night.

> #!/bin/sh
> # $Id: VCS-HOWTO,v 1.25 2002/09/30 20:05:38 pzi Exp $
> # Full backup script. All filesystem from vfstab in cpio format to DLT.
> # Logs in /backup/log_<date>.
>
> set -x
>
> SYSD=/opt/backup
> LOG=log_`date +%y%m%d_%H%M%S`
> ATRIAHOME=/usr/atria; export ATRIAHOME
> PATH=${PATH}:$ATRIAHOME/bin
> DEV=/dev/rmt/4ubn
>
> exec > $SYSD/$LOG 2>&1
>
> # Move all cluster shared storage to the backup node:
> groups="nis_master homes views vobs_cluster4_mnt vobs_cluster4_reg ccregistry"
> for group in $groups; do
> /opt/VRTSvcs/bin/hagrp -switch $group -to zeus_c
> done
>
> # Take all file systems in /etc/vfstab that are of type ufs or vxfs and
> # are not /backup:
> FSYS=`awk '$1 !~ /^#/ { > if ( ( $4 == "ufs" || $4 == "vxfs" ) > && $3 != "/backup" && $3 != "/backup1" && $3 != "/spare" ) > { print $3 } > }' /etc/vfstab`
>
> # Start and stop jobs for each file system:
> vobs=`cleartool lsvob | grep \* | awk '{ printf "vob:%s ", $2 }'`
> cluster4_start="cleartool lock -c Disk-Backup-Running-Now $vobs"
> cluster4_stop="cleartool unlock $vobs"
>
> mt -f $DEV rewind
>
> cd /
>
> for f in $FSYS;do
> f=`echo $f | sed 's/^\///'`
> eval $"${f}_start"
> echo $f
> find ./$f -mount | cpio -ocvB > $DEV
> eval $"${f}_stop"
> done
>
> mt -f $DEV rewind
>
> # Move cluster to the split state - hades_c runs all users homes, etc.
> groups="homes views"
> for group in $groups; do
> /opt/VRTSvcs/bin/hagrp -switch $group -to hades_c
> done
>
> ( head -40 $SYSD/$LOG; echo '...'; tail $SYSD/$LOG ) | > mailx -s "backup on `hostname`" root
>
>
>
>

Veritas Cluster Server - VCS

Versions: 1.0.1, 1.0.2, 1.1, 1.1.1, 1.1.2, 1.3.0 (The difference
between 1.1.1 og 1.1.2 is just VRTSgab)

HEARTBEAT:
1) heartbeat is on layer 2, with LLT /GAB (Low Latency
Transport/Group Atomic Broadcast)

Script-setup NFS Script-setup ORACLE S_delete group
tuning-script v.01

Home MADE Agents Diverse info/setup

Simple Setup(before 1.3.0) (I've never tried this):
cd /opt/VRTSvcs/wizards/config
./quick_start

Simple Setup(from 1.3.0) (even install the packages for you, but NOT
the clusterManager(gui))
cd /cdrom/cdrom0
./VCSinstall

manuel setup:

make /etc/llttab and /etc/llthosts /etc/gabtab

#cat /etc/llttab:(before 1.3.0)
set-node VXcluster1
link qfe0 /dev/qfe:0
link qfe1 /dev/qfe:1
start

#cat /etc/llttab:(from 1.3.0)
set-node VXcluster1
set-cluster 0
link link1 /dev/qfe:0 - ether - -
link link2 /dev/qfe:1 - ether - -
start

# cat /etc/llthosts
0 VXcluster1
1 VXcluster2

Will be started from /etc/rc2.d/S68llt

#cat /etc/gabtab
/sbin/gabconfig -c -n 2

GAB is started from /etc/rc.d/S92gab

test av LLT/GAB:
On system A:
#/opt/VRTSllt/llttest -p <port>
>receive -c <count>

On system B:
#/opt/VRTSllt/llttest -p <port>
>transmit -n <dest-node-ID> -c <count>
>exit

DISKHEARTBEAT: This can be done in 2 ways:
1) global, by GAB
2) pr. hagrp, its own resource , DiskGroupHB

1: needs 128blocks (64kb) of a disk
#hahpsetup c1t3d0 (reinitialize) Can still be used by VxVM
heartbeat slice 7, private slice 3, public slice 4

Settes i /etc/rc2.d/S92gab:
/sbin/gabdisk -a /dev/rdsk/c1t3d0s7 -s 16 -p a
/sbin/gabdisk -a /dev/rdsk/c1t3d0s7 -s 144 -p h

blocks 0-15: partition-table
blocks 16-143: Seed port a
blocks 144-271: VCS port h

/etc/VRTSvcs/conf/sysname holds "the name of the node" this does not
need to be the same as 'nodename' but why not?

#cat /etc/VRTSvcs/conf/config/main.cf
include "types.cf" (Has to be copied from /etc/VRTSvcs/conf
cluster <clustername>
snmp <clustername>
system VXcluster1
system VXcluster2

Example of a CONFIG:
Making a servicegroup, imports a diskgroup, starts a volume, mounts it
and share it

Machines: VXcluster1, VXcluster2
cluster: VXcluster (193.216.23.205)
Volume: Vol01, dg= DataDG
Mountpoint: /export
ha(service)-group: hanfs

VXcluster1# haconf -makerw
# hagrp -add hanfs
group added; populating SystemList and setting the Parallel attribute
recommended
before adding resources
# hagrp -modify hanfs SystemList VXcluster1 1 VXcluster2 2
# hagrp -autoenable hanfs -sys VXcluster1
# #---------------------------------------------------------
# hares -add nfsNIC NIC hanfs
resource added
NameRule and Enabled attributes must be set before agent monitors
# hares -modify nfsNIC Enabled 1
# hares -modify nfsNIC Device hme0
##-----------------------------------------------------------
# hares -add nfsIP IP hanfs
resource added
NameRule and Enabled attributes must be set before agent monitors
# hares -modify nfsIP Enabled 1
# hares -modify nfsIP Device hme0
# hares -modify nfsIP Address 193.216.23.205
# hares -modify nfsIP IfconfigTwice 1
# #---------------------------------------------------------
# hares -add nfsDG DiskGroup hanfs
resource added
NameRule and Enabled attributes must be set before agent monitors
# hares -modify nfsDG Enabled 1
# hares -modify nfsDG DiskGroup DataDG
# hares -modify nfsDG StartVolumes 0
# #---------------------------------------------------------
#
# hares -add nfsVOL Volume hanfs
resource added
NameRule and Enabled attributes must be set before agent monitors
# hares -modify nfsVOL Enabled 1
# hares -modify nfsVOL Volume Vol01
# hares -modify nfsVOL DiskGroup DataDG
# #---------------------------------------------------------
# hares -add nfsMOUNT Mount hanfs
resource added
NameRule and Enabled attributes must be set before agent monitors
# hares -modify nfsMOUNT Enabled 1
# hares -modify nfsMOUNT MountPoint /export
# hares -modify nfsMOUNT BlockDevice /dev/vx/dsk/DataDG/vol01
# hares -modify nfsMOUNT Type vxfs
# #---------------------------------------------------------
# hares -add nfsNFS NFS hanfs
resource added
NameRule and Enabled attributes must be set before agent monitors
# hares -modify nfsNFS Enabled 1
# hares -modify nfsNFS Nservers 24
# #----------------------------------------------------------
# hares -add nfsSHARE Share hanfs
resource added
NameRule and Enabled attributes must be set before agent monitors
# hares -modify nfsSHARE Enabled 1
# hares -modify nfsSHARE PathName /export
# hares -modify nfsSHARE OnlineNFSRestart 1
# hares -modify nfsSHARE Options " -o rw,root=VxConsole"

# Finnished with config, make links and mountpoints:
# mkdir /export
# hares -link nfsIP nfsnic
# hares -link nfsVOL nfsDG
# hares -link nfsMOUNT nfsVOL
# hares -link nfsSHARE nfsIP
# hares -link nfsSHARE nfsMOUNT
# hares -link nfsSHARE nfsNFS

#haconf -dump -makero

- Resource MultiNIC will do network-failover (like nafo in SunCLuster)
########################################################################
MANUALS:
hastart: VCS: usage:
had [-ts]
had [-ts] -stale
had [-ts] -force
had -help
had -version
hastop: usage:
hastop -local [-force|-evacuate]
hastop -sys <system> .... [-force|-evacuate]
hastop -all -force
hastop -all
hastop [-help]
# hatype -help
usage:
hatype -add <type>
hatype -delete <type>
hatype -display [<type>]
hatype -resources <type>
hatype -list
hatype [-help [-modify]]
# hatype -list
Types:
CLARiiON
Disk
DiskGroup
ElifNone
FileNone
FileOnOff
FileOnOnly
IP
IPMultiNIC
Mount
MultiNICA
NFS
NIC
Phantom
Process
Proxy
ServiceGroupHB
Share
Volume
1# hares -help
usage:
hares -add <res> <type> <group>
hares -local <res> <attr>
hares -global <res> <attr> <value> ... | <key> ... | {<key> <value>} ...
hares -delete <res>
hares -link <res> <child>
hares -unlink <res> <child>
hares -clear <res> [-sys <system>]
hares -online <res> -sys <system>
hares -offline <res> -sys <system>
hares -offprop <res> -sys <system>
hares -state <res> -sys <system>
hares -display [<res>]
hares -list
hares [-help [-modify]]
# hares -help -modify
SCALAR:
hares -modify <resource> <attr> <value> [-sys <system>]
VECTOR:
hares -modify <resource> <attr> <value> ... [-sys <system>]
hares -modify <resource> <attr> -add <value> ... [-sys <system>]
hares -modify <resource> <attr> -delete -keys [-sys <system>]
NOTE: You cannot delete an individual element of a VECTOR
KEYLIST:
hares -modify <resource> <attr> <key> ... [-sys <system>]
hares -modify <resource> <attr> -add <key> ... [-sys <system>]
hares -modify <resource> <attr> -delete <key> ... [-sys <system>]
hares -modify <resource> <attr> -delete -keys [-sys <system>]
ASSOCIATION:
hares -modify <resource> <attr> {<key> <value>} ... [-sys <system>]
hares -modify <resource> <attr> -add {<key> <value>} ... [-sys <system>]
hares -modify <resource> <attr> -update {<key> <value>} ...
[-sys <system>]
hares -modify <resource> <attr> -delete <key> ... [-sys <system>]
hares -modify <resource> <attr> -delete -keys [-sys <system>]
# hagrp -help
usage:
hagrp -add <group>
hagrp -delete <group>
hagrp -link <group> <group> <relationship>
hagrp -unlink <group> <group>
hagrp -online <group> -sys <system>
hagrp -offline <group> -sys <system>
hagrp -state <group> -sys <system>
hagrp -switch <group> -to <system>
hagrp -freeze <group> [-persistent]
hagrp -unfreeze <group> [-persistent]
hagrp -enable <group> [-sys <system>]
hagrp -disable <group> [-sys <system>]
hagrp -display [<group>]
hagrp -resources <group>
hagrp -list
hagrp -enableresources <group>
hagrp -disableresources <group>
hagrp -flush <group> -sys <system>
hagrp -autoenable <group> -sys <system>
hagrp [-help [-modify]]
hagrp [-help [-link]]
# hagrp -modify -help
usage:
SCALAR:
hagrp -modify <group> <attr> <value> [-sys <system>]
VECTOR:
hagrp -modify <group> <attr> <value> ... [-sys <system>]
hagrp -modify <group> <attr> -add <value> ... [-sys <system>]
hagrp -modify <group> <attr> -delete -keys [-sys <system>]
NOTE: You cannot delete an individual element of a VECTOR
KEYLIST:
hagrp -modify <group> <attr> <key> ... [-sys <system>]
hagrp -modify <group> <attr> -add <key> ... [-sys <system>]
hagrp -modify <group> <attr> -delete <key> ... [-sys <system>]
hagrp -modify <group> <attr> -delete -keys [-sys <system>]
ASSOCIATION:
hagrp -modify <group> <attr> {<key> <value>} ... [-sys <system>]
hagrp -modify <group> <attr> -add {<key> <value>} ... [-sys <system> ]
hagrp -modify <group> <attr> -update {<key> <value>} ... [-sys <system>]
hagrp -modify <group> <attr> -delete <key> ... [-sys <system>]
hagrp -modify <group> <attr> -delete -keys [-sys <system>]
# hagrp -link -help
usage:
hagrp -link <group1> <group2> online global
hagrp -link <group1> <group2> online local
hagrp -link <group1> <group2> online remote
hagrp -link <group1> <group2> offline local

# hastatus: usage:
hastatus [-sound]
hastatus -summary
hastatus [-sound] -group <group> [ -group <group> ... ]
# haconf: usage:
haconf -makerw
haconf -dump [-makero]
haconf -help

GUI:
Its own package, can be installed anywhere
/opt/VRSTvcs/bin/hagui
/optVRTSvcs/bin/hauser -help

usage:
hauser -add <username>
hauser -update <username>
hauser -delete <username>
hauser -help
# hadebug -help
usage:
hadebug -handle
hadebug -hash [hashname]
hadebug -memory
hadebug -ping
hadebug -startmatch
hadebug -stopmatch
hadebug -time
hadebug -help
# haagent -help
usage:
haagent -start <agent> -sys <system>
haagent -stop <agent> -sys <system>
haagent -display [<agent>]
haagent -list
haagent [-help]
# hasys -help
usage:
hasys -add <sys>
hasys -delete <sys>
hasys -freeze [-persistent] [-evacuate] <sys>
hasys -unfreeze [-persistent] <sys>
hasys -display [<sys>]
hasys -force <sys>
hasys -load <sys> <value>
hasys -state <sys>
hasys -list
hasys -nodeid [<nodeid>]
hasys [-help [-modify]]

WIZARDS:
/opt/VRTSvcs/wizards/config/quick_start: set up llt+GAB
/opt/VRTSvcs/wizards/services/quick_nfs: sets up 2 NFS-groups

1 comment:

Anonymous said...

Very Thanks for this post. :-)