Check & Maintain clustering on RHEL
 
CLUSTER COMMANDS
 
Check Cluster Status
Check cluster status                                         clustat 
Check cluster status every x seconds         clustat -i [interval] 
Check cluster status with extra detail         clustat -l 
Check configured nodes and their votes    ccs_tool lsnode 
Display cluster nodes                                       cman_tool 
 
Cluster Resource Group Administration
Restart Resource in place                            clusvcadm -R [resource name] 
Relocate Resource to another member    clusvcadm -r [resource name] -m [member name] 
Disable Resource                                            clusvcadm -d [resource name] 
Enable Resource                                             clusvcadm -e [resource name] 
Freeze Resource                                             clusvcadm -Z [resource name] 
Unfreeze Resource                                         clusvcadm -U [resource name] 
 
Testing Cluster Services
Status resource in test mode                      rg_test test [cluster config] status service [resource-name] 
Start resource in test mode                         rg_test test [cluster config] start service [resource-name] 
Stop resource in test mode                          rg_test test [cluster config] stop service [resource-name] 
 
Cluster Software
cman                    – Red Hat Cluster Manager
lvm2-cluster       – Cluster extensions for userland logical volume management tools
rgmanager          – Open Source HA Resource Group Failover for Red Hat Cluster
ricci                       – Remote Cluster and Storage Management System 
 
Cluster Services
Cluster Manager  (cman) Manages the following services qdiskd, fenced, dlm_controld and gfs_controld
Cluster Logical Volume Manager  (clvmd)                 
Cluster Resource Manager  (rgmanager)  
Cluster Management & Configuration Daemon  (ricci)
 
Cluster Configuration Files
Cluster configuration file                               /etc/cluster/cluster.conf 
Cluster resource scripts                                 /usr/share/cluster/ 
 
Cluster Log File Locations
Resource Manager Log                                               /var/log/cluster/rgmanager.log 
Fencing Log                                                                    /var/log/cluster/fenced.log 
Quorum Disk Log                                                           /var/log/cluster/qdiskd.log 
GFS Log                                                                            /var/log/cluster/gfs_controld.log 
DLM Log                                                                          /var/log/cluster/dlm_controld.log 
Corosysnc Log                                                                /var/log/cluster/corosync.log 
 
Start ricci on both nodes and join cluster from GUI
[Stop Cluster]                                                  [Start Cluster]                    
# service luci stop                                           # service cman start
# service ricci stop                                          # service rgmanager start
# service rgmanager stop                             # service ricci start
# service cman stop                                       # service luci start
 
 
 
 
How to perform maintenance such as offline PM and Cluster failover/failback test
Procedures:
 
clustat
/etc/cluster/cluster.conf 
 
stop applications and database
 
stops process on both hosts:
clusvcadm -d mainservicegroup
service rgmanager stop                 
service cman stop
service corosync stop                                    
 
disable cluster from starting on reboot
chkconfig/systemctl rgmanager off/disable
chkconfig/systemctl cman off/disable
chkconfig/systemctl corosync off/disable
 
no database should be loaded & work on the maintenance!
clustat
df -h
yum -y update
reboot
 
start process on both hosts:
service cman start
service rgmanager start
 
enable cluster to restart upon reboot
chkconfig/systemctl rgmanager on/enable
chkconfig/systemctl cman on/enable
chkconfig/systemctl corosync on/enable
clusvcadm -e mainservicegroup   <— Enable on prod first
 
 
 
 
Failover & Failback test
Relocate Resource to another member (for Physical Server)
clusvcadm -r [resource name] -m [member name]
 
Migrate Resource to another member (for Virtual Server)
clusvcadm -M [resource name] -m [member name]
 
from primaryNode:
Enable Resource 
               clusvcadm -r mainservicegroup -m secondaryNode
 
from secondaryNode:
Enable Resource 
               clusvcadm -r mainservicegroup -m primaryNode