Check & Maintain clustering on RHEL
CLUSTER COMMANDS
Check Cluster Status
Check cluster status clustat
Check cluster status every x seconds clustat -i [interval]
Check cluster status with extra detail clustat -l
Check configured nodes and their votes ccs_tool lsnode
Display cluster nodes cman_tool
Cluster Resource Group Administration
Restart Resource in place clusvcadm -R [resource name]
Relocate Resource to another member clusvcadm -r [resource name] -m [member name]
Disable Resource clusvcadm -d [resource name]
Enable Resource clusvcadm -e [resource name]
Freeze Resource clusvcadm -Z [resource name]
Unfreeze Resource clusvcadm -U [resource name]
Testing Cluster Services
Status resource in test mode rg_test test [cluster config] status service [resource-name]
Start resource in test mode rg_test test [cluster config] start service [resource-name]
Stop resource in test mode rg_test test [cluster config] stop service [resource-name]
Cluster Software
cman – Red Hat Cluster Manager
lvm2-cluster – Cluster extensions for userland logical volume management tools
rgmanager – Open Source HA Resource Group Failover for Red Hat Cluster
ricci – Remote Cluster and Storage Management System
Cluster Services
Cluster Manager (cman) Manages the following services qdiskd, fenced, dlm_controld and gfs_controld
Cluster Logical Volume Manager (clvmd)
Cluster Resource Manager (rgmanager)
Cluster Management & Configuration Daemon (ricci)
Cluster Configuration Files
Cluster configuration file /etc/cluster/cluster.conf
Cluster resource scripts /usr/share/cluster/
Cluster Log File Locations
Resource Manager Log /var/log/cluster/rgmanager.log
Fencing Log /var/log/cluster/fenced.log
Quorum Disk Log /var/log/cluster/qdiskd.log
GFS Log /var/log/cluster/gfs_controld.log
DLM Log /var/log/cluster/dlm_controld.log
Corosysnc Log /var/log/cluster/corosync.log
Start ricci on both nodes and join cluster from GUI
[Stop Cluster] [Start Cluster]
# service luci stop # service cman start
# service ricci stop # service rgmanager start
# service rgmanager stop # service ricci start
# service cman stop # service luci start
How to perform maintenance such as offline PM and Cluster failover/failback test
Procedures:
clustat
/etc/cluster/cluster.conf
stop applications and database
stops process on both hosts:
clusvcadm -d mainservicegroup
service rgmanager stop
service cman stop
service corosync stop
disable cluster from starting on reboot
chkconfig/systemctl rgmanager off/disable
chkconfig/systemctl cman off/disable
chkconfig/systemctl corosync off/disable
no database should be loaded & work on the maintenance!
clustat
df -h
yum -y update
reboot
start process on both hosts:
service cman start
service rgmanager start
enable cluster to restart upon reboot
chkconfig/systemctl rgmanager on/enable
chkconfig/systemctl cman on/enable
chkconfig/systemctl corosync on/enable
clusvcadm -e mainservicegroup <— Enable on prod first
Failover & Failback test
Relocate Resource to another member (for Physical Server)
clusvcadm -r [resource name] -m [member name]
Migrate Resource to another member (for Virtual Server)
clusvcadm -M [resource name] -m [member name]
from primaryNode:
Enable Resource
clusvcadm -r mainservicegroup -m secondaryNode
from secondaryNode:
Enable Resource
clusvcadm -r mainservicegroup -m primaryNode