RHCS Stable 3 Tutorial - Multinode VM Cluster
AN!Wiki :: How To :: RHCS Stable 3 Tutorial - Multinode VM Cluster |
Warning: This document is very much a work in progress. And and all data here could be wrong, inaccurate and missing important bits of information. In fact, it's little more than a dumping grounds for my notes. You really don't want to take anything below seriously until this warning has been removed. |
Overview
This tutorial will walk you through building two distinct clusters:
- . A 2-Node cluster using DRBD for real-time replicated storage backing an iSCSI SAN using Pacemaker for high availability.
- . A 5-Node cluster hosting KVM virtual servers, each VM hosted on a dedicated LUN from the SAN cluster, using Pacemaker for high availability.
Pacemaker
notes:
- Create a multicast calculator.
- ais_addr is set in beekhof's scriptlet to use the last interface on the system. Manually choose the BCN interface IP.
- Check the ais_* values with `env | grep ais_`.
- Does pacemaker 1.1 in EL6 support a second ring?
- for f in /etc/corosync/corosync.conf /etc/corosync/service.d/pcmk /etc/hosts; do scp $f pcmk-2:$f ; done
yum install pacemaker
cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
vim /etc/corosync/corosync.conf
Add/edit the following three lines to the 'interface { }' section:
interface {
ringnumber: 0
# Interface to use for cluster comms (BCN).
bindnetaddr: 192.168.3.0
# Multicast IP used for CPG. Must be unique per cluster.
mcastaddr: 226.94.1.1
# Multicast TCP port. Must be unique per ring.
mcastport: 4000
ttl: 1
}
Create the pacemaker service file.
vim /etc/corosync/service.d/pcmk
service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 1
}
Copy the two files to the other node.
rsync -av /etc/corosync/service.d/pcmk root@an-node02:/etc/corosync/service.d/
sending incremental file list
pcmk
sent 178 bytes received 31 bytes 59.71 bytes/sec
total size is 106 speedup is 0.51
rsync -av /etc/corosync/corosync.conf root@an-node02:/etc/corosync/
sending incremental file list
corosync.conf
sent 526 bytes received 31 bytes 1114.00 bytes/sec
total size is 445 speedup is 0.80
Start the cluster
/etc/init.d/ntpdate start
Starting Corosync Cluster Engine (corosync): [ OK ]
In the log file of the first node to start (second machine joining can be seen at the end):
May 28 01:11:58 an-node02 corosync[2366]: [MAIN ] Corosync Cluster Engine ('1.2.3'): started and ready to provide service.
May 28 01:11:58 an-node02 corosync[2366]: [MAIN ] Corosync built-in features: nss dbus rdma snmp
May 28 01:11:58 an-node02 corosync[2366]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
May 28 01:11:58 an-node02 corosync[2366]: [TOTEM ] Initializing transport (UDP/IP Multicast).
May 28 01:11:58 an-node02 corosync[2366]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 28 01:11:58 an-node02 corosync[2366]: [TOTEM ] The network interface [192.168.3.72] is now up.
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: process_ais_conf: Reading configure
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: config_find_init: Local handle: 2013064636357672963 for logging
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: config_find_next: Processing additional logging options...
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Found 'off' for option: debug
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Found 'yes' for option: to_logfile
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Found '/var/log/cluster/corosync.log' for option: logfile
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Found 'yes' for option: to_syslog
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: config_find_init: Local handle: 4730966301143465988 for quorum
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: config_find_next: No additional configuration supplied for: quorum
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: No default for option: provider
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: config_find_init: Local handle: 7739444317642555397 for service
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: config_find_next: Processing additional service options...
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Found '1' for option: ver
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: process_ais_conf: Enabling MCP mode: Use the Pacemaker init script to complete Pacemaker startup
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Defaulting to 'no' for option: use_logd
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: get_config_opt: Defaulting to 'no' for option: use_mgmtd
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: pcmk_startup: CRM: Initialized
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] Logging: Initialized pcmk_startup
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: pcmk_startup: Service: 10
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: pcmk_startup: Local hostname: an-node02.alteeve.com
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: pcmk_update_nodeid: Local node id: 1208199360
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: update_member: Creating entry for node 1208199360 born on 0
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: update_member: 0x199aa40 Node 1208199360 now known as an-node02.alteeve.com (was: (null))
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: update_member: Node an-node02.alteeve.com now has 1 quorum votes (was 0)
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: update_member: Node 1208199360/an-node02.alteeve.com is now: member
May 28 01:11:58 an-node02 corosync[2366]: [SERV ] Service engine loaded: Pacemaker Cluster Manager 1.1.5
May 28 01:11:58 an-node02 corosync[2366]: [SERV ] Service engine loaded: corosync extended virtual synchrony service
May 28 01:11:58 an-node02 corosync[2366]: [SERV ] Service engine loaded: corosync configuration service
May 28 01:11:58 an-node02 corosync[2366]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
May 28 01:11:58 an-node02 corosync[2366]: [SERV ] Service engine loaded: corosync cluster config database access v1.01
May 28 01:11:58 an-node02 corosync[2366]: [SERV ] Service engine loaded: corosync profile loading service
May 28 01:11:58 an-node02 corosync[2366]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
May 28 01:11:58 an-node02 corosync[2366]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine.
May 28 01:11:58 an-node02 corosync[2366]: [TOTEM ] Process pause detected for 521 ms, flushing membership messages.
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 4: memb=0, new=0, lost=0
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 4: memb=1, new=1, lost=0
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: pcmk_peer_update: NEW: an-node02.alteeve.com 1208199360
May 28 01:11:58 an-node02 corosync[2366]: [pcmk ] info: pcmk_peer_update: MEMB: an-node02.alteeve.com 1208199360
May 28 01:11:58 an-node02 corosync[2366]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
May 28 01:11:58 an-node02 corosync[2366]: [CPG ] downlist received left_list: 0
May 28 01:11:58 an-node02 corosync[2366]: [CPG ] chosen downlist from node r(0) ip(192.168.3.72)
May 28 01:11:58 an-node02 corosync[2366]: [MAIN ] Completed service synchronization, ready to provide service.
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 8: memb=1, new=0, lost=0
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: pcmk_peer_update: memb: an-node02.alteeve.com 1208199360
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 8: memb=2, new=1, lost=0
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: update_member: Creating entry for node 1191422144 born on 8
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: update_member: Node 1191422144/unknown is now: member
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 1191422144
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 1191422144
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: pcmk_peer_update: MEMB: an-node02.alteeve.com 1208199360
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: send_member_notification: Sending membership update 8 to 0 children
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: update_member: 0x199aa40 Node 1208199360 ((null)) born on: 8
May 28 01:12:07 an-node02 corosync[2366]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: update_member: 0x19a2c30 Node 1191422144 (an-node01.alteeve.com) born on: 8
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: update_member: 0x19a2c30 Node 1191422144 now known as an-node01.alteeve.com (was: (null))
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: update_member: Node an-node01.alteeve.com now has 1 quorum votes (was 0)
May 28 01:12:07 an-node02 corosync[2366]: [pcmk ] info: send_member_notification: Sending membership update 8 to 0 children
May 28 01:12:07 an-node02 corosync[2366]: [CPG ] downlist received left_list: 0
May 28 01:12:07 an-node02 corosync[2366]: [CPG ] downlist received left_list: 0
May 28 01:12:07 an-node02 corosync[2366]: [CPG ] chosen downlist from node r(0) ip(192.168.3.72)
May 28 01:12:07 an-node02 corosync[2366]: [MAIN ] Completed service synchronization, ready to provide service.
In the log file of the second node to start (joins the existing cluster):
May 28 01:12:06 an-node01 corosync[2404]: [MAIN ] Corosync Cluster Engine ('1.2.3'): started and ready to provide service.
May 28 01:12:06 an-node01 corosync[2404]: [MAIN ] Corosync built-in features: nss dbus rdma snmp
May 28 01:12:06 an-node01 corosync[2404]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
May 28 01:12:06 an-node01 corosync[2404]: [TOTEM ] Initializing transport (UDP/IP Multicast).
May 28 01:12:06 an-node01 corosync[2404]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 28 01:12:06 an-node01 corosync[2404]: [TOTEM ] The network interface [192.168.3.71] is now up.
May 28 01:12:06 an-node01 corosync[2404]: [pcmk ] info: process_ais_conf: Reading configure
May 28 01:12:06 an-node01 corosync[2404]: [pcmk ] info: config_find_init: Local handle: 2013064636357672963 for logging
May 28 01:12:06 an-node01 corosync[2404]: [pcmk ] info: config_find_next: Processing additional logging options...
May 28 01:12:06 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Found 'off' for option: debug
May 28 01:12:06 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Found 'yes' for option: to_logfile
May 28 01:12:06 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Found '/var/log/cluster/corosync.log' for option: logfile
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Found 'yes' for option: to_syslog
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: config_find_init: Local handle: 4730966301143465988 for quorum
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: config_find_next: No additional configuration supplied for: quorum
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: No default for option: provider
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: config_find_init: Local handle: 7739444317642555397 for service
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: config_find_next: Processing additional service options...
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Found '1' for option: ver
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: process_ais_conf: Enabling MCP mode: Use the Pacemaker init script to complete Pacemaker startup
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Defaulting to 'no' for option: use_logd
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: get_config_opt: Defaulting to 'no' for option: use_mgmtd
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_startup: CRM: Initialized
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] Logging: Initialized pcmk_startup
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_startup: Service: 10
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_startup: Local hostname: an-node01.alteeve.com
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_update_nodeid: Local node id: 1191422144
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: Creating entry for node 1191422144 born on 0
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: 0x101ba30 Node 1191422144 now known as an-node01.alteeve.com (was: (null))
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: Node an-node01.alteeve.com now has 1 quorum votes (was 0)
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: Node 1191422144/an-node01.alteeve.com is now: member
May 28 01:12:07 an-node01 corosync[2404]: [SERV ] Service engine loaded: Pacemaker Cluster Manager 1.1.5
May 28 01:12:07 an-node01 corosync[2404]: [SERV ] Service engine loaded: corosync extended virtual synchrony service
May 28 01:12:07 an-node01 corosync[2404]: [SERV ] Service engine loaded: corosync configuration service
May 28 01:12:07 an-node01 corosync[2404]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
May 28 01:12:07 an-node01 corosync[2404]: [SERV ] Service engine loaded: corosync cluster config database access v1.01
May 28 01:12:07 an-node01 corosync[2404]: [SERV ] Service engine loaded: corosync profile loading service
May 28 01:12:07 an-node01 corosync[2404]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
May 28 01:12:07 an-node01 corosync[2404]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine.
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 8: memb=0, new=0, lost=0
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 8: memb=2, new=2, lost=0
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_peer_update: NEW: an-node01.alteeve.com 1191422144
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: Creating entry for node 1208199360 born on 8
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: Node 1208199360/unknown is now: member
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 1208199360
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_peer_update: MEMB: an-node01.alteeve.com 1191422144
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 1208199360
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: send_member_notification: Sending membership update 8 to 0 children
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: 0x101ba30 Node 1191422144 ((null)) born on: 8
May 28 01:12:07 an-node01 corosync[2404]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: 0x1022800 Node 1208199360 (an-node02.alteeve.com) born on: 8
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: 0x1022800 Node 1208199360 now known as an-node02.alteeve.com (was: (null))
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: update_member: Node an-node02.alteeve.com now has 1 quorum votes (was 0)
May 28 01:12:07 an-node01 corosync[2404]: [pcmk ] info: send_member_notification: Sending membership update 8 to 0 children
May 28 01:12:07 an-node01 corosync[2404]: [CPG ] downlist received left_list: 0
May 28 01:12:07 an-node01 corosync[2404]: [CPG ] downlist received left_list: 0
May 28 01:12:07 an-node01 corosync[2404]: [CPG ] chosen downlist from node r(0) ip(192.168.3.72)
May 28 01:12:07 an-node01 corosync[2404]: [MAIN ] Completed service synchronization, ready to provide service.
cmirror
cmirror - DOC-55285 - Jonathan Brassow - visegrips - #lvm
Any questions, feedback, advice, complaints or meanderings are welcome. | |||
Alteeve's Niche! | Enterprise Support: Alteeve Support |
Community Support | |
© Alteeve's Niche! Inc. 1997-2024 | Anvil! "Intelligent Availability®" Platform | ||
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions. |