Anvil! Tutorial 3 on EL6: Difference between revisions

From Alteeve Wiki
Jump to navigation Jump to search
 
(70 intermediate revisions by the same user not shown)
Line 48: Line 48:
|-
|-
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vi /etc/yum.repos.d/an.repo
cat <<-END>/etc/yum.repos.d/an.repo
</syntaxhighlight>
<syntaxhighlight lang="bash">
[an-repo]
[an-repo]
name=AN! Repo for Anvil! stuff
name=AN! Repo for Anvil! stuff
Line 57: Line 55:
gpgcheck=0
gpgcheck=0
protect=1
protect=1
END
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 62: Line 61:
</syntaxhighlight>
</syntaxhighlight>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vi /etc/yum.repos.d/an.repo
cat <<-END>/etc/yum.repos.d/an.repo
</syntaxhighlight>
<syntaxhighlight lang="bash">
[an-repo]
[an-repo]
name=AN! Repo for Anvil! stuff
name=AN! Repo for Anvil! stuff
Line 71: Line 68:
gpgcheck=0
gpgcheck=0
protect=1
protect=1
END
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 91: Line 89:
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
yum -y update
yum -y update
yum -y install bridge-utils vim
yum -y install bridge-utils vim pacemaker corosync cman gfs2-utils \
              ccs pcs ipmitool OpenIPMI lvm2-cluster drbd84-utils \
              drbd84-kmod
chkconfig ipmi on
chkconfig acpid off
chkconfig kdump off
chkconfig drbd off
/etc/init.d/ipmi start
/etc/init.d/acpid stop
/etc/init.d/kdump stop
/etc/init.d/drbd stop
</syntaxhighlight>
</syntaxhighlight>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
yum -y update
# same as an-a04n01
yum -y install bridge-utils vim
</syntaxhighlight>
</syntaxhighlight>
|}
|}
Line 115: Line 122:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Internet-Facing Network - Bridge
DEVICE="ifn-bridge1"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="10.255.40.1"
NETMASK="255.255.0.0"
GATEWAY="10.255.255.254"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
DEFROUTE="yes"
</syntaxhighlight>
</syntaxhighlight>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
Line 120: Line 137:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Internet-Facing Network - Bridge
DEVICE="ifn-bridge1"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="10.255.40.2"
NETMASK="255.255.0.0"
GATEWAY="10.255.255.254"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
DEFROUTE="yes"
</syntaxhighlight>
</syntaxhighlight>
|}
|}
Line 127: Line 154:
{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a04n01</span>
!<span class="code">an-a04n01</span>
!<span class="code">an-a04n02</span>
|-
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
</syntaxhighlight>
# Internet-Facing Network - Bond
DEVICE="ifn-bond1"
BRIDGE="ifn-bridge1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn-link1"
</syntaxhighlight>
|-
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Internet-Facing Network - Bond
DEVICE="ifn-bond1"
BRIDGE="ifn-bridge1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn-link1"
</syntaxhighlight>
</syntaxhighlight>
|}
|}
Line 151: Line 192:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Internet-Facing Network - Link 1
HWADDR="00:1B:21:81:C3:34"
DEVICE="ifn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="ifn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 156: Line 205:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Internet-Facing Network - Link 2
HWADDR="A0:36:9F:02:E0:05"
DEVICE="ifn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="ifn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
Line 161: Line 218:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
</syntaxhighlight>
# Internet-Facing Network - Link 1
<syntaxhighlight lang="bash">
HWADDR="00:1B:21:81:C2:EA"
DEVICE="ifn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="ifn-bond1"
SLAVE="yes"
</syntaxhighlight>
<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn-link2
vim /etc/sysconfig/network-scripts/ifcfg-ifn-link2
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Internet-Facing Network - Link 2
HWADDR="A0:36:9F:07:D6:2F"
DEVICE="ifn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="ifn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
|}
|}
Line 173: Line 246:
{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a04n01</span>
!<span class="code">an-a04n01</span>
!<span class="code">an-a04n02</span>
|-
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Storage Network - Bond
DEVICE="sn-bond1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn-link1"
IPADDR="10.10.40.1"
NETMASK="255.255.0.0"
</syntaxhighlight>
</syntaxhighlight>
|-
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
</syntaxhighlight>
# Storage Network - Bond
|}
DEVICE="sn-bond1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn-link1"
IPADDR="10.10.40.2"
NETMASK="255.255.0.0"
</syntaxhighlight>
|}


* [[SN]] Links
* [[SN]] Links
Line 197: Line 286:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Storage Network - Link 1
HWADDR="00:19:99:9C:9B:9F"
DEVICE="sn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="sn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 202: Line 299:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Storage Network - Link 2
HWADDR="A0:36:9F:02:E0:04"
DEVICE="sn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="sn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
Line 207: Line 312:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Storage Network - Link 1
HWADDR="00:19:99:9C:A0:6D"
DEVICE="sn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="sn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 212: Line 325:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
</syntaxhighlight>
# Storage Network - Link 2
|}
HWADDR="A0:36:9F:07:D6:2E"
DEVICE="sn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="sn-bond1"
SLAVE="yes"
</syntaxhighlight>
|}


* [[BCN]] Bond
* [[BCN]] Bond
Line 219: Line 340:
{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a04n01</span>
!<span class="code">an-a04n01</span>
!<span class="code">an-a04n02</span>
|-
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Back-Channel Network - Bond
DEVICE="bcn-bond1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn-link1"
IPADDR="10.20.40.1"
NETMASK="255.255.0.0"
</syntaxhighlight>
</syntaxhighlight>
|-
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Back-Channel Network - Bond
DEVICE="bcn-bond1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn-link1"
IPADDR="10.20.40.2"
NETMASK="255.255.0.0"
</syntaxhighlight>
</syntaxhighlight>
|}
|}
Line 243: Line 380:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
</syntaxhighlight>
# Back-Channel Network - Link 1
<syntaxhighlight lang="bash">
HWADDR="00:19:99:9C:9B:9E"
DEVICE="bcn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="bcn-bond1"
SLAVE="yes"
</syntaxhighlight>
<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-bcn-link2
vim /etc/sysconfig/network-scripts/ifcfg-bcn-link2
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Back-Channel Network - Link 2
HWADDR="00:1B:21:81:C3:35"
DEVICE="bcn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="bcn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
Line 253: Line 406:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Back-Channel Network - Link 1
HWADDR="00:19:99:9C:A0:6C"
DEVICE="bcn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="bcn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
Line 258: Line 419:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Back-Channel Network - Link 2
HWADDR="00:1B:21:81:C2:EB"
DEVICE="bcn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="bcn-bond1"
SLAVE="yes"
</syntaxhighlight>
</syntaxhighlight>
|}
|}
Line 269: Line 438:
Please be aware that this can reduce security. If this is a concern, skip this step.
Please be aware that this can reduce security. If this is a concern, skip this step.


<syntaxhighlight lang="bash">
 
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
sed -i.anvil 's/#GSSAPIAuthentication no/GSSAPIAuthentication no/' /etc/ssh/sshd_config
sed -i.anvil 's/#GSSAPIAuthentication no/GSSAPIAuthentication no/' /etc/ssh/sshd_config
sed -i 's/GSSAPIAuthentication yes/#GSSAPIAuthentication yes/' /etc/ssh/sshd_config
sed -i 's/GSSAPIAuthentication yes/#GSSAPIAuthentication yes/' /etc/ssh/sshd_config
Line 277: Line 449:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="diff">
<syntaxhighlight lang="diff">
--- /etc/ssh/sshd_config.anvil 2013-11-08 09:17:23.000000000 -0500
--- /etc/ssh/sshd_config.anvil 2013-09-30 03:08:17.000000000 -0400
+++ /etc/ssh/sshd_config 2014-04-03 00:01:40.980951975 -0400
+++ /etc/ssh/sshd_config 2014-05-28 00:35:30.954000741 -0400
@@ -89,8 +89,8 @@
@@ -77,8 +77,8 @@
  #KerberosUseKuserok yes
  #KerberosUseKuserok yes
   
   
Line 290: Line 462:
  GSSAPICleanupCredentials yes
  GSSAPICleanupCredentials yes
  #GSSAPIStrictAcceptorCheck yes
  #GSSAPIStrictAcceptorCheck yes
@@ -127,7 +127,7 @@
@@ -119,7 +119,7 @@
  #ClientAliveInterval 0
  #ClientAliveInterval 0
  #ClientAliveCountMax 3
  #ClientAliveCountMax 3
Line 300: Line 472:
  #PermitTunnel no
  #PermitTunnel no
</syntaxhighlight>
</syntaxhighlight>
 
|-
Subsequent logins when the net is down should be quick.
!<span class="code">an-a04n02</span>
 
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
=== Configuring the network ===
sed -i.anvil 's/#GSSAPIAuthentication no/GSSAPIAuthentication no/' /etc/ssh/sshd_config
 
sed -i 's/GSSAPIAuthentication yes/#GSSAPIAuthentication yes/' /etc/ssh/sshd_config
<span class="code"></span>
sed -i 's/#UseDNS yes/UseDNS no/' /etc/ssh/sshd_config
<syntaxhighlight lang="bash">
systemctl restart sshd.service
</syntaxhighlight>
diff -u /etc/ssh/sshd_config.anvil /etc/ssh/sshd_config
 
 
Enable the <span class="code">eth0</span> interface on boot.
 
<syntaxhighlight lang="bash">
sed -i.bak 's/ONBOOT=.*/ONBOOT="yes"/' /etc/sysconfig/network-scripts/ifcfg-eth0
diff -U0 /etc/sysconfig/network-scripts/ifcfg-eth0.bak /etc/sysconfig/network-scripts/ifcfg-eth0
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="diff">
<syntaxhighlight lang="diff">
--- /etc/sysconfig/network-scripts/ifcfg-eth0.bak 2014-01-23 16:15:45.008085032 -0500
--- /etc/ssh/sshd_config.anvil 2013-09-30 03:08:17.000000000 -0400
+++ /etc/sysconfig/network-scripts/ifcfg-eth0 2014-01-23 16:15:25.573009623 -0500
+++ /etc/ssh/sshd_config 2014-05-28 00:35:33.016999110 -0400
@@ -11 +11 @@
@@ -77,8 +77,8 @@
-ONBOOT=no
#KerberosUseKuserok yes
+ONBOOT="yes"
</syntaxhighlight>
# GSSAPI options
 
-#GSSAPIAuthentication no
If you want to make any other changes, like configuring the interface to have a static IP, do so now. Once you're done editing;
-GSSAPIAuthentication yes
 
+GSSAPIAuthentication no
<syntaxhighlight lang="bash">
+#GSSAPIAuthentication yes
nmcli connection reload
#GSSAPICleanupCredentials yes
systemctl restart NetworkManager.service
GSSAPICleanupCredentials yes
ip addr show
#GSSAPIStrictAcceptorCheck yes
</syntaxhighlight>
@@ -119,7 +119,7 @@
<syntaxhighlight lang="text">
#ClientAliveInterval 0
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
#ClientAliveCountMax 3
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
#ShowPatchLevel no
    inet 127.0.0.1/8 scope host lo
-#UseDNS yes
      valid_lft forever preferred_lft forever
+UseDNS no
    inet6 ::1/128 scope host
#PidFile /var/run/sshd.pid
      valid_lft forever preferred_lft forever
#MaxStartups 10:30:100
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
#PermitTunnel no
    link/ether 52:54:00:a7:9d:17 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.201/24 scope global eth0
      valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fea7:9d17/64 scope link
      valid_lft forever preferred_lft forever
</syntaxhighlight>
</syntaxhighlight>
|}


The interface should now start on boot properly.
Subsequent logins when the net is down should be quick.


== Setting the Hostname ==
== Setting the Hostname ==


Fedora 19 is '''very''' different from [[EL6]].
TODO


{{note|1=The '<span class="code">--pretty</span>' line currently doesn't work as there is [https://bugzilla.redhat.com/show_bug.cgi?id=895299 a bug (rhbz#895299)] with single-quotes.}}
== Setup The hosts File ==
{{note|1=The '<span class="code">--static</span>' option is currently needed to prevent the '<span class="code">.</span>' from being removed. See [https://bugzilla.redhat.com/show_bug.cgi?id=896756 this bug (rhbz#896756)].}}


Use a format that works for you. For the tutorial, node names are based on the following;
You can use [[DNS]] if you prefer. For now, lets use <span class="code">/etc/hosts</span> for node name resolution.
* A two-letter prefix identifying the company/user (<span class="code">an</span>, for "Alteeve's Niche!")
* A sequential Anvil! ID number in the form of <span class="code">aXX</span> (<span class="code">a01</span> for "Anvil! 01", <span class="code">a02</span> for Anvil! 02, etc)
* A sequential node ID number in the form of <span class="code">nYY</span>


In our case, this is my third Anvil! and we use the company prefix <span class="code">an</span>, so these two nodes will be;
{|class="wikitable"
* <span class="code">an-a03n01</span> - node 1
!<span class="code">an-a04n01</span>
* <span class="code">an-a03n02</span> - node 2
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
 
vim /etc/hosts
<syntaxhighlight lang="bash">
hostnamectl set-hostname an-a03n01.alteeve.ca --static
hostnamectl set-hostname --pretty "Alteeve's Niche! - Anvil! 03, Node 01"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4
::1        localhost localhost.localdomain localhost6 localhost6.localdomain6


If you want the new host name to take effect immediately, you can use the traditional <span class="code">hostname</span> command:
### Anvil! systems
# Anvil! 03, Node 01
10.20.40.1 an-a04n01.bcn an-a04n01 an-a04n01.alteeve.ca
10.20.41.1 an-a04n01.ipmi
10.10.40.1 an-a04n01.sn
10.255.40.1 an-a04n01.ifn


<syntaxhighlight lang="bash">
# Anvil! 03, Node 02
hostname an-a03n01.alteeve.ca
10.20.40.2 an-a04n02.bcn an-a04n02 an-a04n02.alteeve.ca
</syntaxhighlight>
10.20.41.2 an-a04n02.ipmi
10.10.40.2 an-a04n02.sn
10.255.40.2 an-a04n02.ifn


The "pretty" host name is stored in <span class="code">/etc/machine-info</span> as the unquoted value for the <span class="code">PRETTY_HOSTNAME</span> value.
### Foundation Pack
 
# Network Switches
<syntaxhighlight lang="bash">
10.20.1.1 an-s01 an-s01.alteeve.ca
vim /etc/machine-info
10.20.1.2 an-s02 an-s02.alteeve.ca # Only accessible when out of the stack
# Switched PDUs
10.20.2.1 an-p01 an-p01.alteeve.ca
10.20.2.2 an-p02 an-p02.alteeve.ca
# Network-monitored UPSes
10.20.3.1 an-u01 an-u01.alteeve.ca
10.20.3.2 an-u02 an-u02.alteeve.ca
### Monitor Packs
10.20.4.1 an-m01 an-m01.alteeve.ca
10.255.4.1 an-m01.ifn
10.20.4.2 an-m02 an-m02.alteeve.ca
10.255.4.2 an-m02.ifn
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
|-
PRETTY_HOSTNAME=Alteeves Niche! - Anvil! 01, Node 01
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/hosts
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4
::1        localhost localhost.localdomain localhost6 localhost6.localdomain6


If you can't get the <span class="code">hostname</span> command to work for some reason, you can reboot to have the system read the new values.
### Anvil! systems
# Anvil! 03, Node 01
10.20.40.1 an-a04n01.bcn an-a04n01 an-a04n01.alteeve.ca
10.20.41.1 an-a04n01.ipmi
10.10.40.1 an-a04n01.sn
10.255.40.1 an-a04n01.ifn


== Network ==
# Anvil! 03, Node 02
10.20.40.2 an-a04n02.bcn an-a04n02 an-a04n02.alteeve.ca
10.20.41.2 an-a04n02.ipmi
10.10.40.2 an-a04n02.sn
10.255.40.2 an-a04n02.ifn


We want static, named network devices. Follow this;
### Foundation Pack
 
# Network Switches
* [[Changing Ethernet Device Names in EL7 and Fedora 15+]]
10.20.1.1 an-s01 an-s01.alteeve.ca
10.20.1.2 an-s02 an-s02.alteeve.ca # Only accessible when out of the stack
# Switched PDUs
10.20.2.1 an-p01 an-p01.alteeve.ca
10.20.2.2 an-p02 an-p02.alteeve.ca
# Network-monitored UPSes
10.20.3.1 an-u01 an-u01.alteeve.ca
10.20.3.2 an-u02 an-u02.alteeve.ca
### Monitor Packs
10.20.4.1 an-m01 an-m01.alteeve.ca
10.255.4.1 an-m01.ifn
10.20.4.2 an-m02 an-m02.alteeve.ca
10.255.4.2 an-m02.ifn
</syntaxhighlight>
|}


Then, use these configuration files;
== Setup SSH ==


Build the bridge;
Same as [[AN!Cluster_Tutorial_2#Setting_up_SSH|before]].


<syntaxhighlight lang="bash">
== Populating And Pushing ~/.ssh/known_hosts ==
vim /etc/sysconfig/network-scripts/ifcfg-ifn-vbr1
</syntaxhighlight>
<syntaxhighlight lang="bash">
# Internet-Facing Network - Bridge
DEVICE="ifn-vbr1"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="10.255.10.1"
NETMASK="255.255.0.0"
GATEWAY="10.255.255.254"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
DEFROUTE="yes"
</syntaxhighlight>


Now build the bonds;


<syntaxhighlight lang="bash">
{|class="wikitable"
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
ssh-keygen -t rsa -N "" -b 8191 -f ~/.ssh/id_rsa
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="text">
# Internet-Facing Network - Bond
Generating public/private rsa key pair.
DEVICE="ifn-bond1"
Your identification has been saved in /root/.ssh/id_rsa.
BRIDGE="ifn-vbr1"
Your public key has been saved in /root/.ssh/id_rsa.pub.
BOOTPROTO="none"
The key fingerprint is:
NM_CONTROLLED="no"
f9:41:7e:aa:96:8e:fa:47:79:f5:3a:33:89:c3:9a:4b root@an-a04n01.alteeve.ca
ONBOOT="yes"
The key's randomart image is:
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn1"
+--[ RSA 8191]----+
|                |
|                |
|          .      |
|        +  .    |
|        S.o...  |
|        o..+  .  |
|      .E+o. o  |
|      o+o+ *    |
|    .oo+*o . +  |
+-----------------+
</syntaxhighlight>
</syntaxhighlight>
 
|-
<syntaxhighlight lang="bash">
!<span class="code">an-a04n01</span>
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
ssh-keygen -t rsa -N "" -b 8191 -f ~/.ssh/id_rsa
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="text">
# Storage Network - Bond
Generating public/private rsa key pair.
DEVICE="sn-bond1"
Created directory '/root/.ssh'.
BOOTPROTO="none"
Your identification has been saved in /root/.ssh/id_rsa.
NM_CONTROLLED="no"
Your public key has been saved in /root/.ssh/id_rsa.pub.
ONBOOT="yes"
The key fingerprint is:
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn1"
3f:1a:02:17:44:10:5e:6f:2b:98:44:09:e5:e0:ea:4b root@an-a04n02.alteeve.ca
IPADDR="10.10.10.1"
The key's randomart image is:
NETMASK="255.255.0.0"
+--[ RSA 8191]----+
|  oo==+          |
| . =.o .        |
|  . + . o        |
| . . o o .      |
|.  + o S        |
|.    o . .       |
| E    . . o      |
|. .    . o .     |
| .     .       |
+-----------------+
</syntaxhighlight>
</syntaxhighlight>
|}
Setup autorized_keys:


<syntaxhighlight lang="bash">
{|class="wikitable"
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
ssh root@an-a04n02 "cat /root/.ssh/id_rsa.pub" >> ~/.ssh/authorized_keys
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="text">
# Back-Channel Network - Bond
The authenticity of host 'an-a04n02 (10.20.40.2)' can't be established.
DEVICE="bcn-bond1"
RSA key fingerprint is 22:09:7b:0c:8b:d8:80:08:80:6d:0e:bc:fb:5a:e1:de.
BOOTPROTO="none"
Are you sure you want to continue connecting (yes/no)? yes
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn1"
IPADDR="10.20.10.1"
NETMASK="255.255.0.0"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
Warning: Permanently added 'an-a04n02,10.20.40.2' (RSA) to the list of known hosts.
</syntaxhighlight>
<syntaxhighlight lang="text">
root@an-a04n02's password:
</syntaxhighlight>
|}


Now tell the interfaces to be slaves to their bonds;
Populate <span class="code">~/.ssh/known_hosts</span>:
 
Internet-Facing Network;


{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
ssh-keyscan an-a04n01.alteeve.ca >> ~/.ssh/known_hosts
</syntaxhighlight>
<syntaxhighlight lang="text">
# an-a04n01.alteeve.ca SSH-2.0-OpenSSH_5.3
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn1
ssh-keyscan an-a04n01 >> ~/.ssh/known_hosts
</syntaxhighlight>
<syntaxhighlight lang="text">
# an-a04n01 SSH-2.0-OpenSSH_5.3
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Internet-Facing Network - Link 1
ssh-keyscan an-a04n01.bcn >> ~/.ssh/known_hosts
DEVICE="ifn1"
</syntaxhighlight>
NM_CONTROLLED="no"
<syntaxhighlight lang="text">
BOOTPROTO="none"
# an-a04n01.bcn SSH-2.0-OpenSSH_5.3
ONBOOT="yes"
SLAVE="yes"
MASTER="ifn-bond1"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn2
ssh-keyscan an-a04n01.sn >> ~/.ssh/known_hosts
</syntaxhighlight>
<syntaxhighlight lang="text">
# an-a04n01.sn SSH-2.0-OpenSSH_5.3
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Back-Channel Network - Link 2
ssh-keyscan an-a04n01.ifn >> ~/.ssh/known_hosts
DEVICE="ifn2"
</syntaxhighlight>
NM_CONTROLLED="no"
<syntaxhighlight lang="text">
BOOTPROTO="none"
# an-a04n01.ifn SSH-2.0-OpenSSH_5.3
ONBOOT="yes"
SLAVE="yes"
MASTER="ifn-bond1"
</syntaxhighlight>
</syntaxhighlight>
Storage Network;
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-sn1
ssh-keyscan an-a04n02.alteeve.ca >> ~/.ssh/known_hosts
</syntaxhighlight>
<syntaxhighlight lang="text">
# an-a04n02.alteeve.ca SSH-2.0-OpenSSH_5.3
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Storage Network - Link 1
ssh-keyscan an-a04n02 >> ~/.ssh/known_hosts
DEVICE="sn1"
</syntaxhighlight>
NM_CONTROLLED="no"
<syntaxhighlight lang="text">
BOOTPROTO="none"
# an-a04n02 SSH-2.0-OpenSSH_5.3
ONBOOT="yes"
SLAVE="yes"
MASTER="sn-bond1"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-sn2
ssh-keyscan an-a04n02.bcn >> ~/.ssh/known_hosts
</syntaxhighlight>
<syntaxhighlight lang="text">
# an-a04n02.bcn SSH-2.0-OpenSSH_5.3
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
# Storage Network - Link 1
ssh-keyscan an-a04n02.sn >> ~/.ssh/known_hosts
DEVICE="sn2"
</syntaxhighlight>
NM_CONTROLLED="no"
<syntaxhighlight lang="text">
BOOTPROTO="none"
# an-a04n02.sn SSH-2.0-OpenSSH_5.3
ONBOOT="yes"
</syntaxhighlight>
SLAVE="yes"
<syntaxhighlight lang="bash">
MASTER="sn-bond1"
ssh-keyscan an-a04n02.ifn >> ~/.ssh/known_hosts
</syntaxhighlight>
<syntaxhighlight lang="text">
# an-a04n02.ifn SSH-2.0-OpenSSH_5.3
</syntaxhighlight>
</syntaxhighlight>
|}


Back-Channel Network
Now copy the files to the second node:


<syntaxhighlight lang="bash">
{|class="wikitable"
vim /etc/sysconfig/network-scripts/ifcfg-bcn1
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
rsync -av ~/.ssh/authorized_keys root@an-a04n02:/root/.ssh/
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="text">
# Back-Channel Network - Link 1
root@an-a04n02's password:
DEVICE="bcn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="bcn-bond1"
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
sending incremental file list
authorized_keys


sent 2937 bytes  received 31 bytes  1187.20 bytes/sec
total size is 2854  speedup is 0.96
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-bcn2
rsync -av ~/.ssh/known_hosts root@an-a04n02:/root/.ssh/
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="text">
# Storage Network - Link 1
sending incremental file list
DEVICE="bcn2"
known_hosts
NM_CONTROLLED="no"
 
BOOTPROTO="none"
sent 4829 bytes  received 31 bytes  9720.00 bytes/sec
ONBOOT="yes"
total size is 4750  speedup is 0.98
SLAVE="yes"
MASTER="bcn-bond1"
</syntaxhighlight>
</syntaxhighlight>
|}
Note that there was no password prompt the second time. Hoozah!
== Configuring the Firewall ==


Now restart the network, confirm that the bonds and bridge are up and you are ready to proceed.
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
# cman (corosync's totem)
iptables -I INPUT -m state --state NEW -m multiport -p udp -s 10.20.0.0/16 -d 10.20.0.0/16 --dports 5404,5405 -j ACCEPT
iptables -I INPUT -m addrtype --dst-type MULTICAST -m state --state NEW -m multiport -p udp -s 10.20.0.0/16 --dports 5404,5405 -j ACCEPT


== Setup The hosts File ==
# dlm
iptables -I INPUT -m state --state NEW -p tcp -s 10.20.0.0/16 -d 10.20.0.0/16 --dport 21064 -j ACCEPT


You can use [[DNS]] if you prefer. For now, lets use <span class="code">/etc/hosts</span> for node name resolution.
# DRBD resource 0 and 1 - on the SN
iptables -I INPUT -m state --state NEW -p tcp -s 10.10.0.0/16 -d 10.10.0.0/16 --dport 7788 -j ACCEPT
iptables -I INPUT -m state --state NEW -p tcp -s 10.10.0.0/16 -d 10.10.0.0/16 --dport 7789 -j ACCEPT


<syntaxhighlight lang="bash">
# Make the new rules persistent.
vim /etc/hosts
/etc/init.d/iptables save
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4
iptables: Saving firewall rules to /etc/sysconfig/iptables:[  OK  ]
::1        localhost localhost.localdomain localhost6 localhost6.localdomain6
</syntaxhighlight>
|-
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
# cman (corosync's totem)
iptables -I INPUT -m state --state NEW -m multiport -p udp -s 10.20.0.0/16 -d 10.20.0.0/16 --dports 5404,5405 -j ACCEPT
iptables -I INPUT -m addrtype --dst-type MULTICAST -m state --state NEW -m multiport -p udp -s 10.20.0.0/16 --dports 5404,5405 -j ACCEPT


# Anvil! 03, Node 01
# dlm
10.255.30.1    an-a03n01.ifn
iptables -I INPUT -m state --state NEW -p tcp -s 10.20.0.0/16 -d 10.20.0.0/16 --dport 21064 -j ACCEPT
10.10.30.1      an-a03n01.sn
10.20.30.1      an-a03n01.bcn an-a03n01 an-a03n01.alteeve.ca
10.20.31.1      an-a03n01.ipmi


# Anvil! 03, Node 02
# DRBD resource 0 and 1 - on the SN
10.255.30.2    an-a03n02.ifn
iptables -I INPUT -m state --state NEW -p tcp -s 10.10.0.0/16 -d 10.10.0.0/16 --dport 7788 -j ACCEPT
10.10.30.2      an-a03n02.sn
iptables -I INPUT -m state --state NEW -p tcp -s 10.10.0.0/16 -d 10.10.0.0/16 --dport 7789 -j ACCEPT
10.20.30.2      an-a03n02.bcn an-a03n02 an-a03n02.alteeve.ca
10.20.31.2      an-a03n02.ipmi


# Foundation Pack
# Make the new rules persistent.
10.20.2.7      an-p03 an-p03.alteeve.ca
/etc/init.d/iptables save
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
iptables: Saving firewall rules to /etc/sysconfig/iptables:[  OK  ]
</syntaxhighlight>
|}


== Setup SSH ==
== Keeping Time in Sync ==
 
Same as [[AN!Cluster_Tutorial_2#Setting_up_SSH|before]].
 
== Populating And Pushing ~/.ssh/known_hosts ==


It's not as critical as it used to be to keep the clocks on the nodes in sync, but it's still a good idea.


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
ssh-keygen -t rsa -N "" -b 8191 -f ~/.ssh/id_rsa
chkconfig ntpd on
/etc/init.d/ntpd start
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Generating public/private rsa key pair.
Starting ntpd:                                             OK ]
 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
be:17:cc:23:8e:b1:b4:76:a1:e4:2a:91:cb:cd:d8:3a root@an-a03n01.alteeve.ca
The key's randomart image is:
+--[ RSA 8191]----+
|                |
|                |
|                |
|                |
|  .    So      |
| o  +.o =      |
| . B + B.o o    |
| E + B o..      |
|  .+.o ...      |
+-----------------+
</syntaxhighlight>
</syntaxhighlight>
|-
|-
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
ssh-keygen -t rsa -N "" -b 8191 -f ~/.ssh/id_rsa
chkconfig ntpd on
/etc/init.d/ntpd start
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Generating public/private rsa key pair.
Starting ntpd:                                             OK ]
Created directory '/root/.ssh'.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
71:b1:9d:31:9f:7a:c9:10:74:e0:4c:69:53:8f:e4:70 root@an-a03n02.alteeve.ca
The key's randomart image is:
+--[ RSA 8191]----+
|          ..O+E |
|          B+% + |
|        . o.*.= .|
|        o  + . |
|        S  . + |
|            .  |
|                |
|                |
|                |
+-----------------+
</syntaxhighlight>
</syntaxhighlight>
|}
|}


Setup autorized_keys:
= Configuring the Anvil! =
 
Now we're getting down to business!
 
For this section, we will be working on <span class="code">an-a04n01</span> and using [[ssh]] to perform tasks on <span class="code">an-a04n02</span>.
 
{{note|1=TODO: explain what this is and how it works.}}
 
== Configuring cman ==
 
With RHEL 6, we do not need to configure corosync directly. We will create a "skeleton" cluster.conf file which will, in turn, handle corosync for us. Once configured and the configuration has been copied to the peer, we will start pacemaker and it will handle starting (and stopping) pacemaker and corosync for us.
 
We will use 'ccs' to configure the skeleton cluster.conf file.


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
ccs -f /etc/cluster/cluster.conf --createcluster an-anvil-04
ssh root@an-a03n02 "cat /root/.ssh/id_rsa.pub" >> ~/.ssh/authorized_keys
ccs -f /etc/cluster/cluster.conf --setcman two_node="1" expected_votes="1"
rsync -av ~/.ssh/authorized_keys root@an-a03n02:/root/.ssh/
ccs -f /etc/cluster/cluster.conf --addnode an-a04n01.alteeve.ca
ssh root@an-a03n01
ccs -f /etc/cluster/cluster.conf --addnode an-a04n02.alteeve.ca
ssh root@an-a03n01.alteeve.ca
ccs -f /etc/cluster/cluster.conf --addfencedev pcmk agent=fence_pcmk
ssh root@an-a03n02
ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect an-a04n01.alteeve.ca
ssh root@an-a03n02.alteeve.ca
ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect an-a04n02.alteeve.ca
rsync -av ~/.ssh/known_hosts root@an-a03n02:/root/.ssh/
ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk an-a04n01.alteeve.ca pcmk-redirect port=an-a04n01.alteeve.ca
rsync -av /etc/hosts root@an-a03n02:/etc/
ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk an-a04n02.alteeve.ca pcmk-redirect port=an-a04n02.alteeve.ca
ccs -f /etc/cluster/cluster.conf --setfencedaemon post_join_delay="30"
cat /etc/cluster/cluster.conf
</syntaxhighlight>
</syntaxhighlight>
|-
<syntaxhighlight lang="xml">
!<span class="code">an-a03n01</span>
<cluster config_version="10" name="an-anvil-04">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
  <fence_daemon post_join_delay="30"/>
ssh root@an-a03n01
  <clusternodes>
    <clusternode name="an-a04n01.alteeve.ca" nodeid="1">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="an-a04n01.alteeve.ca"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="an-a04n02.alteeve.ca" nodeid="2">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="an-a04n02.alteeve.ca"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman expected_votes="1" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>
</syntaxhighlight>
</syntaxhighlight>
|}
|}


== Keeping Time in Sync ==
Copy it to an-a04n02;


It's not as critical as it used to be to keep the clocks on the nodes in sync, but it's still a good idea.
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
rsync -av /etc/cluster/cluster.conf root@an-a04n02:/etc/cluster/
</syntaxhighlight>
<syntaxhighlight lang="text">
sending incremental file list
cluster.conf


<syntaxhighlight lang="bash">
sent 838 bytes  received 31 bytes  579.33 bytes/sec
ln -sf /usr/share/zoneinfo/America/Toronto /etc/localtime
total size is 758  speedup is 0.87
systemctl start ntpd.service
</syntaxhighlight>
systemctl enable ntpd.service
|-
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
cat /etc/cluster/cluster.conf
</syntaxhighlight>
<syntaxhighlight lang="xml">
<cluster config_version="10" name="an-anvil-04">
  <fence_daemon post_join_delay="30"/>
  <clusternodes>
    <clusternode name="an-a04n01.alteeve.ca" nodeid="1">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="an-a04n01.alteeve.ca"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="an-a04n02.alteeve.ca" nodeid="2">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="an-a04n02.alteeve.ca"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman expected_votes="1" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>
</syntaxhighlight>
</syntaxhighlight>
|}


== Configuring IPMI ==
== Starting Pacemaker ==


F19 specifics based on the [[IPMI]] tutorial.
Now start pacemaker proper.


<syntaxhighlight lang="bash">
{|class="wikitable"
yum -y install ipmitools OpenIPMI
!<span class="code">an-a04n01</span>
systemctl start ipmi.service
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
systemctl enable ipmi.service
/etc/init.d/pacemaker start
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
ln -s '/usr/lib/systemd/system/ipmi.service' '/etc/systemd/system/multi-user.target.wants/ipmi.service'
Starting cluster:
  Checking if cluster has been disabled at boot...       [  OK  ]
  Checking Network Manager...                            [  OK  ]
  Global setup...                                        [  OK  ]
  Loading kernel modules...                              [  OK  ]
  Mounting configfs...                                    [  OK  ]
  Starting cman...                                        [  OK  ]
  Waiting for quorum...                                  [  OK  ]
  Starting fenced...                                      [  OK  ]
  Starting dlm_controld...                                [  OK  ]
  Tuning DLM kernel config...                            [  OK  ]
  Starting gfs_controld...                                [  OK  ]
  Unfencing self...                                      [  OK  ]
  Joining fence domain...                                [  OK  ]
Starting Pacemaker Cluster Manager                        [  OK  ]
</syntaxhighlight>
</syntaxhighlight>
 
|-
Our servers use lan channel 2, yours might be 1 or something else. Experiment.
!<span class="code">an-a04n02</span>
 
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
/etc/init.d/pacemaker start
ipmitool lan print 2
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Set in Progress        : Set Complete
Starting cluster:  
Auth Type Support      : NONE MD5 PASSWORD
  Checking if cluster has been disabled at boot...       [  OK  ]
Auth Type Enable       : Callback : NONE MD5 PASSWORD
  Checking Network Manager...                            [  OK  ]
                        : User    : NONE MD5 PASSWORD
  Global setup...                                        [  OK  ]
                        : Operator : NONE MD5 PASSWORD
   Loading kernel modules...                              [  OK  ]
                        : Admin   : NONE MD5 PASSWORD
  Mounting configfs...                                    [  OK  ]
                        : OEM      : NONE MD5 PASSWORD
  Starting cman...                                        [  OK  ]
IP Address Source      : BIOS Assigned Address
  Waiting for quorum...                                   [  OK  ]
IP Address              : 10.20.51.1
  Starting fenced...                                     [  OK  ]
Subnet Mask            : 255.255.0.0
  Starting dlm_controld...                                [  OK  ]
MAC Address            : 00:19:99:9a:d8:e8
  Tuning DLM kernel config...                            [  OK  ]
SNMP Community String  : public
  Starting gfs_controld...                                [  OK  ]
IP Header              : TTL=0x40 Flags=0x40 Precedence=0x00 TOS=0x10
  Unfencing self...                                       [  OK  ]
Default Gateway IP      : 10.20.255.254
  Joining fence domain...                                 [  OK  ]
802.1q VLAN ID          : Disabled
Starting Pacemaker Cluster Manager                         [  OK  ]
802.1q VLAN Priority    : 0
RMCP+ Cipher Suites    : 0,1,2,3,6,7,8,17
Cipher Suite Priv Max  : OOOOOOOOXXXXXXX
                        :    X=Cipher Suite Unused
                         :    c=CALLBACK
                        :    u=USER
                        :    o=OPERATOR
                        :    a=ADMIN
                        :    O=OEM
</syntaxhighlight>
</syntaxhighlight>
|}


I need to set the IPs to <span class="code">10.20.31.1/16</span> and <span class="code">10.20.31.2/16</span> for nodes 1 and 2, respectively. I also want to set the password to <span class="code">secret</span> for the <span class="code">admin</span> user.
Verify pacemaker proper started as expected.


'''Node 01''' IP;
{|class="wikitable"
 
!<span class="code">an-a04n01</span>
<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
ipmitool lan set 2 ipsrc static
pcs status
ipmitool lan set 2 ipaddr 10.20.31.
ipmitool lan set 2 netmask 255.255.0.0
ipmitool lan set 2 defgw ipaddr 10.20.255.254
ipmitool lan print 2
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Set in Progress        : Set Complete
Cluster name: an-anvil-04
Auth Type Support      : NONE MD5 PASSWORD
WARNING: no stonith devices and stonith-enabled is not false
Auth Type Enable        : Callback : NONE MD5 PASSWORD
Last updated: Wed May 28 20:59:33 2014
                        : User    : NONE MD5 PASSWORD
Last change: Wed May 28 20:59:18 2014 via crmd on an-a04n01.alteeve.ca
                        : Operator : NONE MD5 PASSWORD
Stack: cman
                        : Admin    : NONE MD5 PASSWORD
Current DC: an-a04n01.alteeve.ca - partition with quorum
                        : OEM      : NONE MD5 PASSWORD
Version: 1.1.10-14.el6_5.3-368c726
IP Address Source      : Static Address
2 Nodes configured
IP Address              : 10.20.31.1
0 Resources configured
Subnet Mask            : 255.255.0.0
 
MAC Address            : 00:19:99:9a:d8:e8
SNMP Community String  : public
IP Header              : TTL=0x40 Flags=0x40 Precedence=0x00 TOS=0x10
Default Gateway IP      : 10.20.255.254
802.1q VLAN ID          : Disabled
802.1q VLAN Priority    : 0
RMCP+ Cipher Suites    : 0,1,2,3,6,7,8,17
Cipher Suite Priv Max  : OOOOOOOOXXXXXXX
                        :    X=Cipher Suite Unused
                        :    c=CALLBACK
                        :    u=USER
                        :    o=OPERATOR
                        :    a=ADMIN
                        :    O=OEM
</syntaxhighlight>


'''Node 01''' IP;
Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]


<syntaxhighlight lang="bash">
Full list of resources:
ipmitool lan set 2 ipsrc static
ipmitool lan set 2 ipaddr 10.20.31.2
ipmitool lan set 2 netmask 255.255.0.0
ipmitool lan set 2 defgw ipaddr 10.20.255.254
ipmitool lan print 2
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
|-
Set in Progress        : Set Complete
!<span class="code">an-a04n02</span>
Auth Type Support      : NONE MD5 PASSWORD
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
Auth Type Enable        : Callback : NONE MD5 PASSWORD
pcs status
                        : User    : NONE MD5 PASSWORD
                        : Operator : NONE MD5 PASSWORD
                        : Admin    : NONE MD5 PASSWORD
                        : OEM      : NONE MD5 PASSWORD
IP Address Source      : Static Address
IP Address              : 10.20.31.2
Subnet Mask            : 255.255.0.0
MAC Address            : 00:19:99:9a:b1:78
SNMP Community String  : public
IP Header              : TTL=0x40 Flags=0x40 Precedence=0x00 TOS=0x10
Default Gateway IP      : 10.20.255.254
802.1q VLAN ID          : Disabled
802.1q VLAN Priority    : 0
RMCP+ Cipher Suites    : 0,1,2,3,6,7,8,17
Cipher Suite Priv Max  : OOOOOOOOXXXXXXX
                        :    X=Cipher Suite Unused
                        :    c=CALLBACK
                        :    u=USER
                        :    o=OPERATOR
                        :    a=ADMIN
                        :    O=OEM
</syntaxhighlight>
 
Set the password.
 
<syntaxhighlight lang="bash">
ipmitool user list 2
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
ID  Name     Callin  Link Auth IPMI Msg  Channel Priv Limit
WARNING: no stonith devices and stonith-enabled is not false
1                   true    true      true      Unknown (0x00)
Last updated: Wed May 28 20:59:29 2014
2   admin            true    true      true      OEM
Last change: Wed May 28 20:59:18 2014 via crmd on an-a04n01.alteeve.ca
Get User Access command failed (channel 2, user 3): Unknown (0x32)
Stack: cman
</syntaxhighlight>
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
0 Resources configured


(ignore the error, it's harmless... *BOOM*)


We want to set <span class="code">admin</span>'s password, so we do:
Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]


{{note|1=The <span class="code">2</span> below is the ID number, not the LAN channel.}}
Full list of resources:
 
<syntaxhighlight lang="bash">
ipmitool user set password 2 secret
</syntaxhighlight>
</syntaxhighlight>
|}


Done!
Note the error about stonith. We will address that momentarily.
 
= Configuring the Anvil! =


Now we're getting down to business!
== Configure and test stonith (aka fencing) ==


For this section, we will be working on <span class="code">an-a03n01</span> and using [[ssh]] to perform tasks on <span class="code">an-a03n02</span>.
We will use [[IPMI]] and [[PDU]] based fence devices with [http://clusterlabs.org/wiki/STONITH_Levels STONITH levels].


{{note|1=TODO: explain what this is and how it works.}}
You can see the list of available fence agents here. You will need to find the one for your hardware fence devices.


== Enable the pcs Daemon ==
Note: [https://bugzilla.redhat.com/show_bug.cgi?id=1102444 Ignore the errors].


{{note|1=Most of this section comes more or less verbatim from the main [http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html Clusters from Scratch] tutorial.}}
{|class="wikitable"
 
!<span class="code">an-a04n01</span>
We will use [[pcs]], the Pacemaker Configuration System, to configure our Anvil!.
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
 
pcs stonith list
<syntaxhighlight lang="bash">
systemctl start pcsd.service
systemctl enable pcsd.service
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
ln -s '/usr/lib/systemd/system/pcsd.service' '/etc/systemd/system/multi-user.target.wants/pcsd.service'
fence_apc - Fence agent for APC over telnet/ssh
</syntaxhighlight>
fence_apc_snmp - Fence agent for APC over SNMP
 
fence_bladecenter - Fence agent for IBM BladeCenter
Now we need to set a password for the <span class="code">hacluster</span> user. This is the account used by <span class="code">pcs</span> on one node to talk to the <span class="code">pcs</span> [[daemon]] on the other node. For this tutorial, we will use the password <span class="code">secret</span>. You will want to use [https://xkcd.com/936/ a stronger password], of course.
fence_bladecenter_snmp - Fence agent for IBM BladeCenter over SNMP
 
fence_brocade - Fence agent for Brocade over telnet
<syntaxhighlight lang="bash">
Error: no metadata for /usr/sbin/fence_check
echo secret | passwd --stdin hacluster
fence_cisco_mds - Fence agent for Cisco MDS
</syntaxhighlight>
fence_cisco_ucs - Fence agent for Cisco UCS
<syntaxhighlight lang="text">
fence_drac - fencing agent for Dell Remote Access Card
Changing password for user hacluster.
fence_drac5 - Fence agent for Dell DRAC CMC/5
passwd: all authentication tokens updated successfully.
fence_eaton_snmp - Fence agent for Eaton over SNMP
</syntaxhighlight>
fence_egenera - I/O Fencing agent for the Egenera BladeFrame
 
fence_eps - Fence agent for ePowerSwitch
== Initializing the Cluster ==
fence_hpblade - Fence agent for HP BladeSystem
 
fence_ibmblade - Fence agent for IBM BladeCenter over SNMP
One of the biggest reasons we're using the [[pcs]] tool, over something like [[crm]], is that it has been written to simplify the setup of clusters on [[Red Hat]] style operating systems. It will configure [[corosync]] automatically.
fence_idrac - Fence agent for IPMI over LAN
 
fence_ifmib - Fence agent for IF MIB
First, we need to know what <span class="code">hostname</span> we will need to use for <span class="code">[[pcs]]</span>.
fence_ilo - Fence agent for HP iLO
 
fence_ilo2 - Fence agent for HP iLO
'''Node 01''':
fence_ilo3 - Fence agent for IPMI over LAN
 
fence_ilo4 - Fence agent for IPMI over LAN
<syntaxhighlight lang="bash">
fence_ilo_mp - Fence agent for HP iLO MP
hostname
fence_imm - Fence agent for IPMI over LAN
</syntaxhighlight>
fence_intelmodular - Fence agent for Intel Modular
<syntaxhighlight lang="bash">
fence_ipdu - Fence agent for iPDU over SNMP
an-a03n01.alteeve.ca
fence_ipmilan - Fence agent for IPMI over LAN
fence_kdump - Fence agent for use with kdump
Error: no metadata for /usr/sbin/fence_node
fence_rhevm - Fence agent for RHEV-M REST API
fence_rsa - Fence agent for IBM RSA
fence_rsb - I/O Fencing agent for Fujitsu-Siemens RSB
fence_sanbox2 - Fence agent for QLogic SANBox2 FC switches
fence_scsi - fence agent for SCSI-3 persistent reservations
Error: no metadata for /usr/sbin/fence_tool
fence_virsh - Fence agent for virsh
fence_virt - Fence agent for virtual machines
fence_vmware - Fence agent for VMWare
fence_vmware_soap - Fence agent for VMWare over SOAP API
fence_wti - Fence agent for WTI
fence_xvm - Fence agent for virtual machines
</syntaxhighlight>
</syntaxhighlight>
|}


'''Node 02''':
We will use <span class="code">fence_ipmilan</span> and <span class="code">fence_apc_snmp</span>.


<syntaxhighlight lang="bash">
=== Configuring IPMI Fencing ===
hostname
</syntaxhighlight>
<syntaxhighlight lang="bash">
an-a03n02.alteeve.ca
</syntaxhighlight>


Next, authenticate against the cluster nodes.
Setup out IPMI BMCs (on LAN channel 2 and using user ID 2).


'''Both nodes''':
{|class="wikitable"
 
!<span class="code">an-a04n01</span>
<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs cluster auth an-a03n01.alteeve.ca an-a03n02.alteeve.ca -u hacluster
ipmitool lan set 2 ipsrc static
ipmitool lan set 2 ipaddr 10.20.41.1
ipmitool lan set 2 netmask 255.255.0.0
ipmitool lan set 2 defgw ipaddr 10.20.255.254
ipmitool user set password 2 Initial1
</syntaxhighlight>
|-
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
ipmitool lan set 2 ipsrc static
ipmitool lan set 2 ipaddr 10.20.41.2
ipmitool lan set 2 netmask 255.255.0.0
ipmitool lan set 2 defgw ipaddr 10.20.255.254
ipmitool user set password 2 Initial1
</syntaxhighlight>
</syntaxhighlight>
|}


This will ask you for the user name and password. The default user name is <span class="code">hacluster</span> and we set the password to <span class="code">secret</span>.
Test the new settings (using the hostnames we set in /etc/hosts):


{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
fence_ipmilan -a an-a04n02.ipmi -l admin -p Initial1 -o status
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Password:  
Getting status of IPMI:an-a04n02.ipmi...Chassis power = On
an-a03n01.alteeve.ca: 6e9f7e98-dfb7-4305-b8e0-d84bf4f93ce3
Done
an-a03n01.alteeve.ca: Authorized
an-a03n02.alteeve.ca: ffee6a85-ddac-4d03-9b97-f136d532b478
an-a03n02.alteeve.ca: Authorized
</syntaxhighlight>
</syntaxhighlight>
 
|-
'''Do this on one node only''':
!<span class="code">an-a04n02</span>
 
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
Now to initialize the cluster's communication and membership layer.
fence_ipmilan -a an-a04n01.ipmi -l admin -p Initial1 -o status
 
<syntaxhighlight lang="bash">
pcs cluster setup --name an-cluster-03 an-a03n01.alteeve.ca an-a03n02.alteeve.ca
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
an-a03n01.alteeve.ca: Succeeded
Getting status of IPMI:an-a04n01.ipmi...Chassis power = On
an-a03n02.alteeve.ca: Succeeded
Done
</syntaxhighlight>
</syntaxhighlight>
|}


This will create the corosync configuration file <span class="code">/etc/corosync/corosync.conf</span>;
Good, now we can configure IPMI fencing.


<syntaxhighlight lang="bash">
Every fence agent has a possibly unique subset of options that can be used. You can see a brief description of these options with the <span class="code">pcs stonith describe fence_X</span> command. Let's look at the options available for <span class="code">fence_ipmilan</span>.
cat /etc/corosync/corosync.conf
 
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs stonith describe fence_ipmilan
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
totem {
Stonith options for: fence_ipmilan
version: 2
  auth: IPMI Lan Auth type (md5, password, or none)
secauth: off
  ipaddr: IPMI Lan IP to talk to
cluster_name: an-cluster-03
  passwd: Password (if required) to control power on IPMI device
transport: udpu
  passwd_script: Script to retrieve password (if required)
}
  lanplus: Use Lanplus to improve security of connection
  login: Username/Login (if required) to control power on IPMI device
  action: Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata
  timeout: Timeout (sec) for IPMI operation
  cipher: Ciphersuite to use (same as ipmitool -C parameter)
  method: Method to fence (onoff or cycle)
  power_wait: Wait X seconds after on/off operation
  delay: Wait X seconds before fencing is started
  privlvl: Privilege level on IPMI device
  verbose: Verbose mode
  stonith-timeout: How long to wait for the STONITH action to complete per a stonith device.
  priority: The priority of the stonith resource. Devices are tried in order of highest priority to lowest.
  pcmk_host_map: A mapping of host names to ports numbers for devices that do not support host names.
  pcmk_host_list: A list of machines controlled by this device (Optional unless pcmk_host_check=static-list).
  pcmk_host_check: How to determin which machines are controlled by the device.
</syntaxhighlight>
|}


nodelist {
One of the nice things about pcs is that it allows us to create a test file to prepare all our changes in. Then, when we're happy with the changes, merge them into the running cluster. So let's make a copy called <span class="code">stonith_cfg</span>
  node {
        ring0_addr: an-a03n01.alteeve.ca
        nodeid: 1
      }
  node {
        ring0_addr: an-a03n02.alteeve.ca
        nodeid: 2
      }
}


quorum {
Now add [[IPMI]] fencing.
provider: corosync_votequorum
two_node: 1
}


logging {
{|class="wikitable"
to_syslog: yes
!<span class="code">an-a04n01</span>
}
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs cluster cib stonith_cfg
#  work in our temp file        unique name    fence agent  target node                          device addr            options
pcs -f stonith_cfg stonith create fence_n01_ipmi fence_ipmilan pcmk_host_list="an-a04n01.alteeve.ca" ipaddr="an-a04n01.ipmi" action="reboot" login="admin" passwd="Initial1" delay=15 op monitor interval=10s
pcs -f stonith_cfg stonith create fence_n02_ipmi fence_ipmilan pcmk_host_list="an-a04n02.alteeve.ca" ipaddr="an-a04n02.ipmi" action="reboot" login="admin" passwd="Initial1" op monitor interval=10s
pcs cluster cib-push stonith_cfg
</syntaxhighlight>
</syntaxhighlight>
|}


== Start the Cluster For the First Time ==
Note that <span class="code">fence_n01_ipmi</span> has a <span class="code">delay=15</span> set but <span class="code">fence_n02_ipmi</span> does not. If the network connection breaks between the two nodes, they will both try to fence each other at the same time. If <span class="code">acpid</span> is running, the slower node will not die right away. It will continue to run for up to four more seconds, ample time for it to also initiate a fence against the faster node. The end result is that both nodes get fenced. The ten-second delay protects against this by causing <span class="code">an-a04n02</span> to pause for <span class="code">10</span> seconds before initiating a fence against <span class="code">an-a04n01</span>. If both nodes are alive, <span class="code">an-a04n02</span> will power off before the 10 seconds pass, so it will never fence <span class="code">an-a04n01</span>. However, if <span class="code">an-a04n01</span> really is dead, after the ten seconds have elapsed, fencing will proceed as normal.


This starts the cluster communication and membership layer for the first time.
NOTE: Get my PDUs back and use them here!


'''On one node only''';
We can check the new configuration now;


<syntaxhighlight lang="bash">
{|class="wikitable"
pcs cluster start --all
!<span class="code">an-a04n01</span>
</syntaxhighlight>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
<syntaxhighlight lang="text">
an-a03n01.alteeve.ca: Starting Cluster...
an-a03n02.alteeve.ca: Starting Cluster...
</syntaxhighlight>
 
After a few moments, you should be able to check the status;
 
<syntaxhighlight lang="bash">
pcs status
pcs status
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Cluster name: an-cluster-03
Cluster name: an-anvil-04
WARNING: no stonith devices and stonith-enabled is not false
Last updated: Wed May 28 22:01:14 2014
Last updated: Mon Jun 24 23:28:29 2013
Last change: Wed May 28 21:55:59 2014 via cibadmin on an-a04n01.alteeve.ca
Last change: Mon Jun 24 23:28:10 2013 via crmd on an-a03n01.alteeve.ca
Stack: cman
Current DC: NONE
Current DC: an-a04n01.alteeve.ca - partition with quorum
2 Nodes configured, unknown expected votes
Version: 1.1.10-14.el6_5.3-368c726
0 Resources configured.
2 Nodes configured
2 Resources configured




Node an-a03n01.alteeve.ca (1): UNCLEAN (offline)
Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
Node an-a03n02.alteeve.ca (2): UNCLEAN (offline)


Full list of resources:
Full list of resources:
fence_n01_ipmi (stonith:fence_ipmilan): Started an-a04n01.alteeve.ca
fence_n02_ipmi (stonith:fence_ipmilan): Started an-a04n02.alteeve.ca
</syntaxhighlight>
</syntaxhighlight>
|}


The other node should show almost the identical output.
Tell pacemaker to use fencing;


== Disabling Quorum ==
{|class="wikitable"
 
!<span class="code">an-a04n01</span>
{{note|1=Show the math.}}
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
 
pcs property set stonith-enabled=true
With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum.
pcs property set no-quorum-policy=ignore
 
By default, pacemaker uses quorum. You don't see this initially though;
 
<syntaxhighlight lang="bash">
pcs property
pcs property
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Cluster Properties:
Cluster Properties:
  dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
cluster-infrastructure: cman
  cluster-infrastructure: corosync
  dc-version: 1.1.10-14.el6_5.3-368c726
no-quorum-policy: ignore
  stonith-enabled: true
</syntaxhighlight>
</syntaxhighlight>
|}


To disable it, we set <span class="code">no-quorum-policy=ignore</span>.
Excellent!


<syntaxhighlight lang="bash">
== Configuring Fence Levels ==
pcs property set no-quorum-policy=ignore
pcs property
</syntaxhighlight>
<syntaxhighlight lang="text">
Cluster Properties:
dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
cluster-infrastructure: corosync
no-quorum-policy: ignore
</syntaxhighlight>


== Enabling and Configuring Fencing ==
TODO...


We will use IPMI and PDU based fence devices for redundancy.


You can see the list of available fence agents here. You will need to find the one for your hardware fence devices.
=== Test Fencing ===


<syntaxhighlight lang="bash">
ToDo: Kill each node with <span class="code">echo c > /proc/sysrq-trigger</span> and make sure the other node fences it.
pcs stonith list
 
</syntaxhighlight>
= Shared Storage =
 
DRBD -> Clustered LVM -> GFS2
 
== DRBD ==
 
We will use DRBD 8.4.
 
=== Partition Storage ===
 
How you do this will depend a lot on your storage (local disks, md software RAID, hardware RAID, 1 or multiple arrays, etc). It will also depend on how you plan to divy up your servers; you need two partitions; One for servers that will run on node 1 and another for node 2. It also depends on how much space you want for the /shared partition.
 
In our case, we're using a single hardware RAID array, we'll set aside 40 GB of space for /shared and we're going to divide the remaining free space evenly.
 
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
parted -a opt /dev/sda "print free"
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
fence_alom - Fence agent for Sun ALOM
Model: LSI RAID 5/6 SAS 6G (scsi)
fence_apc - Fence agent for APC over telnet/ssh
Disk /dev/sda: 898GB
fence_apc_snmp - Fence agent for APC over SNMP
Sector size (logical/physical): 512B/512B
fence_baytech - I/O Fencing agent for Baytech RPC switches in combination with a Cyclades Terminal
Partition Table: msdos
                Server
 
fence_bladecenter - Fence agent for IBM BladeCenter
Number  Start  End    Size    Type    File system    Flags
fence_brocade - Fence agent for Brocade over telnet
        32.3kB  1049kB  1016kB          Free Space
fence_bullpap - I/O Fencing agent for Bull FAME architecture controlled by a PAP management console.
1      1049kB  538MB  537MB  primary  ext4            boot
fence_cisco_mds - Fence agent for Cisco MDS
2      538MB  4833MB  4295MB  primary  linux-swap(v1)
fence_cisco_ucs - Fence agent for Cisco UCS
3      4833MB  26.3GB  21.5GB  primary  ext4
fence_cpint - I/O Fencing agent for GFS on s390 and zSeries VM clusters
        26.3GB  898GB  872GB            Free Space
fence_drac - fencing agent for Dell Remote Access Card
</syntaxhighlight>
fence_drac5 - Fence agent for Dell DRAC CMC/5
|-
fence_eaton_snmp - Fence agent for Eaton over SNMP
!<span class="code">an-a04n01</span>
fence_egenera - I/O Fencing agent for the Egenera BladeFrame
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
fence_eps - Fence agent for ePowerSwitch
# same as an-a04n01
fence_hpblade - Fence agent for HP BladeSystem
</syntaxhighlight>
fence_ibmblade - Fence agent for IBM BladeCenter over SNMP
|}
fence_idrac - Fence agent for IPMI over LAN
 
fence_ifmib - Fence agent for IF MIB
So 872 GB of free space, less 40 for /shared leaves 832 GB for servers. Divided evenly in 2 gives us 416 GB per server pool. Our first partition will then be 446 GB (40 for /shared) and the second will be 416 GB.
fence_ilo - Fence agent for HP iLO
 
fence_ilo2 - Fence agent for HP iLO
The free space starts at 26.3 GB, so our first partition will start at 26.3 GB and end at 492 GB (rounding off the .3). The second partition will then start at 492 GB and end at 898 GB, the end of the disk. Both of these new partitions will be contained in an extended partition.
fence_ilo3 - Fence agent for IPMI over LAN
 
fence_ilo_mp - Fence agent for HP iLO MP
{{note|1=After each change, we will get an error saying "Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or resource busy).  As
fence_imm - Fence agent for IPMI over LAN
a result, it may not reflect all of your changes until after reboot.". Will reboot once done to address this.}}
fence_intelmodular - Fence agent for Intel Modular
 
fence_ipdu - Fence agent for iPDU over SNMP
{|class="wikitable"
fence_ipmilan - Fence agent for IPMI over LAN
!<span class="code">an-a04n01</span>
fence_kdump - Fence agent for use with kdump
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
fence_ldom - Fence agent for Sun LDOM
parted -a opt /dev/sda "mkpart extended 26.3GB 898GB"
fence_lpar - Fence agent for IBM LPAR
parted -a opt /dev/sda "mkpart logical 26.3GB 492GB"
fence_mcdata - I/O Fencing agent for McData FC switches
parted -a opt /dev/sda "mkpart logical 492GB 898GB"
fence_rackswitch - fence_rackswitch - I/O Fencing agent for RackSaver RackSwitch
parted -a opt /dev/sda "print free"
fence_rhevm - Fence agent for RHEV-M REST API
</syntaxhighlight>
fence_rsa - Fence agent for IBM RSA
<syntaxhighlight lang="text">
fence_rsb - I/O Fencing agent for Fujitsu-Siemens RSB
Model: LSI RAID 5/6 SAS 6G (scsi)
fence_sanbox2 - Fence agent for QLogic SANBox2 FC switches
Disk /dev/sda: 898GB
fence_scsi - fence agent for SCSI-3 persistent reservations
Sector size (logical/physical): 512B/512B
fence_virsh - Fence agent for virsh
Partition Table: msdos
fence_vixel - I/O Fencing agent for Vixel FC switches
 
fence_vmware - Fence agent for VMWare
Number  Start  End    Size    Type      File system    Flags
fence_vmware_soap - Fence agent for VMWare over SOAP API
        32.3kB  1049kB  1016kB            Free Space
fence_wti - Fence agent for WTI
1      1049kB  538MB  537MB  primary  ext4            boot
fence_xcat - I/O Fencing agent for xcat environments
2      538MB  4833MB  4295MB  primary  linux-swap(v1)
fence_xenapi - XenAPI based fencing for the Citrix XenServer virtual machines.
3     4833MB  26.3GB  21.5GB  primary  ext4
fence_zvm - I/O Fencing agent for GFS on s390 and zSeries VM clusters
4      26.3GB  898GB  872GB  extended                  lba
5      26.3GB  492GB  466GB  logical
6      492GB  898GB  406GB  logical
</syntaxhighlight>
|-
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
# same as an-a04n01
</syntaxhighlight>
</syntaxhighlight>
|}


We will use <span class="code">fence_ipmilan</span> and <span class="code">fence_apc_snmp</span>.
Reboot


=== Configuring IPMI Fencing ===
{|class="wikitable"
 
!<span class="code">an-a04n01</span>
Every fence agent has a possibly unique subset of options that can be used. You can see a brief description of these options with the <span class="code">pcs stonith describe fence_X</span> command. Let's look at the options available for <span class="code">fence_ipmilan</span>.
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
reboot
</syntaxhighlight>
|-
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
reboot
</syntaxhighlight>
|}


<syntaxhighlight lang="bash">
=== Configure DRBD ===
pcs stonith describe fence_ipmilan
 
Configure <span class="code">global-common.conf</span>;
 
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/drbd.d/global_common.conf
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Stonith options for: fence_ipmilan
# These are options to set for the DRBD daemon sets the default values for
  auth: IPMI Lan Auth type (md5, password, or none)
# resources.
  ipaddr: IPMI Lan IP to talk to
global {
  passwd: Password (if required) to control power on IPMI device
# This tells DRBD that you allow it to report this installation to  
  passwd_script: Script to retrieve password (if required)
# LINBIT for statistical purposes. If you have privacy concerns, set
  lanplus: Use Lanplus
# this to 'no'. The default is 'ask' which will prompt you each time
  login: Username/Login (if required) to control power on IPMI device
# DRBD is updated. Set to 'yes' to allow it without being prompted.
  action: Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata
usage-count yes;
  timeout: Timeout (sec) for IPMI operation
  cipher: Ciphersuite to use (same as ipmitool -C parameter)
# minor-count dialog-refresh disable-ip-verification
  method: Method to fence (onoff or cycle)
}
  power_wait: Wait X seconds after on/off operation
  delay: Wait X seconds before fencing is started
common {
  privlvl: Privilege level on IPMI device
handlers {
  verbose: Verbose mode
# pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
</syntaxhighlight>
# pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
 
# local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
One of the nice things about pcs is that it allows us to create a test file to prepare all our changes in. Then, when we're happy with the changes, merge them into the running cluster. So let's make a copy called <span class="code">stonith_cfg</span>
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
 
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
<syntaxhighlight lang="bash">
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
pcs cluster cib stonith_cfg
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
</syntaxhighlight>
 
# Hook into Pacemaker's fencing.
Now add [[IPMI]] fencing.
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
 
before-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
<syntaxhighlight lang="bash">
}
#                 unique name    fence agent  target node                          device addr            options
pcs stonith create fence_n01_ipmi fence_ipmilan pcmk_host_list="an-a03n01.alteeve.ca" ipaddr="an-a03n01.ipmi" action="reboot" login="admin" passwd="secret" delay=15 op monitor interval=60s
startup {
pcs stonith create fence_n02_ipmi fence_ipmilan pcmk_host_list="an-a03n02.alteeve.ca" ipaddr="an-a03n02.ipmi" action="reboot" login="admin" passwd="secret" op monitor interval=60s
# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
</syntaxhighlight>
}
 
Note that <span class="code">fence_n01_ipmi</span> has a <span class="code">delay=15</span> set but <span class="code">fence_n02_ipmi</span> does not. If the network connection breaks between the two nodes, they will both try to fence each other at the same time. If <span class="code">acpid</span> is running, the slower node will not die right away. It will continue to run for up to four more seconds, ample time for it to also initiate a fence against the faster node. The end result is that both nodes get fenced. The ten-second delay protects against this by causing <span class="code">an-a03n02</span> to pause for <span class="code">10</span> seconds before initiating a fence against <span class="code">an-a03n01</span>. If both nodes are alive, <span class="code">an-a03n02</span> will power off before the 10 seconds pass, so it will never fence <span class="code">an-a03n01</span>. However, if <span class="code">an-a03n01</span> really is dead, after the ten seconds have elapsed, fencing will proceed as normal.
options {
 
# cpu-mask on-no-data-accessible
{{note|1=At the time of writing, <span class="code">pcmk_reboot_action</span> is needed to override pacemaker's global fence action and <span class="code">pcmk_reboot_action</span> is not recognized by pcs. Both of these issues will be resolved shortly; Pacemaker will honour <span class="code">action="..."</span> in v1.1.10 and pcs will recognize <span class="code">pcmk_*</span> special attributes "real soon now". Until then, the <span class="code">--force</span> switch is needed.}}
}
 
Next, add the [[PDU]] fencing. This requires distinct "off" and "on" actions for each outlet on each PDU. With two nodes, each with two [[PSU]]s, this translates to eight commands. The "off" commands will be monitored to alert us if the PDU fails for some reason. There is no reason to monitor the "on" actions (it would be redundant). Note also that we don't bother using a "delay". The IPMI fence method will go first, before the PDU actions, so the PDU is already delayed.
disk {
 
# size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes
<syntaxhighlight lang="bash">
# disk-drain md-flushes resync-rate resync-after al-extents
# Node 1 - off
                # c-plan-ahead c-delay-target c-fill-target c-max-rate
pcs stonith create fence_n01_pdu1_off fence_apc_snmp pcmk_host_list="an-a03n01.alteeve.ca" ipaddr="an-p01" action="off" port="1" op monitor interval="60s"
                # c-min-rate disk-timeout
pcs stonith create fence_n01_pdu2_off fence_apc_snmp pcmk_host_list="an-a03n01.alteeve.ca" ipaddr="an-p02" action="off" port="1" power_wait="5" op monitor interval="60s"
                fencing resource-and-stonith;
 
}
# Node 1 - on
pcs stonith create fence_n01_pdu1_on fence_apc_snmp pcmk_host_list="an-a03n01.alteeve.ca" ipaddr="an-p01" action="on" port="1"
net {
pcs stonith create fence_n01_pdu2_on fence_apc_snmp pcmk_host_list="an-a03n01.alteeve.ca" ipaddr="an-p02" action="on" port="1"
# protocol timeout max-epoch-size max-buffers unplug-watermark
 
# connect-int ping-int sndbuf-size rcvbuf-size ko-count
# Node 2 - off
# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri
pcs stonith create fence_n02_pdu1_off fence_apc_snmp pcmk_host_list="an-a03n02.alteeve.ca" ipaddr="an-p01" action="off" port="2" op monitor interval="60s"
# after-sb-1pri after-sb-2pri always-asbp rr-conflict
pcs stonith create fence_n02_pdu2_off fence_apc_snmp pcmk_host_list="an-a03n02.alteeve.ca" ipaddr="an-p02" action="off" port="2" power_wait="5" op monitor interval="60s"
# ping-timeout data-integrity-alg tcp-cork on-congestion
 
# congestion-fill congestion-extents csums-alg verify-alg
# Node 2 - on
# use-rle
pcs stonith create fence_n02_pdu1_on fence_apc_snmp pcmk_host_list="an-a03n02.alteeve.ca" ipaddr="an-p01" action="on" port="2"
pcs stonith create fence_n02_pdu2_on fence_apc_snmp pcmk_host_list="an-a03n02.alteeve.ca" ipaddr="an-p02" action="on" port="2"
# Protocol "C" tells DRBD not to tell the operating system that
</syntaxhighlight>
# the write is complete until the data has reach persistent
 
# storage on both nodes. This is the slowest option, but it is
We can check the new configuration now;
# also the only one that guarantees consistency between the
 
# nodes. It is also required for dual-primary, which we will
<syntaxhighlight lang="bash">
# be using.
pcs status
protocol C;
</syntaxhighlight>
   
<syntaxhighlight lang="text">
# Tell DRBD to allow dual-primary. This is needed to enable
Cluster name: an-cluster-03
# live-migration of our servers.
Last updated: Tue Jul  2 16:41:55 2013
allow-two-primaries yes;
Last change: Tue Jul  2 16:41:44 2013 via cibadmin on an-a03n01.alteeve.ca
   
Stack: corosync
# This tells DRBD what to do in the case of a split-brain when
Current DC: an-a03n01.alteeve.ca (1) - partition with quorum
# neither node was primary, when one node was primary and when
Version: 1.1.9-3.fc19-781a388
# both nodes are primary. In our case, we'll be running
2 Nodes configured, unknown expected votes
# dual-primary, so we can not safely recover automatically. The
10 Resources configured.
# only safe option is for the nodes to disconnect from one
 
# another and let a human decide which node to invalidate. Of
 
after-sb-0pri discard-zero-changes;
Online: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
after-sb-1pri discard-secondary;
 
after-sb-2pri disconnect;
Full list of resources:
}
 
}
  fence_n01_ipmi (stonith:fence_ipmilan): Started an-a03n01.alteeve.ca
fence_n02_ipmi (stonith:fence_ipmilan): Started an-a03n02.alteeve.ca
fence_n01_pdu1_off (stonith:fence_apc_snmp): Started an-a03n01.alteeve.ca
fence_n01_pdu2_off (stonith:fence_apc_snmp): Started an-a03n02.alteeve.ca
fence_n02_pdu1_off (stonith:fence_apc_snmp): Started an-a03n01.alteeve.ca
  fence_n02_pdu2_off (stonith:fence_apc_snmp): Started an-a03n02.alteeve.ca
fence_n01_pdu1_on (stonith:fence_apc_snmp): Started an-a03n01.alteeve.ca
fence_n01_pdu2_on (stonith:fence_apc_snmp): Started an-a03n02.alteeve.ca
fence_n02_pdu1_on (stonith:fence_apc_snmp): Started an-a03n01.alteeve.ca
fence_n02_pdu2_on (stonith:fence_apc_snmp): Started an-a03n02.alteeve.ca
</syntaxhighlight>
 
Before we proceed, we need to tell pacemaker to use fencing;
 
<syntaxhighlight lang="bash">
pcs property set stonith-enabled=true
pcs property
</syntaxhighlight>
<syntaxhighlight lang="text">
Cluster Properties:
Cluster Properties:
cluster-infrastructure: corosync
dc-version: 1.1.9-3.fc19-781a388
no-quorum-policy: ignore
stonith-enabled: true
</syntaxhighlight>
</syntaxhighlight>
|}


Excellent!
And now configure the first resource;
 
== Configuring Fence Levels ==
 
The goal of fence levels is to tell pacemaker that there are "fence methods" to try and to impose an order on those methods. Each method composes one or more fence primitives and, when 2 or more primitives are tied together, that all primitives must succeed for the overall method to succeed.
 
So in our case; the order we want is;
 
* IPMI -> PDUs
 
The reason is that when IPMI fencing succeeds, we can be very certain the node is truly fenced. When PDU fencing succeeds, it only confirms that the power outlets were cycled. If someone moved a node's power cables to another outlet, we'll get a false positive. On that topic, tie-down the node's PSU cables to the PDU's cable tray when possible, clearly label the power cables and wrap the fingers of anyone who might move them around.
 
The PDU fencing needs to be implemented using four steps;
 
* PDU 1, outlet X -> off
* PDU 2, outlet X -> off
** The <span class="code">power_wait="5"</span> setting for the <span class="code">fence_n0X_pdu2_off</span> primitives will cause a 5 second delay here, giving ample time to ensure the nodes lose power
* PDU 1, outlet X -> on
* PDU 2, outlet X -> on
 
This is to ensure that both outlets are off at the same time, ensuring that the node loses power. This works because <span class="code">fencing_topology</span> acts serially.
 
Putting all this together, we issue this command;


{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /etc/drbd.d/r0.res
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
pcs stonith level add 1 an-a03n01.alteeve.ca fence_n01_ipmi
# This is the first DRBD resource. It will store the shared file systems and
pcs stonith level add 1 an-a03n02.alteeve.ca fence_n02_ipmi
# the servers designed to run on node 01.
</syntaxhighlight>
resource r0 {
 
# These options here are common to both nodes. If for some reason you
The <span class="code">1</span> tells pacemaker that this is our highest priority fence method. We can see that this was set using pcs;
# need to set unique values per node, you can move these to the
 
# 'on <name> { ... }' section.
<syntaxhighlight lang="bash">
pcs stonith level
# This sets the device name of this DRBD resouce.
</syntaxhighlight>
device /dev/drbd0;
<syntaxhighlight lang="bash">
Node: an-a03n01.alteeve.ca
# This tells DRBD what the backing device is for this resource.
  Level 1 - fence_n01_ipmi
disk /dev/sda5;
Node: an-a03n02.alteeve.ca
  Level 1 - fence_n02_ipmi
# This controls the location of the metadata. When "internal" is used,
</syntaxhighlight>
# as we use here, a little space at the end of the backing devices is
 
# set aside (roughly 32 MB per 1 TB of raw storage). External metadata
Now we'll tell pacemaker to use the PDUs as the second fence method. Here we tie together the two <span class="code">off</span> calls and the two <span class="code">on</span> calls into a single method.
# can be used to put the metadata on another partition when converting
 
# existing file systems to be DRBD backed, when there is no extra space
<syntaxhighlight lang="bash">
# available for the metadata.
pcs stonith level add 2 an-a03n01.alteeve.ca fence_n01_pdu1_off,fence_n01_pdu2_off,fence_n01_pdu1_on,fence_n01_pdu2_on
meta-disk internal;
pcs stonith level add 2 an-a03n02.alteeve.ca fence_n02_pdu1_off,fence_n02_pdu2_off,fence_n02_pdu1_on,fence_n02_pdu2_on
</syntaxhighlight>
# NOTE: this is not required or even recommended with pacemaker. remove
 
# this options as soon as pacemaker is setup.
Check again and we'll see that the new methods were added.
startup {
 
# This tells DRBD to promote both nodes to 'primary' when this
<syntaxhighlight lang="bash">
# resource starts. However, we will let pacemaker control this
pcs stonith level
# so we comment it out, which tells DRBD to leave both nodes
</syntaxhighlight>
# as secondary when drbd starts.
<syntaxhighlight lang="bash">
#become-primary-on both;
Node: an-a03n01.alteeve.ca
}
  Level 1 - fence_n01_ipmi
  Level 2 - fence_n01_pdu1_off,fence_n01_pdu2_off,fence_n01_pdu1_on,fence_n01_pdu2_on
# NOTE: Later, make it an option in the dashboard to trigger a manual
Node: an-a03n02.alteeve.ca
# verify and/or schedule periodic automatic runs
  Level 1 - fence_n02_ipmi
net {
  Level 2 - fence_n02_pdu1_off,fence_n02_pdu2_off,fence_n02_pdu1_on,fence_n02_pdu2_on
# TODO: Test performance differences between sha1 and md5
</syntaxhighlight>
# This tells DRBD how to do a block-by-block verification of
 
# the data stored on the backing devices. Any verification
For those of us who are [[XML]] fans, this is what the [[cib]] looks like now:
# failures will result in the effected block being marked
 
# out-of-sync.
<syntaxhighlight lang="bash">
verify-alg md5;
cat /var/lib/pacemaker/cib/cib.xml
</syntaxhighlight>
# TODO: Test the performance hit of this being enabled.
<syntaxhighlight lang="xml">
# This tells DRBD to generate a checksum for each transmitted
<cib epoch="18" num_updates="0" admin_epoch="0" validate-with="pacemaker-1.2" cib-last-written="Thu Jul 18 13:15:53 2013" update-origin="an-a03n01.alteeve.ca" update-client="cibadmin" crm_feature_set="3.0.7" have-quorum="1" dc-uuid="1">
# packet. If the data received data doesn't generate the same
  <configuration>
# sum, a retransmit request is generated. This protects against
    <crm_config>
# otherwise-undetected errors in transmission, like
      <cluster_property_set id="cib-bootstrap-options">
# bit-flipping. See:
        <nvpair id="cib-bootstrap-options-dc-version" name="dc-version" value="1.1.9-dde1c52"/>
# http://www.drbd.org/users-guide/s-integrity-check.html
        <nvpair id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure" value="corosync"/>
data-integrity-alg md5;
        <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-quorum-policy" value="ignore"/>
}
      </cluster_property_set>
    </crm_config>
# WARNING: Confirm that these are safe when the controller's BBU is
    <nodes>
#          depleted/failed and the controller enters write-through
      <node id="1" uname="an-a03n01.alteeve.ca"/>
#          mode.
      <node id="2" uname="an-a03n02.alteeve.ca"/>
disk {
    </nodes>
# TODO: Test the real-world performance differences gained with
    <resources>
#       these options.
      <primitive class="stonith" id="fence_n01_ipmi" type="fence_ipmilan">
# This tells DRBD not to bypass the write-back caching on the
        <instance_attributes id="fence_n01_ipmi-instance_attributes">
# RAID controller. Normally, DRBD forces the data to be flushed
          <nvpair id="fence_n01_ipmi-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n01.alteeve.ca"/>
# to disk, rather than allowing the write-back cachine to
          <nvpair id="fence_n01_ipmi-instance_attributes-ipaddr" name="ipaddr" value="an-a03n01.ipmi"/>
# handle it. Normally this is dangerous, but with BBU-backed
          <nvpair id="fence_n01_ipmi-instance_attributes-action" name="action" value="reboot"/>
# caching, it is safe. The first option disables disk flushing
          <nvpair id="fence_n01_ipmi-instance_attributes-login" name="login" value="admin"/>
# and the second disabled metadata flushes.
          <nvpair id="fence_n01_ipmi-instance_attributes-passwd" name="passwd" value="secret"/>
disk-flushes no;
          <nvpair id="fence_n01_ipmi-instance_attributes-delay" name="delay" value="15"/>
md-flushes no;
        </instance_attributes>
}
        <operations>
          <op id="fence_n01_ipmi-monitor-interval-60s" interval="60s" name="monitor"/>
# This sets up the resource on node 01. The name used below must be the
        </operations>
# named returned by "uname -n".
      </primitive>
on an-a04n01.alteeve.ca {
      <primitive class="stonith" id="fence_n02_ipmi" type="fence_ipmilan">
# This is the address and port to use for DRBD traffic on this
        <instance_attributes id="fence_n02_ipmi-instance_attributes">
# node. Multiple resources can use the same IP but the ports
          <nvpair id="fence_n02_ipmi-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n02.alteeve.ca"/>
# must differ. By convention, the first resource uses 7788, the
          <nvpair id="fence_n02_ipmi-instance_attributes-ipaddr" name="ipaddr" value="an-a03n02.ipmi"/>
# second uses 7789 and so on, incrementing by one for each
          <nvpair id="fence_n02_ipmi-instance_attributes-action" name="action" value="reboot"/>
# additional resource.
          <nvpair id="fence_n02_ipmi-instance_attributes-login" name="login" value="admin"/>
address 10.10.40.1:7788;
          <nvpair id="fence_n02_ipmi-instance_attributes-passwd" name="passwd" value="secret"/>
}
        </instance_attributes>
on an-a04n02.alteeve.ca {
        <operations>
address 10.10.40.2:7788;
          <op id="fence_n02_ipmi-monitor-interval-60s" interval="60s" name="monitor"/>
}
        </operations>
}
      </primitive>
      <primitive class="stonith" id="fence_n01_pdu1_off" type="fence_apc_snmp">
        <instance_attributes id="fence_n01_pdu1_off-instance_attributes">
          <nvpair id="fence_n01_pdu1_off-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n01.alteeve.ca"/>
          <nvpair id="fence_n01_pdu1_off-instance_attributes-ipaddr" name="ipaddr" value="an-p01"/>
          <nvpair id="fence_n01_pdu1_off-instance_attributes-action" name="action" value="off"/>
          <nvpair id="fence_n01_pdu1_off-instance_attributes-port" name="port" value="1"/>
        </instance_attributes>
        <operations>
          <op id="fence_n01_pdu1_off-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="fence_n01_pdu2_off" type="fence_apc_snmp">
        <instance_attributes id="fence_n01_pdu2_off-instance_attributes">
          <nvpair id="fence_n01_pdu2_off-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n01.alteeve.ca"/>
          <nvpair id="fence_n01_pdu2_off-instance_attributes-ipaddr" name="ipaddr" value="an-p02"/>
          <nvpair id="fence_n01_pdu2_off-instance_attributes-action" name="action" value="off"/>
          <nvpair id="fence_n01_pdu2_off-instance_attributes-port" name="port" value="1"/>
          <nvpair id="fence_n01_pdu2_off-instance_attributes-power_wait" name="power_wait" value="5"/>
        </instance_attributes>
        <operations>
          <op id="fence_n01_pdu2_off-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="fence_n01_pdu1_on" type="fence_apc_snmp">
        <instance_attributes id="fence_n01_pdu1_on-instance_attributes">
          <nvpair id="fence_n01_pdu1_on-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n01.alteeve.ca"/>
          <nvpair id="fence_n01_pdu1_on-instance_attributes-ipaddr" name="ipaddr" value="an-p01"/>
          <nvpair id="fence_n01_pdu1_on-instance_attributes-action" name="action" value="on"/>
          <nvpair id="fence_n01_pdu1_on-instance_attributes-port" name="port" value="1"/>
        </instance_attributes>
        <operations>
          <op id="fence_n01_pdu1_on-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="fence_n01_pdu2_on" type="fence_apc_snmp">
        <instance_attributes id="fence_n01_pdu2_on-instance_attributes">
          <nvpair id="fence_n01_pdu2_on-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n01.alteeve.ca"/>
          <nvpair id="fence_n01_pdu2_on-instance_attributes-ipaddr" name="ipaddr" value="an-p02"/>
          <nvpair id="fence_n01_pdu2_on-instance_attributes-action" name="action" value="on"/>
          <nvpair id="fence_n01_pdu2_on-instance_attributes-port" name="port" value="1"/>
        </instance_attributes>
        <operations>
          <op id="fence_n01_pdu2_on-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="fence_n02_pdu1_off" type="fence_apc_snmp">
        <instance_attributes id="fence_n02_pdu1_off-instance_attributes">
          <nvpair id="fence_n02_pdu1_off-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n02.alteeve.ca"/>
          <nvpair id="fence_n02_pdu1_off-instance_attributes-ipaddr" name="ipaddr" value="an-p01"/>
          <nvpair id="fence_n02_pdu1_off-instance_attributes-action" name="action" value="off"/>
          <nvpair id="fence_n02_pdu1_off-instance_attributes-port" name="port" value="2"/>
        </instance_attributes>
        <operations>
          <op id="fence_n02_pdu1_off-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="fence_n02_pdu2_off" type="fence_apc_snmp">
        <instance_attributes id="fence_n02_pdu2_off-instance_attributes">
          <nvpair id="fence_n02_pdu2_off-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n02.alteeve.ca"/>
          <nvpair id="fence_n02_pdu2_off-instance_attributes-ipaddr" name="ipaddr" value="an-p02"/>
          <nvpair id="fence_n02_pdu2_off-instance_attributes-action" name="action" value="off"/>
          <nvpair id="fence_n02_pdu2_off-instance_attributes-port" name="port" value="2"/>
          <nvpair id="fence_n02_pdu2_off-instance_attributes-power_wait" name="power_wait" value="5"/>
        </instance_attributes>
        <operations>
          <op id="fence_n02_pdu2_off-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="fence_n02_pdu1_on" type="fence_apc_snmp">
        <instance_attributes id="fence_n02_pdu1_on-instance_attributes">
          <nvpair id="fence_n02_pdu1_on-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n02.alteeve.ca"/>
          <nvpair id="fence_n02_pdu1_on-instance_attributes-ipaddr" name="ipaddr" value="an-p01"/>
          <nvpair id="fence_n02_pdu1_on-instance_attributes-action" name="action" value="on"/>
          <nvpair id="fence_n02_pdu1_on-instance_attributes-port" name="port" value="2"/>
        </instance_attributes>
        <operations>
          <op id="fence_n02_pdu1_on-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
      <primitive class="stonith" id="fence_n02_pdu2_on" type="fence_apc_snmp">
        <instance_attributes id="fence_n02_pdu2_on-instance_attributes">
          <nvpair id="fence_n02_pdu2_on-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="an-a03n02.alteeve.ca"/>
          <nvpair id="fence_n02_pdu2_on-instance_attributes-ipaddr" name="ipaddr" value="an-p02"/>
          <nvpair id="fence_n02_pdu2_on-instance_attributes-action" name="action" value="on"/>
          <nvpair id="fence_n02_pdu2_on-instance_attributes-port" name="port" value="2"/>
        </instance_attributes>
        <operations>
          <op id="fence_n02_pdu2_on-monitor-interval-60s" interval="60s" name="monitor"/>
        </operations>
      </primitive>
    </resources>
    <constraints/>
    <fencing-topology>
      <fencing-level devices="fence_n01_ipmi" id="fl-an-a03n01.alteeve.ca-1" index="1" target="an-a03n01.alteeve.ca"/>
      <fencing-level devices="fence_n02_ipmi" id="fl-an-a03n02.alteeve.ca-1" index="1" target="an-a03n02.alteeve.ca"/>
      <fencing-level devices="fence_n01_pdu1_off,fence_n01_pdu2_off,fence_n01_pdu1_on,fence_n01_pdu2_on" id="fl-an-a03n01.alteeve.ca-2" index="2" target="an-a03n01.alteeve.ca"/>
       <fencing-level devices="fence_n02_pdu1_off,fence_n02_pdu2_off,fence_n02_pdu1_on,fence_n02_pdu2_on" id="fl-an-a03n02.alteeve.ca-2" index="2" target="an-a03n02.alteeve.ca"/>
    </fencing-topology>
  </configuration>
</cib>
</syntaxhighlight>
 
== Fencing using fence_virsh ==
 
{{note|1=To write this section, I used two virtual machines called <span class="code">pcmk1</span> and <span class="code">pcmk2</span>.}}
 
If you are trying to learn fencing using KVM or Xen virtual machines, you can use the <span class="code">fence_virsh</span>. You can also use <span class="code">[[Fencing KVM Virtual Servers|fence_virtd]]</span>, which is actually recommended by many, but I have found it to be rather unreliable.
 
To use <span class="code">fence_virsh</span>, first install it.
 
<syntaxhighlight lang="bash">
yum -y install fence-agents-virsh
</syntaxhighlight>
<syntaxhighlight lang="text">
<lots of yum output>
</syntaxhighlight>
 
Now test it from the command line. To do this, we need to know a few things;
* The VM host is at IP <span class="code">192.168.122.1</span>
* The username and password (<span class="code">-l</span> and <span class="code">-p</span> respectively) are the credentials used to log into VM host over [[SSH]].
** If you don't want your password to be shown, create a little shell script that simply prints your password and then use <span class="code">-S /path/to/script</span> instead of <span class="code">-p "secret"</span>.
* The name of the target VM, as shown by <span class="code">virsh list --all</span> on the host, is the node (<span class="code">-n</span>) value. For me, the nodes are called <span class="code">an-a03n01</span> and <span class="code">an-a03n02</span>.
 
=== Create the Password Script ===
 
In my case, the host is called '<span class="code">lemass</span>', so I want to create a password script called '<span class="code">/root/lemass.pw</span>'. The name of the script is entirely up to you.
 
{|class="wikitable"
!<span class="code">an-a03n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
vim /root/lemass.pw
</syntaxhighlight>
<syntaxhighlight lang="text">
echo "my secret password"
</syntaxhighlight>
<syntaxhighlight lang="bash">
chmod 755 /root/lemass.pw
/root/lemass.pw
</syntaxhighlight>
<syntaxhighlight lang="text">
my secret password
</syntaxhighlight>
<syntaxhighlight lang="bash">
rsync -av /root/lemass.pw root@an-a03n02:/root/
</syntaxhighlight>
<syntaxhighlight lang="bash">
sending incremental file list
lemass.pw
 
sent 102 bytes  received 31 bytes  266.00 bytes/sec
total size is 25  speedup is 0.19
</syntaxhighlight>
|-
!<span class="code">an-a03n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
/root/lemass.pw
</syntaxhighlight>
<syntaxhighlight lang="text">
my secret password
</syntaxhighlight>
|}
 
Done.
 
=== Test fence_virsh Status from the Command Line ===
 
{|class="wikitable"
!<span class="code">an-a03n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
fence_virsh -a 192.168.122.1 -l root -S /root/lemass.pw -n an-a03n02 -o status
</syntaxhighlight>
<syntaxhighlight lang="text">
Status: ON
</syntaxhighlight>
|-
!<span class="code">an-a03n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
fence_virsh -a 192.168.122.1 -l root -S /root/lemass.pw -n an-a03n01 -o status
</syntaxhighlight>
<syntaxhighlight lang="text">
Status: ON
</syntaxhighlight>
</syntaxhighlight>
|}
|}


Excellent! Now to configure it in pacemaker;
And the second.


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs stonith create fence_n01_virsh fence_virsh pcmk_host_list="an-a03n01.alteeve.ca" ipaddr="192.168.122.1" action="reboot" login="root" passwd_script="/root/lemass.pw" port="an-a03n01" delay=15 op monitor interval=60s
vim /etc/drbd.d/r1.res
pcs stonith create fence_n02_virsh fence_virsh pcmk_host_list="an-a03n02.alteeve.ca" ipaddr="192.168.122.1" action="reboot" login="root" passwd_script="/root/lemass.pw" port="an-a03n02" op monitor interval=60s
pcs cluster status
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="bash">
Cluster Status:
# This is the first DRBD resource. It will store the servers designed
Last updated: Sun Jan 26 15:45:31 2014
# to run on node 02.
Last change: Sun Jan 26 15:06:14 2014 via crmd on an-a03n01.alteeve.ca
resource r1 {
  Stack: corosync
device /dev/drbd1;
Current DC: an-a03n02.alteeve.ca (2) - partition with quorum
disk /dev/sda6;
  Version: 1.1.10-19.el7-368c726
meta-disk internal;
2 Nodes configured
   
  2 Resources configured
net {
 
verify-alg md5;
PCSD Status:
data-integrity-alg md5;
an-a03n01.alteeve.ca:
}
  an-a03n01.alteeve.ca: Online
   
an-a03n02.alteeve.ca:
disk {
  an-a03n02.alteeve.ca: Online
disk-flushes no;
md-flushes no;
}
   
on an-a04n01.alteeve.ca {
address 10.10.40.1:7789;
}
on an-a04n02.alteeve.ca {
address 10.10.40.2:7789;
}
}
</syntaxhighlight>
</syntaxhighlight>
|}
|}


=== Test Fencing ===
Test the config;


ToDo: Kill each node with <span class="code">echo c > /proc/sysrq-trigger</span> and make sure the other node fences it.
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
drbdadm dump
</syntaxhighlight>
<syntaxhighlight lang="text">
# /etc/drbd.conf
common {
}


= Shared Storage =
# resource r0 on an-a04n01.alteeve.ca: not ignored, not stacked
# defined at /etc/drbd.d/r0.res:3
resource r0 {
    on an-a04n01.alteeve.ca {
        volume 0 {
            device      /dev/drbd0 minor 0;
            disk        /dev/sda5;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.1:7788;
    }
    on an-a04n02.alteeve.ca {
        volume 0 {
            device      /dev/drbd0 minor 0;
            disk        /dev/sda5;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.2:7788;
    }
    net {
        verify-alg      md5;
        data-integrity-alg md5;
    }
    disk {
        disk-flushes      no;
        md-flushes        no;
    }
}


== DRBD ==
# resource r1 on an-a04n01.alteeve.ca: not ignored, not stacked
# defined at /etc/drbd.d/r1.res:3
resource r1 {
    on an-a04n01.alteeve.ca {
        volume 0 {
            device      /dev/drbd1 minor 1;
            disk        /dev/sda6;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.1:7789;
    }
    on an-a04n02.alteeve.ca {
        volume 0 {
            device      /dev/drbd1 minor 1;
            disk        /dev/sda6;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.2:7789;
    }
    net {
        verify-alg      md5;
        data-integrity-alg md5;
    }
    disk {
        disk-flushes      no;
        md-flushes        no;
    }
}
</syntaxhighlight>
|}


We will use DRBD 8.4.
Good, copy it to the other node and test it there.
 
=== Install DRBD 8.4.4 from AN! ===
 
{{warning|1=this doesn't work.}}
 
ToDo: Make a proper repo


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
rpm -Uvh https://alteeve.ca/files/AN-Cluster_Tutorial_3/drbd84/drbd-8.4.4-4.el7.x86_64.rpm \
rsync -av /etc/drbd.* root@an-a04n02:/etc/
        https://alteeve.ca/files/AN-Cluster_Tutorial_3/drbd84/drbd-bash-completion-8.4.4-4.el7.x86_64.rpm \
        https://alteeve.ca/files/AN-Cluster_Tutorial_3/drbd84/drbd-pacemaker-8.4.4-4.el7.x86_64.rpm \
        https://alteeve.ca/files/AN-Cluster_Tutorial_3/drbd84/drbd-udev-8.4.4-4.el7.x86_64.rpm \
        https://alteeve.ca/files/AN-Cluster_Tutorial_3/drbd84/drbd-utils-8.4.4-4.el7.x86_64.rpm \
        https://alteeve.ca/files/AN-Cluster_Tutorial_3/drbd84/drbd-heartbeat-8.4.4-4.el7.x86_64.rpm \
        https://alteeve.ca/files/AN-Cluster_Tutorial_3/drbd84/drbd-xen-8.4.4-4.el7.x86_64.rpm
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
sending incremental file list
drbd.d/
drbd.d/global_common.conf
drbd.d/r0.res
drbd.d/r1.res
sent 5738 bytes  received 73 bytes  11622.00 bytes/sec
total size is 5618  speedup is 0.97
</syntaxhighlight>
</syntaxhighlight>
|-
|-
!<span class="code">an-a03n02</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
drbdadm dump
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
# /etc/drbd.conf
common {
}
# resource r0 on an-a04n02.alteeve.ca: not ignored, not stacked
# defined at /etc/drbd.d/r0.res:3
resource r0 {
    on an-a04n01.alteeve.ca {
        volume 0 {
            device      /dev/drbd0 minor 0;
            disk        /dev/sda5;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.1:7788;
    }
    on an-a04n02.alteeve.ca {
        volume 0 {
            device      /dev/drbd0 minor 0;
            disk        /dev/sda5;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.2:7788;
    }
    net {
        verify-alg      md5;
        data-integrity-alg md5;
    }
    disk {
        disk-flushes      no;
        md-flushes        no;
    }
}
# resource r1 on an-a04n02.alteeve.ca: not ignored, not stacked
# defined at /etc/drbd.d/r1.res:3
resource r1 {
    on an-a04n01.alteeve.ca {
        volume 0 {
            device      /dev/drbd1 minor 1;
            disk        /dev/sda6;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.1:7789;
    }
    on an-a04n02.alteeve.ca {
        volume 0 {
            device      /dev/drbd1 minor 1;
            disk        /dev/sda6;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.2:7789;
    }
    net {
        verify-alg      md5;
        data-integrity-alg md5;
    }
    disk {
        disk-flushes      no;
        md-flushes        no;
    }
}
</syntaxhighlight>
</syntaxhighlight>
|}
|}


This isn't a plain dump of your configs, you will notice things have been shifted around. The point is that it dumped the configuration without errors, so we're good to go.


=== Install DRBD 8.4.4 From Source ===
=== Start DRBD for the first time ===


At this time, no EPEL repo exists for RHEL7, and the Fedora RPMs don't work, so we will install DRBD 8.4.4 from source.
Load the config;


Install dependencies:
{|class="wikitable"
 
!<span class="code">an-a04n01</span>
<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
yum -y install gcc flex rpm-build wget kernel-devel
modprobe drbd
wget -c http://oss.linbit.com/drbd/8.4/drbd-8.4.4.tar.gz
lsmod | grep drbd
tar -xvzf drbd-8.4.4.tar.gz
cd drbd-8.4.4
./configure \
  --prefix=/usr \
  --localstatedir=/var \
  --sysconfdir=/etc \
  --with-km \
  --with-udev \
  --with-pacemaker \
  --with-bashcompletion \
  --with-utils \
  --without-xen \
  --without-rgmanager \
  --without-heartbeat
make
make install
</syntaxhighlight>
 
Don't let DRBD start on boot (pacemaker will handle it for us).
 
<syntaxhighlight lang="bash">
systemctl disable drbd.service
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
drbd.service is not a native service, redirecting to /sbin/chkconfig.
drbd                 333723  0
Executing /sbin/chkconfig drbd off
libcrc32c              1246  1 drbd
</syntaxhighlight>
</syntaxhighlight>
 
|-
Done.
!<span class="code">an-a04n01</span>
 
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
=== Optional; Make RPMs ===
modprobe drbd
 
lsmod | grep drbd
{{warning|1=I've not been able to get the RPMs genreated here to install yet. I'd recommend skipping this, unless you want to help sort out the problems. :) }}
 
After <span class="code">./configure</span> above, you can make RPMs instead of installing directly.
 
Dependencies:
 
<syntaxhighlight lang="bash">
yum install rpmdevtools redhat-rpm-config kernel-devel
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
<install text>
drbd                  333723  0
libcrc32c              1246  1 drbd
</syntaxhighlight>
</syntaxhighlight>
|}


Setup RPM dev tree:
{{note|1=If you have used these partitions before, drbd may see an FS and refuse to create the MD. If that happens, use 'dd' to zero out the partition.}}


<syntaxhighlight lang="bash">
Create the metadisk;
cd ~
 
rpmdev-setuptree
{|class="wikitable"
ls -lah ~/rpmbuild/
!<span class="code">an-a04n01</span>
wget -c http://oss.linbit.com/drbd/8.4/drbd-8.4.4.tar.gz
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
tar -xvzf drbd-8.4.4.tar.gz
drbdadm create-md r{0,1}
cd drbd-8.4.4
./configure \
  --prefix=/usr \
  --localstatedir=/var \
  --sysconfdir=/etc \
  --with-km \
  --with-udev \
  --with-pacemaker \
  --with-bashcompletion \
  --with-utils \
  --without-xen \
  --without-heartbeat
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
total 4.0K
Writing meta data...
drwxr-xr-x. 7 root root  67 Dec 23 20:06 .
initializing activity log
dr-xr-x---. 6 root root 4.0K Dec 23 20:06 ..
NOT initializing bitmap
drwxr-xr-x. 2 root root    6 Dec 23 20:06 BUILD
New drbd meta data block successfully created.
drwxr-xr-x. 2 root root    6 Dec 23 20:06 RPMS
success
drwxr-xr-x. 2 root root    6 Dec 23 20:06 SOURCES
Writing meta data...
drwxr-xr-x. 2 root root    6 Dec 23 20:06 SPECS
initializing activity log
drwxr-xr-x. 2 root root    6 Dec 23 20:06 SRPMS
NOT initializing bitmap
New drbd meta data block successfully created.
success
</syntaxhighlight>
</syntaxhighlight>
 
|-
Userland tools:
!<span class="code">an-a04n01</span>
 
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
drbdadm create-md r{0,1}
make rpm
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
checking for presence of 8\.4\.4 in various changelog files
Writing meta data...
<snip>
initializing activity log
+ exit 0
NOT initializing bitmap
You have now:
New drbd meta data block successfully created.
/root/rpmbuild/RPMS/x86_64/drbd-8.4.4-4.el7.x86_64.rpm
success
/root/rpmbuild/RPMS/x86_64/drbd-utils-8.4.4-4.el7.x86_64.rpm
Writing meta data...
/root/rpmbuild/RPMS/x86_64/drbd-xen-8.4.4-4.el7.x86_64.rpm
initializing activity log
/root/rpmbuild/RPMS/x86_64/drbd-udev-8.4.4-4.el7.x86_64.rpm
NOT initializing bitmap
/root/rpmbuild/RPMS/x86_64/drbd-pacemaker-8.4.4-4.el7.x86_64.rpm
New drbd meta data block successfully created.
/root/rpmbuild/RPMS/x86_64/drbd-heartbeat-8.4.4-4.el7.x86_64.rpm
success
/root/rpmbuild/RPMS/x86_64/drbd-bash-completion-8.4.4-4.el7.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/drbd-debuginfo-8.4.4-4.el7.x86_64.rpm
</syntaxhighlight>
</syntaxhighlight>
|}


Kernel module:
Bring up the new resources.


<syntaxhighlight lang="bash">
{|class="wikitable"
make kmp-rpm
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
drbdadm up r{0,1}
cat /proc/drbd
</syntaxhighlight>
<syntaxhighlight lang="text">
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@rhel6-builder.alteeve.ca, 2014-07-20 21:29:34
0: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/Outdated C r----s
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:454762916
1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/Outdated C r----s
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:396782732
</syntaxhighlight>
|-
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
drbdadm up r{0,1}
cat /proc/drbd
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
checking for presence of 8\.4\.4 in various changelog files
version: 8.4.4 (api:1/proto:86-101)
<snip>
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@rhel6-builder.alteeve.ca, 2014-07-20 21:29:34
+ exit 0
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
You have now:
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:454762916
/root/rpmbuild/RPMS/x86_64/drbd-8.4.4-4.el7.x86_64.rpm
1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
/root/rpmbuild/RPMS/x86_64/drbd-utils-8.4.4-4.el7.x86_64.rpm
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:396782732
/root/rpmbuild/RPMS/x86_64/drbd-xen-8.4.4-4.el7.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/drbd-udev-8.4.4-4.el7.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/drbd-pacemaker-8.4.4-4.el7.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/drbd-heartbeat-8.4.4-4.el7.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/drbd-bash-completion-8.4.4-4.el7.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/drbd-debuginfo-8.4.4-4.el7.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/kmod-drbd-8.4.4_3.10.0_54.0.1-4.el7.x86_64.rpm
/root/rpmbuild/RPMS/x86_64/drbd-kernel-debuginfo-8.4.4-4.el7.x86_64.rpm
</syntaxhighlight>
</syntaxhighlight>
|}


=== Configure DRBD ===
Neither node has data, so we'll arbitrarily force node 01 to become primary, then normally promote node 02 to primary.
 
Configure <span class="code">global-common.conf</span>;


<syntaxhighlight lang="bash">
{|class="wikitable"
vim /etc/drbd.d/global_common.conf
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
drbdadm primary --force r{0,1}
cat /proc/drbd
</syntaxhighlight>
<syntaxhighlight lang="text">
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@rhel6-builder.alteeve.ca, 2014-07-20 21:29:34
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:2136 nr:0 dw:0 dr:2800 al:0 bm:0 lo:0 pe:3 ua:0 ap:0 ep:1 wo:d oos:454760880
        [>....................] sync'ed:  0.1% (444100/444104)M
        finish: 421:04:29 speed: 252 (252) K/sec
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:24696 nr:0 dw:0 dr:25360 al:0 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:396758036
        [>....................] sync'ed:  0.1% (387456/387480)M
        finish: 35:33:06 speed: 3,084 (3,084) K/sec
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
|-
# These are options to set for the DRBD daemon sets the default values for
!<span class="code">an-a04n01</span>
# resources.
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
global {
drbdadm primary r{0,1}
# This tells DRBD that you allow it to report this installation to
cat /proc/drbd
# LINBIT for statistical purposes. If you have privacy concerns, set
</syntaxhighlight>
# this to 'no'. The default is 'ask' which will prompt you each time
<syntaxhighlight lang="text">
# DRBD is updated. Set to 'yes' to allow it without being prompted.
version: 8.4.4 (api:1/proto:86-101)
usage-count no;
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@rhel6-builder.alteeve.ca, 2014-07-20 21:29:34
0: cs:SyncTarget ro:Primary/Primary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:859488 dw:859432 dr:608 al:0 bm:52 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:453903484
        [>....................] sync'ed:  0.2% (443264/444104)M
        finish: 71:24:53 speed: 1,752 (4,428) want: 440 K/sec
1: cs:SyncTarget ro:Primary/Primary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:1140588 dw:1140532 dr:608 al:0 bm:69 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:395642200
        [>....................] sync'ed:  0.3% (386368/387480)M
        finish: 70:30:41 speed: 1,548 (5,876) want: 4,400 K/sec
</syntaxhighlight>
|}
 
The sync rate starts low, but it will continue to climb, you can keep an eye on it if you wish. DRBD 8.4 is smarter than 8.3 in that it will adjust the sync rate automatically based on load.


# minor-count dialog-refresh disable-ip-verification
We can proceed now, we do not have to wait for the sync to complete.
}


common {
== Clustered LVM and GFS2 ==
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
# Hook into Pacemaker's fencing.
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
}


startup {
Clustered LVM provides the logical volumes that will back our /shared GFS2 partition and the storage for the HA servers.
# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
}


options {
=== Configure lvm.conf ===
# cpu-mask on-no-data-accessible
}


disk {
Configure clustered LVM.
# size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes
# disk-drain md-flushes resync-rate resync-after al-extents
                # c-plan-ahead c-delay-target c-fill-target c-max-rate
                # c-min-rate disk-timeout
                fencing resource-and-stonith;
}


net {
{|class="wikitable"
# protocol timeout max-epoch-size max-buffers unplug-watermark
!<span class="code">an-a04n01</span>
# connect-int ping-int sndbuf-size rcvbuf-size ko-count
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri
sed -i.anvil 's^filter = \[ "a/\.\*/" \]^filter = \[ "a|/dev/drbd*|", "r/.*/" \]^' /etc/lvm/lvm.conf
# after-sb-1pri after-sb-2pri always-asbp rr-conflict
sed -i 's/locking_type = 1$/locking_type = 3/' /etc/lvm/lvm.conf
# ping-timeout data-integrity-alg tcp-cork on-congestion
sed -i 's/fallback_to_local_locking = 1$/fallback_to_local_locking = 0/' /etc/lvm/lvm.conf
# congestion-fill congestion-extents csums-alg verify-alg
diff -u /etc/lvm/lvm.conf.anvil /etc/lvm/lvm.conf
# use-rle
</syntaxhighlight>
 
<syntaxhighlight lang="diff">
# Protocol "C" tells DRBD not to tell the operating system that
--- /etc/lvm/lvm.conf.anvil 2013-10-30 04:10:42.000000000 -0400
# the write is complete until the data has reach persistent
+++ /etc/lvm/lvm.conf 2014-06-04 18:38:15.545166869 -0400
# storage on both nodes. This is the slowest option, but it is
@@ -82,7 +82,7 @@
# also the only one that guarantees consistency between the
# nodes. It is also required for dual-primary, which we will
# be using.
    # By default we accept every block device:
protocol C;
-   filter = [ "a/.*/" ]
 
+    filter = [ "a|/dev/drbd*|", "r/.*/" ]
# Tell DRBD to allow dual-primary. This is needed to enable
# live-migration of our servers.
    # Exclude the cdrom drive
allow-two-primaries yes;
    # filter = [ "r|/dev/cdrom|" ]
 
@@ -459,7 +459,7 @@
# This tells DRBD what to do in the case of a split-brain when
    # Type 3 uses built-in clustered locking.
# neither node was primary, when one node was primary and when
    # Type 4 uses read-only locking which forbids any operations that might
# both nodes are primary. In our case, we'll be running
    # change metadata.
# dual-primary, so we can not safely recover automatically. The
-   locking_type = 1
# only safe option is for the nodes to disconnect from one
+    locking_type = 3
# another and let a human decide which node to invalidate. Of
after-sb-0pri discard-zero-changes;
    # Set to 0 to fail when a lock request cannot be satisfied immediately.
after-sb-1pri discard-secondary;
    wait_for_locks = 1
after-sb-2pri disconnect;
@@ -475,7 +475,7 @@
}
    # to 1 an attempt will be made to use local file-based locking (type 1).
}
    # If this succeeds, only commands against local volume groups will proceed.
    # Volume Groups marked as clustered will be ignored.
-   fallback_to_local_locking = 1
+    fallback_to_local_locking = 0
    # Local non-LV directory that holds file-based locks while commands are
    # in progress.  A directory like /tmp that may get wiped on reboot is OK.
</syntaxhighlight>
</syntaxhighlight>
And now configure the first resource;
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
vim /etc/drbd.d/r0.res
rsync -av /etc/lvm/lvm.conf* root@an-a04n02:/etc/lvm/
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="text">
# This is the first DRBD resource. If will store the shared file systems and
sending incremental file list
# the servers designed to run on node 01.
lvm.conf
resource r0 {
lvm.conf.anvil
# These options here are common to both nodes. If for some reason you
# need to set unique values per node, you can move these to the
# 'on <name> { ... }' section.
# This sets the device name of this DRBD resouce.
device /dev/drbd0;


# This tells DRBD what the backing device is for this resource.
sent 47499 bytes  received 440 bytes  95878.00 bytes/sec
disk /dev/sda5;
total size is 89999  speedup is 1.88
 
</syntaxhighlight>
# This controls the location of the metadata. When "internal" is used,
|-
# as we use here, a little space at the end of the backing devices is
!<span class="code">an-a04n02</span>
# set aside (roughly 32 MB per 1 TB of raw storage). External metadata
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
# can be used to put the metadata on another partition when converting
diff -u /etc/lvm/lvm.conf.anvil /etc/lvm/lvm.conf
# existing file systems to be DRBD backed, when there is no extra space
</syntaxhighlight>
# available for the metadata.
<syntaxhighlight lang="diff">
meta-disk internal;
--- /etc/lvm/lvm.conf.anvil 2013-10-30 04:10:42.000000000 -0400
+++ /etc/lvm/lvm.conf 2014-06-04 18:38:15.000000000 -0400
@@ -82,7 +82,7 @@
    # By default we accept every block device:
-    filter = [ "a/.*/" ]
+    filter = [ "a|/dev/drbd*|", "r/.*/" ]
    # Exclude the cdrom drive
    # filter = [ "r|/dev/cdrom|" ]
@@ -459,7 +459,7 @@
    # Type 3 uses built-in clustered locking.
    # Type 4 uses read-only locking which forbids any operations that might
    # change metadata.
-    locking_type = 1
+    locking_type = 3
    # Set to 0 to fail when a lock request cannot be satisfied immediately.
    wait_for_locks = 1
@@ -475,7 +475,7 @@
    # to 1 an attempt will be made to use local file-based locking (type 1).
    # If this succeeds, only commands against local volume groups will proceed.
    # Volume Groups marked as clustered will be ignored.
-   fallback_to_local_locking = 1
+    fallback_to_local_locking = 0
    # Local non-LV directory that holds file-based locks while commands are
    # in progress.  A directory like /tmp that may get wiped on reboot is OK.
</syntaxhighlight>
|}
 
=== Start clvmd ===


# NOTE: this is not required or even recommended with pacemaker. remove
{{note|1=This will be moved to pacemaker shortly. We're enabling it here just long enough to configure pacemaker.}}
# this options as soon as pacemaker is setup.
startup {
# This tells DRBD to promote both nodes to 'primary' when this
# resource starts. However, we will let pacemaker control this
# so we comment it out, which tells DRBD to leave both nodes
# as secondary when drbd starts.
#become-primary-on both;
}


# NOTE: Later, make it an option in the dashboard to trigger a manual
Make sure the cluster is up (you could use 'pcs status', 'cman_tool status', etc):
# verify and/or schedule periodic automatic runs
net {
# TODO: Test performance differences between sha1 and md5
# This tells DRBD how to do a block-by-block verification of
# the data stored on the backing devices. Any verification
# failures will result in the effected block being marked
# out-of-sync.
verify-alg md5;


# TODO: Test the performance hit of this being enabled.
{|class="wikitable"
# This tells DRBD to generate a checksum for each transmitted
!<span class="code">an-a04n01</span>
# packet. If the data received data doesn't generate the same
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
# sum, a retransmit request is generated. This protects against
dlm_tool dump | grep node
# otherwise-undetected errors in transmission, like
</syntaxhighlight>
# bit-flipping. See:
<syntaxhighlight lang="text">
# http://www.drbd.org/users-guide/s-integrity-check.html
1401921044 cluster node 1 added seq 68
data-integrity-alg md5;
1401921044 set_configfs_node 1 10.20.40.1 local 1
}
1401921044 cluster node 2 added seq 68
1401921044 set_configfs_node 2 10.20.40.2 local 0
1401921044 run protocol from nodeid 1
</syntaxhighlight>
|}


# WARNING: Confirm that these are safe when the controller's BBU is
Make sure DRBD is up as primary on both nodes:
#          depleted/failed and the controller enters write-through
#          mode.
disk {
# TODO: Test the real-world performance differences gained with
#      these options.
# This tells DRBD not to bypass the write-back caching on the
# RAID controller. Normally, DRBD forces the data to be flushed
# to disk, rather than allowing the write-back cachine to
# handle it. Normally this is dangerous, but with BBU-backed
# caching, it is safe. The first option disables disk flushing
# and the second disabled metadata flushes.
disk-flushes no;
md-flushes no;
}


# This sets up the resource on node 01. The name used below must be the
{|class="wikitable"
# named returned by "uname -n".
!<span class="code">an-a04n01</span>
on an-a03n01.alteeve.ca {
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
# This is the address and port to use for DRBD traffic on this
cat /proc/drbd  
# node. Multiple resources can use the same IP but the ports
# must differ. By convention, the first resource uses 7788, the
# second uses 7789 and so on, incrementing by one for each
# additional resource.
address 10.10.30.1:7788;
}
on an-a03n02.alteeve.ca {
address 10.10.30.2:7788;
}
}
</syntaxhighlight>
 
Disable <span class="code">drbd</span> from starting on boot.
 
<syntaxhighlight lang="bash">
systemctl disable drbd.service
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
drbd.service is not a native service, redirecting to /sbin/chkconfig.
version: 8.3.16 (api:88/proto:86-97)
Executing /sbin/chkconfig drbd off
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by root@rhel6-builder.alteeve.ca, 2014-04-20 12:16:31
0: cs:SyncSource ro:Primary/Primary ds:UpToDate/Inconsistent C r-----
    ns:1519672 nr:0 dw:0 dr:1520336 al:0 bm:93 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:448214308
[>....................] sync'ed:  0.4% (437708/439192)M
finish: 6:20:02 speed: 19,652 (15,992) K/sec
1: cs:SyncSource ro:Primary/Primary ds:UpToDate/Inconsistent C r-----
    ns:1896504 nr:0 dw:0 dr:1897168 al:0 bm:115 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:390577164
[>....................] sync'ed:  0.5% (381420/383272)M
finish: 2:33:17 speed: 42,440 (19,960) K/sec
</syntaxhighlight>
</syntaxhighlight>
|}


Load the config;
Note that we don't have to wait for the sync to finish.


<syntaxhighlight lang="bash">
Start clvmd;
modprobe drbd
</syntaxhighlight>


Now check the config;
{|class="wikitable"
 
!<span class="code">an-a04n01</span>
<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
drbdadm dump
/etc/init.d/clvmd start
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
  --==  Thank you for participating in the global usage survey  ==--
Starting clvmd:
The server's response is:
Activating VG(s):   No volume groups found
 
                                                          [  OK  ]
you are the 69th user to install this version
/etc/drbd.d/r0.res:3: in resource r0:
become-primary-on is set to both, but allow-two-primaries is not set.
</syntaxhighlight>
</syntaxhighlight>
|-
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
/etc/init.d/clvmd start
</syntaxhighlight>
<syntaxhighlight lang="text">
Starting clvmd:
Activating VG(s):  No volume groups found
                                                          [  OK  ]
</syntaxhighlight>
|}


Ignore that error. It has been reported and does not effect operation.
{{note|1=If this fails, showing a timeout or simply never returning, make sure that TCP port 21064 is opened in your firewall on both nodes.}}


Create the metadisk;
From here on, pacemaker will start clvmd when pacemaker itself start, *if* clvmd is set to start on boot. So lets set that.


<syntaxhighlight lang="bash">
{|class="wikitable"
drbdadm create-md r0
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
chkconfig clvmd on
chkconfig --list clvmd
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Writing meta data...
clvmd          0:off 1:off 2:on 3:on 4:on 5:on 6:off
initializing activity log
</syntaxhighlight>
NOT initializing bitmap
|-
New drbd meta data block successfully created.
!<span class="code">an-a04n01</span>
success
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
chkconfig clvmd on
chkconfig --list clvmd
</syntaxhighlight>
</syntaxhighlight>
 
<syntaxhighlight lang="text">
Start the DRBD resource on both nodes;
clvmd          0:off 1:off 2:on 3:on 4:on 5:on 6:off
 
<syntaxhighlight lang="bash">
drbdadm up r0
</syntaxhighlight>
</syntaxhighlight>
|}


Once <span class="code">/proc/drbd</span> shows both nodes connected, force one to primary and it will sync over the second.
=== Create Initial PVs, VGs and the /shared LV ===


Create the [[PV]], [[VG]] and the <span class="code">/shared</span> [[LV]];
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pvcreate /dev/drbd{0,1}
</syntaxhighlight>
<syntaxhighlight lang="text">
  Physical volume "/dev/drbd0" successfully created
  Physical volume "/dev/drbd1" successfully created
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
drbdadm primary --force r0
vgcreate an-a04n01_vg0 /dev/drbd0
</syntaxhighlight>
<syntaxhighlight lang="text">
  Clustered volume group "an-a04n01_vg0" successfully created
</syntaxhighlight>
</syntaxhighlight>
You should see the resource syncing now. Push both nodes to primary;
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
drbdadm primary r0
vgcreate an-a04n02_vg0 /dev/drbd1
</syntaxhighlight>
<syntaxhighlight lang="text">
  Clustered volume group "an-a04n02_vg0" successfully created
</syntaxhighlight>
</syntaxhighlight>
 
<syntaxhighlight lang="bash">
== DLM, Clustered LVM and GFS2 ==
lvcreate -L 40GiB -n shared an-a04n01_vg0
 
{|class="wikitable"
!<span class="code">an-a03n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
sed -i.anvil 's^filter = \[ "a/\.\*/" \]^filter = \[ "a|/dev/drbd*|", "r/.*/" \]^' /etc/lvm/lvm.conf
sed -i 's/locking_type = 1$/locking_type = 3/' /etc/lvm/lvm.conf
sed -i 's/fallback_to_local_locking = 1$/fallback_to_local_locking = 0/' /etc/lvm/lvm.conf
sed -i 's/use_lvmetad = 1$/use_lvmetad = 0/' /etc/lvm/lvm.conf
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="diff">
<syntaxhighlight lang="text">
--- /etc/lvm/lvm.conf.anvil 2013-11-27 03:28:08.000000000 -0500
  Logical volume "shared" created
+++ /etc/lvm/lvm.conf 2014-01-26 18:57:41.026928464 -0500
@@ -84,7 +84,7 @@
    # lvmetad is used" comment that is attached to global/use_lvmetad setting.
    # By default we accept every block device:
-    filter = [ "a/.*/" ]
+    filter = [ "a|/dev/drbd*|", "r/.*/" ]
    # Exclude the cdrom drive
    # filter = [ "r|/dev/cdrom|" ]
@@ -451,7 +451,7 @@
    # supported in clustered environment. If use_lvmetad=1 and locking_type=3
    # is set at the same time, LVM always issues a warning message about this
    # and then it automatically disables lvmetad use.
-    locking_type = 1
+    locking_type = 3
    # Set to 0 to fail when a lock request cannot be satisfied immediately.
    wait_for_locks = 1
@@ -467,7 +467,7 @@
    # to 1 an attempt will be made to use local file-based locking (type 1).
    # If this succeeds, only commands against local volume groups will proceed.
    # Volume Groups marked as clustered will be ignored.
-    fallback_to_local_locking = 1
+    fallback_to_local_locking = 0
    # Local non-LV directory that holds file-based locks while commands are
    # in progress.  A directory like /tmp that may get wiped on reboot is OK.
@@ -594,7 +594,7 @@
    # supported in clustered environment. If use_lvmetad=1 and locking_type=3
    # is set at the same time, LVM always issues a warning message about this
    # and then it automatically disables lvmetad use.
-    use_lvmetad = 1
+    use_lvmetad = 0
    # Full path of the utility called to check that a thin metadata device
    # is in a state that allows it to be used.
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
|-
rsync -av /etc/lvm/lvm.conf* root@an-a03n02:/etc/lvm/
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pvdisplay
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
sending incremental file list
  --- Physical volume ---
lvm.conf
  PV Name              /dev/drbd1
lvm.conf.anvil
  VG Name              an-a04n02_vg0
  PV Size              378.40 GiB / not usable 3.14 MiB
  Allocatable          yes
  PE Size              4.00 MiB
  Total PE              96870
  Free PE              96870
  Allocated PE          0
  PV UUID              TpEXBC-7822-UGz0-ICz1-AJdg-v5eS-lyB7C5
 
  --- Physical volume ---
  PV Name              /dev/drbd0
  VG Name              an-a04n01_vg0
  PV Size              433.70 GiB / not usable 4.41 MiB
  Allocatable          yes
  PE Size              4.00 MiB
  Total PE              111025
  Free PE              100785
  Allocated PE          10240
  PV UUID              RoHAJQ-qrsO-Ofwz-f8W7-jIXd-2cvG-oPgfFR
</syntaxhighlight>
<syntaxhighlight lang="bash">
vgdisplay
</syntaxhighlight>
<syntaxhighlight lang="text">
  --- Volume group ---
  VG Name              an-a04n02_vg0
  System ID           
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  1
  VG Access            read/write
  VG Status            resizable
  Clustered            yes
  Shared                no
  MAX LV                0
  Cur LV                0
  Open LV              0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size              378.40 GiB
  PE Size              4.00 MiB
  Total PE              96870
  Alloc PE / Size      0 / 0 
  Free  PE / Size      96870 / 378.40 GiB
  VG UUID              9bTBDu-JSma-kwKR-4oBI-sxi1-YT6i-1uIM4C
 
  --- Volume group ---
  VG Name              an-a04n01_vg0
  System ID           
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access            read/write
  VG Status            resizable
  Clustered            yes
  Shared                no
  MAX LV                0
  Cur LV                1
  Open LV              0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size              433.69 GiB
  PE Size              4.00 MiB
  Total PE              111025
  Alloc PE / Size      10240 / 40.00 GiB
  Free  PE / Size      100785 / 393.69 GiB
  VG UUID              hLnvle-EScm-cP1t-xodO-cKyv-5EyC-TyIpj5
</syntaxhighlight>
<syntaxhighlight lang="bash">
lvdisplay
</syntaxhighlight>
<syntaxhighlight lang="text">
  --- Logical volume ---
  LV Path                /dev/an-a04n01_vg0/shared
  LV Name                shared
  VG Name                an-a04n01_vg0
  LV UUID                tvolRF-cb3L-29Dn-Vgqd-e4rf-Qq2e-JFIcbA
  LV Write Access        read/write
  LV Creation host, time an-a04n01.alteeve.ca, 2014-06-07 18:54:41 -0400
  LV Status              available
  # open                0
  LV Size                40.00 GiB
  Current LE            10240
  Segments              1
  Allocation            inherit
  Read ahead sectors    auto
  - currently set to    256
  Block device          253:0
</syntaxhighlight>
|}
 
=== Create the /shared GFS2 filesystem ===
 
Format the <span class="code">/dev/an-a04n01_vg0/shared</span> logical volume as a GFS2 filesystem;


sent 48536 bytes  received 440 bytes  97952.00 bytes/sec
{|class="wikitable"
total size is 90673  speedup is 1.85
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
mkfs.gfs2 -j 2 -p lock_dlm -t an-anvil-04:shared /dev/an-a04n01_vg0/shared
</syntaxhighlight>
</syntaxhighlight>
|-
<syntaxhighlight lang="text">
!<span class="code">an-a03n02</span>
This will destroy any data on /dev/an-a04n01_vg0/shared.
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
It appears to contain: symbolic link to `../dm-0'
diff -u /etc/lvm/lvm.conf.anvil /etc/lvm/lvm.conf
 
Are you sure you want to proceed? [y/n] y
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="diff">
<syntaxhighlight lang="text">
--- /etc/lvm/lvm.conf.anvil 2013-11-27 03:28:08.000000000 -0500
Device:                   /dev/an-a04n01_vg0/shared
+++ /etc/lvm/lvm.conf 2014-01-26 18:57:41.000000000 -0500
Blocksize:                 4096
@@ -84,7 +84,7 @@
Device Size                40.00 GB (10485760 blocks)
    # lvmetad is used" comment that is attached to global/use_lvmetad setting.
Filesystem Size:          40.00 GB (10485758 blocks)
Journals:                  2
    # By default we accept every block device:
Resource Groups:          160
-    filter = [ "a/.*/" ]
Locking Protocol:          "lock_dlm"
+    filter = [ "a|/dev/drbd*|", "r/.*/" ]
Lock Table:                "an-anvil-04:shared"
UUID:                      e07d35fe-6860-f790-38cd-af075366c27b
    # Exclude the cdrom drive
</syntaxhighlight>
    # filter = [ "r|/dev/cdrom|" ]
<syntaxhighlight lang="bash">
@@ -451,7 +451,7 @@
mkdir /shared
    # supported in clustered environment. If use_lvmetad=1 and locking_type=3
mount /dev/an-a04n01_vg0/shared /shared
    # is set at the same time, LVM always issues a warning message about this
df -hP
    # and then it automatically disables lvmetad use.
-    locking_type = 1
+    locking_type = 3
    # Set to 0 to fail when a lock request cannot be satisfied immediately.
    wait_for_locks = 1
@@ -467,7 +467,7 @@
    # to 1 an attempt will be made to use local file-based locking (type 1).
    # If this succeeds, only commands against local volume groups will proceed.
    # Volume Groups marked as clustered will be ignored.
-   fallback_to_local_locking = 1
+    fallback_to_local_locking = 0
    # Local non-LV directory that holds file-based locks while commands are
    # in progress.  A directory like /tmp that may get wiped on reboot is OK.
@@ -594,7 +594,7 @@
    # supported in clustered environment. If use_lvmetad=1 and locking_type=3
    # is set at the same time, LVM always issues a warning message about this
    # and then it automatically disables lvmetad use.
-   use_lvmetad = 1
+    use_lvmetad = 0
    # Full path of the utility called to check that a thin metadata device
    # is in a state that allows it to be used.
</syntaxhighlight>
|}
 
Disable <span class="code">lvmetad</span> as it's not cluster-aware.
 
{|class="wikitable"
!<span class="code">an-a03n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
systemctl disable lvm2-lvmetad.service
systemctl disable lvm2-lvmetad.socket
systemctl stop lvm2-lvmetad.service
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
rm '/etc/systemd/system/sockets.target.wants/lvm2-lvmetad.socket'
Filesystem                        Size  Used Avail Use% Mounted on
/dev/sda3                          20G  1.5G  18G  8% /
tmpfs                              12G  67M  12G  1% /dev/shm
/dev/sda1                          504M  72M  407M  16% /boot
/dev/mapper/an--a04n01_vg0-shared  40G  259M  40G  1% /shared
</syntaxhighlight>
</syntaxhighlight>
|-
|-
!<span class="code">an-a03n02</span>
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
systemctl disable lvm2-lvmetad.service
mkdir /shared
systemctl disable lvm2-lvmetad.socket
mount /dev/an-a04n01_vg0/shared /shared
systemctl stop lvm2-lvmetad.service
df -hP
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
rm '/etc/systemd/system/sockets.target.wants/lvm2-lvmetad.socket'
Filesystem                        Size  Used Avail Use% Mounted on
/dev/sda3                          20G  1.5G  18G  8% /
tmpfs                              12G  52M  12G  1% /dev/shm
/dev/sda1                          504M  72M  407M  16% /boot
/dev/mapper/an--a04n01_vg0-shared  40G  259M  40G  1% /shared
</syntaxhighlight>
</syntaxhighlight>
|}
|}


{{note|1=This will be moved to pacemaker shortly. We're enabling it here just long enough to configure pacemaker.}}
= Add Storage to Pacemaker =
 
== Configure Dual-Primary DRBD ==
 
Setup DRBD as a dual-primary resource.


Start DLM and clvmd;
Notes:
* Clones allow for a given service to run on multiple nodes.
** master-max is how many copies of the resource can be promoted to master at the same time across the cluster.
** master-node-max is how many copies of the resource can be promoted to master on a given node.
** clone-max is how many copies can run in the cluster, default is to the number of nodes in the cluster.
** clone-node-max is the number of instances of the resource that can run on each node.
** notify controls whether other nodes are notified before and after a resource is started or stopped on a given node.


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
systemctl start dlm.service
pcs cluster cib drbd_cfg
systemctl start clvmd.service
pcs -f drbd_cfg resource create drbd_r0 ocf:linbit:drbd drbd_resource=r0 op monitor interval=10s
pcs -f drbd_cfg resource create drbd_r1 ocf:linbit:drbd drbd_resource=r1 op monitor interval=10s
### Ignore this for now.
#pcs -f drbd_cfg resource create drbd_r0 ocf:linbit:drbd drbd_resource=r0 \
#                op monitor interval=29s role=Master \
#                op monitor interval=31s role=Slave \
#                op promote interval=0 timeout=90s start-delay=2s \
#                op start interval=0 timeout=240s \
#                op stop interval=0 timeout=120s
pcs -f drbd_cfg resource master drbd_r0_Clone drbd_r0 master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs -f drbd_cfg resource master drbd_r1_Clone drbd_r1 master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs cluster cib-push drbd_cfg
</syntaxhighlight>
</syntaxhighlight>
|-
<syntaxhighlight lang="text">
!<span class="code">an-a03n02</span>
CIB updated
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
systemctl start dlm.service
systemctl start clvmd.service
</syntaxhighlight>
</syntaxhighlight>
|}
|}


Create the [[PV]], [[VG]] and the <span class="code">/shared</span> [[LV]];
Give it a couple minutes to promote both nodes to <span class="code">Master</span> on both nodes. Initially, it will appear as <span class="code">Master</span> on one node only.
 
Once updated, you should see this:


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pvcreate /dev/drbd0
pcs status
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
  Physical volume "/dev/drbd0" successfully created
Cluster name: an-anvil-04
</syntaxhighlight>
Last updated: Sat Jun  7 20:29:09 2014
<syntaxhighlight lang="bash">
Last change: Sat Jun  7 20:28:36 2014 via cibadmin on an-a04n01.alteeve.ca
vgcreate an-a03n01_vg0 /dev/drbd0
Stack: cman
</syntaxhighlight>
Current DC: an-a04n01.alteeve.ca - partition with quorum
<syntaxhighlight lang="text">
Version: 1.1.10-14.el6_5.3-368c726
  /proc/devices: No entry for device-mapper found
2 Nodes configured
  Clustered volume group "an-a03n01_vg0" successfully created
6 Resources configured
</syntaxhighlight>
 
<syntaxhighlight lang="bash">
 
lvcreate -L 10G -n shared an-a03n01_vg0
Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
</syntaxhighlight>
 
<syntaxhighlight lang="text">
Full list of resources:
  Logical volume "shared" created
 
fence_n01_ipmi (stonith:fence_ipmilan): Started an-a04n01.alteeve.ca
fence_n02_ipmi (stonith:fence_ipmilan): Started an-a04n02.alteeve.ca
Master/Slave Set: drbd_r0_Clone [drbd_r0]
    Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
Master/Slave Set: drbd_r1_Clone [drbd_r1]
    Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
</syntaxhighlight>
</syntaxhighlight>
|-
|-
!<span class="code">an-a03n02</span>
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pvscan
pcs status
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
  PV /dev/drbd0  VG an-a03n01_vg0  lvm2 [20.00 GiB / 20.00 GiB free]
Cluster name: an-anvil-04
  Total: 1 [20.00 GiB] / in use: 1 [20.00 GiB] / in no VG: 0 []
Last updated: Sat Jun  7 20:29:36 2014
</syntaxhighlight>
Last change: Sat Jun  7 20:28:36 2014 via cibadmin on an-a04n01.alteeve.ca
<syntaxhighlight lang="bash">
Stack: cman
vgscan
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
6 Resources configured
 
 
Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
 
Full list of resources:
 
fence_n01_ipmi (stonith:fence_ipmilan): Started an-a04n01.alteeve.ca
fence_n02_ipmi (stonith:fence_ipmilan): Started an-a04n02.alteeve.ca
Master/Slave Set: drbd_r0_Clone [drbd_r0]
    Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
Master/Slave Set: drbd_r1_Clone [drbd_r1]
    Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
|}
  Reading all physical volumes. This may take a while...
 
  Found volume group "an-a03n01_vg0" using metadata type lvm2
=== Configure LVM ===
</syntaxhighlight>
 
<syntaxhighlight lang="bash">
We need to have pacemaker activate our clustered LVM LVs on start, and deactivate them when stopping. We don't start/stop clvmd directly because of stop timing issues that can lead to stray fencing.
lvscan
 
{{note|1=This will throw errors if there are no LVs on a given VG... Do not add a volume group until at least one logical volume has been created.}}
 
{|class="wikitable"
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs cluster cib lvm_cfg
pcs -f lvm_cfg resource create lvm_n01_vg0 ocf:heartbeat:lvm volgrpname=an-a04n01_vg0 op monitor interval=10s
pcs -f lvm_cfg resource master lvm_n01_vg0_Clone lvm_n01_vg0 master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs cluster cib-push lvm_cfg
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
  ACTIVE            '/dev/an-a03n01_vg0/shared' [10.00 GiB] inherit
CIB updated
</syntaxhighlight>
</syntaxhighlight>
|}
|}


Format the <span class="code">/dev/an-a03n01_vg0/shared</span>;
== Configure LVM to start after the DRBD PV is Primary ==
 
It we stopped here, there is a good chance that on future starts of pacemaker, LVM and DRBD would start in parallel, DRBD would take too long, LVM would error out and stonith's would start to fly. To prevent this, we will tell Pacemaker not to start the LVM resource until after the DRBD resource that is behind the volume group has been promoted to primary.


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
mkfs.gfs2 -j 2 -p lock_dlm -t an-cluster-03:shared /dev/an-a03n01_vg0/shared
pcs cluster cib cst_cfg
pcs -f cst_cfg constraint order promote drbd_r0_Clone then start lvm_n01_vg0_Clone
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
/dev/an-a03n01_vg0/shared is a symbolic link to /dev/dm-0
Adding drbd_r0_Clone lvm_n01_vg0_Clone (kind: Mandatory) (Options: first-action=promote then-action=start)
This will destroy any data on /dev/dm-0
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="bash">
Are you sure you want to proceed? [y/n]y
pcs cluster cib-push cst_cfg
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Device:                    /dev/an-a03n01_vg0/shared
CIB updated
Block size:                4096
Device size:              10.00 GB (2621440 blocks)
Filesystem size:          10.00 GB (2621438 blocks)
Journals:                  2
Resource groups:          40
Locking protocol:          "lock_dlm"
Lock table:                "an-cluster-03:shared"
UUID:                      20bafdb0-1f86-f424-405b-9bf608c0c486
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
mkdir /shared
pcs constraint show
mount /dev/an-a03n01_vg0/shared /shared
df -h
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Filesystem                        Size  Used Avail Use% Mounted on
Location Constraints:
/dev/vda3                          18G  5.6G  12G  32% /
Ordering Constraints:
devtmpfs                          932M    0  932M  0% /dev
   promote drbd_r0_Clone then start lvm_n01_vg0_Clone
tmpfs                              937M  61M  877M  7% /dev/shm
Colocation Constraints:
tmpfs                              937M  1.9M  935M  1% /run
tmpfs                              937M    0  937M  0% /sys/fs/cgroup
/dev/loop0                        4.4G  4.4G    0 100% /mnt/dvd
/dev/vda1                          484M  83M  401M  18% /boot
/dev/mapper/an--a03n01_vg0-shared  10G  259M  9.8G   3% /shared
</syntaxhighlight>
|-
!<span class="code">an-a03n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
</syntaxhighlight>
<syntaxhighlight lang="text">
Filesystem                        Size  Used Avail Use% Mounted on
/dev/vda3                          18G  5.6G  12G  32% /
devtmpfs                          932M    0  932M  0% /dev
tmpfs                              937M  76M  862M  9% /dev/shm
tmpfs                              937M  2.0M  935M  1% /run
tmpfs                              937M    0  937M  0% /sys/fs/cgroup
/dev/loop0                        4.4G  4.4G    0 100% /mnt/dvd
/dev/vda1                          484M  83M  401M  18% /boot
/dev/mapper/an--a03n01_vg0-shared  10G  259M  9.8G  3% /shared
</syntaxhighlight>
</syntaxhighlight>
|}
|}


Shut down <span class="code">gfs2</span>, <span class="code">clvmd</span> and <span class="code">drbd</span> now.
== Configure the /shared GFS2 Partition ==


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
umount /shared/
pcs cluster cib fs_cfg
systemctl stop clvmd.service
pcs -f fs_cfg resource create sharedFS Filesystem device="/dev/an-a04n01_vg0/shared" directory="/shared" fstype="gfs2"
drbdadm down r0
pcs -f fs_cfg resource clone sharedFS master-max=2 master-node-max=1 clone-max=2 clone-node-max=1
pcs cluster cib-push fs_cfg
</syntaxhighlight>
</syntaxhighlight>
|-
<syntaxhighlight lang="text">
!<span class="code">an-a03n02</span>
CIB updated
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
</syntaxhighlight>
umount /shared/
<syntaxhighlight lang="bash">
systemctl stop clvmd.service
pcs status
drbdadm down r0
</syntaxhighlight>
</syntaxhighlight>
|}
<syntaxhighlight lang="text">
Cluster name: an-anvil-04
Last updated: Sat Jun  7 21:09:28 2014
Last change: Sat Jun  7 21:08:47 2014 via cibadmin on an-a04n01.alteeve.ca
Stack: cman
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
8 Resources configured


Done.


= Add Storage to Pacemaker =
Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]


== Configure Dual-Primary DRBD ==
Full list of resources:


Setup DRBD as a dual-primary resource.
fence_n01_ipmi (stonith:fence_ipmilan): Started an-a04n01.alteeve.ca
 
fence_n02_ipmi (stonith:fence_ipmilan): Started an-a04n02.alteeve.ca
{|class="wikitable"
Master/Slave Set: drbd_r0_Clone [drbd_r0]
!<span class="code">an-a03n01</span>
    Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
Master/Slave Set: drbd_r1_Clone [drbd_r1]
    Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
Clone Set: sharedFS-clone [sharedFS]
    Started: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
</syntaxhighlight>
<syntaxhighlight lang="bash">
df -hP
</syntaxhighlight>
<syntaxhighlight lang="text">
Filesystem                        Size  Used Avail Use% Mounted on
/dev/sda3                          20G  1.5G  18G  8% /
tmpfs                              12G  67M  12G  1% /dev/shm
/dev/sda1                          504M  72M  407M  16% /boot
/dev/mapper/an--a04n01_vg0-shared  40G  259M  40G  1% /shared
</syntaxhighlight>
|-
!<span class="code">an-a04n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs cluster cib drbd_cfg
df -h
pcs -f drbd_cfg resource create drbd_r0 ocf:linbit:drbd drbd_resource=r0 op monitor interval=60s
pcs -f drbd_cfg resource master drbd_r0_Clone drbd_r0 master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs cluster cib-push drbd_cfg
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
CIB updated
Filesystem                        Size  Used Avail Use% Mounted on
/dev/sda3                          20G  1.5G  18G  8% /
tmpfs                              12G  52M  12G  1% /dev/shm
/dev/sda1                          504M  72M  407M  16% /boot
/dev/mapper/an--a04n01_vg0-shared  40G  259M  40G  1% /shared
</syntaxhighlight>
</syntaxhighlight>
|}
|}


Give it a couple minutes to promote both nodes to <span class="code">Master</span> on both nodes. Initially, it will appear as <span class="code">Master</span> on one node only.
== Configur /shared to start after LVM ==


Once updated, you should see this:
As we did before in making sure LVM started after DRBD, this time we will make sure LVM starts before /shared is mounted.


{|class="wikitable"
{|class="wikitable"
!<span class="code">an-a03n01</span>
!<span class="code">an-a04n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs status
pcs cluster cib cst_cfg
</syntaxhighlight>
pcs -f cst_cfg constraint order start lvm_n01_vg0_Clone then start sharedFS
<syntaxhighlight lang="text">
Cluster name: an-cluster-03
Last updated: Sun Jan 26 20:26:33 2014
Last change: Sun Jan 26 20:23:23 2014 via cibadmin on an-a03n01.alteeve.ca
Stack: corosync
Current DC: an-a03n02.alteeve.ca (2) - partition with quorum
Version: 1.1.10-19.el7-368c726
2 Nodes configured
4 Resources configured
 
 
Online: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
 
Full list of resources:
 
fence_n01_virsh (stonith:fence_virsh): Started an-a03n01.alteeve.ca
fence_n02_virsh (stonith:fence_virsh): Started an-a03n02.alteeve.ca
Master/Slave Set: drbd_r0_Clone [drbd_r0]
    Masters: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
 
PCSD Status:
an-a03n01.alteeve.ca:
  an-a03n01.alteeve.ca: Online
an-a03n02.alteeve.ca:
  an-a03n02.alteeve.ca: Online
 
Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
</syntaxhighlight>
|-
!<span class="code">an-a03n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs status
</syntaxhighlight>
<syntaxhighlight lang="text">
Cluster name: an-cluster-03
Last updated: Sun Jan 26 20:26:58 2014
Last change: Sun Jan 26 20:23:23 2014 via cibadmin on an-a03n01.alteeve.ca
Stack: corosync
Current DC: an-a03n02.alteeve.ca (2) - partition with quorum
Version: 1.1.10-19.el7-368c726
2 Nodes configured
4 Resources configured
 
 
Online: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
 
Full list of resources:
 
fence_n01_virsh (stonith:fence_virsh): Started an-a03n01.alteeve.ca
fence_n02_virsh (stonith:fence_virsh): Started an-a03n02.alteeve.ca
Master/Slave Set: drbd_r0_Clone [drbd_r0]
    Masters: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
 
PCSD Status:
an-a03n01.alteeve.ca:
  an-a03n01.alteeve.ca: Online
an-a03n02.alteeve.ca:
  an-a03n02.alteeve.ca: Online
 
Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
</syntaxhighlight>
|}
 
== Configure DLM ==
 
{|class="wikitable"
!<span class="code">an-a03n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs cluster cib dlm_cfg
pcs -f dlm_cfg resource create dlm ocf:pacemaker:controld op monitor interval=60s
pcs -f dlm_cfg resource clone dlm clone-max=2 clone-node-max=1
pcs cluster cib-push dlm_cfg
</syntaxhighlight>
<syntaxhighlight lang="text">
CIB updated
</syntaxhighlight>
|-
!<span class="code">an-a03n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs status
</syntaxhighlight>
<syntaxhighlight lang="text">
Cluster name: an-cluster-03
Last updated: Sun Jan 26 20:34:36 2014
Last change: Sun Jan 26 20:33:31 2014 via cibadmin on an-a03n01.alteeve.ca
Stack: corosync
Current DC: an-a03n02.alteeve.ca (2) - partition with quorum
Version: 1.1.10-19.el7-368c726
2 Nodes configured
6 Resources configured
 
 
Online: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
 
Full list of resources:
 
fence_n01_virsh (stonith:fence_virsh): Started an-a03n01.alteeve.ca
fence_n02_virsh (stonith:fence_virsh): Started an-a03n02.alteeve.ca
Master/Slave Set: drbd_r0_Clone [drbd_r0]
    Masters: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
Clone Set: dlm-clone [dlm]
    Started: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
 
PCSD Status:
an-a03n01.alteeve.ca:
  an-a03n01.alteeve.ca: Online
an-a03n02.alteeve.ca:
  an-a03n02.alteeve.ca: Online
 
Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
</syntaxhighlight>
|}
 
== Configure Cluster LVM ==
 
{|class="wikitable"
!<span class="code">an-a03n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs cluster cib clvmd_cfg
pcs -f clvmd_cfg resource create clvmd lsb:clvmd params daemon_timeout=30s op monitor interval=60s
pcs -f clvmd_cfg resource clone clvmd clone-max=2 clone-node-max=1
pcs -f clvmd_cfg constraint colocation add dlm-clone clvmd-clone INFINITY
pcs -f clvmd_cfg constraint order start dlm then start clvmd-clone
pcs cluster cib-push clvmd_cfg</syntaxhighlight>
<syntaxhighlight lang="text">
CIB updated
</syntaxhighlight>
|-
!<span class="code">an-a03n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs status
</syntaxhighlight>
<syntaxhighlight lang="text">
Cluster name: an-cluster-03
Last updated: Mon Jan 27 19:00:33 2014
Last change: Mon Jan 27 19:00:19 2014 via crm_resource on an-a03n01.alteeve.ca
Stack: corosync
Current DC: an-a03n01.alteeve.ca (1) - partition with quorum
Version: 1.1.10-19.el7-368c726
2 Nodes configured
8 Resources configured
 
 
Online: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
 
Full list of resources:
 
fence_n01_virsh        (stonith:fence_virsh):  Started an-a03n01.alteeve.ca
fence_n02_virsh        (stonith:fence_virsh):  Started an-a03n02.alteeve.ca
Master/Slave Set: drbd_r0_Clone [drbd_r0]
    Masters: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
Clone Set: dlm-clone [dlm]
    Started: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
Clone Set: clvmd-clone [clvmd]
    Started: [ an-a03n01.alteeve.ca an-a03n02.alteeve.ca ]
 
PCSD Status:
an-a03n01.alteeve.ca:
  an-a03n01.alteeve.ca: Online
an-a03n02.alteeve.ca:
  an-a03n02.alteeve.ca: Online
 
Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
</syntaxhighlight>
|}
 
== Configure the /shared GFS2 Partition ==
 
{|class="wikitable"
!<span class="code">an-a03n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs cluster cib fs_cfg
pcs -f fs_cfg resource create sharedFS Filesystem device="/dev/an-a03n01_vg0/shared" directory="/shared" fstype="gfs2"
pcs -f fs_cfg resource clone sharedFS
pcs cluster cib-push fs_cfg
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
CIB updated
Adding lvm_n01_vg0_Clone sharedFS (kind: Mandatory) (Options: first-action=start then-action=start)
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
df -h
</syntaxhighlight>
<syntaxhighlight lang="text">
Filesystem                        Size  Used Avail Use% Mounted on
/dev/vda3                          18G  5.6G  12G  32% /
devtmpfs                          932M    0  932M  0% /dev
tmpfs                              937M  61M  877M  7% /dev/shm
tmpfs                              937M  2.2M  935M  1% /run
tmpfs                              937M    0  937M  0% /sys/fs/cgroup
/dev/loop0                        4.4G  4.4G    0 100% /mnt/dvd
/dev/vda1                          484M  83M  401M  18% /boot
/dev/mapper/an--a03n01_vg0-shared  10G  259M  9.8G  3% /shared
</syntaxhighlight>
|-
!<span class="code">an-a03n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
df -h
</syntaxhighlight>
<syntaxhighlight lang="text">
Filesystem                        Size  Used Avail Use% Mounted on
/dev/vda3                          18G  5.6G  12G  32% /
devtmpfs                          932M    0  932M  0% /dev
tmpfs                              937M  76M  862M  9% /dev/shm
tmpfs                              937M  2.6M  935M  1% /run
tmpfs                              937M    0  937M  0% /sys/fs/cgroup
/dev/loop0                        4.4G  4.4G    0 100% /mnt/dvd
/dev/vda1                          484M  83M  401M  18% /boot
/dev/mapper/an--a03n01_vg0-shared  10G  259M  9.8G  3% /shared
</syntaxhighlight>
|}
== Configuring Constraints ==
{|class="wikitable"
!<span class="code">an-a03n01</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs cluster cib cst_cfg
pcs -f cst_cfg constraint order start dlm then promote drbd_r0_Clone
pcs -f cst_cfg constraint order promote drbd_r0_Clone then start clvmd-clone
pcs -f cst_cfg constraint order promote clvmd-clone then start sharedFS-clone
pcs cluster cib-push cst_cfg
pcs cluster cib-push cst_cfg
</syntaxhighlight>
</syntaxhighlight>
Line 2,422: Line 2,475:
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">
pcs constraint show
pcs constraint show --full
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="text">
<syntaxhighlight lang="text">
Location Constraints:
Location Constraints:
Ordering Constraints:
Ordering Constraints:
  start dlm then promote drbd_r0_Clone
   promote drbd_r0_Clone then start lvm_n01_vg0_Clone (Mandatory) (id:order-drbd_r0_Clone-lvm_n01_vg0_Clone-mandatory)
   promote drbd_r0_Clone then start clvmd-clone
   start lvm_n01_vg0_Clone then start sharedFS-clone (Mandatory) (id:order-lvm_n01_vg0_Clone-sharedFS-clone-mandatory)
   start clvmd-clone then start sharedFS-clone
Colocation Constraints:
</syntaxhighlight>
|-
!<span class="code">an-a03n02</span>
|style="white-space: nowrap;"|<syntaxhighlight lang="bash">
pcs constraint show
</syntaxhighlight>
<syntaxhighlight lang="text">
Location Constraints:
Ordering Constraints:
  start dlm then promote drbd_r0_Clone
  promote drbd_r0_Clone then start clvmd-clone
  start clvmd-clone then start sharedFS-clone
Colocation Constraints:
Colocation Constraints:
</syntaxhighlight>
</syntaxhighlight>
|}
|}


= Odds and Sods =
Note that this time we added '--full'. If you ever need to delete a constraint, you would use 'pcs constraint delete <id>'.
 
This is a section for random notes. The stuff here will be integrated into the finished tutorial or removed.
 
== Determine multicast Address ==
 
Useful if you need to ensure that your switch has persistent multicast addresses set.
 
<syntaxhighlight lang="bash">
corosync-cmapctl | grep mcastaddr
</syntaxhighlight>
<syntaxhighlight lang="text">
totem.interface.0.mcastaddr (str) = 239.192.122.199
</syntaxhighlight>
 
 
<span class="code"></span>
<syntaxhighlight lang="bash">
</syntaxhighlight>
<syntaxhighlight lang="text">
</syntaxhighlight>
<syntaxhighlight lang="diff">
</syntaxhighlight>


= Notes =
= Notes =

Latest revision as of 12:23, 21 July 2014

 AN!Wiki :: How To :: Anvil! Tutorial 3 on EL6

Warning: This tutorial is incomplete, flawed and generally sucks at this time. Do not follow this and expect anything to work. In large part, it's a dumping ground for notes and little else. This warning will be removed when the tutorial is completed.

This is the third Anvil! tutorial built on Red Hat's Enterprise Linux 6.5 and newer. It is meant to be a stop-gap / learning cluster before RHEL 7 is released and stabilized.

Before We Begin

This tutorial does not require prior Anvil! experience (or any clustering experience), but it does expect a certain familiarity with Linux and a low-intermediate understanding of networking. Where possible, steps are explained in detail and rationale is provided for why certain decisions are made.

For those with Anvil! experience;

Please be careful not to skip too much. There are some major and some subtle changes from previous tutorials.

OS Setup

Warning: RHEL v6.5 or newer is required.

Post OS Install

Stuff.

If you're using RHEL proper, register your nodes with RHN.

Note: You need to replace $user and $pass with your RHN account details.
an-a04n01
rhnreg_ks --username "$user" --password "$pass" --force --profilename "an-a04n01.alteeve.ca"
rhn-channel --add --user "$user" --password "$pass" --channel=rhel-x86_64-server-rs-6
an-a04n02
rhnreg_ks --username "$user" --password "$pass" --force --profilename "an-a04n02.alteeve.ca"
rhn-channel --add --user "$user" --password "$pass" --channel=rhel-x86_64-server-rs-6

Adding AN! Repo

AN! offers a new repo with a few RPMs not in stock EL 6 distros.

an-a04n01 an-a04n02
cat <<-END>/etc/yum.repos.d/an.repo
[an-repo]
name=AN! Repo for Anvil! stuff
baseurl=https://alteeve.ca/repo/el6/
enabled=1
gpgcheck=0
protect=1
END
yum clean all
cat <<-END>/etc/yum.repos.d/an.repo
[an-repo]
name=AN! Repo for Anvil! stuff
baseurl=https://alteeve.ca/repo/el6/
enabled=1
gpgcheck=0
protect=1
END
yum clean all

Done.

Install

Not all of these are required, but most are used at one point or another in this tutorial.

Note: The fence-agents-virsh package is not available in RHEL 7 beta. Further, it's only needed if you're building your Anvil! using VMs.
an-a04n01 an-a04n02
yum -y update
yum -y install bridge-utils vim pacemaker corosync cman gfs2-utils \
               ccs pcs ipmitool OpenIPMI lvm2-cluster drbd84-utils \
               drbd84-kmod
chkconfig ipmi on
chkconfig acpid off
chkconfig kdump off
chkconfig drbd off
/etc/init.d/ipmi start
/etc/init.d/acpid stop
/etc/init.d/kdump stop
/etc/init.d/drbd stop
# same as an-a04n01

Setup Networking

TODO: Explain this.

Remap all NICs to have purpose-based names.

an-a04n01 an-a04n02
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bridge1
# Internet-Facing Network - Bridge
DEVICE="ifn-bridge1"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="10.255.40.1"
NETMASK="255.255.0.0"
GATEWAY="10.255.255.254"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
DEFROUTE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bridge1
# Internet-Facing Network - Bridge
DEVICE="ifn-bridge1"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="10.255.40.2"
NETMASK="255.255.0.0"
GATEWAY="10.255.255.254"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
DEFROUTE="yes"
an-a04n01
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
# Internet-Facing Network - Bond
DEVICE="ifn-bond1"
BRIDGE="ifn-bridge1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn-link1"
an-a04n02
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
# Internet-Facing Network - Bond
DEVICE="ifn-bond1"
BRIDGE="ifn-bridge1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn-link1"
an-a04n01 an-a04n02
vim /etc/sysconfig/network-scripts/ifcfg-ifn-link1
# Internet-Facing Network - Link 1
HWADDR="00:1B:21:81:C3:34"
DEVICE="ifn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="ifn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-ifn-link2
# Internet-Facing Network - Link 2
HWADDR="A0:36:9F:02:E0:05"
DEVICE="ifn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="ifn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-ifn-link1
# Internet-Facing Network - Link 1
HWADDR="00:1B:21:81:C2:EA"
DEVICE="ifn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="ifn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-ifn-link2
# Internet-Facing Network - Link 2
HWADDR="A0:36:9F:07:D6:2F"
DEVICE="ifn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="ifn-bond1"
SLAVE="yes"
an-a04n01
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
# Storage Network - Bond
DEVICE="sn-bond1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn-link1"
IPADDR="10.10.40.1"
NETMASK="255.255.0.0"
an-a04n02
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
# Storage Network - Bond
DEVICE="sn-bond1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn-link1"
IPADDR="10.10.40.2"
NETMASK="255.255.0.0"
an-a04n01 an-a04n02
vim /etc/sysconfig/network-scripts/ifcfg-sn-link1
# Storage Network - Link 1
HWADDR="00:19:99:9C:9B:9F"
DEVICE="sn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="sn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-sn-link2
# Storage Network - Link 2
HWADDR="A0:36:9F:02:E0:04"
DEVICE="sn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="sn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-sn-link1
# Storage Network - Link 1
HWADDR="00:19:99:9C:A0:6D"
DEVICE="sn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="sn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-sn-link2
# Storage Network - Link 2
HWADDR="A0:36:9F:07:D6:2E"
DEVICE="sn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="sn-bond1"
SLAVE="yes"
an-a04n01
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
# Back-Channel Network - Bond
DEVICE="bcn-bond1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn-link1"
IPADDR="10.20.40.1"
NETMASK="255.255.0.0"
an-a04n02
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
# Back-Channel Network - Bond
DEVICE="bcn-bond1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn-link1"
IPADDR="10.20.40.2"
NETMASK="255.255.0.0"
an-a04n01 an-a04n02
vim /etc/sysconfig/network-scripts/ifcfg-bcn-link1
# Back-Channel Network - Link 1
HWADDR="00:19:99:9C:9B:9E"
DEVICE="bcn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="bcn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-bcn-link2
# Back-Channel Network - Link 2
HWADDR="00:1B:21:81:C3:35"
DEVICE="bcn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="bcn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-bcn-link1
# Back-Channel Network - Link 1
HWADDR="00:19:99:9C:A0:6C"
DEVICE="bcn-link1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="bcn-bond1"
SLAVE="yes"
vim /etc/sysconfig/network-scripts/ifcfg-bcn-link2
# Back-Channel Network - Link 2
HWADDR="00:1B:21:81:C2:EB"
DEVICE="bcn-link2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
MASTER="bcn-bond1"
SLAVE="yes"

Making ssh faster when the net is down

By default, the nodes will try to resolve the host name of an incoming ssh connection. When the internet connection is down, DNS lookups have to time out, which can make login times quite slow. When something goes wrong, seconds count and waiting for up to a minute for an SSH password prompt can be maddening.

For this reason, we will make two changes to /etc/ssh/sshd_config that disable this login delay.

Please be aware that this can reduce security. If this is a concern, skip this step.


an-a04n01
sed -i.anvil 's/#GSSAPIAuthentication no/GSSAPIAuthentication no/' /etc/ssh/sshd_config
sed -i 's/GSSAPIAuthentication yes/#GSSAPIAuthentication yes/' /etc/ssh/sshd_config
sed -i 's/#UseDNS yes/UseDNS no/' /etc/ssh/sshd_config
systemctl restart sshd.service
diff -u /etc/ssh/sshd_config.anvil /etc/ssh/sshd_config
--- /etc/ssh/sshd_config.anvil	2013-09-30 03:08:17.000000000 -0400
+++ /etc/ssh/sshd_config	2014-05-28 00:35:30.954000741 -0400
@@ -77,8 +77,8 @@
 #KerberosUseKuserok yes
 
 # GSSAPI options
-#GSSAPIAuthentication no
-GSSAPIAuthentication yes
+GSSAPIAuthentication no
+#GSSAPIAuthentication yes
 #GSSAPICleanupCredentials yes
 GSSAPICleanupCredentials yes
 #GSSAPIStrictAcceptorCheck yes
@@ -119,7 +119,7 @@
 #ClientAliveInterval 0
 #ClientAliveCountMax 3
 #ShowPatchLevel no
-#UseDNS yes
+UseDNS no
 #PidFile /var/run/sshd.pid
 #MaxStartups 10:30:100
 #PermitTunnel no
an-a04n02
sed -i.anvil 's/#GSSAPIAuthentication no/GSSAPIAuthentication no/' /etc/ssh/sshd_config
sed -i 's/GSSAPIAuthentication yes/#GSSAPIAuthentication yes/' /etc/ssh/sshd_config
sed -i 's/#UseDNS yes/UseDNS no/' /etc/ssh/sshd_config
systemctl restart sshd.service
diff -u /etc/ssh/sshd_config.anvil /etc/ssh/sshd_config
--- /etc/ssh/sshd_config.anvil	2013-09-30 03:08:17.000000000 -0400
+++ /etc/ssh/sshd_config	2014-05-28 00:35:33.016999110 -0400
@@ -77,8 +77,8 @@
 #KerberosUseKuserok yes
 
 # GSSAPI options
-#GSSAPIAuthentication no
-GSSAPIAuthentication yes
+GSSAPIAuthentication no
+#GSSAPIAuthentication yes
 #GSSAPICleanupCredentials yes
 GSSAPICleanupCredentials yes
 #GSSAPIStrictAcceptorCheck yes
@@ -119,7 +119,7 @@
 #ClientAliveInterval 0
 #ClientAliveCountMax 3
 #ShowPatchLevel no
-#UseDNS yes
+UseDNS no
 #PidFile /var/run/sshd.pid
 #MaxStartups 10:30:100
 #PermitTunnel no

Subsequent logins when the net is down should be quick.

Setting the Hostname

TODO

Setup The hosts File

You can use DNS if you prefer. For now, lets use /etc/hosts for node name resolution.

an-a04n01
vim /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

### Anvil! systems
# Anvil! 03, Node 01
10.20.40.1	an-a04n01.bcn an-a04n01 an-a04n01.alteeve.ca
10.20.41.1	an-a04n01.ipmi
10.10.40.1	an-a04n01.sn
10.255.40.1	an-a04n01.ifn

# Anvil! 03, Node 02
10.20.40.2	an-a04n02.bcn an-a04n02 an-a04n02.alteeve.ca
10.20.41.2	an-a04n02.ipmi
10.10.40.2	an-a04n02.sn
10.255.40.2	an-a04n02.ifn

### Foundation Pack
# Network Switches
10.20.1.1	an-s01 an-s01.alteeve.ca
10.20.1.2	an-s02 an-s02.alteeve.ca	# Only accessible when out of the stack
 
# Switched PDUs
10.20.2.1	an-p01 an-p01.alteeve.ca
10.20.2.2	an-p02 an-p02.alteeve.ca
 
# Network-monitored UPSes
10.20.3.1	an-u01 an-u01.alteeve.ca
10.20.3.2	an-u02 an-u02.alteeve.ca
 
### Monitor Packs
10.20.4.1	an-m01 an-m01.alteeve.ca
10.255.4.1	an-m01.ifn
10.20.4.2	an-m02 an-m02.alteeve.ca
10.255.4.2	an-m02.ifn
an-a04n02
vim /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

### Anvil! systems
# Anvil! 03, Node 01
10.20.40.1	an-a04n01.bcn an-a04n01 an-a04n01.alteeve.ca
10.20.41.1	an-a04n01.ipmi
10.10.40.1	an-a04n01.sn
10.255.40.1	an-a04n01.ifn

# Anvil! 03, Node 02
10.20.40.2	an-a04n02.bcn an-a04n02 an-a04n02.alteeve.ca
10.20.41.2	an-a04n02.ipmi
10.10.40.2	an-a04n02.sn
10.255.40.2	an-a04n02.ifn

### Foundation Pack
# Network Switches
10.20.1.1	an-s01 an-s01.alteeve.ca
10.20.1.2	an-s02 an-s02.alteeve.ca	# Only accessible when out of the stack
 
# Switched PDUs
10.20.2.1	an-p01 an-p01.alteeve.ca
10.20.2.2	an-p02 an-p02.alteeve.ca
 
# Network-monitored UPSes
10.20.3.1	an-u01 an-u01.alteeve.ca
10.20.3.2	an-u02 an-u02.alteeve.ca
 
### Monitor Packs
10.20.4.1	an-m01 an-m01.alteeve.ca
10.255.4.1	an-m01.ifn
10.20.4.2	an-m02 an-m02.alteeve.ca
10.255.4.2	an-m02.ifn

Setup SSH

Same as before.

Populating And Pushing ~/.ssh/known_hosts

an-a04n01
ssh-keygen -t rsa -N "" -b 8191 -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
f9:41:7e:aa:96:8e:fa:47:79:f5:3a:33:89:c3:9a:4b root@an-a04n01.alteeve.ca
The key's randomart image is:
+--[ RSA 8191]----+
|                 |
|                 |
|          .      |
|         +  .    |
|        S.o...   |
|        o..+  .  |
|       .E+o. o   |
|       o+o+ *    |
|    .oo+*o . +   |
+-----------------+
an-a04n01
ssh-keygen -t rsa -N "" -b 8191 -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Created directory '/root/.ssh'.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
3f:1a:02:17:44:10:5e:6f:2b:98:44:09:e5:e0:ea:4b root@an-a04n02.alteeve.ca
The key's randomart image is:
+--[ RSA 8191]----+
|  oo==+          |
| . =.o .         |
|  . + . o        |
| . . o o .       |
|.   + o S        |
|.    o . .       |
| E    . . o      |
|. .    . o .     |
| .      .        |
+-----------------+

Setup autorized_keys:

an-a04n01
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
ssh root@an-a04n02 "cat /root/.ssh/id_rsa.pub" >> ~/.ssh/authorized_keys
The authenticity of host 'an-a04n02 (10.20.40.2)' can't be established.
RSA key fingerprint is 22:09:7b:0c:8b:d8:80:08:80:6d:0e:bc:fb:5a:e1:de.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'an-a04n02,10.20.40.2' (RSA) to the list of known hosts.
root@an-a04n02's password:

Populate ~/.ssh/known_hosts:

an-a04n01
ssh-keyscan an-a04n01.alteeve.ca >> ~/.ssh/known_hosts
# an-a04n01.alteeve.ca SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n01 >> ~/.ssh/known_hosts
# an-a04n01 SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n01.bcn >> ~/.ssh/known_hosts
# an-a04n01.bcn SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n01.sn >> ~/.ssh/known_hosts
# an-a04n01.sn SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n01.ifn >> ~/.ssh/known_hosts
# an-a04n01.ifn SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n02.alteeve.ca >> ~/.ssh/known_hosts
# an-a04n02.alteeve.ca SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n02 >> ~/.ssh/known_hosts
# an-a04n02 SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n02.bcn >> ~/.ssh/known_hosts
# an-a04n02.bcn SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n02.sn >> ~/.ssh/known_hosts
# an-a04n02.sn SSH-2.0-OpenSSH_5.3
ssh-keyscan an-a04n02.ifn >> ~/.ssh/known_hosts
# an-a04n02.ifn SSH-2.0-OpenSSH_5.3

Now copy the files to the second node:

an-a04n01
rsync -av ~/.ssh/authorized_keys root@an-a04n02:/root/.ssh/
root@an-a04n02's password:
sending incremental file list
authorized_keys

sent 2937 bytes  received 31 bytes  1187.20 bytes/sec
total size is 2854  speedup is 0.96
rsync -av ~/.ssh/known_hosts root@an-a04n02:/root/.ssh/
sending incremental file list
known_hosts

sent 4829 bytes  received 31 bytes  9720.00 bytes/sec
total size is 4750  speedup is 0.98

Note that there was no password prompt the second time. Hoozah!

Configuring the Firewall

an-a04n01
# cman (corosync's totem)
iptables -I INPUT -m state --state NEW -m multiport -p udp -s 10.20.0.0/16 -d 10.20.0.0/16 --dports 5404,5405 -j ACCEPT
iptables -I INPUT -m addrtype --dst-type MULTICAST -m state --state NEW -m multiport -p udp -s 10.20.0.0/16 --dports 5404,5405 -j ACCEPT

# dlm
iptables -I INPUT -m state --state NEW -p tcp -s 10.20.0.0/16 -d 10.20.0.0/16 --dport 21064 -j ACCEPT 

# DRBD resource 0 and 1 - on the SN
iptables -I INPUT -m state --state NEW -p tcp -s 10.10.0.0/16 -d 10.10.0.0/16 --dport 7788 -j ACCEPT
iptables -I INPUT -m state --state NEW -p tcp -s 10.10.0.0/16 -d 10.10.0.0/16 --dport 7789 -j ACCEPT

# Make the new rules persistent.
/etc/init.d/iptables save
iptables: Saving firewall rules to /etc/sysconfig/iptables:[  OK  ]
an-a04n01
# cman (corosync's totem)
iptables -I INPUT -m state --state NEW -m multiport -p udp -s 10.20.0.0/16 -d 10.20.0.0/16 --dports 5404,5405 -j ACCEPT
iptables -I INPUT -m addrtype --dst-type MULTICAST -m state --state NEW -m multiport -p udp -s 10.20.0.0/16 --dports 5404,5405 -j ACCEPT

# dlm
iptables -I INPUT -m state --state NEW -p tcp -s 10.20.0.0/16 -d 10.20.0.0/16 --dport 21064 -j ACCEPT 

# DRBD resource 0 and 1 - on the SN
iptables -I INPUT -m state --state NEW -p tcp -s 10.10.0.0/16 -d 10.10.0.0/16 --dport 7788 -j ACCEPT
iptables -I INPUT -m state --state NEW -p tcp -s 10.10.0.0/16 -d 10.10.0.0/16 --dport 7789 -j ACCEPT

# Make the new rules persistent.
/etc/init.d/iptables save
iptables: Saving firewall rules to /etc/sysconfig/iptables:[  OK  ]

Keeping Time in Sync

It's not as critical as it used to be to keep the clocks on the nodes in sync, but it's still a good idea.

an-a04n01
chkconfig ntpd on
/etc/init.d/ntpd start
Starting ntpd:                                             [  OK  ]
an-a04n01
chkconfig ntpd on
/etc/init.d/ntpd start
Starting ntpd:                                             [  OK  ]

Configuring the Anvil!

Now we're getting down to business!

For this section, we will be working on an-a04n01 and using ssh to perform tasks on an-a04n02.

Note: TODO: explain what this is and how it works.

Configuring cman

With RHEL 6, we do not need to configure corosync directly. We will create a "skeleton" cluster.conf file which will, in turn, handle corosync for us. Once configured and the configuration has been copied to the peer, we will start pacemaker and it will handle starting (and stopping) pacemaker and corosync for us.

We will use 'ccs' to configure the skeleton cluster.conf file.

an-a04n01
ccs -f /etc/cluster/cluster.conf --createcluster an-anvil-04
ccs -f /etc/cluster/cluster.conf --setcman two_node="1" expected_votes="1"
ccs -f /etc/cluster/cluster.conf --addnode an-a04n01.alteeve.ca
ccs -f /etc/cluster/cluster.conf --addnode an-a04n02.alteeve.ca
ccs -f /etc/cluster/cluster.conf --addfencedev pcmk agent=fence_pcmk 
ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect an-a04n01.alteeve.ca
ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect an-a04n02.alteeve.ca
ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk an-a04n01.alteeve.ca pcmk-redirect port=an-a04n01.alteeve.ca
ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk an-a04n02.alteeve.ca pcmk-redirect port=an-a04n02.alteeve.ca
ccs -f /etc/cluster/cluster.conf --setfencedaemon post_join_delay="30"
cat /etc/cluster/cluster.conf
<cluster config_version="10" name="an-anvil-04">
  <fence_daemon post_join_delay="30"/>
  <clusternodes>
    <clusternode name="an-a04n01.alteeve.ca" nodeid="1">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="an-a04n01.alteeve.ca"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="an-a04n02.alteeve.ca" nodeid="2">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="an-a04n02.alteeve.ca"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman expected_votes="1" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>

Copy it to an-a04n02;

an-a04n01
rsync -av /etc/cluster/cluster.conf root@an-a04n02:/etc/cluster/
sending incremental file list
cluster.conf

sent 838 bytes  received 31 bytes  579.33 bytes/sec
total size is 758  speedup is 0.87
an-a04n02
cat /etc/cluster/cluster.conf
<cluster config_version="10" name="an-anvil-04">
  <fence_daemon post_join_delay="30"/>
  <clusternodes>
    <clusternode name="an-a04n01.alteeve.ca" nodeid="1">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="an-a04n01.alteeve.ca"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="an-a04n02.alteeve.ca" nodeid="2">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="an-a04n02.alteeve.ca"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <cman expected_votes="1" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>

Starting Pacemaker

Now start pacemaker proper.

an-a04n01
/etc/init.d/pacemaker start
Starting cluster: 
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman...                                        [  OK  ]
   Waiting for quorum...                                   [  OK  ]
   Starting fenced...                                      [  OK  ]
   Starting dlm_controld...                                [  OK  ]
   Tuning DLM kernel config...                             [  OK  ]
   Starting gfs_controld...                                [  OK  ]
   Unfencing self...                                       [  OK  ]
   Joining fence domain...                                 [  OK  ]
Starting Pacemaker Cluster Manager                         [  OK  ]
an-a04n02
/etc/init.d/pacemaker start
Starting cluster: 
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman...                                        [  OK  ]
   Waiting for quorum...                                   [  OK  ]
   Starting fenced...                                      [  OK  ]
   Starting dlm_controld...                                [  OK  ]
   Tuning DLM kernel config...                             [  OK  ]
   Starting gfs_controld...                                [  OK  ]
   Unfencing self...                                       [  OK  ]
   Joining fence domain...                                 [  OK  ]
Starting Pacemaker Cluster Manager                         [  OK  ]

Verify pacemaker proper started as expected.

an-a04n01
pcs status
Cluster name: an-anvil-04
WARNING: no stonith devices and stonith-enabled is not false
Last updated: Wed May 28 20:59:33 2014
Last change: Wed May 28 20:59:18 2014 via crmd on an-a04n01.alteeve.ca
Stack: cman
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
0 Resources configured


Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]

Full list of resources:
an-a04n02
pcs status
WARNING: no stonith devices and stonith-enabled is not false
Last updated: Wed May 28 20:59:29 2014
Last change: Wed May 28 20:59:18 2014 via crmd on an-a04n01.alteeve.ca
Stack: cman
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
0 Resources configured


Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]

Full list of resources:

Note the error about stonith. We will address that momentarily.

Configure and test stonith (aka fencing)

We will use IPMI and PDU based fence devices with STONITH levels.

You can see the list of available fence agents here. You will need to find the one for your hardware fence devices.

Note: Ignore the errors.

an-a04n01
pcs stonith list
fence_apc - Fence agent for APC over telnet/ssh
fence_apc_snmp - Fence agent for APC over SNMP
fence_bladecenter - Fence agent for IBM BladeCenter
fence_bladecenter_snmp - Fence agent for IBM BladeCenter over SNMP
fence_brocade - Fence agent for Brocade over telnet
Error: no metadata for /usr/sbin/fence_check
fence_cisco_mds - Fence agent for Cisco MDS
fence_cisco_ucs - Fence agent for Cisco UCS
fence_drac - fencing agent for Dell Remote Access Card
fence_drac5 - Fence agent for Dell DRAC CMC/5
fence_eaton_snmp - Fence agent for Eaton over SNMP
fence_egenera - I/O Fencing agent for the Egenera BladeFrame
fence_eps - Fence agent for ePowerSwitch
fence_hpblade - Fence agent for HP BladeSystem
fence_ibmblade - Fence agent for IBM BladeCenter over SNMP
fence_idrac - Fence agent for IPMI over LAN
fence_ifmib - Fence agent for IF MIB
fence_ilo - Fence agent for HP iLO
fence_ilo2 - Fence agent for HP iLO
fence_ilo3 - Fence agent for IPMI over LAN
fence_ilo4 - Fence agent for IPMI over LAN
fence_ilo_mp - Fence agent for HP iLO MP
fence_imm - Fence agent for IPMI over LAN
fence_intelmodular - Fence agent for Intel Modular
fence_ipdu - Fence agent for iPDU over SNMP
fence_ipmilan - Fence agent for IPMI over LAN
fence_kdump - Fence agent for use with kdump
Error: no metadata for /usr/sbin/fence_node
fence_rhevm - Fence agent for RHEV-M REST API
fence_rsa - Fence agent for IBM RSA
fence_rsb - I/O Fencing agent for Fujitsu-Siemens RSB
fence_sanbox2 - Fence agent for QLogic SANBox2 FC switches
fence_scsi - fence agent for SCSI-3 persistent reservations
Error: no metadata for /usr/sbin/fence_tool
fence_virsh - Fence agent for virsh
fence_virt - Fence agent for virtual machines
fence_vmware - Fence agent for VMWare
fence_vmware_soap - Fence agent for VMWare over SOAP API
fence_wti - Fence agent for WTI
fence_xvm - Fence agent for virtual machines

We will use fence_ipmilan and fence_apc_snmp.

Configuring IPMI Fencing

Setup out IPMI BMCs (on LAN channel 2 and using user ID 2).

an-a04n01
ipmitool lan set 2 ipsrc static
ipmitool lan set 2 ipaddr 10.20.41.1
ipmitool lan set 2 netmask 255.255.0.0
ipmitool lan set 2 defgw ipaddr 10.20.255.254
ipmitool user set password 2 Initial1
an-a04n02
ipmitool lan set 2 ipsrc static
ipmitool lan set 2 ipaddr 10.20.41.2
ipmitool lan set 2 netmask 255.255.0.0
ipmitool lan set 2 defgw ipaddr 10.20.255.254
ipmitool user set password 2 Initial1

Test the new settings (using the hostnames we set in /etc/hosts):

an-a04n01
fence_ipmilan -a an-a04n02.ipmi -l admin -p Initial1 -o status
Getting status of IPMI:an-a04n02.ipmi...Chassis power = On
Done
an-a04n02
fence_ipmilan -a an-a04n01.ipmi -l admin -p Initial1 -o status
Getting status of IPMI:an-a04n01.ipmi...Chassis power = On
Done

Good, now we can configure IPMI fencing.

Every fence agent has a possibly unique subset of options that can be used. You can see a brief description of these options with the pcs stonith describe fence_X command. Let's look at the options available for fence_ipmilan.

an-a04n01
pcs stonith describe fence_ipmilan
Stonith options for: fence_ipmilan
  auth: IPMI Lan Auth type (md5, password, or none)
  ipaddr: IPMI Lan IP to talk to
  passwd: Password (if required) to control power on IPMI device
  passwd_script: Script to retrieve password (if required)
  lanplus: Use Lanplus to improve security of connection
  login: Username/Login (if required) to control power on IPMI device
  action: Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata
  timeout: Timeout (sec) for IPMI operation
  cipher: Ciphersuite to use (same as ipmitool -C parameter)
  method: Method to fence (onoff or cycle)
  power_wait: Wait X seconds after on/off operation
  delay: Wait X seconds before fencing is started
  privlvl: Privilege level on IPMI device
  verbose: Verbose mode
  stonith-timeout: How long to wait for the STONITH action to complete per a stonith device.
  priority: The priority of the stonith resource. Devices are tried in order of highest priority to lowest.
  pcmk_host_map: A mapping of host names to ports numbers for devices that do not support host names.
  pcmk_host_list: A list of machines controlled by this device (Optional unless pcmk_host_check=static-list).
  pcmk_host_check: How to determin which machines are controlled by the device.

One of the nice things about pcs is that it allows us to create a test file to prepare all our changes in. Then, when we're happy with the changes, merge them into the running cluster. So let's make a copy called stonith_cfg

Now add IPMI fencing.

an-a04n01
pcs cluster cib stonith_cfg
#   work in our temp file         unique name    fence agent   target node                           device addr             options
pcs -f stonith_cfg stonith create fence_n01_ipmi fence_ipmilan pcmk_host_list="an-a04n01.alteeve.ca" ipaddr="an-a04n01.ipmi" action="reboot" login="admin" passwd="Initial1" delay=15 op monitor interval=10s
pcs -f stonith_cfg stonith create fence_n02_ipmi fence_ipmilan pcmk_host_list="an-a04n02.alteeve.ca" ipaddr="an-a04n02.ipmi" action="reboot" login="admin" passwd="Initial1" op monitor interval=10s
pcs cluster cib-push stonith_cfg

Note that fence_n01_ipmi has a delay=15 set but fence_n02_ipmi does not. If the network connection breaks between the two nodes, they will both try to fence each other at the same time. If acpid is running, the slower node will not die right away. It will continue to run for up to four more seconds, ample time for it to also initiate a fence against the faster node. The end result is that both nodes get fenced. The ten-second delay protects against this by causing an-a04n02 to pause for 10 seconds before initiating a fence against an-a04n01. If both nodes are alive, an-a04n02 will power off before the 10 seconds pass, so it will never fence an-a04n01. However, if an-a04n01 really is dead, after the ten seconds have elapsed, fencing will proceed as normal.

NOTE: Get my PDUs back and use them here!

We can check the new configuration now;

an-a04n01
pcs status
Cluster name: an-anvil-04
Last updated: Wed May 28 22:01:14 2014
Last change: Wed May 28 21:55:59 2014 via cibadmin on an-a04n01.alteeve.ca
Stack: cman
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
2 Resources configured


Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]

Full list of resources:

 fence_n01_ipmi	(stonith:fence_ipmilan):	Started an-a04n01.alteeve.ca 
 fence_n02_ipmi	(stonith:fence_ipmilan):	Started an-a04n02.alteeve.ca

Tell pacemaker to use fencing;

an-a04n01
pcs property set stonith-enabled=true
pcs property set no-quorum-policy=ignore
pcs property
Cluster Properties:
 cluster-infrastructure: cman
 dc-version: 1.1.10-14.el6_5.3-368c726
 no-quorum-policy: ignore
 stonith-enabled: true

Excellent!

Configuring Fence Levels

TODO...


Test Fencing

ToDo: Kill each node with echo c > /proc/sysrq-trigger and make sure the other node fences it.

Shared Storage

DRBD -> Clustered LVM -> GFS2

DRBD

We will use DRBD 8.4.

Partition Storage

How you do this will depend a lot on your storage (local disks, md software RAID, hardware RAID, 1 or multiple arrays, etc). It will also depend on how you plan to divy up your servers; you need two partitions; One for servers that will run on node 1 and another for node 2. It also depends on how much space you want for the /shared partition.

In our case, we're using a single hardware RAID array, we'll set aside 40 GB of space for /shared and we're going to divide the remaining free space evenly.

an-a04n01
parted -a opt /dev/sda "print free"
Model: LSI RAID 5/6 SAS 6G (scsi)
Disk /dev/sda: 898GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start   End     Size    Type     File system     Flags
        32.3kB  1049kB  1016kB           Free Space
 1      1049kB  538MB   537MB   primary  ext4            boot
 2      538MB   4833MB  4295MB  primary  linux-swap(v1)
 3      4833MB  26.3GB  21.5GB  primary  ext4
        26.3GB  898GB   872GB            Free Space
an-a04n01
# same as an-a04n01

So 872 GB of free space, less 40 for /shared leaves 832 GB for servers. Divided evenly in 2 gives us 416 GB per server pool. Our first partition will then be 446 GB (40 for /shared) and the second will be 416 GB.

The free space starts at 26.3 GB, so our first partition will start at 26.3 GB and end at 492 GB (rounding off the .3). The second partition will then start at 492 GB and end at 898 GB, the end of the disk. Both of these new partitions will be contained in an extended partition.

Note: After each change, we will get an error saying "Warning: WARNING: the kernel failed to re-read the partition table on /dev/sda (Device or resource busy). As a result, it may not reflect all of your changes until after reboot.". Will reboot once done to address this.
an-a04n01
parted -a opt /dev/sda "mkpart extended 26.3GB 898GB"
parted -a opt /dev/sda "mkpart logical 26.3GB 492GB"
parted -a opt /dev/sda "mkpart logical 492GB 898GB"
parted -a opt /dev/sda "print free"
Model: LSI RAID 5/6 SAS 6G (scsi)
Disk /dev/sda: 898GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start   End     Size    Type      File system     Flags
        32.3kB  1049kB  1016kB            Free Space
 1      1049kB  538MB   537MB   primary   ext4            boot
 2      538MB   4833MB  4295MB  primary   linux-swap(v1)
 3      4833MB  26.3GB  21.5GB  primary   ext4
 4      26.3GB  898GB   872GB   extended                  lba
 5      26.3GB  492GB   466GB   logical
 6      492GB   898GB   406GB   logical
an-a04n01
# same as an-a04n01

Reboot

an-a04n01
reboot
an-a04n01
reboot

Configure DRBD

Configure global-common.conf;

an-a04n01
vim /etc/drbd.d/global_common.conf
# These are options to set for the DRBD daemon sets the default values for
# resources.
global {
	# This tells DRBD that you allow it to report this installation to 
	# LINBIT for statistical purposes. If you have privacy concerns, set
	# this to 'no'. The default is 'ask' which will prompt you each time
	# DRBD is updated. Set to 'yes' to allow it without being prompted.
	usage-count yes;
 
	# minor-count dialog-refresh disable-ip-verification
}
 
common {
	handlers {
		# pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
		# pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
		# local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
		# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
		# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
		# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
		# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
 
		# Hook into Pacemaker's fencing.
		fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
		before-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
	}
 
	startup {
		# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
	}
 
	options {
		# cpu-mask on-no-data-accessible
	}
 
	disk {
		# size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes
		# disk-drain md-flushes resync-rate resync-after al-extents
                # c-plan-ahead c-delay-target c-fill-target c-max-rate
                # c-min-rate disk-timeout
                fencing resource-and-stonith;
	}
 
	net {
		# protocol timeout max-epoch-size max-buffers unplug-watermark
		# connect-int ping-int sndbuf-size rcvbuf-size ko-count
		# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri
		# after-sb-1pri after-sb-2pri always-asbp rr-conflict
		# ping-timeout data-integrity-alg tcp-cork on-congestion
		# congestion-fill congestion-extents csums-alg verify-alg
		# use-rle
 
		# Protocol "C" tells DRBD not to tell the operating system that
		# the write is complete until the data has reach persistent
		# storage on both nodes. This is the slowest option, but it is
		# also the only one that guarantees consistency between the
		# nodes. It is also required for dual-primary, which we will 
		# be using.
		protocol C;
 
		# Tell DRBD to allow dual-primary. This is needed to enable 
		# live-migration of our servers.
		allow-two-primaries yes;
 
		# This tells DRBD what to do in the case of a split-brain when
		# neither node was primary, when one node was primary and when
		# both nodes are primary. In our case, we'll be running
		# dual-primary, so we can not safely recover automatically. The
		# only safe option is for the nodes to disconnect from one
		# another and let a human decide which node to invalidate. Of 
		after-sb-0pri discard-zero-changes;
		after-sb-1pri discard-secondary;
		after-sb-2pri disconnect;
	}
}

And now configure the first resource;

an-a04n01
vim /etc/drbd.d/r0.res
# This is the first DRBD resource. It will store the shared file systems and
# the servers designed to run on node 01.
resource r0 {
	# These options here are common to both nodes. If for some reason you
	# need to set unique values per node, you can move these to the
	# 'on <name> { ... }' section.
 
	# This sets the device name of this DRBD resouce.
	device /dev/drbd0;
 
	# This tells DRBD what the backing device is for this resource.
	disk /dev/sda5;
 
	# This controls the location of the metadata. When "internal" is used,
	# as we use here, a little space at the end of the backing devices is
	# set aside (roughly 32 MB per 1 TB of raw storage). External metadata
	# can be used to put the metadata on another partition when converting
	# existing file systems to be DRBD backed, when there is no extra space
	# available for the metadata.
	meta-disk internal;
 
	# NOTE: this is not required or even recommended with pacemaker. remove
	# 	this options as soon as pacemaker is setup.
	startup {
		# This tells DRBD to promote both nodes to 'primary' when this
		# resource starts. However, we will let pacemaker control this
		# so we comment it out, which tells DRBD to leave both nodes
		# as secondary when drbd starts.
		#become-primary-on both;
	}
 
	# NOTE: Later, make it an option in the dashboard to trigger a manual
	# 	verify and/or schedule periodic automatic runs
	net {
		# TODO: Test performance differences between sha1 and md5
		# This tells DRBD how to do a block-by-block verification of
		# the data stored on the backing devices. Any verification
		# failures will result in the effected block being marked
		# out-of-sync.
		verify-alg md5;
 
		# TODO: Test the performance hit of this being enabled.
		# This tells DRBD to generate a checksum for each transmitted
		# packet. If the data received data doesn't generate the same
		# sum, a retransmit request is generated. This protects against
		# otherwise-undetected errors in transmission, like 
		# bit-flipping. See:
		# http://www.drbd.org/users-guide/s-integrity-check.html
		data-integrity-alg md5;
	}
 
	# WARNING: Confirm that these are safe when the controller's BBU is
	#          depleted/failed and the controller enters write-through 
	#          mode.
	disk {
		# TODO: Test the real-world performance differences gained with
		#       these options.
		# This tells DRBD not to bypass the write-back caching on the
		# RAID controller. Normally, DRBD forces the data to be flushed
		# to disk, rather than allowing the write-back cachine to 
		# handle it. Normally this is dangerous, but with BBU-backed
		# caching, it is safe. The first option disables disk flushing
		# and the second disabled metadata flushes.
		disk-flushes no;
		md-flushes no;
	}
 
	# This sets up the resource on node 01. The name used below must be the
	# named returned by "uname -n".
	on an-a04n01.alteeve.ca {
		# This is the address and port to use for DRBD traffic on this
		# node. Multiple resources can use the same IP but the ports
		# must differ. By convention, the first resource uses 7788, the
		# second uses 7789 and so on, incrementing by one for each
		# additional resource. 
		address 10.10.40.1:7788;
	}
	on an-a04n02.alteeve.ca {
		address 10.10.40.2:7788;
	}
}

And the second.

an-a04n01
vim /etc/drbd.d/r1.res
# This is the first DRBD resource. It will store the servers designed 
# to run on node 02.
resource r1 {
	device /dev/drbd1;
	disk /dev/sda6;
	meta-disk internal;
 
	net {
		verify-alg md5;
		data-integrity-alg md5;
	}
 
	disk {
		disk-flushes no;
		md-flushes no;
	}
 
	on an-a04n01.alteeve.ca {
		address 10.10.40.1:7789;
	}
	on an-a04n02.alteeve.ca {
		address 10.10.40.2:7789;
	}
}

Test the config;

an-a04n01
drbdadm dump
# /etc/drbd.conf
common {
}

# resource r0 on an-a04n01.alteeve.ca: not ignored, not stacked
# defined at /etc/drbd.d/r0.res:3
resource r0 {
    on an-a04n01.alteeve.ca {
        volume 0 {
            device       /dev/drbd0 minor 0;
            disk         /dev/sda5;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.1:7788;
    }
    on an-a04n02.alteeve.ca {
        volume 0 {
            device       /dev/drbd0 minor 0;
            disk         /dev/sda5;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.2:7788;
    }
    net {
        verify-alg       md5;
        data-integrity-alg md5;
    }
    disk {
        disk-flushes      no;
        md-flushes        no;
    }
}

# resource r1 on an-a04n01.alteeve.ca: not ignored, not stacked
# defined at /etc/drbd.d/r1.res:3
resource r1 {
    on an-a04n01.alteeve.ca {
        volume 0 {
            device       /dev/drbd1 minor 1;
            disk         /dev/sda6;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.1:7789;
    }
    on an-a04n02.alteeve.ca {
        volume 0 {
            device       /dev/drbd1 minor 1;
            disk         /dev/sda6;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.2:7789;
    }
    net {
        verify-alg       md5;
        data-integrity-alg md5;
    }
    disk {
        disk-flushes      no;
        md-flushes        no;
    }
}

Good, copy it to the other node and test it there.

an-a04n01
rsync -av /etc/drbd.* root@an-a04n02:/etc/
sending incremental file list
drbd.d/
drbd.d/global_common.conf
drbd.d/r0.res
drbd.d/r1.res

sent 5738 bytes  received 73 bytes  11622.00 bytes/sec
total size is 5618  speedup is 0.97
an-a04n01
drbdadm dump
# /etc/drbd.conf
common {
}

# resource r0 on an-a04n02.alteeve.ca: not ignored, not stacked
# defined at /etc/drbd.d/r0.res:3
resource r0 {
    on an-a04n01.alteeve.ca {
        volume 0 {
            device       /dev/drbd0 minor 0;
            disk         /dev/sda5;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.1:7788;
    }
    on an-a04n02.alteeve.ca {
        volume 0 {
            device       /dev/drbd0 minor 0;
            disk         /dev/sda5;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.2:7788;
    }
    net {
        verify-alg       md5;
        data-integrity-alg md5;
    }
    disk {
        disk-flushes      no;
        md-flushes        no;
    }
}

# resource r1 on an-a04n02.alteeve.ca: not ignored, not stacked
# defined at /etc/drbd.d/r1.res:3
resource r1 {
    on an-a04n01.alteeve.ca {
        volume 0 {
            device       /dev/drbd1 minor 1;
            disk         /dev/sda6;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.1:7789;
    }
    on an-a04n02.alteeve.ca {
        volume 0 {
            device       /dev/drbd1 minor 1;
            disk         /dev/sda6;
            meta-disk    internal;
        }
        address          ipv4 10.10.40.2:7789;
    }
    net {
        verify-alg       md5;
        data-integrity-alg md5;
    }
    disk {
        disk-flushes      no;
        md-flushes        no;
    }
}

This isn't a plain dump of your configs, you will notice things have been shifted around. The point is that it dumped the configuration without errors, so we're good to go.

Start DRBD for the first time

Load the config;

an-a04n01
modprobe drbd
lsmod | grep drbd
drbd                  333723  0 
libcrc32c               1246  1 drbd
an-a04n01
modprobe drbd
lsmod | grep drbd
drbd                  333723  0 
libcrc32c               1246  1 drbd
Note: If you have used these partitions before, drbd may see an FS and refuse to create the MD. If that happens, use 'dd' to zero out the partition.

Create the metadisk;

an-a04n01
drbdadm create-md r{0,1}
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
success
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
success
an-a04n01
drbdadm create-md r{0,1}
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
success
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
success

Bring up the new resources.

an-a04n01
drbdadm up r{0,1}
cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@rhel6-builder.alteeve.ca, 2014-07-20 21:29:34
 0: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/Outdated C r----s
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:454762916
 1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/Outdated C r----s
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:396782732
an-a04n01
drbdadm up r{0,1}
cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@rhel6-builder.alteeve.ca, 2014-07-20 21:29:34
 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:454762916
 1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:396782732

Neither node has data, so we'll arbitrarily force node 01 to become primary, then normally promote node 02 to primary.

an-a04n01
drbdadm primary --force r{0,1}
cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@rhel6-builder.alteeve.ca, 2014-07-20 21:29:34
 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:2136 nr:0 dw:0 dr:2800 al:0 bm:0 lo:0 pe:3 ua:0 ap:0 ep:1 wo:d oos:454760880
        [>....................] sync'ed:  0.1% (444100/444104)M
        finish: 421:04:29 speed: 252 (252) K/sec
 1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
    ns:24696 nr:0 dw:0 dr:25360 al:0 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:396758036
        [>....................] sync'ed:  0.1% (387456/387480)M
        finish: 35:33:06 speed: 3,084 (3,084) K/sec
an-a04n01
drbdadm primary r{0,1}
cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by root@rhel6-builder.alteeve.ca, 2014-07-20 21:29:34
 0: cs:SyncTarget ro:Primary/Primary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:859488 dw:859432 dr:608 al:0 bm:52 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:453903484
        [>....................] sync'ed:  0.2% (443264/444104)M
        finish: 71:24:53 speed: 1,752 (4,428) want: 440 K/sec
 1: cs:SyncTarget ro:Primary/Primary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:1140588 dw:1140532 dr:608 al:0 bm:69 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:395642200
        [>....................] sync'ed:  0.3% (386368/387480)M
        finish: 70:30:41 speed: 1,548 (5,876) want: 4,400 K/sec

The sync rate starts low, but it will continue to climb, you can keep an eye on it if you wish. DRBD 8.4 is smarter than 8.3 in that it will adjust the sync rate automatically based on load.

We can proceed now, we do not have to wait for the sync to complete.

Clustered LVM and GFS2

Clustered LVM provides the logical volumes that will back our /shared GFS2 partition and the storage for the HA servers.

Configure lvm.conf

Configure clustered LVM.

an-a04n01
sed -i.anvil 's^filter = \[ "a/\.\*/" \]^filter = \[ "a|/dev/drbd*|", "r/.*/" \]^' /etc/lvm/lvm.conf
sed -i 's/locking_type = 1$/locking_type = 3/' /etc/lvm/lvm.conf
sed -i 's/fallback_to_local_locking = 1$/fallback_to_local_locking = 0/' /etc/lvm/lvm.conf 
diff -u /etc/lvm/lvm.conf.anvil /etc/lvm/lvm.conf
--- /etc/lvm/lvm.conf.anvil	2013-10-30 04:10:42.000000000 -0400
+++ /etc/lvm/lvm.conf	2014-06-04 18:38:15.545166869 -0400
@@ -82,7 +82,7 @@
 
 
     # By default we accept every block device:
-    filter = [ "a/.*/" ]
+    filter = [ "a|/dev/drbd*|", "r/.*/" ]
 
     # Exclude the cdrom drive
     # filter = [ "r|/dev/cdrom|" ]
@@ -459,7 +459,7 @@
     # Type 3 uses built-in clustered locking.
     # Type 4 uses read-only locking which forbids any operations that might 
     # change metadata.
-    locking_type = 1
+    locking_type = 3
 
     # Set to 0 to fail when a lock request cannot be satisfied immediately.
     wait_for_locks = 1
@@ -475,7 +475,7 @@
     # to 1 an attempt will be made to use local file-based locking (type 1).
     # If this succeeds, only commands against local volume groups will proceed.
     # Volume Groups marked as clustered will be ignored.
-    fallback_to_local_locking = 1
+    fallback_to_local_locking = 0
 
     # Local non-LV directory that holds file-based locks while commands are
     # in progress.  A directory like /tmp that may get wiped on reboot is OK.
rsync -av /etc/lvm/lvm.conf* root@an-a04n02:/etc/lvm/
sending incremental file list
lvm.conf
lvm.conf.anvil

sent 47499 bytes  received 440 bytes  95878.00 bytes/sec
total size is 89999  speedup is 1.88
an-a04n02
diff -u /etc/lvm/lvm.conf.anvil /etc/lvm/lvm.conf
--- /etc/lvm/lvm.conf.anvil	2013-10-30 04:10:42.000000000 -0400
+++ /etc/lvm/lvm.conf	2014-06-04 18:38:15.000000000 -0400
@@ -82,7 +82,7 @@
 
 
     # By default we accept every block device:
-    filter = [ "a/.*/" ]
+    filter = [ "a|/dev/drbd*|", "r/.*/" ]
 
     # Exclude the cdrom drive
     # filter = [ "r|/dev/cdrom|" ]
@@ -459,7 +459,7 @@
     # Type 3 uses built-in clustered locking.
     # Type 4 uses read-only locking which forbids any operations that might 
     # change metadata.
-    locking_type = 1
+    locking_type = 3
 
     # Set to 0 to fail when a lock request cannot be satisfied immediately.
     wait_for_locks = 1
@@ -475,7 +475,7 @@
     # to 1 an attempt will be made to use local file-based locking (type 1).
     # If this succeeds, only commands against local volume groups will proceed.
     # Volume Groups marked as clustered will be ignored.
-    fallback_to_local_locking = 1
+    fallback_to_local_locking = 0
 
     # Local non-LV directory that holds file-based locks while commands are
     # in progress.  A directory like /tmp that may get wiped on reboot is OK.

Start clvmd

Note: This will be moved to pacemaker shortly. We're enabling it here just long enough to configure pacemaker.

Make sure the cluster is up (you could use 'pcs status', 'cman_tool status', etc):

an-a04n01
dlm_tool dump | grep node
1401921044 cluster node 1 added seq 68
1401921044 set_configfs_node 1 10.20.40.1 local 1
1401921044 cluster node 2 added seq 68
1401921044 set_configfs_node 2 10.20.40.2 local 0
1401921044 run protocol from nodeid 1

Make sure DRBD is up as primary on both nodes:

an-a04n01
cat /proc/drbd
version: 8.3.16 (api:88/proto:86-97)
GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by root@rhel6-builder.alteeve.ca, 2014-04-20 12:16:31
 0: cs:SyncSource ro:Primary/Primary ds:UpToDate/Inconsistent C r-----
    ns:1519672 nr:0 dw:0 dr:1520336 al:0 bm:93 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:448214308
	[>....................] sync'ed:  0.4% (437708/439192)M
	finish: 6:20:02 speed: 19,652 (15,992) K/sec
 1: cs:SyncSource ro:Primary/Primary ds:UpToDate/Inconsistent C r-----
    ns:1896504 nr:0 dw:0 dr:1897168 al:0 bm:115 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:390577164
	[>....................] sync'ed:  0.5% (381420/383272)M
	finish: 2:33:17 speed: 42,440 (19,960) K/sec

Note that we don't have to wait for the sync to finish.

Start clvmd;

an-a04n01
/etc/init.d/clvmd start
Starting clvmd: 
Activating VG(s):   No volume groups found
                                                           [  OK  ]
an-a04n02
/etc/init.d/clvmd start
Starting clvmd: 
Activating VG(s):   No volume groups found
                                                           [  OK  ]
Note: If this fails, showing a timeout or simply never returning, make sure that TCP port 21064 is opened in your firewall on both nodes.

From here on, pacemaker will start clvmd when pacemaker itself start, *if* clvmd is set to start on boot. So lets set that.

an-a04n01
chkconfig clvmd on
chkconfig --list clvmd
clvmd          	0:off	1:off	2:on	3:on	4:on	5:on	6:off
an-a04n01
chkconfig clvmd on
chkconfig --list clvmd
clvmd          	0:off	1:off	2:on	3:on	4:on	5:on	6:off

Create Initial PVs, VGs and the /shared LV

Create the PV, VG and the /shared LV;

an-a04n01
pvcreate /dev/drbd{0,1}
  Physical volume "/dev/drbd0" successfully created
  Physical volume "/dev/drbd1" successfully created
vgcreate an-a04n01_vg0 /dev/drbd0
  Clustered volume group "an-a04n01_vg0" successfully created
vgcreate an-a04n02_vg0 /dev/drbd1
  Clustered volume group "an-a04n02_vg0" successfully created
lvcreate -L 40GiB -n shared an-a04n01_vg0
  Logical volume "shared" created
an-a04n02
pvdisplay
  --- Physical volume ---
  PV Name               /dev/drbd1
  VG Name               an-a04n02_vg0
  PV Size               378.40 GiB / not usable 3.14 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              96870
  Free PE               96870
  Allocated PE          0
  PV UUID               TpEXBC-7822-UGz0-ICz1-AJdg-v5eS-lyB7C5
   
  --- Physical volume ---
  PV Name               /dev/drbd0
  VG Name               an-a04n01_vg0
  PV Size               433.70 GiB / not usable 4.41 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              111025
  Free PE               100785
  Allocated PE          10240
  PV UUID               RoHAJQ-qrsO-Ofwz-f8W7-jIXd-2cvG-oPgfFR
vgdisplay
  --- Volume group ---
  VG Name               an-a04n02_vg0
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  1
  VG Access             read/write
  VG Status             resizable
  Clustered             yes
  Shared                no
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               378.40 GiB
  PE Size               4.00 MiB
  Total PE              96870
  Alloc PE / Size       0 / 0   
  Free  PE / Size       96870 / 378.40 GiB
  VG UUID               9bTBDu-JSma-kwKR-4oBI-sxi1-YT6i-1uIM4C
   
  --- Volume group ---
  VG Name               an-a04n01_vg0
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  Clustered             yes
  Shared                no
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               433.69 GiB
  PE Size               4.00 MiB
  Total PE              111025
  Alloc PE / Size       10240 / 40.00 GiB
  Free  PE / Size       100785 / 393.69 GiB
  VG UUID               hLnvle-EScm-cP1t-xodO-cKyv-5EyC-TyIpj5
lvdisplay
  --- Logical volume ---
  LV Path                /dev/an-a04n01_vg0/shared
  LV Name                shared
  VG Name                an-a04n01_vg0
  LV UUID                tvolRF-cb3L-29Dn-Vgqd-e4rf-Qq2e-JFIcbA
  LV Write Access        read/write
  LV Creation host, time an-a04n01.alteeve.ca, 2014-06-07 18:54:41 -0400
  LV Status              available
  # open                 0
  LV Size                40.00 GiB
  Current LE             10240
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0

Create the /shared GFS2 filesystem

Format the /dev/an-a04n01_vg0/shared logical volume as a GFS2 filesystem;

an-a04n01
mkfs.gfs2 -j 2 -p lock_dlm -t an-anvil-04:shared /dev/an-a04n01_vg0/shared
This will destroy any data on /dev/an-a04n01_vg0/shared.
It appears to contain: symbolic link to `../dm-0'

Are you sure you want to proceed? [y/n] y
Device:                    /dev/an-a04n01_vg0/shared
Blocksize:                 4096
Device Size                40.00 GB (10485760 blocks)
Filesystem Size:           40.00 GB (10485758 blocks)
Journals:                  2
Resource Groups:           160
Locking Protocol:          "lock_dlm"
Lock Table:                "an-anvil-04:shared"
UUID:                      e07d35fe-6860-f790-38cd-af075366c27b
mkdir /shared
mount /dev/an-a04n01_vg0/shared /shared
df -hP
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda3                           20G  1.5G   18G   8% /
tmpfs                               12G   67M   12G   1% /dev/shm
/dev/sda1                          504M   72M  407M  16% /boot
/dev/mapper/an--a04n01_vg0-shared   40G  259M   40G   1% /shared
an-a04n02
mkdir /shared
mount /dev/an-a04n01_vg0/shared /shared
df -hP
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda3                           20G  1.5G   18G   8% /
tmpfs                               12G   52M   12G   1% /dev/shm
/dev/sda1                          504M   72M  407M  16% /boot
/dev/mapper/an--a04n01_vg0-shared   40G  259M   40G   1% /shared

Add Storage to Pacemaker

Configure Dual-Primary DRBD

Setup DRBD as a dual-primary resource.

Notes:

  • Clones allow for a given service to run on multiple nodes.
    • master-max is how many copies of the resource can be promoted to master at the same time across the cluster.
    • master-node-max is how many copies of the resource can be promoted to master on a given node.
    • clone-max is how many copies can run in the cluster, default is to the number of nodes in the cluster.
    • clone-node-max is the number of instances of the resource that can run on each node.
    • notify controls whether other nodes are notified before and after a resource is started or stopped on a given node.
an-a04n01
pcs cluster cib drbd_cfg
pcs -f drbd_cfg resource create drbd_r0 ocf:linbit:drbd drbd_resource=r0 op monitor interval=10s
pcs -f drbd_cfg resource create drbd_r1 ocf:linbit:drbd drbd_resource=r1 op monitor interval=10s
### Ignore this for now.
#pcs -f drbd_cfg resource create drbd_r0 ocf:linbit:drbd drbd_resource=r0 \
#                op monitor interval=29s role=Master \
#                op monitor interval=31s role=Slave \
#                op promote interval=0 timeout=90s start-delay=2s \
#                op start interval=0 timeout=240s \
#                op stop interval=0 timeout=120s
pcs -f drbd_cfg resource master drbd_r0_Clone drbd_r0 master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs -f drbd_cfg resource master drbd_r1_Clone drbd_r1 master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs cluster cib-push drbd_cfg
CIB updated

Give it a couple minutes to promote both nodes to Master on both nodes. Initially, it will appear as Master on one node only.

Once updated, you should see this:

an-a04n01
pcs status
Cluster name: an-anvil-04
Last updated: Sat Jun  7 20:29:09 2014
Last change: Sat Jun  7 20:28:36 2014 via cibadmin on an-a04n01.alteeve.ca
Stack: cman
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
6 Resources configured


Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]

Full list of resources:

 fence_n01_ipmi	(stonith:fence_ipmilan):	Started an-a04n01.alteeve.ca 
 fence_n02_ipmi	(stonith:fence_ipmilan):	Started an-a04n02.alteeve.ca 
 Master/Slave Set: drbd_r0_Clone [drbd_r0]
     Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
 Master/Slave Set: drbd_r1_Clone [drbd_r1]
     Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
an-a04n02
pcs status
Cluster name: an-anvil-04
Last updated: Sat Jun  7 20:29:36 2014
Last change: Sat Jun  7 20:28:36 2014 via cibadmin on an-a04n01.alteeve.ca
Stack: cman
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
6 Resources configured


Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]

Full list of resources:

 fence_n01_ipmi	(stonith:fence_ipmilan):	Started an-a04n01.alteeve.ca 
 fence_n02_ipmi	(stonith:fence_ipmilan):	Started an-a04n02.alteeve.ca 
 Master/Slave Set: drbd_r0_Clone [drbd_r0]
     Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
 Master/Slave Set: drbd_r1_Clone [drbd_r1]
     Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]

Configure LVM

We need to have pacemaker activate our clustered LVM LVs on start, and deactivate them when stopping. We don't start/stop clvmd directly because of stop timing issues that can lead to stray fencing.

Note: This will throw errors if there are no LVs on a given VG... Do not add a volume group until at least one logical volume has been created.
an-a04n01
pcs cluster cib lvm_cfg
pcs -f lvm_cfg resource create lvm_n01_vg0 ocf:heartbeat:lvm volgrpname=an-a04n01_vg0 op monitor interval=10s
pcs -f lvm_cfg resource master lvm_n01_vg0_Clone lvm_n01_vg0 master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true
pcs cluster cib-push lvm_cfg
CIB updated

Configure LVM to start after the DRBD PV is Primary

It we stopped here, there is a good chance that on future starts of pacemaker, LVM and DRBD would start in parallel, DRBD would take too long, LVM would error out and stonith's would start to fly. To prevent this, we will tell Pacemaker not to start the LVM resource until after the DRBD resource that is behind the volume group has been promoted to primary.

an-a04n01
pcs cluster cib cst_cfg 
pcs -f cst_cfg constraint order promote drbd_r0_Clone then start lvm_n01_vg0_Clone
Adding drbd_r0_Clone lvm_n01_vg0_Clone (kind: Mandatory) (Options: first-action=promote then-action=start)
pcs cluster cib-push cst_cfg
CIB updated
pcs constraint show
Location Constraints:
Ordering Constraints:
  promote drbd_r0_Clone then start lvm_n01_vg0_Clone
Colocation Constraints:

Configure the /shared GFS2 Partition

an-a04n01
pcs cluster cib fs_cfg
pcs -f fs_cfg resource create sharedFS Filesystem device="/dev/an-a04n01_vg0/shared" directory="/shared" fstype="gfs2"
pcs -f fs_cfg resource clone sharedFS master-max=2 master-node-max=1 clone-max=2 clone-node-max=1
pcs cluster cib-push fs_cfg
CIB updated
pcs status
Cluster name: an-anvil-04
Last updated: Sat Jun  7 21:09:28 2014
Last change: Sat Jun  7 21:08:47 2014 via cibadmin on an-a04n01.alteeve.ca
Stack: cman
Current DC: an-a04n01.alteeve.ca - partition with quorum
Version: 1.1.10-14.el6_5.3-368c726
2 Nodes configured
8 Resources configured


Online: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]

Full list of resources:

 fence_n01_ipmi	(stonith:fence_ipmilan):	Started an-a04n01.alteeve.ca 
 fence_n02_ipmi	(stonith:fence_ipmilan):	Started an-a04n02.alteeve.ca 
 Master/Slave Set: drbd_r0_Clone [drbd_r0]
     Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
 Master/Slave Set: drbd_r1_Clone [drbd_r1]
     Masters: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
 Clone Set: sharedFS-clone [sharedFS]
     Started: [ an-a04n01.alteeve.ca an-a04n02.alteeve.ca ]
df -hP
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda3                           20G  1.5G   18G   8% /
tmpfs                               12G   67M   12G   1% /dev/shm
/dev/sda1                          504M   72M  407M  16% /boot
/dev/mapper/an--a04n01_vg0-shared   40G  259M   40G   1% /shared
an-a04n02
df -h
Filesystem                         Size  Used Avail Use% Mounted on
/dev/sda3                           20G  1.5G   18G   8% /
tmpfs                               12G   52M   12G   1% /dev/shm
/dev/sda1                          504M   72M  407M  16% /boot
/dev/mapper/an--a04n01_vg0-shared   40G  259M   40G   1% /shared

Configur /shared to start after LVM

As we did before in making sure LVM started after DRBD, this time we will make sure LVM starts before /shared is mounted.

an-a04n01
pcs cluster cib cst_cfg
pcs -f cst_cfg constraint order start lvm_n01_vg0_Clone then start sharedFS
Adding lvm_n01_vg0_Clone sharedFS (kind: Mandatory) (Options: first-action=start then-action=start)
pcs cluster cib-push cst_cfg
CIB updated
pcs constraint show --full
Location Constraints:
Ordering Constraints:
  promote drbd_r0_Clone then start lvm_n01_vg0_Clone (Mandatory) (id:order-drbd_r0_Clone-lvm_n01_vg0_Clone-mandatory)
  start lvm_n01_vg0_Clone then start sharedFS-clone (Mandatory) (id:order-lvm_n01_vg0_Clone-sharedFS-clone-mandatory)
Colocation Constraints:

Note that this time we added '--full'. If you ever need to delete a constraint, you would use 'pcs constraint delete <id>'.

Notes

Thanks

This list will certainly grow as this tutorial progresses;

 

Any questions, feedback, advice, complaints or meanderings are welcome.
Alteeve's Niche! Enterprise Support:
Alteeve Support
Community Support
© Alteeve's Niche! Inc. 1997-2024   Anvil! "Intelligent Availability®" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.