Anvil! Tutorial 3: Difference between revisions

From Alteeve Wiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
{{howto_header}}
== Disabling Quorum ==


{{warning|1=This tutorial is incomplete, flawed and generally sucks at this time. Do not follow this and expect anything to work. In large part, it's a dumping ground for notes and little else. This warning will be removed when the tutorial is completed.}}
{{note|1=Show the math.}}


{{warning|1=This tutorial is built on '''''a guess''''' of what [[Red Hat]]'s Enterprise Linux 7 will offer, based on what the author sees happening in [[Fedora]] upstream. [[Red Hat]] never confirms what a future release will contain until it is actually released. As such, this tutorial may turn out to be inappropriate for the final release of [[RHEL]] 7. In such a case, the warning above will remain in place until the tutorial is updated to reflect the final release.}}
With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum.


This is the third '''AN!Cluster''' tutorial built on [[Red Hat]]'s Enterprise Linux 7. It improves on the [[Red Hat Cluster Service 2 Tutorial|RHEL 5, RHCS stable 2]] and [[2-Node Red Hat KVM Cluster Tutorial|RHEL 6, RHCS stable3]] tutorials.
By default, pacemaker uses quorum. You don't see this initially though;
 
As with the previous tutorials, the end goal of this tutorial is a 2-node cluster providing a platform for high-availability virtual servers. It's design attempts to remove all single points of failure from the system. Power and networking are made fully redundant in this version, along with minimizing the node failures which would lead to service interruption. This tutorial also covers the [[AN!Utilities]]; [[AN!Cluster Dashboard]], [[AN!Cluster Monitor]] and [[AN!Safe Cluster Shutdown]].
 
As it the previous tutorial, [[KVM]] will be the hypervisor used for facilitating virtual machines. The old <span class="code">[[cman]]</span> and <span class="code">[[rgmanager]]</span> tools are replaced in favour of <span class="code">[[pacemaker]]</span> for resource management.
 
= Before We Begin =
 
This tutorial '''does not''' require prior cluster experience, but it does expect familiarity with Linux and a low-intermediate understanding of networking. Where possible, steps are explained in detail and rationale is provided for why certain decisions are made.
 
'''For those with cluster experience''';
 
Please be careful not to skip too much. There are some major and some subtle changes from previous tutorials.
 
= OS Setup =
 
{{warning|1=I used Fedora 18 at this point, obviously things will change, possibly a lot, once RHEL 7 is released.}}
 
== Install ==
 
Not all of these are required, but most are used at one point or another in this tutorial.


<source lang="bash">
<source lang="bash">
yum install bridge-utils corosync gpm man net-tools network ntp pacemaker pcs rsync syslinux vim wget
pcs property
</source>
 
If you want to use your mouse at the node's terminal, run the following;
 
<source lang="bash">
systemctl enable gpm.service
systemctl start gpm.service
</source>
 
== Make the Network Configuration Static ==
 
We don't want [[NetworkManager]] in our cluster as it tries to dynamically manage the network and we need our network to be static.
 
<source lang="bash">
yum remove NetworkManager
</source>
 
{{note|1=This assumes that [[systemd]] will be used in [[RHEL]]7. This may not be the case come release day.}}
 
Now to ensure the static <span class="code">network</span> service starts on boot.
 
<source lang="bash">
systemctl enable network.service
</source>
 
== Setting the Hostname ==
 
Fedora 18 is '''very''' different from [[EL6]].
 
{{note|1=The '<span class="code">--pretty</span>' line currently doesn't work as there is [https://bugzilla.redhat.com/show_bug.cgi?id=895299 a bug (rhbz#895299)] with single-quotes.}}
{{note|1=The '<span class="code">--static</span>' option is currently needed to prevent the '<span class="code">.</span>' from being removed. See [https://bugzilla.redhat.com/show_bug.cgi?id=896756 this bug (rhbz#896756)].}}
 
Use a format that works for you. For the tutorial, node names are based on the following;
* A two-letter prefix identifying the company/user (<span class="code">an</span>, for "Alteeve's Niche!")
* A sequential cluster ID number in the form of <span class="code">cXX</span> (<span class="code">c01</span> for "Cluster 01", <span class="code">c02</span> for Cluster 02, etc)
* A sequential node ID number in the form of <span class="code">nYY</span>
 
In my case, this is my third cluster and I use the company prefix <span class="code">an</span>, so my two nodes will be;
* <span class="code">an-c03n01</span> - node 1
* <span class="code">an-c03n02</span> - node 2
 
Folks who've read my earlier tutorials will note that this is a departure in naming. I find this method spans and scales much better. Further, it the simply required in order to use the [[AN!CDB|AN! Cluster Dashboard]].
 
<source lang="bash">
hostnamectl set-hostname an-c03n01.alteeve.ca --static
hostnamectl set-hostname --pretty "Alteeve's Niche! - Cluster 01, Node 01"
</source>
 
If you want the new host name to take effect immediately, you can use the traditional <span class="code">hostname</span> command:
 
<source lang="bash">
hostname an-c03n01.alteeve.ca
</source>
 
'''Alternatively'''
 
If you have trouble with those commands, you can directly edit the files that contain the host names.
 
The host name is stored in <span class="code">/etc/hostname</span>:
 
<source lang="bash">
echo an-c03n01.alteeve.ca > /etc/hostname
cat /etc/hostname
</source>
</source>
<source lang="text">
<source lang="text">
an-c03n01.alteeve.ca
Cluster Properties:
dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
cluster-infrastructure: corosync
</source>
</source>


The "pretty" host name is stored in <span class="code">/etc/machine-info</span> as the unquoted value for the <span class="code">PRETTY_HOSTNAME</span> value.
To disable it, we set <span class="code">no-quorum-policy=ignore</span>.


<source lang="bash">
<source lang="bash">
vim /etc/machine-info
pcs property set no-quorum-policy=ignore
pcs property
</source>
</source>
<source lang="text">
<source lang="text">
PRETTY_HOSTNAME=Alteeves Niche! - Cluster 01, Node 01
Cluster Properties:
dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
cluster-infrastructure: corosync
no-quorum-policy: ignore
</source>
</source>


If you can't get the <span class="code">hostname</span> command to work for some reason, you can reboot to have the system read the new values.
== Enabling and Configuring Fencing ==
 
== Optional - Video Problems ==


On my servers, [[Fedora]] 18 doesn't detect or use the video card properly. To resolve this, I need to add <span class="code">nomodeset</span> to the kernel line when installing and again after the install is complete.
We will use IPMI and PDU based fence devices for redundancy.


Once installed
You can see the list of available fence agents here. You will need to find the one for your hardware fence devices.
 
Edit the <span class="code">/etc/default/grub</span> and append <span class="code">nomodeset</span> to the end of the <span class="code">GRUB_CMDLINE_LINUX</span> variable.


<source lang="bash">
<source lang="bash">
vim /etc/default/grub
pcs stonith list
</source>
<source lang="bash">
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_CMDLINE_LINUX="nomodeset rd.md=0 rd.lvm=0 rd.dm=0 $([ -x /usr/sbin/rhcrashkernel-param ] && /usr/sbin/rhcrashkernel-param || :) rd.luks=0 vconsole.keymap=us nomodeset"
GRUB_DISABLE_RECOVERY="true"
GRUB_THEME="/boot/grub2/themes/system/theme.txt"
</source>
 
Save that. and then rewrite the [[grub2]] configuration file.
 
<source lang="bash">
grub2-mkconfig -o /boot/grub2/grub.cfg
</source>
 
Next time you reboot, you should get a stock 80x25 character display. It's not much, but it will work on esoteric video cards or weird monitors.
 
== What Security? ==
 
This section will be re-added at the end. For now;
 
<source lang="bash">
setenforce 0
sed -i 's/SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
systemctl disable firewalld.service
systemctl stop firewalld.service
</source>
 
== Network ==
 
We want static, named network devices. Follow this;
 
* [[Changing Ethernet Device Names in EL7 and Fedora 15+]]
 
Then, use these configuration files;
 
Build the bridge;
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn-vbr1
</source>
<source lang="bash">
# Internet-Facing Network - Bridge
DEVICE="ifn-vbr1"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="10.255.10.1"
NETMASK="255.255.0.0"
GATEWAY="10.255.255.254"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
DEFROUTE="yes"
</source>
 
Now build the bonds;
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
</source>
<source lang="bash">
# Internet-Facing Network - Bond
DEVICE="ifn-bond1"
BRIDGE="ifn-vbr1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn1"
</source>
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
</source>
<source lang="bash">
# Storage Network - Bond
DEVICE="sn-bond1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn1"
IPADDR="10.10.10.1"
NETMASK="255.255.0.0"
</source>
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
</source>
<source lang="bash">
# Back-Channel Network - Bond
DEVICE="bcn-bond1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn1"
IPADDR="10.20.10.1"
NETMASK="255.255.0.0"
</source>
 
Now tell the interfaces to be slaves to their bonds;
 
Internet-Facing Network;
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn1
</source>
<source lang="bash">
# Internet-Facing Network - Link 1
DEVICE="ifn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="ifn-bond1"
</source>
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-ifn2
</source>
<source lang="bash">
# Back-Channel Network - Link 2
DEVICE="ifn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="ifn-bond1"
</source>
 
Storage Network;
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-sn1
</source>
<source lang="bash">
# Storage Network - Link 1
DEVICE="sn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="sn-bond1"
</source>
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-sn2
</source>
<source lang="bash">
# Storage Network - Link 1
DEVICE="sn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="sn-bond1"
</source>
 
Back-Channel Network
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-bcn1
</source>
<source lang="bash">
# Back-Channel Network - Link 1
DEVICE="bcn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="bcn-bond1"
</source>
 
<source lang="bash">
vim /etc/sysconfig/network-scripts/ifcfg-bcn2
</source>
<source lang="bash">
# Storage Network - Link 1
DEVICE="bcn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="bcn-bond1"
</source>
 
Now restart the network, confirm that the bonds and bridge are up and you are ready to proceed.
 
== Setup The hosts File ==
 
You can use [[DNS]] if you prefer. For now, lets use <span class="code">/etc/hosts</span> for node name resolution.
 
<source lang="bash">
vim /etc/hosts
</source>
<source lang="text">
127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4
::1        localhost localhost.localdomain localhost6 localhost6.localdomain6
 
# AN!Cluster 01, Node 01
10.255.10.1    an-c01n01.ifn
10.10.10.1      an-c01n01.sn
10.20.10.1      an-c01n01.bcn an-c01n01 an-c01n01.alteeve.ca
10.20.11.1      an-c01n01.ipmi
 
# AN!Cluster 01, Node 02
10.255.10.2    an-c01n02.ifn
10.10.10.2      an-c01n02.sn
10.20.10.2      an-c01n02.bcn an-c01n02 an-c01n02.alteeve.ca
10.20.11.2      an-c01n02.ipmi
 
# Foundation Pack
10.20.2.7      an-p03 an-p03.alteeve.ca
</source>
 
== Setup SSH ==
 
Same as [[2-Node_Red_Hat_KVM_Cluster_Tutorial#Setting_up_SSH|before]].
 
== Populating And Pushing ~/.ssh/known_hosts ==
 
Same as [[2-Node_Red_Hat_KVM_Cluster_Tutorial#Populating_And_Pushing_~/.ssh/known_hosts|before]].
 
<source lang="bash">
ssh root@an-c03n01.alteeve.ca
</source>
<source lang="text">
The authenticity of host 'an-c03n01.alteeve.ca (10.20.30.1)' can't be established.
RSA key fingerprint is 7b:dd:0d:aa:c5:f5:9e:a6:b6:4d:40:69:d6:80:4d:09.
Are you sure you want to continue connecting (yes/no)?
</source>
 
Type <span class="code">yes</span>
 
<source lang="text">
Are you sure you want to continue connecting (yes/no)? yes
</source>
<source lang="text">
Warning: Permanently added 'an-c03n01.alteeve.ca,10.20.30.1' (RSA) to the list of known hosts.
Last login: Thu Feb 14 15:18:33 2013 from 10.20.5.100
</source>
 
You will now be logged into the <span class="code">an-c03n01</span> node, which in this case is the same machine on a new session in the same terminal.
 
<source lang="text">
[root@an-c03n01 ~]#
</source>
 
You can logout by typing <span class="code">exit</span>.
 
<source lang="bash">
exit
</source>
<source lang="text">
logout
Connection to an-c03n01.alteeve.ca closed.
</source>
 
Now we have to repeat the steps for all the other variations on the names of the hosts. This is annoying and tedious, sorry.
 
<source lang="bash">
ssh root@an-c03n01
ssh root@an-c03n01.bcn
ssh root@an-c03n01.sn
ssh root@an-c03n01.ifn
ssh root@an-c03n02.alteeve.ca
ssh root@an-c03n02
ssh root@an-c03n02.bcn
ssh root@an-c03n02.sn
ssh root@an-c03n02.ifn
</source>
 
Your <span class="code">~/.ssh/known_hosts</span> file will now be populated with both nodes' ssh fingerprints. Copy it over to the second node to save all that typing a second time.
 
<source lang="bash">
rsync -av ~/.ssh/known_hosts root@an-c03n02:/root/.ssh/
</source>
 
== Keeping Time in Sync ==
 
It's not as critical as it used to be to keep the clocks on the nodes in sync, but it's still a good idea.
 
<source lang="bash">
systemctl start ntpd.service
systemctl enable ntpd.service
</source>
 
= Configuring the Cluster =
 
Now we're getting down to business!
 
For this section, we will be working on <span class="code">an-c03n01</span> and using [[ssh]] to perform tasks on <span class="code">an-c03n02</span>.
 
{{note|1=TODO: explain what this is and how it works.}}
 
== Enable the pcs Daemon ==
 
{{note|1=Most of this section comes more or less verbatim from the main [http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html Clusters from Scratch] tutorial.}}
 
We will use [[pcs]], the Pacemaker Configuration System, to configure our cluster.
 
<source lang="bash">
systemctl start pcsd.service
systemctl enable pcsd.service
</source>
<source lang="text">
ln -s '/usr/lib/systemd/system/pcsd.service' '/etc/systemd/system/multi-user.target.wants/pcsd.service'
</source>
 
Now we need to set a password for the <span class="code">hacluster</span> user. This is the account used by <span class="code">pcs</span> on one node to talk to the <span class="code">pcs</span> [[daemon]] on the other node. For this tutorial, we will use the password <span class="code">secret</span>. You will want to use [https://xkcd.com/936/ a stronger password], of course.
 
<source lang="bash">
echo secret | passwd --stdin hacluster
</source>
<source lang="text">
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.
</source>
 
== Initializing the Cluster ==
 
One of the biggest reasons we're using the [[pcs]] tool, over something like [[crm]], is that it has been written to simplify the setup of clusters on [[Red Hat]] style operating systems. It will configure [[corosync]] automatically.
 
First, authenticate against the cluster nodes.
 
<source lang="bash">
pcs cluster auth an-c03n01 an-c03n02
</source>
 
This will ask you for the user name and password. The default user name is <span class="code">hacluster</span> and we set the password to <span class="code">secret</span>.
 
<source lang="text">
Username: hacluster
Password:
an-c03n01: Authorized
an-c03n02: Authorized
</source>
 
'''Do this on one node only''':
 
Now to initialize the cluster's communication and membership layer.
 
<source lang="bash">
pcs cluster setup an-cluster-03 an-c03n01 an-c03n02
</source>
</source>
<source lang="text">
<source lang="text">
an-c03n01: Succeeded
fence_alom - Fence agent for Sun ALOM
an-c03n02: Succeeded
fence_apc - Fence agent for APC over telnet/ssh
</source>
fence_apc_snmp - Fence agent for APC over SNMP
 
fence_baytech - I/O Fencing agent for Baytech RPC switches in combination with a Cyclades Terminal
This will create the corosync configuration file <span class="code">/etc/corosync/corosync.conf</span>;
                Server
 
fence_bladecenter - Fence agent for IBM BladeCenter
<source lang="bash">
fence_brocade - Fence agent for Brocade over telnet
cat /etc/corosync/corosync.conf
fence_bullpap - I/O Fencing agent for Bull FAME architecture controlled by a PAP management console.
fence_cisco_mds - Fence agent for Cisco MDS
fence_cisco_ucs - Fence agent for Cisco UCS
fence_cpint - I/O Fencing agent for GFS on s390 and zSeries VM clusters
fence_drac - fencing agent for Dell Remote Access Card
fence_drac5 - Fence agent for Dell DRAC CMC/5
fence_eaton_snmp - Fence agent for Eaton over SNMP
fence_egenera - I/O Fencing agent for the Egenera BladeFrame
fence_eps - Fence agent for ePowerSwitch
fence_hpblade - Fence agent for HP BladeSystem
fence_ibmblade - Fence agent for IBM BladeCenter over SNMP
fence_idrac - Fence agent for IPMI over LAN
fence_ifmib - Fence agent for IF MIB
fence_ilo - Fence agent for HP iLO
fence_ilo2 - Fence agent for HP iLO
fence_ilo3 - Fence agent for IPMI over LAN
fence_ilo_mp - Fence agent for HP iLO MP
fence_imm - Fence agent for IPMI over LAN
fence_intelmodular - Fence agent for Intel Modular
fence_ipdu - Fence agent for iPDU over SNMP
fence_ipmilan - Fence agent for IPMI over LAN
fence_kdump - Fence agent for use with kdump
fence_ldom - Fence agent for Sun LDOM
fence_lpar - Fence agent for IBM LPAR
fence_mcdata - I/O Fencing agent for McData FC switches
fence_rackswitch - fence_rackswitch - I/O Fencing agent for RackSaver RackSwitch
fence_rhevm - Fence agent for RHEV-M REST API
fence_rsa - Fence agent for IBM RSA
fence_rsb - I/O Fencing agent for Fujitsu-Siemens RSB
fence_sanbox2 - Fence agent for QLogic SANBox2 FC switches
fence_scsi - fence agent for SCSI-3 persistent reservations
fence_virsh - Fence agent for virsh
fence_vixel - I/O Fencing agent for Vixel FC switches
fence_vmware - Fence agent for VMWare
fence_vmware_soap - Fence agent for VMWare over SOAP API
fence_wti - Fence agent for WTI
fence_xcat - I/O Fencing agent for xcat environments
fence_xenapi - XenAPI based fencing for the Citrix XenServer virtual machines.
fence_zvm - I/O Fencing agent for GFS on s390 and zSeries VM clusters
</source>
</source>
<source lang="text">
totem {
version: 2
secauth: off
cluster_name: an-cluster-03
transport: udpu
}
nodelist {
  node {
        ring0_addr: an-c03n01
        nodeid: 1
      }
  node {
        ring0_addr: an-c03n02
        nodeid: 2
      }
}


quorum {
We will use <span class="code">fence_ipmilan</span> and <span class="code">fence_apc_snmp</span>.
provider: corosync_votequorum
}


logging {
=== Configuring IPMI Fencing ===
to_syslog: yes
}
</source>


== Start the Cluster For the First Time ==
Every fence agent has a possibly unique subset of options that can be used. You can see a brief description of these options with the <span class="code">pcs stonith describe fence_X</span> command. Let's look at the options available for <span class="code">fence_ipmilan</span>.
 
This starts the cluster communication and membership layer for the first time.
 
'''On one node only''';


<source lang="bash">
<source lang="bash">
pcs cluster start --all
pcs stonith describe fence_ipmilan
</source>
</source>
<source lang="text">
<source lang="text">
an-c03n01: Starting Cluster...
Stonith options for: fence_ipmilan
an-c03n02: Starting Cluster...
  auth: IPMI Lan Auth type (md5, password, or none)
  ipaddr: IPMI Lan IP to talk to
  passwd: Password (if required) to control power on IPMI device
  passwd_script: Script to retrieve password (if required)
  lanplus: Use Lanplus
  login: Username/Login (if required) to control power on IPMI device
  action: Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata
  timeout: Timeout (sec) for IPMI operation
  cipher: Ciphersuite to use (same as ipmitool -C parameter)
  method: Method to fence (onoff or cycle)
  power_wait: Wait X seconds after on/off operation
  delay: Wait X seconds before fencing is started
  privlvl: Privilege level on IPMI device
  verbose: Verbose mode
</source>
</source>


After a few moments, you should be able to check the status;
One of the nice things about pcs is that it allows us to create a test file to prepare all our changes in. Then, when we're happy with the changes, merge them into the running cluster. So let's make a copy called <span class="code">stonith_cfg</span>


<source lang="bash">
<source lang="bash">
pcs status
pcs cluster cib stonith_cfg
</source>
<source lang="text">
Last updated: Fri Feb 15 01:30:43 2013
Last change: Fri Feb 15 01:30:29 2013 via crmd on an-c03n01
Stack: corosync
Current DC: an-c03n01 (1) - partition with quorum
Version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
2 Nodes configured, unknown expected votes
0 Resources configured.
 
 
Online: [ an-c03n01 an-c03n02 ]
 
Full list of resources:
</source>
</source>


The other node should show almost the identical output.
Now add fencing.
 
== Disabling Quorum ==
 
{{note|1=Show the math.}}
 
With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum.
 
By default, pacemaker uses quorum. You don't see this initially though;


<source lang="bash">
<source lang="bash">
pcs property
#  temp file                    unique name    fence agent  target node                device addr          credentials
</source>
pcs -f stonith_cfg stonith create impi-an-c03n01 fence_ipmilan pcmk_host_list="an-m03n01" ipaddr=an-c03n01.ipmi login=admin passwd=admin op monitor interval=60s
<source lang="text">
Cluster Properties:
dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
cluster-infrastructure: corosync
</source>
</source>


To disable it, we set <span class="code">no-quorum-policy=ignore</span>.


<span class="code"></span>
<span class="code"></span>

Revision as of 22:40, 15 February 2013

Disabling Quorum

Note: Show the math.

With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum.

By default, pacemaker uses quorum. You don't see this initially though;

pcs property
Cluster Properties:
 dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
 cluster-infrastructure: corosync

To disable it, we set no-quorum-policy=ignore.

pcs property set no-quorum-policy=ignore
pcs property
Cluster Properties:
 dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
 cluster-infrastructure: corosync
 no-quorum-policy: ignore

Enabling and Configuring Fencing

We will use IPMI and PDU based fence devices for redundancy.

You can see the list of available fence agents here. You will need to find the one for your hardware fence devices.

pcs stonith list
fence_alom - Fence agent for Sun ALOM
fence_apc - Fence agent for APC over telnet/ssh
fence_apc_snmp - Fence agent for APC over SNMP
fence_baytech - I/O Fencing agent for Baytech RPC switches in combination with a Cyclades Terminal
                Server
fence_bladecenter - Fence agent for IBM BladeCenter
fence_brocade - Fence agent for Brocade over telnet
fence_bullpap - I/O Fencing agent for Bull FAME architecture controlled by a PAP management console.
fence_cisco_mds - Fence agent for Cisco MDS
fence_cisco_ucs - Fence agent for Cisco UCS
fence_cpint - I/O Fencing agent for GFS on s390 and zSeries VM clusters
fence_drac - fencing agent for Dell Remote Access Card
fence_drac5 - Fence agent for Dell DRAC CMC/5
fence_eaton_snmp - Fence agent for Eaton over SNMP
fence_egenera - I/O Fencing agent for the Egenera BladeFrame
fence_eps - Fence agent for ePowerSwitch
fence_hpblade - Fence agent for HP BladeSystem
fence_ibmblade - Fence agent for IBM BladeCenter over SNMP
fence_idrac - Fence agent for IPMI over LAN
fence_ifmib - Fence agent for IF MIB
fence_ilo - Fence agent for HP iLO
fence_ilo2 - Fence agent for HP iLO
fence_ilo3 - Fence agent for IPMI over LAN
fence_ilo_mp - Fence agent for HP iLO MP
fence_imm - Fence agent for IPMI over LAN
fence_intelmodular - Fence agent for Intel Modular
fence_ipdu - Fence agent for iPDU over SNMP
fence_ipmilan - Fence agent for IPMI over LAN
fence_kdump - Fence agent for use with kdump
fence_ldom - Fence agent for Sun LDOM
fence_lpar - Fence agent for IBM LPAR
fence_mcdata - I/O Fencing agent for McData FC switches
fence_rackswitch - fence_rackswitch - I/O Fencing agent for RackSaver RackSwitch
fence_rhevm - Fence agent for RHEV-M REST API
fence_rsa - Fence agent for IBM RSA
fence_rsb - I/O Fencing agent for Fujitsu-Siemens RSB
fence_sanbox2 - Fence agent for QLogic SANBox2 FC switches
fence_scsi - fence agent for SCSI-3 persistent reservations
fence_virsh - Fence agent for virsh
fence_vixel - I/O Fencing agent for Vixel FC switches
fence_vmware - Fence agent for VMWare
fence_vmware_soap - Fence agent for VMWare over SOAP API
fence_wti - Fence agent for WTI
fence_xcat - I/O Fencing agent for xcat environments
fence_xenapi - XenAPI based fencing for the Citrix XenServer virtual machines.
fence_zvm - I/O Fencing agent for GFS on s390 and zSeries VM clusters

We will use fence_ipmilan and fence_apc_snmp.

Configuring IPMI Fencing

Every fence agent has a possibly unique subset of options that can be used. You can see a brief description of these options with the pcs stonith describe fence_X command. Let's look at the options available for fence_ipmilan.

pcs stonith describe fence_ipmilan
Stonith options for: fence_ipmilan
  auth: IPMI Lan Auth type (md5, password, or none)
  ipaddr: IPMI Lan IP to talk to
  passwd: Password (if required) to control power on IPMI device
  passwd_script: Script to retrieve password (if required)
  lanplus: Use Lanplus
  login: Username/Login (if required) to control power on IPMI device
  action: Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata
  timeout: Timeout (sec) for IPMI operation
  cipher: Ciphersuite to use (same as ipmitool -C parameter)
  method: Method to fence (onoff or cycle)
  power_wait: Wait X seconds after on/off operation
  delay: Wait X seconds before fencing is started
  privlvl: Privilege level on IPMI device
  verbose: Verbose mode

One of the nice things about pcs is that it allows us to create a test file to prepare all our changes in. Then, when we're happy with the changes, merge them into the running cluster. So let's make a copy called stonith_cfg

pcs cluster cib stonith_cfg

Now add fencing.

#   temp file                     unique name    fence agent   target node                device addr           credentials
pcs -f stonith_cfg stonith create impi-an-c03n01 fence_ipmilan pcmk_host_list="an-m03n01" ipaddr=an-c03n01.ipmi login=admin passwd=admin op monitor interval=60s


 

Any questions, feedback, advice, complaints or meanderings are welcome.
Alteeve's Niche! Enterprise Support:
Alteeve Support
Community Support
© Alteeve's Niche! Inc. 1997-2024   Anvil! "Intelligent Availability®" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.