Revision as of 22:41, 15 February 2013

Alteeve Wiki :: How To :: Anvil! Tutorial 3

Warning: This tutorial is incomplete, flawed and generally sucks at this time. Do not follow this and expect anything to work. In large part, it's a dumping ground for notes and little else. This warning will be removed when the tutorial is completed.

Warning: This tutorial is built on a guess of what Red Hat's Enterprise Linux 7 will offer, based on what the author sees happening in Fedora upstream. Red Hat never confirms what a future release will contain until it is actually released. As such, this tutorial may turn out to be inappropriate for the final release of RHEL 7. In such a case, the warning above will remain in place until the tutorial is updated to reflect the final release.

This is the third AN!Cluster tutorial built on Red Hat's Enterprise Linux 7. It improves on the RHEL 5, RHCS stable 2 and RHEL 6, RHCS stable3 tutorials.

As with the previous tutorials, the end goal of this tutorial is a 2-node cluster providing a platform for high-availability virtual servers. It's design attempts to remove all single points of failure from the system. Power and networking are made fully redundant in this version, along with minimizing the node failures which would lead to service interruption. This tutorial also covers the AN!Utilities; AN!Cluster Dashboard, AN!Cluster Monitor and AN!Safe Cluster Shutdown.

As it the previous tutorial, KVM will be the hypervisor used for facilitating virtual machines. The old cman and rgmanager tools are replaced in favour of pacemaker for resource management.

Before We Begin

This tutorial does not require prior cluster experience, but it does expect familiarity with Linux and a low-intermediate understanding of networking. Where possible, steps are explained in detail and rationale is provided for why certain decisions are made.

For those with cluster experience;

Please be careful not to skip too much. There are some major and some subtle changes from previous tutorials.

OS Setup

Warning: I used Fedora 18 at this point, obviously things will change, possibly a lot, once RHEL 7 is released.

Install

Not all of these are required, but most are used at one point or another in this tutorial.

yum install bridge-utils corosync gpm man net-tools network ntp pacemaker pcs rsync syslinux vim wget

If you want to use your mouse at the node's terminal, run the following;

systemctl enable gpm.service
systemctl start gpm.service

Make the Network Configuration Static

We don't want NetworkManager in our cluster as it tries to dynamically manage the network and we need our network to be static.

yum remove NetworkManager

Note: This assumes that systemd will be used in RHEL7. This may not be the case come release day.

Now to ensure the static network service starts on boot.

systemctl enable network.service

Setting the Hostname

Fedora 18 is very different from EL6.

Note: The '--pretty' line currently doesn't work as there is a bug (rhbz#895299) with single-quotes.

Note: The '--static' option is currently needed to prevent the '.' from being removed. See this bug (rhbz#896756).

Use a format that works for you. For the tutorial, node names are based on the following;

A two-letter prefix identifying the company/user (an, for "Alteeve's Niche!")
A sequential cluster ID number in the form of cXX (c01 for "Cluster 01", c02 for Cluster 02, etc)
A sequential node ID number in the form of nYY

In my case, this is my third cluster and I use the company prefix an, so my two nodes will be;

an-c03n01 - node 1
an-c03n02 - node 2

Folks who've read my earlier tutorials will note that this is a departure in naming. I find this method spans and scales much better. Further, it the simply required in order to use the AN! Cluster Dashboard.

hostnamectl set-hostname an-c03n01.alteeve.ca --static
hostnamectl set-hostname --pretty "Alteeve's Niche! - Cluster 01, Node 01"

If you want the new host name to take effect immediately, you can use the traditional hostname command:

hostname an-c03n01.alteeve.ca

Alternatively

If you have trouble with those commands, you can directly edit the files that contain the host names.

The host name is stored in /etc/hostname:

echo an-c03n01.alteeve.ca > /etc/hostname 
cat /etc/hostname

an-c03n01.alteeve.ca

The "pretty" host name is stored in /etc/machine-info as the unquoted value for the PRETTY_HOSTNAME value.

vim /etc/machine-info

PRETTY_HOSTNAME=Alteeves Niche! - Cluster 01, Node 01

If you can't get the hostname command to work for some reason, you can reboot to have the system read the new values.

Optional - Video Problems

On my servers, Fedora 18 doesn't detect or use the video card properly. To resolve this, I need to add nomodeset to the kernel line when installing and again after the install is complete.

Once installed

Edit the /etc/default/grub and append nomodeset to the end of the GRUB_CMDLINE_LINUX variable.

vim /etc/default/grub

GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_CMDLINE_LINUX="nomodeset rd.md=0 rd.lvm=0 rd.dm=0 $([ -x /usr/sbin/rhcrashkernel-param ] && /usr/sbin/rhcrashkernel-param || :) rd.luks=0 vconsole.keymap=us nomodeset"
GRUB_DISABLE_RECOVERY="true"
GRUB_THEME="/boot/grub2/themes/system/theme.txt"

Save that. and then rewrite the grub2 configuration file.

grub2-mkconfig -o /boot/grub2/grub.cfg

Next time you reboot, you should get a stock 80x25 character display. It's not much, but it will work on esoteric video cards or weird monitors.

What Security?

This section will be re-added at the end. For now;

setenforce 0
sed -i 's/SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
systemctl disable firewalld.service
systemctl stop firewalld.service

Network

We want static, named network devices. Follow this;

Changing Ethernet Device Names in EL7 and Fedora 15+

Then, use these configuration files;

Build the bridge;

vim /etc/sysconfig/network-scripts/ifcfg-ifn-vbr1

# Internet-Facing Network - Bridge
DEVICE="ifn-vbr1"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="10.255.10.1"
NETMASK="255.255.0.0"
GATEWAY="10.255.255.254"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
DEFROUTE="yes"

Now build the bonds;

vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1

# Internet-Facing Network - Bond
DEVICE="ifn-bond1"
BRIDGE="ifn-vbr1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn1"

vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1

# Storage Network - Bond
DEVICE="sn-bond1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn1"
IPADDR="10.10.10.1"
NETMASK="255.255.0.0"

vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1

# Back-Channel Network - Bond
DEVICE="bcn-bond1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn1"
IPADDR="10.20.10.1"
NETMASK="255.255.0.0"

Now tell the interfaces to be slaves to their bonds;

Internet-Facing Network;

vim /etc/sysconfig/network-scripts/ifcfg-ifn1

# Internet-Facing Network - Link 1
DEVICE="ifn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="ifn-bond1"

vim /etc/sysconfig/network-scripts/ifcfg-ifn2

# Back-Channel Network - Link 2
DEVICE="ifn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="ifn-bond1"

Storage Network;

vim /etc/sysconfig/network-scripts/ifcfg-sn1

# Storage Network - Link 1
DEVICE="sn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="sn-bond1"

vim /etc/sysconfig/network-scripts/ifcfg-sn2

# Storage Network - Link 1
DEVICE="sn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="sn-bond1"

Back-Channel Network

vim /etc/sysconfig/network-scripts/ifcfg-bcn1

# Back-Channel Network - Link 1
DEVICE="bcn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="bcn-bond1"

vim /etc/sysconfig/network-scripts/ifcfg-bcn2

# Storage Network - Link 1
DEVICE="bcn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="bcn-bond1"

Now restart the network, confirm that the bonds and bridge are up and you are ready to proceed.

Setup The hosts File

You can use DNS if you prefer. For now, lets use /etc/hosts for node name resolution.

vim /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

# AN!Cluster 01, Node 01
10.255.10.1     an-c01n01.ifn
10.10.10.1      an-c01n01.sn
10.20.10.1      an-c01n01.bcn an-c01n01 an-c01n01.alteeve.ca
10.20.11.1      an-c01n01.ipmi

# AN!Cluster 01, Node 02
10.255.10.2     an-c01n02.ifn
10.10.10.2      an-c01n02.sn
10.20.10.2      an-c01n02.bcn an-c01n02 an-c01n02.alteeve.ca
10.20.11.2      an-c01n02.ipmi

# Foundation Pack
10.20.2.7       an-p03 an-p03.alteeve.ca

Setup SSH

Same as before.

Populating And Pushing ~/.ssh/known_hosts

Same as before.

ssh root@an-c03n01.alteeve.ca

The authenticity of host 'an-c03n01.alteeve.ca (10.20.30.1)' can't be established.
RSA key fingerprint is 7b:dd:0d:aa:c5:f5:9e:a6:b6:4d:40:69:d6:80:4d:09.
Are you sure you want to continue connecting (yes/no)?

Type yes

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'an-c03n01.alteeve.ca,10.20.30.1' (RSA) to the list of known hosts.
Last login: Thu Feb 14 15:18:33 2013 from 10.20.5.100

You will now be logged into the an-c03n01 node, which in this case is the same machine on a new session in the same terminal.

[root@an-c03n01 ~]#

You can logout by typing exit.

exit

logout
Connection to an-c03n01.alteeve.ca closed.

Now we have to repeat the steps for all the other variations on the names of the hosts. This is annoying and tedious, sorry.

ssh root@an-c03n01
ssh root@an-c03n01.bcn
ssh root@an-c03n01.sn
ssh root@an-c03n01.ifn
ssh root@an-c03n02.alteeve.ca
ssh root@an-c03n02
ssh root@an-c03n02.bcn
ssh root@an-c03n02.sn
ssh root@an-c03n02.ifn

Your ~/.ssh/known_hosts file will now be populated with both nodes' ssh fingerprints. Copy it over to the second node to save all that typing a second time.

rsync -av ~/.ssh/known_hosts root@an-c03n02:/root/.ssh/

Keeping Time in Sync

It's not as critical as it used to be to keep the clocks on the nodes in sync, but it's still a good idea.

systemctl start ntpd.service
systemctl enable ntpd.service

Configuring the Cluster

Now we're getting down to business!

For this section, we will be working on an-c03n01 and using ssh to perform tasks on an-c03n02.

Note: TODO: explain what this is and how it works.

Enable the pcs Daemon

Note: Most of this section comes more or less verbatim from the main Clusters from Scratch tutorial.

We will use pcs, the Pacemaker Configuration System, to configure our cluster.

systemctl start pcsd.service
systemctl enable pcsd.service

ln -s '/usr/lib/systemd/system/pcsd.service' '/etc/systemd/system/multi-user.target.wants/pcsd.service'

Now we need to set a password for the hacluster user. This is the account used by pcs on one node to talk to the pcs daemon on the other node. For this tutorial, we will use the password secret. You will want to use a stronger password, of course.

echo secret | passwd --stdin hacluster

Changing password for user hacluster.
passwd: all authentication tokens updated successfully.

Initializing the Cluster

One of the biggest reasons we're using the pcs tool, over something like crm, is that it has been written to simplify the setup of clusters on Red Hat style operating systems. It will configure corosync automatically.

First, authenticate against the cluster nodes.

pcs cluster auth an-c03n01 an-c03n02

This will ask you for the user name and password. The default user name is hacluster and we set the password to secret.

Username: hacluster
Password: 
an-c03n01: Authorized
an-c03n02: Authorized

Do this on one node only:

Now to initialize the cluster's communication and membership layer.

pcs cluster setup an-cluster-03 an-c03n01 an-c03n02

an-c03n01: Succeeded
an-c03n02: Succeeded

This will create the corosync configuration file /etc/corosync/corosync.conf;

cat /etc/corosync/corosync.conf

totem {
version: 2
secauth: off
cluster_name: an-cluster-03
transport: udpu
}

nodelist {
  node {
        ring0_addr: an-c03n01
        nodeid: 1
       }
  node {
        ring0_addr: an-c03n02
        nodeid: 2
       }
}

quorum {
provider: corosync_votequorum
}

logging {
to_syslog: yes
}

Start the Cluster For the First Time

This starts the cluster communication and membership layer for the first time.

On one node only;

pcs cluster start --all

an-c03n01: Starting Cluster...
an-c03n02: Starting Cluster...

After a few moments, you should be able to check the status;

pcs status

Last updated: Fri Feb 15 01:30:43 2013
Last change: Fri Feb 15 01:30:29 2013 via crmd on an-c03n01
Stack: corosync
Current DC: an-c03n01 (1) - partition with quorum
Version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
2 Nodes configured, unknown expected votes
0 Resources configured.


Online: [ an-c03n01 an-c03n02 ]

Full list of resources:

The other node should show almost the identical output.

Disabling Quorum

Note: Show the math.

With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum.

By default, pacemaker uses quorum. You don't see this initially though;

pcs property

Cluster Properties:
 dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
 cluster-infrastructure: corosync

To disable it, we set no-quorum-policy=ignore.

Any questions, feedback, advice, complaints or meanderings are welcome.
`Alteeve's Niche!`	`Alteeve Enterprise Support`	`Community Support`
© 2025 Alteeve. Intelligent Availability® is a registered trademark of Alteeve's Niche! Inc. 1997-2025
`legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.`

Anvil! Tutorial 3: Difference between revisions

Revision as of 22:41, 15 February 2013

Contents

Before We Begin

OS Setup

Install

Make the Network Configuration Static

Setting the Hostname

Optional - Video Problems

What Security?

Network

Setup The hosts File

Setup SSH

Populating And Pushing ~/.ssh/known_hosts

Keeping Time in Sync

Configuring the Cluster

Enable the pcs Daemon

Initializing the Cluster

Start the Cluster For the First Time

Disabling Quorum

Navigation menu

@@ Line 1: / Line 1: @@
-== Disabling Quorum ==
+{{howto_header}}
+{{warning|1=This tutorial is incomplete, flawed and generally sucks at this time. Do not follow this and expect anything to work. In large part, it's a dumping ground for notes and little else. This warning will be removed when the tutorial is completed.}}
+{{warning|1=This tutorial is built on '''''a guess''''' of what [[Red Hat]]'s Enterprise Linux 7 will offer, based on what the author sees happening in [[Fedora]] upstream. [[Red Hat]] never confirms what a future release will contain until it is actually released. As such, this tutorial may turn out to be inappropriate for the final release of [[RHEL]] 7. In such a case, the warning above will remain in place until the tutorial is updated to reflect the final release.}}
+This is the third '''AN!Cluster''' tutorial built on [[Red Hat]]'s Enterprise Linux 7. It improves on the [[Red Hat Cluster Service 2 Tutorial|RHEL 5, RHCS stable 2]] and [[2-Node Red Hat KVM Cluster Tutorial|RHEL 6, RHCS stable3]] tutorials.
+As with the previous tutorials, the end goal of this tutorial is a 2-node cluster providing a platform for high-availability virtual servers. It's design attempts to remove all single points of failure from the system. Power and networking are made fully redundant in this version, along with minimizing the node failures which would lead to service interruption. This tutorial also covers the [[AN!Utilities]]; [[AN!Cluster Dashboard]], [[AN!Cluster Monitor]] and [[AN!Safe Cluster Shutdown]].
+As it the previous tutorial, [[KVM]] will be the hypervisor used for facilitating virtual machines. The old <span class="code">[[cman]]</span> and <span class="code">[[rgmanager]]</span> tools are replaced in favour of <span class="code">[[pacemaker]]</span> for resource management.
+= Before We Begin =
+This tutorial '''does not''' require prior cluster experience, but it does expect familiarity with Linux and a low-intermediate understanding of networking. Where possible, steps are explained in detail and rationale is provided for why certain decisions are made.
+'''For those with cluster experience''';
+Please be careful not to skip too much. There are some major and some subtle changes from previous tutorials.
+= OS Setup =
+{{warning|1=I used Fedora 18 at this point, obviously things will change, possibly a lot, once RHEL 7 is released.}}
+== Install ==
+Not all of these are required, but most are used at one point or another in this tutorial.
+<source lang="bash">
+yum install bridge-utils corosync gpm man net-tools network ntp pacemaker pcs rsync syslinux vim wget
+</source>
+If you want to use your mouse at the node's terminal, run the following;
+<source lang="bash">
+systemctl enable gpm.service
+systemctl start gpm.service
+</source>
+== Make the Network Configuration Static ==
+We don't want [[NetworkManager]] in our cluster as it tries to dynamically manage the network and we need our network to be static.
+<source lang="bash">
+yum remove NetworkManager
+</source>
+{{note|1=This assumes that [[systemd]] will be used in [[RHEL]]7. This may not be the case come release day.}}
+Now to ensure the static <span class="code">network</span> service starts on boot.
+<source lang="bash">
+systemctl enable network.service
+</source>
+== Setting the Hostname ==
+Fedora 18 is '''very''' different from [[EL6]].
+{{note|1=The '<span class="code">--pretty</span>' line currently doesn't work as there is [https://bugzilla.redhat.com/show_bug.cgi?id=895299 a bug (rhbz#895299)] with single-quotes.}}
+{{note|1=The '<span class="code">--static</span>' option is currently needed to prevent the '<span class="code">.</span>' from being removed. See [https://bugzilla.redhat.com/show_bug.cgi?id=896756 this bug (rhbz#896756)].}}
+Use a format that works for you. For the tutorial, node names are based on the following;
+* A two-letter prefix identifying the company/user (<span class="code">an</span>, for "Alteeve's Niche!")
+* A sequential cluster ID number in the form of <span class="code">cXX</span> (<span class="code">c01</span> for "Cluster 01", <span class="code">c02</span> for Cluster 02, etc)
+* A sequential node ID number in the form of <span class="code">nYY</span>
+In my case, this is my third cluster and I use the company prefix <span class="code">an</span>, so my two nodes will be;
+* <span class="code">an-c03n01</span> - node 1
+* <span class="code">an-c03n02</span> - node 2
+Folks who've read my earlier tutorials will note that this is a departure in naming. I find this method spans and scales much better. Further, it the simply required in order to use the [[AN!CDB|AN! Cluster Dashboard]].
+<source lang="bash">
+hostnamectl set-hostname an-c03n01.alteeve.ca --static
+hostnamectl set-hostname --pretty "Alteeve's Niche! - Cluster 01, Node 01"
+</source>
+If you want the new host name to take effect immediately, you can use the traditional <span class="code">hostname</span> command:
+<source lang="bash">
+hostname an-c03n01.alteeve.ca
+</source>
+'''Alternatively'''
+If you have trouble with those commands, you can directly edit the files that contain the host names.
+The host name is stored in <span class="code">/etc/hostname</span>:
+<source lang="bash">
+echo an-c03n01.alteeve.ca > /etc/hostname
+cat /etc/hostname
+</source>
+<source lang="text">
+an-c03n01.alteeve.ca
+</source>
+The "pretty" host name is stored in <span class="code">/etc/machine-info</span> as the unquoted value for the <span class="code">PRETTY_HOSTNAME</span> value.
+<source lang="bash">
+vim /etc/machine-info
+</source>
+<source lang="text">
+PRETTY_HOSTNAME=Alteeves Niche! - Cluster 01, Node 01
+</source>
+If you can't get the <span class="code">hostname</span> command to work for some reason, you can reboot to have the system read the new values.
+== Optional - Video Problems ==
+On my servers, [[Fedora]] 18 doesn't detect or use the video card properly. To resolve this, I need to add <span class="code">nomodeset</span> to the kernel line when installing and again after the install is complete.
+Once installed
+Edit the <span class="code">/etc/default/grub</span> and append <span class="code">nomodeset</span> to the end of the <span class="code">GRUB_CMDLINE_LINUX</span> variable.
+<source lang="bash">
+vim /etc/default/grub
+</source>
+<source lang="bash">
+GRUB_TIMEOUT=5
+GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
+GRUB_DEFAULT=saved
+GRUB_CMDLINE_LINUX="nomodeset rd.md=0 rd.lvm=0 rd.dm=0 $([ -x /usr/sbin/rhcrashkernel-param ] && /usr/sbin/rhcrashkernel-param || :) rd.luks=0 vconsole.keymap=us nomodeset"
+GRUB_DISABLE_RECOVERY="true"
+GRUB_THEME="/boot/grub2/themes/system/theme.txt"
+</source>
+Save that. and then rewrite the [[grub2]] configuration file.
+<source lang="bash">
+grub2-mkconfig -o /boot/grub2/grub.cfg
+</source>
+Next time you reboot, you should get a stock 80x25 character display. It's not much, but it will work on esoteric video cards or weird monitors.
+== What Security? ==
+This section will be re-added at the end. For now;
+<source lang="bash">
+setenforce 0
+sed -i 's/SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
+systemctl disable firewalld.service
+systemctl stop firewalld.service
+</source>
+== Network ==
+We want static, named network devices. Follow this;
+* [[Changing Ethernet Device Names in EL7 and Fedora 15+]]
+Then, use these configuration files;
+Build the bridge;
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-ifn-vbr1
+</source>
+<source lang="bash">
+# Internet-Facing Network - Bridge
+DEVICE="ifn-vbr1"
+TYPE="Bridge"
+BOOTPROTO="none"
+IPADDR="10.255.10.1"
+NETMASK="255.255.0.0"
+GATEWAY="10.255.255.254"
+DNS1="8.8.8.8"
+DNS2="8.8.4.4"
+DEFROUTE="yes"
+</source>
+Now build the bonds;
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
+</source>
+<source lang="bash">
+# Internet-Facing Network - Bond
+DEVICE="ifn-bond1"
+BRIDGE="ifn-vbr1"
+BOOTPROTO="none"
+NM_CONTROLLED="no"
+ONBOOT="yes"
+BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn1"
+</source>
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
+</source>
+<source lang="bash">
+# Storage Network - Bond
+DEVICE="sn-bond1"
+BOOTPROTO="none"
+NM_CONTROLLED="no"
+ONBOOT="yes"
+BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn1"
+IPADDR="10.10.10.1"
+NETMASK="255.255.0.0"
+</source>
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
+</source>
+<source lang="bash">
+# Back-Channel Network - Bond
+DEVICE="bcn-bond1"
+BOOTPROTO="none"
+NM_CONTROLLED="no"
+ONBOOT="yes"
+BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn1"
+IPADDR="10.20.10.1"
+NETMASK="255.255.0.0"
+</source>
+Now tell the interfaces to be slaves to their bonds;
+Internet-Facing Network;
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-ifn1
+</source>
+<source lang="bash">
+# Internet-Facing Network - Link 1
+DEVICE="ifn1"
+NM_CONTROLLED="no"
+BOOTPROTO="none"
+ONBOOT="yes"
+SLAVE="yes"
+MASTER="ifn-bond1"
+</source>
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-ifn2
+</source>
+<source lang="bash">
+# Back-Channel Network - Link 2
+DEVICE="ifn2"
+NM_CONTROLLED="no"
+BOOTPROTO="none"
+ONBOOT="yes"
+SLAVE="yes"
+MASTER="ifn-bond1"
+</source>
+Storage Network;
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-sn1
+</source>
+<source lang="bash">
+# Storage Network - Link 1
+DEVICE="sn1"
+NM_CONTROLLED="no"
+BOOTPROTO="none"
+ONBOOT="yes"
+SLAVE="yes"
+MASTER="sn-bond1"
+</source>
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-sn2
+</source>
+<source lang="bash">
+# Storage Network - Link 1
+DEVICE="sn2"
+NM_CONTROLLED="no"
+BOOTPROTO="none"
+ONBOOT="yes"
+SLAVE="yes"
+MASTER="sn-bond1"
+</source>
+Back-Channel Network
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-bcn1
+</source>
+<source lang="bash">
+# Back-Channel Network - Link 1
+DEVICE="bcn1"
+NM_CONTROLLED="no"
+BOOTPROTO="none"
+ONBOOT="yes"
+SLAVE="yes"
+MASTER="bcn-bond1"
+</source>
+<source lang="bash">
+vim /etc/sysconfig/network-scripts/ifcfg-bcn2
+</source>
+<source lang="bash">
+# Storage Network - Link 1
+DEVICE="bcn2"
+NM_CONTROLLED="no"
+BOOTPROTO="none"
+ONBOOT="yes"
+SLAVE="yes"
+MASTER="bcn-bond1"
+</source>
+Now restart the network, confirm that the bonds and bridge are up and you are ready to proceed.
+== Setup The hosts File ==
+You can use [[DNS]] if you prefer. For now, lets use <span class="code">/etc/hosts</span> for node name resolution.
+<source lang="bash">
+vim /etc/hosts
+</source>
+<source lang="text">
+.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
+::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
+# AN!Cluster 01, Node 01
+.255.10.1     an-c01n01.ifn
+.10.10.1      an-c01n01.sn
+.20.10.1      an-c01n01.bcn an-c01n01 an-c01n01.alteeve.ca
+.20.11.1      an-c01n01.ipmi
+# AN!Cluster 01, Node 02
+.255.10.2     an-c01n02.ifn
+.10.10.2      an-c01n02.sn
+.20.10.2      an-c01n02.bcn an-c01n02 an-c01n02.alteeve.ca
+.20.11.2      an-c01n02.ipmi
+# Foundation Pack
+.20.2.7       an-p03 an-p03.alteeve.ca
+</source>
+== Setup SSH ==
+Same as [[2-Node_Red_Hat_KVM_Cluster_Tutorial#Setting_up_SSH|before]].
+== Populating And Pushing ~/.ssh/known_hosts ==
+Same as [[2-Node_Red_Hat_KVM_Cluster_Tutorial#Populating_And_Pushing_~/.ssh/known_hosts|before]].
+<source lang="bash">
+ssh root@an-c03n01.alteeve.ca
+</source>
+<source lang="text">
+The authenticity of host 'an-c03n01.alteeve.ca (10.20.30.1)' can't be established.
+RSA key fingerprint is 7b:dd:0d:aa:c5:f5:9e:a6:b6:4d:40:69:d6:80:4d:09.
+Are you sure you want to continue connecting (yes/no)?
+</source>
+Type <span class="code">yes</span>
+<source lang="text">
+Are you sure you want to continue connecting (yes/no)? yes
+</source>
+<source lang="text">
+Warning: Permanently added 'an-c03n01.alteeve.ca,10.20.30.1' (RSA) to the list of known hosts.
+Last login: Thu Feb 14 15:18:33 2013 from 10.20.5.100
+</source>
+You will now be logged into the <span class="code">an-c03n01</span> node, which in this case is the same machine on a new session in the same terminal.
+<source lang="text">
+[root@an-c03n01 ~]#
+</source>
+You can logout by typing <span class="code">exit</span>.
+<source lang="bash">
+exit
+</source>
+<source lang="text">
+logout
+Connection to an-c03n01.alteeve.ca closed.
+</source>
+Now we have to repeat the steps for all the other variations on the names of the hosts. This is annoying and tedious, sorry.
+<source lang="bash">
+ssh root@an-c03n01
+ssh root@an-c03n01.bcn
+ssh root@an-c03n01.sn
+ssh root@an-c03n01.ifn
+ssh root@an-c03n02.alteeve.ca
+ssh root@an-c03n02
+ssh root@an-c03n02.bcn
+ssh root@an-c03n02.sn
+ssh root@an-c03n02.ifn
+</source>
+Your <span class="code">~/.ssh/known_hosts</span> file will now be populated with both nodes' ssh fingerprints. Copy it over to the second node to save all that typing a second time.
+<source lang="bash">
+rsync -av ~/.ssh/known_hosts root@an-c03n02:/root/.ssh/
+</source>
+== Keeping Time in Sync ==
+It's not as critical as it used to be to keep the clocks on the nodes in sync, but it's still a good idea.
+<source lang="bash">
+systemctl start ntpd.service
+systemctl enable ntpd.service
+</source>
+= Configuring the Cluster =
+Now we're getting down to business!
+For this section, we will be working on <span class="code">an-c03n01</span> and using [[ssh]] to perform tasks on <span class="code">an-c03n02</span>.
+{{note|1=TODO: explain what this is and how it works.}}
+== Enable the pcs Daemon ==
+{{note|1=Most of this section comes more or less verbatim from the main [http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html Clusters from Scratch] tutorial.}}
-{{note|1=Show the math.}}
+We will use [[pcs]], the Pacemaker Configuration System, to configure our cluster.
-With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum.
+<source lang="bash">
+systemctl start pcsd.service
+systemctl enable pcsd.service
+</source>
+<source lang="text">
+ln -s '/usr/lib/systemd/system/pcsd.service' '/etc/systemd/system/multi-user.target.wants/pcsd.service'
+</source>
-By default, pacemaker uses quorum. You don't see this initially though;
+Now we need to set a password for the <span class="code">hacluster</span> user. This is the account used by <span class="code">pcs</span> on one node to talk to the <span class="code">pcs</span> [[daemon]] on the other node. For this tutorial, we will use the password <span class="code">secret</span>. You will want to use [https://xkcd.com/936/ a stronger password], of course.
 <source lang="bash">
-pcs property
+echo secret | passwd --stdin hacluster
 </source>
 <source lang="text">
-Cluster Properties:
+Changing password for user hacluster.
- dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
+passwd: all authentication tokens updated successfully.
- cluster-infrastructure: corosync
 </source>
-To disable it, we set <span class="code">no-quorum-policy=ignore</span>.
+== Initializing the Cluster ==
+One of the biggest reasons we're using the [[pcs]] tool, over something like [[crm]], is that it has been written to simplify the setup of clusters on [[Red Hat]] style operating systems. It will configure [[corosync]] automatically.
+First, authenticate against the cluster nodes.
 <source lang="bash">
-pcs property set no-quorum-policy=ignore
+pcs cluster auth an-c03n01 an-c03n02
-pcs property
 </source>
+This will ask you for the user name and password. The default user name is <span class="code">hacluster</span> and we set the password to <span class="code">secret</span>.
 <source lang="text">
-Cluster Properties:
+Username: hacluster
- dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
+Password:
- cluster-infrastructure: corosync
+an-c03n01: Authorized
- no-quorum-policy: ignore
+an-c03n02: Authorized
 </source>
-== Enabling and Configuring Fencing ==
+'''Do this on one node only''':
+Now to initialize the cluster's communication and membership layer.
-We will use IPMI and PDU based fence devices for redundancy.
+<source lang="bash">
+pcs cluster setup an-cluster-03 an-c03n01 an-c03n02
+</source>
+<source lang="text">
+an-c03n01: Succeeded
+an-c03n02: Succeeded
+</source>
-You can see the list of available fence agents here. You will need to find the one for your hardware fence devices.
+This will create the corosync configuration file <span class="code">/etc/corosync/corosync.conf</span>;
 <source lang="bash">
-pcs stonith list
+cat /etc/corosync/corosync.conf
 </source>
 <source lang="text">
-fence_alom - Fence agent for Sun ALOM
+totem {
-fence_apc - Fence agent for APC over telnet/ssh
+version: 2
-fence_apc_snmp - Fence agent for APC over SNMP
+secauth: off
-fence_baytech - I/O Fencing agent for Baytech RPC switches in combination with a Cyclades Terminal
+cluster_name: an-cluster-03
-                Server
+transport: udpu
-fence_bladecenter - Fence agent for IBM BladeCenter
+}
-fence_brocade - Fence agent for Brocade over telnet
-fence_bullpap - I/O Fencing agent for Bull FAME architecture controlled by a PAP management console.
+nodelist {
-fence_cisco_mds - Fence agent for Cisco MDS
+  node {
-fence_cisco_ucs - Fence agent for Cisco UCS
+        ring0_addr: an-c03n01
-fence_cpint - I/O Fencing agent for GFS on s390 and zSeries VM clusters
+        nodeid: 1
-fence_drac - fencing agent for Dell Remote Access Card
+       }
-fence_drac5 - Fence agent for Dell DRAC CMC/5
+  node {
-fence_eaton_snmp - Fence agent for Eaton over SNMP
+        ring0_addr: an-c03n02
-fence_egenera - I/O Fencing agent for the Egenera BladeFrame
+        nodeid: 2
-fence_eps - Fence agent for ePowerSwitch
+       }
-fence_hpblade - Fence agent for HP BladeSystem
+}
-fence_ibmblade - Fence agent for IBM BladeCenter over SNMP
-fence_idrac - Fence agent for IPMI over LAN
+quorum {
-fence_ifmib - Fence agent for IF MIB
+provider: corosync_votequorum
-fence_ilo - Fence agent for HP iLO
+}
-fence_ilo2 - Fence agent for HP iLO
-fence_ilo3 - Fence agent for IPMI over LAN
+logging {
-fence_ilo_mp - Fence agent for HP iLO MP
+to_syslog: yes
-fence_imm - Fence agent for IPMI over LAN
+}
-fence_intelmodular - Fence agent for Intel Modular
-fence_ipdu - Fence agent for iPDU over SNMP
-fence_ipmilan - Fence agent for IPMI over LAN
-fence_kdump - Fence agent for use with kdump
-fence_ldom - Fence agent for Sun LDOM
-fence_lpar - Fence agent for IBM LPAR
-fence_mcdata - I/O Fencing agent for McData FC switches
-fence_rackswitch - fence_rackswitch - I/O Fencing agent for RackSaver RackSwitch
-fence_rhevm - Fence agent for RHEV-M REST API
-fence_rsa - Fence agent for IBM RSA
-fence_rsb - I/O Fencing agent for Fujitsu-Siemens RSB
-fence_sanbox2 - Fence agent for QLogic SANBox2 FC switches
-fence_scsi - fence agent for SCSI-3 persistent reservations
-fence_virsh - Fence agent for virsh
-fence_vixel - I/O Fencing agent for Vixel FC switches
-fence_vmware - Fence agent for VMWare
-fence_vmware_soap - Fence agent for VMWare over SOAP API
-fence_wti - Fence agent for WTI
-fence_xcat - I/O Fencing agent for xcat environments
-fence_xenapi - XenAPI based fencing for the Citrix XenServer virtual machines.
-fence_zvm - I/O Fencing agent for GFS on s390 and zSeries VM clusters
 </source>
-We will use <span class="code">fence_ipmilan</span> and <span class="code">fence_apc_snmp</span>.
+== Start the Cluster For the First Time ==
-=== Configuring IPMI Fencing ===
+This starts the cluster communication and membership layer for the first time.
-Every fence agent has a possibly unique subset of options that can be used. You can see a brief description of these options with the <span class="code">pcs stonith describe fence_X</span> command. Let's look at the options available for <span class="code">fence_ipmilan</span>.
+'''On one node only''';
 <source lang="bash">
-pcs stonith describe fence_ipmilan
+pcs cluster start --all
 </source>
 <source lang="text">
-Stonith options for: fence_ipmilan
+an-c03n01: Starting Cluster...
-  auth: IPMI Lan Auth type (md5, password, or none)
+an-c03n02: Starting Cluster...
-  ipaddr: IPMI Lan IP to talk to
-  passwd: Password (if required) to control power on IPMI device
-  passwd_script: Script to retrieve password (if required)
-  lanplus: Use Lanplus
-  login: Username/Login (if required) to control power on IPMI device
-  action: Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata
-  timeout: Timeout (sec) for IPMI operation
-  cipher: Ciphersuite to use (same as ipmitool -C parameter)
-  method: Method to fence (onoff or cycle)
-  power_wait: Wait X seconds after on/off operation
-  delay: Wait X seconds before fencing is started
-  privlvl: Privilege level on IPMI device
-  verbose: Verbose mode
 </source>
-One of the nice things about pcs is that it allows us to create a test file to prepare all our changes in. Then, when we're happy with the changes, merge them into the running cluster. So let's make a copy called <span class="code">stonith_cfg</span>
+After a few moments, you should be able to check the status;
 <source lang="bash">
-pcs cluster cib stonith_cfg
+pcs status
+</source>
+<source lang="text">
+Last updated: Fri Feb 15 01:30:43 2013
+Last change: Fri Feb 15 01:30:29 2013 via crmd on an-c03n01
+Stack: corosync
+Current DC: an-c03n01 (1) - partition with quorum
+Version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
+Nodes configured, unknown expected votes
+Resources configured.
+Online: [ an-c03n01 an-c03n02 ]
+Full list of resources:
 </source>
-Now add fencing.
+The other node should show almost the identical output.
+== Disabling Quorum ==
+{{note|1=Show the math.}}
+With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum.
+By default, pacemaker uses quorum. You don't see this initially though;
 <source lang="bash">
-#   temp file                     unique name    fence agent   target node                device addr           credentials
+pcs property
-pcs -f stonith_cfg stonith create impi-an-c03n01 fence_ipmilan pcmk_host_list="an-m03n01" ipaddr=an-c03n01.ipmi login=admin passwd=admin op monitor interval=60s
+</source>
+<source lang="text">
+Cluster Properties:
+ dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
+ cluster-infrastructure: corosync
 </source>
+To disable it, we set <span class="code">no-quorum-policy=ignore</span>.
 <span class="code"></span>

Anvil! Tutorial 3: Difference between revisions

Revision as of 22:41, 15 February 2013

Before We Begin

OS Setup

Install

Make the Network Configuration Static

Setting the Hostname

Optional - Video Problems

What Security?

Network

Setup The hosts File

Setup SSH

Populating And Pushing ~/.ssh/known_hosts

Keeping Time in Sync

Configuring the Cluster

Enable the pcs Daemon

Initializing the Cluster

Start the Cluster For the First Time

Disabling Quorum

Navigation menu

Search