Anvil! Tutorial 3: Difference between revisions
No edit summary |
|||
Line 1: | Line 1: | ||
== | {{howto_header}} | ||
{{warning|1=This tutorial is incomplete, flawed and generally sucks at this time. Do not follow this and expect anything to work. In large part, it's a dumping ground for notes and little else. This warning will be removed when the tutorial is completed.}} | |||
{{warning|1=This tutorial is built on '''''a guess''''' of what [[Red Hat]]'s Enterprise Linux 7 will offer, based on what the author sees happening in [[Fedora]] upstream. [[Red Hat]] never confirms what a future release will contain until it is actually released. As such, this tutorial may turn out to be inappropriate for the final release of [[RHEL]] 7. In such a case, the warning above will remain in place until the tutorial is updated to reflect the final release.}} | |||
This is the third '''AN!Cluster''' tutorial built on [[Red Hat]]'s Enterprise Linux 7. It improves on the [[Red Hat Cluster Service 2 Tutorial|RHEL 5, RHCS stable 2]] and [[2-Node Red Hat KVM Cluster Tutorial|RHEL 6, RHCS stable3]] tutorials. | |||
As with the previous tutorials, the end goal of this tutorial is a 2-node cluster providing a platform for high-availability virtual servers. It's design attempts to remove all single points of failure from the system. Power and networking are made fully redundant in this version, along with minimizing the node failures which would lead to service interruption. This tutorial also covers the [[AN!Utilities]]; [[AN!Cluster Dashboard]], [[AN!Cluster Monitor]] and [[AN!Safe Cluster Shutdown]]. | |||
As it the previous tutorial, [[KVM]] will be the hypervisor used for facilitating virtual machines. The old <span class="code">[[cman]]</span> and <span class="code">[[rgmanager]]</span> tools are replaced in favour of <span class="code">[[pacemaker]]</span> for resource management. | |||
= Before We Begin = | |||
This tutorial '''does not''' require prior cluster experience, but it does expect familiarity with Linux and a low-intermediate understanding of networking. Where possible, steps are explained in detail and rationale is provided for why certain decisions are made. | |||
'''For those with cluster experience'''; | |||
Please be careful not to skip too much. There are some major and some subtle changes from previous tutorials. | |||
= OS Setup = | |||
{{warning|1=I used Fedora 18 at this point, obviously things will change, possibly a lot, once RHEL 7 is released.}} | |||
== Install == | |||
Not all of these are required, but most are used at one point or another in this tutorial. | |||
<source lang="bash"> | |||
yum install bridge-utils corosync gpm man net-tools network ntp pacemaker pcs rsync syslinux vim wget | |||
</source> | |||
If you want to use your mouse at the node's terminal, run the following; | |||
<source lang="bash"> | |||
systemctl enable gpm.service | |||
systemctl start gpm.service | |||
</source> | |||
== Make the Network Configuration Static == | |||
We don't want [[NetworkManager]] in our cluster as it tries to dynamically manage the network and we need our network to be static. | |||
<source lang="bash"> | |||
yum remove NetworkManager | |||
</source> | |||
{{note|1=This assumes that [[systemd]] will be used in [[RHEL]]7. This may not be the case come release day.}} | |||
Now to ensure the static <span class="code">network</span> service starts on boot. | |||
<source lang="bash"> | |||
systemctl enable network.service | |||
</source> | |||
== Setting the Hostname == | |||
Fedora 18 is '''very''' different from [[EL6]]. | |||
{{note|1=The '<span class="code">--pretty</span>' line currently doesn't work as there is [https://bugzilla.redhat.com/show_bug.cgi?id=895299 a bug (rhbz#895299)] with single-quotes.}} | |||
{{note|1=The '<span class="code">--static</span>' option is currently needed to prevent the '<span class="code">.</span>' from being removed. See [https://bugzilla.redhat.com/show_bug.cgi?id=896756 this bug (rhbz#896756)].}} | |||
Use a format that works for you. For the tutorial, node names are based on the following; | |||
* A two-letter prefix identifying the company/user (<span class="code">an</span>, for "Alteeve's Niche!") | |||
* A sequential cluster ID number in the form of <span class="code">cXX</span> (<span class="code">c01</span> for "Cluster 01", <span class="code">c02</span> for Cluster 02, etc) | |||
* A sequential node ID number in the form of <span class="code">nYY</span> | |||
In my case, this is my third cluster and I use the company prefix <span class="code">an</span>, so my two nodes will be; | |||
* <span class="code">an-c03n01</span> - node 1 | |||
* <span class="code">an-c03n02</span> - node 2 | |||
Folks who've read my earlier tutorials will note that this is a departure in naming. I find this method spans and scales much better. Further, it the simply required in order to use the [[AN!CDB|AN! Cluster Dashboard]]. | |||
<source lang="bash"> | |||
hostnamectl set-hostname an-c03n01.alteeve.ca --static | |||
hostnamectl set-hostname --pretty "Alteeve's Niche! - Cluster 01, Node 01" | |||
</source> | |||
If you want the new host name to take effect immediately, you can use the traditional <span class="code">hostname</span> command: | |||
<source lang="bash"> | |||
hostname an-c03n01.alteeve.ca | |||
</source> | |||
'''Alternatively''' | |||
If you have trouble with those commands, you can directly edit the files that contain the host names. | |||
The host name is stored in <span class="code">/etc/hostname</span>: | |||
<source lang="bash"> | |||
echo an-c03n01.alteeve.ca > /etc/hostname | |||
cat /etc/hostname | |||
</source> | |||
<source lang="text"> | |||
an-c03n01.alteeve.ca | |||
</source> | |||
The "pretty" host name is stored in <span class="code">/etc/machine-info</span> as the unquoted value for the <span class="code">PRETTY_HOSTNAME</span> value. | |||
<source lang="bash"> | |||
vim /etc/machine-info | |||
</source> | |||
<source lang="text"> | |||
PRETTY_HOSTNAME=Alteeves Niche! - Cluster 01, Node 01 | |||
</source> | |||
If you can't get the <span class="code">hostname</span> command to work for some reason, you can reboot to have the system read the new values. | |||
== Optional - Video Problems == | |||
On my servers, [[Fedora]] 18 doesn't detect or use the video card properly. To resolve this, I need to add <span class="code">nomodeset</span> to the kernel line when installing and again after the install is complete. | |||
Once installed | |||
Edit the <span class="code">/etc/default/grub</span> and append <span class="code">nomodeset</span> to the end of the <span class="code">GRUB_CMDLINE_LINUX</span> variable. | |||
<source lang="bash"> | |||
vim /etc/default/grub | |||
</source> | |||
<source lang="bash"> | |||
GRUB_TIMEOUT=5 | |||
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)" | |||
GRUB_DEFAULT=saved | |||
GRUB_CMDLINE_LINUX="nomodeset rd.md=0 rd.lvm=0 rd.dm=0 $([ -x /usr/sbin/rhcrashkernel-param ] && /usr/sbin/rhcrashkernel-param || :) rd.luks=0 vconsole.keymap=us nomodeset" | |||
GRUB_DISABLE_RECOVERY="true" | |||
GRUB_THEME="/boot/grub2/themes/system/theme.txt" | |||
</source> | |||
Save that. and then rewrite the [[grub2]] configuration file. | |||
<source lang="bash"> | |||
grub2-mkconfig -o /boot/grub2/grub.cfg | |||
</source> | |||
Next time you reboot, you should get a stock 80x25 character display. It's not much, but it will work on esoteric video cards or weird monitors. | |||
== What Security? == | |||
This section will be re-added at the end. For now; | |||
<source lang="bash"> | |||
setenforce 0 | |||
sed -i 's/SELINUX=.*/SELINUX=disabled/' /etc/selinux/config | |||
systemctl disable firewalld.service | |||
systemctl stop firewalld.service | |||
</source> | |||
== Network == | |||
We want static, named network devices. Follow this; | |||
* [[Changing Ethernet Device Names in EL7 and Fedora 15+]] | |||
Then, use these configuration files; | |||
Build the bridge; | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-ifn-vbr1 | |||
</source> | |||
<source lang="bash"> | |||
# Internet-Facing Network - Bridge | |||
DEVICE="ifn-vbr1" | |||
TYPE="Bridge" | |||
BOOTPROTO="none" | |||
IPADDR="10.255.10.1" | |||
NETMASK="255.255.0.0" | |||
GATEWAY="10.255.255.254" | |||
DNS1="8.8.8.8" | |||
DNS2="8.8.4.4" | |||
DEFROUTE="yes" | |||
</source> | |||
Now build the bonds; | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1 | |||
</source> | |||
<source lang="bash"> | |||
# Internet-Facing Network - Bond | |||
DEVICE="ifn-bond1" | |||
BRIDGE="ifn-vbr1" | |||
BOOTPROTO="none" | |||
NM_CONTROLLED="no" | |||
ONBOOT="yes" | |||
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn1" | |||
</source> | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1 | |||
</source> | |||
<source lang="bash"> | |||
# Storage Network - Bond | |||
DEVICE="sn-bond1" | |||
BOOTPROTO="none" | |||
NM_CONTROLLED="no" | |||
ONBOOT="yes" | |||
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn1" | |||
IPADDR="10.10.10.1" | |||
NETMASK="255.255.0.0" | |||
</source> | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1 | |||
</source> | |||
<source lang="bash"> | |||
# Back-Channel Network - Bond | |||
DEVICE="bcn-bond1" | |||
BOOTPROTO="none" | |||
NM_CONTROLLED="no" | |||
ONBOOT="yes" | |||
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn1" | |||
IPADDR="10.20.10.1" | |||
NETMASK="255.255.0.0" | |||
</source> | |||
Now tell the interfaces to be slaves to their bonds; | |||
Internet-Facing Network; | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-ifn1 | |||
</source> | |||
<source lang="bash"> | |||
# Internet-Facing Network - Link 1 | |||
DEVICE="ifn1" | |||
NM_CONTROLLED="no" | |||
BOOTPROTO="none" | |||
ONBOOT="yes" | |||
SLAVE="yes" | |||
MASTER="ifn-bond1" | |||
</source> | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-ifn2 | |||
</source> | |||
<source lang="bash"> | |||
# Back-Channel Network - Link 2 | |||
DEVICE="ifn2" | |||
NM_CONTROLLED="no" | |||
BOOTPROTO="none" | |||
ONBOOT="yes" | |||
SLAVE="yes" | |||
MASTER="ifn-bond1" | |||
</source> | |||
Storage Network; | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-sn1 | |||
</source> | |||
<source lang="bash"> | |||
# Storage Network - Link 1 | |||
DEVICE="sn1" | |||
NM_CONTROLLED="no" | |||
BOOTPROTO="none" | |||
ONBOOT="yes" | |||
SLAVE="yes" | |||
MASTER="sn-bond1" | |||
</source> | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-sn2 | |||
</source> | |||
<source lang="bash"> | |||
# Storage Network - Link 1 | |||
DEVICE="sn2" | |||
NM_CONTROLLED="no" | |||
BOOTPROTO="none" | |||
ONBOOT="yes" | |||
SLAVE="yes" | |||
MASTER="sn-bond1" | |||
</source> | |||
Back-Channel Network | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-bcn1 | |||
</source> | |||
<source lang="bash"> | |||
# Back-Channel Network - Link 1 | |||
DEVICE="bcn1" | |||
NM_CONTROLLED="no" | |||
BOOTPROTO="none" | |||
ONBOOT="yes" | |||
SLAVE="yes" | |||
MASTER="bcn-bond1" | |||
</source> | |||
<source lang="bash"> | |||
vim /etc/sysconfig/network-scripts/ifcfg-bcn2 | |||
</source> | |||
<source lang="bash"> | |||
# Storage Network - Link 1 | |||
DEVICE="bcn2" | |||
NM_CONTROLLED="no" | |||
BOOTPROTO="none" | |||
ONBOOT="yes" | |||
SLAVE="yes" | |||
MASTER="bcn-bond1" | |||
</source> | |||
Now restart the network, confirm that the bonds and bridge are up and you are ready to proceed. | |||
== Setup The hosts File == | |||
You can use [[DNS]] if you prefer. For now, lets use <span class="code">/etc/hosts</span> for node name resolution. | |||
<source lang="bash"> | |||
vim /etc/hosts | |||
</source> | |||
<source lang="text"> | |||
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 | |||
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 | |||
# AN!Cluster 01, Node 01 | |||
10.255.10.1 an-c01n01.ifn | |||
10.10.10.1 an-c01n01.sn | |||
10.20.10.1 an-c01n01.bcn an-c01n01 an-c01n01.alteeve.ca | |||
10.20.11.1 an-c01n01.ipmi | |||
# AN!Cluster 01, Node 02 | |||
10.255.10.2 an-c01n02.ifn | |||
10.10.10.2 an-c01n02.sn | |||
10.20.10.2 an-c01n02.bcn an-c01n02 an-c01n02.alteeve.ca | |||
10.20.11.2 an-c01n02.ipmi | |||
# Foundation Pack | |||
10.20.2.7 an-p03 an-p03.alteeve.ca | |||
</source> | |||
== Setup SSH == | |||
Same as [[2-Node_Red_Hat_KVM_Cluster_Tutorial#Setting_up_SSH|before]]. | |||
== Populating And Pushing ~/.ssh/known_hosts == | |||
Same as [[2-Node_Red_Hat_KVM_Cluster_Tutorial#Populating_And_Pushing_~/.ssh/known_hosts|before]]. | |||
<source lang="bash"> | |||
ssh root@an-c03n01.alteeve.ca | |||
</source> | |||
<source lang="text"> | |||
The authenticity of host 'an-c03n01.alteeve.ca (10.20.30.1)' can't be established. | |||
RSA key fingerprint is 7b:dd:0d:aa:c5:f5:9e:a6:b6:4d:40:69:d6:80:4d:09. | |||
Are you sure you want to continue connecting (yes/no)? | |||
</source> | |||
Type <span class="code">yes</span> | |||
<source lang="text"> | |||
Are you sure you want to continue connecting (yes/no)? yes | |||
</source> | |||
<source lang="text"> | |||
Warning: Permanently added 'an-c03n01.alteeve.ca,10.20.30.1' (RSA) to the list of known hosts. | |||
Last login: Thu Feb 14 15:18:33 2013 from 10.20.5.100 | |||
</source> | |||
You will now be logged into the <span class="code">an-c03n01</span> node, which in this case is the same machine on a new session in the same terminal. | |||
<source lang="text"> | |||
[root@an-c03n01 ~]# | |||
</source> | |||
You can logout by typing <span class="code">exit</span>. | |||
<source lang="bash"> | |||
exit | |||
</source> | |||
<source lang="text"> | |||
logout | |||
Connection to an-c03n01.alteeve.ca closed. | |||
</source> | |||
Now we have to repeat the steps for all the other variations on the names of the hosts. This is annoying and tedious, sorry. | |||
<source lang="bash"> | |||
ssh root@an-c03n01 | |||
ssh root@an-c03n01.bcn | |||
ssh root@an-c03n01.sn | |||
ssh root@an-c03n01.ifn | |||
ssh root@an-c03n02.alteeve.ca | |||
ssh root@an-c03n02 | |||
ssh root@an-c03n02.bcn | |||
ssh root@an-c03n02.sn | |||
ssh root@an-c03n02.ifn | |||
</source> | |||
Your <span class="code">~/.ssh/known_hosts</span> file will now be populated with both nodes' ssh fingerprints. Copy it over to the second node to save all that typing a second time. | |||
<source lang="bash"> | |||
rsync -av ~/.ssh/known_hosts root@an-c03n02:/root/.ssh/ | |||
</source> | |||
== Keeping Time in Sync == | |||
It's not as critical as it used to be to keep the clocks on the nodes in sync, but it's still a good idea. | |||
<source lang="bash"> | |||
systemctl start ntpd.service | |||
systemctl enable ntpd.service | |||
</source> | |||
= Configuring the Cluster = | |||
Now we're getting down to business! | |||
For this section, we will be working on <span class="code">an-c03n01</span> and using [[ssh]] to perform tasks on <span class="code">an-c03n02</span>. | |||
{{note|1=TODO: explain what this is and how it works.}} | |||
== Enable the pcs Daemon == | |||
{{note|1=Most of this section comes more or less verbatim from the main [http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html Clusters from Scratch] tutorial.}} | |||
We will use [[pcs]], the Pacemaker Configuration System, to configure our cluster. | |||
<source lang="bash"> | |||
systemctl start pcsd.service | |||
systemctl enable pcsd.service | |||
</source> | |||
<source lang="text"> | |||
ln -s '/usr/lib/systemd/system/pcsd.service' '/etc/systemd/system/multi-user.target.wants/pcsd.service' | |||
</source> | |||
Now we need to set a password for the <span class="code">hacluster</span> user. This is the account used by <span class="code">pcs</span> on one node to talk to the <span class="code">pcs</span> [[daemon]] on the other node. For this tutorial, we will use the password <span class="code">secret</span>. You will want to use [https://xkcd.com/936/ a stronger password], of course. | |||
<source lang="bash"> | <source lang="bash"> | ||
echo secret | passwd --stdin hacluster | |||
</source> | </source> | ||
<source lang="text"> | <source lang="text"> | ||
Changing password for user hacluster. | |||
passwd: all authentication tokens updated successfully. | |||
</source> | </source> | ||
== Initializing the Cluster == | |||
One of the biggest reasons we're using the [[pcs]] tool, over something like [[crm]], is that it has been written to simplify the setup of clusters on [[Red Hat]] style operating systems. It will configure [[corosync]] automatically. | |||
First, authenticate against the cluster nodes. | |||
<source lang="bash"> | <source lang="bash"> | ||
pcs | pcs cluster auth an-c03n01 an-c03n02 | ||
</source> | </source> | ||
This will ask you for the user name and password. The default user name is <span class="code">hacluster</span> and we set the password to <span class="code">secret</span>. | |||
<source lang="text"> | <source lang="text"> | ||
Username: hacluster | |||
Password: | |||
an-c03n01: Authorized | |||
an-c03n02: Authorized | |||
</source> | </source> | ||
'''Do this on one node only''': | |||
Now to initialize the cluster's communication and membership layer. | |||
<source lang="bash"> | |||
pcs cluster setup an-cluster-03 an-c03n01 an-c03n02 | |||
</source> | |||
<source lang="text"> | |||
an-c03n01: Succeeded | |||
an-c03n02: Succeeded | |||
</source> | |||
This will create the corosync configuration file <span class="code">/etc/corosync/corosync.conf</span>; | |||
<source lang="bash"> | <source lang="bash"> | ||
cat /etc/corosync/corosync.conf | |||
</source> | </source> | ||
<source lang="text"> | <source lang="text"> | ||
totem { | |||
version: 2 | |||
secauth: off | |||
cluster_name: an-cluster-03 | |||
transport: udpu | |||
} | |||
nodelist { | |||
node { | |||
ring0_addr: an-c03n01 | |||
nodeid: 1 | |||
} | |||
node { | |||
ring0_addr: an-c03n02 | |||
nodeid: 2 | |||
} | |||
} | |||
quorum { | |||
provider: corosync_votequorum | |||
} | |||
logging { | |||
to_syslog: yes | |||
} | |||
</source> | </source> | ||
== Start the Cluster For the First Time == | |||
This starts the cluster communication and membership layer for the first time. | |||
'''On one node only'''; | |||
<source lang="bash"> | <source lang="bash"> | ||
pcs | pcs cluster start --all | ||
</source> | </source> | ||
<source lang="text"> | <source lang="text"> | ||
an-c03n01: Starting Cluster... | |||
an-c03n02: Starting Cluster... | |||
</source> | </source> | ||
After a few moments, you should be able to check the status; | |||
<source lang="bash"> | <source lang="bash"> | ||
pcs | pcs status | ||
</source> | |||
<source lang="text"> | |||
Last updated: Fri Feb 15 01:30:43 2013 | |||
Last change: Fri Feb 15 01:30:29 2013 via crmd on an-c03n01 | |||
Stack: corosync | |||
Current DC: an-c03n01 (1) - partition with quorum | |||
Version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb | |||
2 Nodes configured, unknown expected votes | |||
0 Resources configured. | |||
Online: [ an-c03n01 an-c03n02 ] | |||
Full list of resources: | |||
</source> | </source> | ||
The other node should show almost the identical output. | |||
== Disabling Quorum == | |||
{{note|1=Show the math.}} | |||
With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum. | |||
By default, pacemaker uses quorum. You don't see this initially though; | |||
<source lang="bash"> | <source lang="bash"> | ||
pcs property | |||
</source> | |||
<source lang="text"> | |||
Cluster Properties: | |||
dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb | |||
cluster-infrastructure: corosync | |||
</source> | </source> | ||
To disable it, we set <span class="code">no-quorum-policy=ignore</span>. | |||
<span class="code"></span> | <span class="code"></span> |
Revision as of 22:41, 15 February 2013
Alteeve Wiki :: How To :: Anvil! Tutorial 3 |
![]() |
Warning: This tutorial is built on a guess of what Red Hat's Enterprise Linux 7 will offer, based on what the author sees happening in Fedora upstream. Red Hat never confirms what a future release will contain until it is actually released. As such, this tutorial may turn out to be inappropriate for the final release of RHEL 7. In such a case, the warning above will remain in place until the tutorial is updated to reflect the final release. |
This is the third AN!Cluster tutorial built on Red Hat's Enterprise Linux 7. It improves on the RHEL 5, RHCS stable 2 and RHEL 6, RHCS stable3 tutorials.
As with the previous tutorials, the end goal of this tutorial is a 2-node cluster providing a platform for high-availability virtual servers. It's design attempts to remove all single points of failure from the system. Power and networking are made fully redundant in this version, along with minimizing the node failures which would lead to service interruption. This tutorial also covers the AN!Utilities; AN!Cluster Dashboard, AN!Cluster Monitor and AN!Safe Cluster Shutdown.
As it the previous tutorial, KVM will be the hypervisor used for facilitating virtual machines. The old cman and rgmanager tools are replaced in favour of pacemaker for resource management.
Before We Begin
This tutorial does not require prior cluster experience, but it does expect familiarity with Linux and a low-intermediate understanding of networking. Where possible, steps are explained in detail and rationale is provided for why certain decisions are made.
For those with cluster experience;
Please be careful not to skip too much. There are some major and some subtle changes from previous tutorials.
OS Setup
![]() |
Warning: I used Fedora 18 at this point, obviously things will change, possibly a lot, once RHEL 7 is released. |
Install
Not all of these are required, but most are used at one point or another in this tutorial.
yum install bridge-utils corosync gpm man net-tools network ntp pacemaker pcs rsync syslinux vim wget
If you want to use your mouse at the node's terminal, run the following;
systemctl enable gpm.service
systemctl start gpm.service
Make the Network Configuration Static
We don't want NetworkManager in our cluster as it tries to dynamically manage the network and we need our network to be static.
yum remove NetworkManager
![]() |
Note: This assumes that systemd will be used in RHEL7. This may not be the case come release day. |
Now to ensure the static network service starts on boot.
systemctl enable network.service
Setting the Hostname
Fedora 18 is very different from EL6.
![]() |
Note: The '--pretty' line currently doesn't work as there is a bug (rhbz#895299) with single-quotes. |
![]() |
Note: The '--static' option is currently needed to prevent the '.' from being removed. See this bug (rhbz#896756). |
Use a format that works for you. For the tutorial, node names are based on the following;
- A two-letter prefix identifying the company/user (an, for "Alteeve's Niche!")
- A sequential cluster ID number in the form of cXX (c01 for "Cluster 01", c02 for Cluster 02, etc)
- A sequential node ID number in the form of nYY
In my case, this is my third cluster and I use the company prefix an, so my two nodes will be;
- an-c03n01 - node 1
- an-c03n02 - node 2
Folks who've read my earlier tutorials will note that this is a departure in naming. I find this method spans and scales much better. Further, it the simply required in order to use the AN! Cluster Dashboard.
hostnamectl set-hostname an-c03n01.alteeve.ca --static
hostnamectl set-hostname --pretty "Alteeve's Niche! - Cluster 01, Node 01"
If you want the new host name to take effect immediately, you can use the traditional hostname command:
hostname an-c03n01.alteeve.ca
Alternatively
If you have trouble with those commands, you can directly edit the files that contain the host names.
The host name is stored in /etc/hostname:
echo an-c03n01.alteeve.ca > /etc/hostname
cat /etc/hostname
an-c03n01.alteeve.ca
The "pretty" host name is stored in /etc/machine-info as the unquoted value for the PRETTY_HOSTNAME value.
vim /etc/machine-info
PRETTY_HOSTNAME=Alteeves Niche! - Cluster 01, Node 01
If you can't get the hostname command to work for some reason, you can reboot to have the system read the new values.
Optional - Video Problems
On my servers, Fedora 18 doesn't detect or use the video card properly. To resolve this, I need to add nomodeset to the kernel line when installing and again after the install is complete.
Once installed
Edit the /etc/default/grub and append nomodeset to the end of the GRUB_CMDLINE_LINUX variable.
vim /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_CMDLINE_LINUX="nomodeset rd.md=0 rd.lvm=0 rd.dm=0 $([ -x /usr/sbin/rhcrashkernel-param ] && /usr/sbin/rhcrashkernel-param || :) rd.luks=0 vconsole.keymap=us nomodeset"
GRUB_DISABLE_RECOVERY="true"
GRUB_THEME="/boot/grub2/themes/system/theme.txt"
Save that. and then rewrite the grub2 configuration file.
grub2-mkconfig -o /boot/grub2/grub.cfg
Next time you reboot, you should get a stock 80x25 character display. It's not much, but it will work on esoteric video cards or weird monitors.
What Security?
This section will be re-added at the end. For now;
setenforce 0
sed -i 's/SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
systemctl disable firewalld.service
systemctl stop firewalld.service
Network
We want static, named network devices. Follow this;
Then, use these configuration files;
Build the bridge;
vim /etc/sysconfig/network-scripts/ifcfg-ifn-vbr1
# Internet-Facing Network - Bridge
DEVICE="ifn-vbr1"
TYPE="Bridge"
BOOTPROTO="none"
IPADDR="10.255.10.1"
NETMASK="255.255.0.0"
GATEWAY="10.255.255.254"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
DEFROUTE="yes"
Now build the bonds;
vim /etc/sysconfig/network-scripts/ifcfg-ifn-bond1
# Internet-Facing Network - Bond
DEVICE="ifn-bond1"
BRIDGE="ifn-vbr1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=ifn1"
vim /etc/sysconfig/network-scripts/ifcfg-sn-bond1
# Storage Network - Bond
DEVICE="sn-bond1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=sn1"
IPADDR="10.10.10.1"
NETMASK="255.255.0.0"
vim /etc/sysconfig/network-scripts/ifcfg-bcn-bond1
# Back-Channel Network - Bond
DEVICE="bcn-bond1"
BOOTPROTO="none"
NM_CONTROLLED="no"
ONBOOT="yes"
BONDING_OPTS="mode=1 miimon=100 use_carrier=1 updelay=120000 downdelay=0 primary=bcn1"
IPADDR="10.20.10.1"
NETMASK="255.255.0.0"
Now tell the interfaces to be slaves to their bonds;
Internet-Facing Network;
vim /etc/sysconfig/network-scripts/ifcfg-ifn1
# Internet-Facing Network - Link 1
DEVICE="ifn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="ifn-bond1"
vim /etc/sysconfig/network-scripts/ifcfg-ifn2
# Back-Channel Network - Link 2
DEVICE="ifn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="ifn-bond1"
Storage Network;
vim /etc/sysconfig/network-scripts/ifcfg-sn1
# Storage Network - Link 1
DEVICE="sn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="sn-bond1"
vim /etc/sysconfig/network-scripts/ifcfg-sn2
# Storage Network - Link 1
DEVICE="sn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="sn-bond1"
Back-Channel Network
vim /etc/sysconfig/network-scripts/ifcfg-bcn1
# Back-Channel Network - Link 1
DEVICE="bcn1"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="bcn-bond1"
vim /etc/sysconfig/network-scripts/ifcfg-bcn2
# Storage Network - Link 1
DEVICE="bcn2"
NM_CONTROLLED="no"
BOOTPROTO="none"
ONBOOT="yes"
SLAVE="yes"
MASTER="bcn-bond1"
Now restart the network, confirm that the bonds and bridge are up and you are ready to proceed.
Setup The hosts File
You can use DNS if you prefer. For now, lets use /etc/hosts for node name resolution.
vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
# AN!Cluster 01, Node 01
10.255.10.1 an-c01n01.ifn
10.10.10.1 an-c01n01.sn
10.20.10.1 an-c01n01.bcn an-c01n01 an-c01n01.alteeve.ca
10.20.11.1 an-c01n01.ipmi
# AN!Cluster 01, Node 02
10.255.10.2 an-c01n02.ifn
10.10.10.2 an-c01n02.sn
10.20.10.2 an-c01n02.bcn an-c01n02 an-c01n02.alteeve.ca
10.20.11.2 an-c01n02.ipmi
# Foundation Pack
10.20.2.7 an-p03 an-p03.alteeve.ca
Setup SSH
Same as before.
Populating And Pushing ~/.ssh/known_hosts
Same as before.
ssh root@an-c03n01.alteeve.ca
The authenticity of host 'an-c03n01.alteeve.ca (10.20.30.1)' can't be established.
RSA key fingerprint is 7b:dd:0d:aa:c5:f5:9e:a6:b6:4d:40:69:d6:80:4d:09.
Are you sure you want to continue connecting (yes/no)?
Type yes
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'an-c03n01.alteeve.ca,10.20.30.1' (RSA) to the list of known hosts.
Last login: Thu Feb 14 15:18:33 2013 from 10.20.5.100
You will now be logged into the an-c03n01 node, which in this case is the same machine on a new session in the same terminal.
[root@an-c03n01 ~]#
You can logout by typing exit.
exit
logout
Connection to an-c03n01.alteeve.ca closed.
Now we have to repeat the steps for all the other variations on the names of the hosts. This is annoying and tedious, sorry.
ssh root@an-c03n01
ssh root@an-c03n01.bcn
ssh root@an-c03n01.sn
ssh root@an-c03n01.ifn
ssh root@an-c03n02.alteeve.ca
ssh root@an-c03n02
ssh root@an-c03n02.bcn
ssh root@an-c03n02.sn
ssh root@an-c03n02.ifn
Your ~/.ssh/known_hosts file will now be populated with both nodes' ssh fingerprints. Copy it over to the second node to save all that typing a second time.
rsync -av ~/.ssh/known_hosts root@an-c03n02:/root/.ssh/
Keeping Time in Sync
It's not as critical as it used to be to keep the clocks on the nodes in sync, but it's still a good idea.
systemctl start ntpd.service
systemctl enable ntpd.service
Configuring the Cluster
Now we're getting down to business!
For this section, we will be working on an-c03n01 and using ssh to perform tasks on an-c03n02.
![]() |
Note: TODO: explain what this is and how it works. |
Enable the pcs Daemon
![]() |
Note: Most of this section comes more or less verbatim from the main Clusters from Scratch tutorial. |
We will use pcs, the Pacemaker Configuration System, to configure our cluster.
systemctl start pcsd.service
systemctl enable pcsd.service
ln -s '/usr/lib/systemd/system/pcsd.service' '/etc/systemd/system/multi-user.target.wants/pcsd.service'
Now we need to set a password for the hacluster user. This is the account used by pcs on one node to talk to the pcs daemon on the other node. For this tutorial, we will use the password secret. You will want to use a stronger password, of course.
echo secret | passwd --stdin hacluster
Changing password for user hacluster.
passwd: all authentication tokens updated successfully.
Initializing the Cluster
One of the biggest reasons we're using the pcs tool, over something like crm, is that it has been written to simplify the setup of clusters on Red Hat style operating systems. It will configure corosync automatically.
First, authenticate against the cluster nodes.
pcs cluster auth an-c03n01 an-c03n02
This will ask you for the user name and password. The default user name is hacluster and we set the password to secret.
Username: hacluster
Password:
an-c03n01: Authorized
an-c03n02: Authorized
Do this on one node only:
Now to initialize the cluster's communication and membership layer.
pcs cluster setup an-cluster-03 an-c03n01 an-c03n02
an-c03n01: Succeeded
an-c03n02: Succeeded
This will create the corosync configuration file /etc/corosync/corosync.conf;
cat /etc/corosync/corosync.conf
totem {
version: 2
secauth: off
cluster_name: an-cluster-03
transport: udpu
}
nodelist {
node {
ring0_addr: an-c03n01
nodeid: 1
}
node {
ring0_addr: an-c03n02
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
}
logging {
to_syslog: yes
}
Start the Cluster For the First Time
This starts the cluster communication and membership layer for the first time.
On one node only;
pcs cluster start --all
an-c03n01: Starting Cluster...
an-c03n02: Starting Cluster...
After a few moments, you should be able to check the status;
pcs status
Last updated: Fri Feb 15 01:30:43 2013
Last change: Fri Feb 15 01:30:29 2013 via crmd on an-c03n01
Stack: corosync
Current DC: an-c03n01 (1) - partition with quorum
Version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
2 Nodes configured, unknown expected votes
0 Resources configured.
Online: [ an-c03n01 an-c03n02 ]
Full list of resources:
The other node should show almost the identical output.
Disabling Quorum
![]() |
Note: Show the math. |
With quorum enabled, a two node cluster will lose quorum once either node fails. So we have to disable quorum.
By default, pacemaker uses quorum. You don't see this initially though;
pcs property
Cluster Properties:
dc-version: 1.1.9-0.1318.a7966fb.git.fc18-a7966fb
cluster-infrastructure: corosync
To disable it, we set no-quorum-policy=ignore.
Any questions, feedback, advice, complaints or meanderings are welcome. | |||
Alteeve's Niche! | Alteeve Enterprise Support | Community Support | |
© 2025 Alteeve. Intelligent Availability® is a registered trademark of Alteeve's Niche! Inc. 1997-2025 | |||
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions. |