Red Hat Cluster Service 2 Tutorial - Archive

From Alteeve Wiki
Jump to navigation Jump to search

 AN!Wiki :: How To :: Red Hat Cluster Service 2 Tutorial - Archive

Overview

This paper has one goal;

  • Creating a 2-node, high-availability cluster hosting Xen virtual machines using RHCS "stable 2" using DRBD for synchronized storage.

Technologies We Will Use

  • Enterprise Linux 5; specifically we will be using CentOS v5.5.
  • Red Hat Cluster Services "Stable" version 2. This describes the following core components:
    • OpenAIS; Provides cluster communications using the totem protocol.
    • Cluster Manager (cman); Manages the starting, stopping and managing of the cluster.
    • Resource Manager (rgmanager); Manages cluster resources and services. Handles service recovery during failures.
    • Cluster Logical Volume Manager (clvm); Cluster-aware (disk) volume manager. Backs GFS2 filesystems and Xen virtual machines.
    • Global File Systems version 2 (gfs2); Cluster-aware, concurrently mountable file system.
  • Distributed Redundant Block Device (DRBD); Keeps shared data synchronized across cluster nodes.
  • Xen; Hypervisor that controls and supports virtual machines.

A Note on Patience

There is nothing inherently hard about clustering. However, there are many components that you need to understand before you can begin. The result is that clustering has an inherently steep learning curve.

You must have patience. Lots of it.

Many technologies can be learned by creating a very simple base and then building on it. The classic "Hello, World!" script created when first learning a programming language is an example of this. Unfortunately, there is no real analog to this in clustering. Even the most basic cluster requires several pieces be in place and working together. If you try to rush by ignoring pieces you think are not important, you will almost certainly waste time. A good example is setting aside fencing, thinking that your test cluster's data isn't important. The cluster software has no concept of "test". It treats everything as critical all the time and will shut down if anything goes wrong.

Take your time, work through these steps, and you will have the foundation cluster sooner than you realize. Clustering is fun because it is a challenge.

Prerequisites

It is assumed that you are familiar with Linux systems administration, specifically Red Hat Enterprise Linux and it's derivatives. You will need to have somewhat advanced networking experience as well. You should be comfortable working in a terminal (directly or over ssh). Familiarity with XML will help, but is not terribly required as it's use here is pretty self-evident.

If you feel a little out of depth at times, don't hesitate to set this tutorial aside. Branch over to the components you feel the need to study more, then return and continue on. Finally, and perhaps most importantly, you must have patience! If you have a manager asking you to "go live" with a cluster in a month, tell him or her that it simply won't happen. If you rush, you will skip important points and you will fail. Patience is vastly more important than any pre-existing skill.

Focus and Goal

There is a different cluster for every problem. Generally speaking though, there are two main problems that clusters try to resolve; Performance and High Availability. Performance clusters are generally tailored to the application requiring the performance increase. There are some general tools for performance clustering, like Red Hat's LVS (Linux Virtual Server) for load-balancing common applications like the Apache web-server.

This tutorial will focus on High Availability clustering, often shortened to simply HA and not to be confused with the HA Linux "heartbeat" cluster suite, which we will not be using here. The cluster will provide a shared file systems and will provide for the high availability on Xen-based virtual servers. The goal will be to have the virtual servers live-migrate during planned node outages and automatically restart on a surviving node when the original host node fails.

A very brief overview;

High Availability clusters like our have two main parts; Cluster management and resource management.

The cluster itself is responsible for maintaining the cluster nodes in a group. This group is part of a "Cluster Process Group", or CPG. When a node fails, the cluster manager must detect the failure, reliably eject the node from the cluster and reform the CPG. Each time the cluster changes, or "re-forms", the resource manager is called. The resource manager checks to see how the cluster changed, consults it's configuration and determines what to do, if anything.

The details of all this will be discussed in detail a little later on. For now, it's sufficient to have in mind these two major roles and understand that they are somewhat independent entities.

Platform

This tutorial was written using CentOS version 5.5, x86_64. No attempt was made to test on i686 or other EL5 derivatives. That said, there is no reason to believe that this tutorial will not apply to any variant. As much as possible, the language will be distro-agnostic. For reasons of memory constraints, it is advised that you use an x86_64 (64-bit) platform if at all possible.

Do note that as of EL5.4 and above, significant changes were made to how RHCS are supported. It is strongly advised that you use at least version 5.4 or newer while working with this tutorial.

Base Setup

Before we can look at the cluster, we must first build two cluster nodes and then install the operating system.

Hardware Requirements

The bare minimum requirements are;

  • All hardware must be supported by EL5. It is strongly recommended that you check compatibility before making any purchases.
  • A dual-core CPUs with hardware virtualization support.
  • Three network cards; At least one should be gigabit or faster.
  • One hard drive.
  • 2 GiB of RAM
  • A fence device. This can be an IPMI-enabled server, a Node Assassin, a switched PDU or similar.

This tutorial was written using the following hardware:

This is not an endorsement of the above hardware. I put a heavy emphasis on minimizing power consumption and bought what was within my budget. This hardware was never meant to be put into production, but instead was chosen to serve the purpose of my own study and for creating this tutorial. What you ultimately choose to use, provided it meets the minimum requirements, is entirely up to you and your requirements.

Note: I use three physical NICs, but you can get away with two by merging the storage and back-channel networks, which we will discuss shortly. If you are really in a pinch, you could create three aliases on on interface and isolate them using VLANs. If you go this route, please ensure that your VLANs are configured and working before beginning this tutorial. Pay close attention to multicast traffic.

Pre-Assembly

Before you assemble your nodes, take a moment to record the MAC addresses of each network interface and then note where each interface is physically installed. This will help you later when configuring the networks. I generally create a simple text file with the MAC addresses, the interface I intend to assign to it and where it physically is located.

-=] an-node01
48:5B:39:3C:53:15   # eth0 - onboard interface
00:1B:21:72:96:E8   # eth1 - right-most PCIe interface
00:1B:21:72:9B:56   # eth2 - left-most PCI interface

-=] an-node02
48:5B:39:3C:53:14   # eth0 - onboard interface
00:1B:21:72:9B:5A   # eth1 - right-most PCIe interface
00:1B:21:72:96:EA   # eth2 - left-most PCI interface

OS Install

Later steps will include packages to install, so the initial OS install can be minimal. I like to change the default run-level to 3, remove rhgb quiet from the grub menu, disable the firewall and disable SELinux. In a production cluster, you will want to use firewalling and selinux, but until you finish studying, leave it off to keep things simple.

  • Note: Before EL5.4, you could not use SELinux. It is now possible to use it, and it is recommended that you do so in any production cluster.
  • Note: Ports and protocols to open in a firewall will be discussed later in the networking section.

I like to minimize and automate my installs as much as possible. To that end, I run a little PXE server on my network and use a kickstart script to automate the install. Here is a simple one for use on a single-drive node:

If you decide to manually install EL5 on your nodes, please try to keep the installation as small as possible. The fewer packages installed, the fewer sources of problems and vectors for attack.

Post Install OS Changes

This section discusses changes I recommend, but are not required.

Network Configuration

The most important change that is recommended is to get your nodes into a consistent networking configuration. This will prove very handy when trying to keep track of your networks and where they're physically connected. This becomes exponentially more helpful as your cluster grows.

The first step is to understand the three networks we will be creating. Once you understand their role, you will need to decide which interface on the nodes will be used for each network.

Cluster Networks

The three networks are;

Network Acronym Use
Back-Channel Network BCN Private cluster communications, virtual machine migrations, fence devices
Storage Network SN Used exclusively for storage communications. Possible to use as totem's redundant ring.
Internet-Facing Network IFN Internet-polluted network. No cluster or storage communication or devices.

Things To Consider

When planning which interfaces to connect to each network, consider the following, in order of importance:

  • If your nodes have IPMI and an interface sharing a physical RJ-45 connector, this must be on the Back-Channel Network. The reasoning is that having your fence device accessible on the Internet-Facing Network poses a major security risk. Having the IPMI interface on the Storage Network can cause problems if a fence is fired and the network is saturated with storage traffic.
  • The lowest-latency network interface should be used as the Back-Channel Network. The cluster is maintained by multicast messaging between the nodes using something called the totem protocol. Any delay in the delivery of these messages can risk causing a failure and ejection of effected nodes when no actual failure existed. This will be discussed in greater detail later.
  • The network with the most raw bandwidth should be used for the Storage Network. All disk writes must be sent across the network and committed to the remote nodes before the write is declared complete. This causes the network to become the disk I/O bottle neck. Using a network with jumbo frames and high raw throughput will help minimize this bottle neck.
  • During the live migration of virtual machines, the VM's RAM is copied to the other node using the BCN. For this reason, the second fastest network should be used for back-channel communication. However, these copies can saturate the network, so care must be taken to ensure that cluster communications get higher priority. This can be done using a managed switch. If you can not ensure priority for totem multicast, then be sure to configure Xen later to use the storage network for migrations.
  • The remain, slowest interface should be used for the IFN.

Planning the Networks

This paper will use the following setup. Feel free to alter the interface to network mapping and the IP subnets used to best suit your needs. For reasons completely my own, I like to start my cluster IPs final octal at 71 for node 1 and then increment up from there. This is entirely arbitrary, so please use what ever makes sense to you. The remainder of this tutorial will follow the convention below:

Network Interface Subnet
IFN eth0 192.168.1.0/24
SN eth1 192.168.2.0/24
BCN eth2 192.139.3.0/24

This translates to the following per-node configuration:

an-node01 an-node02
IP Address Host Name(s) IP Address Host Name(s)
IFN 192.168.1.71 an-node01.ifn 192.168.1.72 an-node02.ifn
SN 192.168.2.71 an-node01.sn 192.168.2.72 an-node02.sn
BCN 192.168.3.71 an-node01 an-node01.alteeve.com an-node01.bcn 192.168.3.72 an-node02 an-node02.alteeve.com an-node02.bcn


 

Any questions, feedback, advice, complaints or meanderings are welcome.
Alteeve's Niche! Enterprise Support:
Alteeve Support
Community Support
© Alteeve's Niche! Inc. 1997-2024   Anvil! "Intelligent Availability®" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.