Abandoned - Two Node Fedora 13 Cluster

From Alteeve Wiki
Jump to navigation Jump to search

 AN!Wiki :: How To :: Abandoned - Two Node Fedora 13 Cluster

Overview

This paper has one goal;

  1. How to assemble the simplest cluster possible, a 2-Node Cluster, which you can then expand on for your own needs.

With this completed, you can then jump into "step 2" papers that will show various uses of a two node cluster:

  1. How to create a "floating" virtual machine that can move between the two nodes in the event of a node failure, maximizing up time.
  2. How to create simple resources that can move between nodes. Examples will be a simple PostgreSQL database, DHCP, DNS and web servers.

Prerequisites

It is expected that you are already comfortable with the Linux command line, specifically bash, and that you are familiar with general administrative tasks in Red Hat based distributions, specifically Fedora. You will also need to be comfortable using editors like vim, nano or similar. This paper uses vim in examples. Simply substitute your favourite editor in it's place.

You are also expected to be comfortable with networking concepts. You will be expected to understand TCP/IP, multicast, broadcast, subnets and netmasks, routing and other relatively basic networking concepts. Please take the time to become familiar with these concepts before proceeding.

This said, where feasible, as much detail as is possible will be provided. For example, all configuration file locations will be shown and functioning sample files will be provided.

Platform

This paper will implement the Red Hat Cluster Suite using the Fedora v13 distribution. This paper uses the x86_64 repositories, however, if you are on an i386 (32 bit) system, you should be able to following along fine. Simply replace x86_64 with i386 or i686 in package names.

You can either download the stock Fedora 13 DVD ISO (currently at version 5.4 which is used in this document), or you can try out the alpha AN!Cluster Install DVD. (4.3GB iso). If you use the later, please test it out on a development or test cluster. If you have any problems with the AN!Cluster variant Fedora distro, please contact me and let me know what your trouble was.

Why Fedora 13?

Generally speaking, I much prefer to use a server-oriented Linux distribution like CentOS, Debian or similar. However, there have been many recent changes in the Linux-Clustering world that have made all of the currently available server-class distributions obsolete. With luck, once Red Hat Enterprise Linux and CentOS version 6 is released, this will change.

Until then, Fedora version 13 provides the most up to date binary releases of the new implementation of the clustering stack available. For this reason, FC13 is the best choice in clustering, if you want to be current. To mitigate some of the issues introduced by using a workstation distribution, many packages will be stripped out of the default install.

Focus

Clusters can serve to solve three problems; Reliability, Performance and Scalability.

This focus of the cluster described in this paper is primarily reliability. Second to this, scalability will be the priority leaving performance to be addressed only when it does not impact the first two criteria. This is not to indicate that performance is not a valid priority, it simply isn't the priority of this paper.

Goal

At the end of this paper, you should have a fully functioning two-node array capable of hosting a "floating" resources. That is, resources that exists on one node and can be easily moved to the other node with minimal effort and down time. This should conclude with a solid foundation for adding more virtual servers up to the limit of your cluster's resources.

This paper should also server to show how to build the foundation of any other cluster configuration. This paper has a core focus of introducing the main issues that come with clustering and hopes to serve as a foundation for any cluster configuration outside the scope of this paper.

Begin

Let's begin!

Hardware

We will need two physical servers each with the following hardware:

  • One or more multi-core CPUs with Virtualization support.
  • Three network cards; At least one should be gigabit or faster.
  • One or more hard drives.

This paper uses the following hardware:

  • ASUS M4A78L-M
  • AMD Athlon II x2 250
  • 2GB Kingston DDR2 KVR800D2N6K2/4G (4GB kit split between the two nodes)
  • 1x Intel 82540 PCI NICs
  • 1x D-Link DGE-560T

This is not an endorsement of the above hardware. I bought what was within my budget that would server the purposes of creating this document. What you purchase shouldn't matter, so long at the minimum requirements are met.

OS Install

Start with a stock CentOS 5.x install. This How-To uses CentOS 5.4 x86_64, however it should be fairly easy to adapt to other CentOS 5*, RHEL5 or other RHEL5-based distributions.

These are sample kickstart script used by this paper. Be sure to set your how password string and network settings.

Warning! These kickstart scripts will erase your hard drive! Adapt them, don't blindly use them.

Generic cluster node kickstart scripts.

AN!Cluster Install

If you are feeling brave, below is a link to a custom install DVD that contains the kickstart scripts to setup nodes and an an-cluster directory with all the configuration files.

  • Download the custom AN!Cluster v0.1.006 Install DVD. (4.5GiB iso). (Currently disabled - Reworking for F13)

Post OS Install

Once the OS is installed, we need to do some ground work.

  1. Setup networking.
  2. Limit dom0's memory.
  3. Change the default run-level.
  4. Change when xend starts.

Post-Install Network Configuration

This cluster uses Xen, which fairly dramatically impacts networking. Terms you need to be familiar with are:

  • dom0
    • This is the "first" virtual machine with special access to the underlying hardware. This looks like the host operating system but is in fact just another virtual server running under Xen. This is also the virtual machine that can directly see the Xen networking infrastructure.
  • domu
    • These are the virtual servers setup in and managed by the dom0 virtual machine. These are what most people think of when talking about "virtual servers" under Xen.

Ethernet Devices and Subnets

The most important thing to do after the install is to identify which ethX device matches which network card. This is important in two cases;

  • The fastest network card should be allocated to the DRBD partition.
  • If you have IPMI piggy-backed on a physical network card, it should be allocated to the back-channel subnet.

This paper has the following configuration:

  • eth0; Internet-polluted subnet.
  • eth1; DRBD subnet.
  • eth2; Back-channel subnet.

To change which ethX device maps to which ethernet card, please see:

If you are unfamiliar with how networking works in Xen, please read this article:

Choosing your Subnets

There will be three subnets in our two node cluster;

  • Internet-polluted subnet; 192.168.1.0/24
    • This subnet will ultimately be directly accessible only by the firewall virtual server. All other virtual machines and the node's dom0s will access the internet via the firewall for security reasons. During setup though, the 'dom0' servers will directly access this subnet.
  • DRBD subnet; 10.0.0.0/24
    • Only the two 'dom0' servers will access to this subnet. It is used for DRBD communication and as a backup for the totem ring protocol.
  • Back-channel; 10.0.1.0/24
    • This is the private subnet used for communication between the dom0 and domU virtual servers. This subnet will have no direct access to the internet. This paper will use the 10.0.1.0/24 subnet for this

I like to assign the same last octal to a given node's subnets. This helps me keep track of which node I am working with at any given time. Here is how I setup my two nodes:

  • an-node01
    • eth0: 192.168.1.71
    • eth1: 10.0.0.71
    • eth2: 10.0.1.71
  • an-node02
    • eth0: 192.168.1.72
    • eth1: 10.0.0.72
    • eth2: 10.0.1.72

/etc/hosts

Some applications expect to be able to call nodes by their name. To accommodate this, and to ensure that inter-node communication takes place on the back-channel subnet, we add the following to the /etc/hosts file:

vim /etc/hosts
# By back-channel IPs to name mapping.
10.0.1.71	an-node01 an-node01.alteeve.com
10.0.1.72	an-node02 an-node02.alteeve.com

Note: Delete any pre-existing entries matching the name returned by uname -n. There is a good chance there will be an entry that resolves to 127.0.0.1 which would cause problems later.

Obviously, adapt the names and IPs to match your nodes and subnets. The only critical thing is to make sure that the name returned by uname -n is resolvable to the back-channel subnet. I like to add a short-form name for convenience.

iptables

Be sure to flush netfilter tables and disable iptables and ip6tables from starting on your nodes. This is because the 'dom0' servers will not be connected directly to the Internet and we want to minimize the chance of an errant iptables rule messing up our configuration. If, before launch, you wish to implement a firewall, feel free to do so but be sure to thoroughly test your cluster to ensure no problems were introduced.

chkconfig --level 2345 iptables off
/etc/init.d/iptables stop
chkconfig --level 2345 ip6tables off
/etc/init.d/ip6tables stop

Limit dom0's Memory

Normally, 'dom0' will claim and use memory not allocated to virtual machines. This can cause trouble if, for example, you've moved a VM off of a node and then want to move it or another VM back. For a period of time, the node will claim that there is not enough free memory for the migration. By setting a hard limit of dom0's memory usage, this scenario won't happen and you will not need to delay migrations.

To do this, add dom0_mem=512M to the Xen kernel image's first module line in grub. For example, you should have a line like in your grub configuration file:

vim /boot/grub/menu.lst
title CentOS (2.6.18-164.15.1.el5)
	root (hd0,0)
	kernel /vmlinuz-2.6.18-164.15.1.el5 ro root=/dev/an-lvm01/lv01 rhgb quiet dom0_mem=512M
	initrd /initrd-2.6.18-164.15.1.el5.img

You can change the '512M' with the amount of RAM you want to allocate to dom0. Note that if you used the AN!Cluster install DVD or the AN!Cluster kickstart files, this should already be set for you.

REMEMBER!

If you update your kernel, be sure to re-add this argument to the new kernel's argument list.

Change the Default Run-Level

If you don't plan to work on your nodes directly, it makes sense to switch the default run level from 5 to 3. This prevents Gnome from starting at boot, thus freeing up a lot of memory and system resources and reducing the possible attack vectors.

To do this, edit /etc/inittab, change the id:5:initdefault: line to id:3:initdefault: and then switch to run level 3:

vim /etc/inittab
id:3:initdefault:
init 3

Change when xend starts

Normally, xend starts at priority 98 in /etc/rc.X/. This can cause problems with other packages that expect the network to be stable. This is because xend takes all the networks down when it starts. To prevent these problems, we will move the xend init script to start priority 11. We'll also adapt the stop priority to 89, though this is less critical

First, edit the actual initialization script and change the line '# chkconfig: 2345 98 01' to '# chkconfig: 2345 11 89'.

vim /etc/init.d/xend
# chkconfig: 2345 11 89

Now, use chkconfig to change the apply the changes:

chkconfig --del xend
chkconfig --add xend

You should now see the symlink /etc/rc3.d/S11xend and /etc/rc3.d/K89xend.


 

Any questions, feedback, advice, complaints or meanderings are welcome.
Alteeve's Niche! Enterprise Support:
Alteeve Support
Community Support
© Alteeve's Niche! Inc. 1997-2024   Anvil! "Intelligent Availability®" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.