RHCS Release Manager

From Alteeve Wiki
Jump to navigation Jump to search

 AN!Wiki :: How To :: RHCS Release Manager

Warning: These are my notes to help me with my duties as the Red Hat cluster release manager. They are not meant to be a general tutorial and no effort is made to make these notes useful in any general way. Unless you are taking over my duties, this page is probably useless to you.

Test Results

These tests are run by copying over the tarball created on the build machine, untar'ing it and running ./configure && make.

Testing v3.1.9x

Per-release test results.

Distro Build Tests

Distro Arch Date tested
(YYYY-MM-DD)
Results Notes
Fedora Rawhide x86_64 2011-12-24 Pass
Fedora Rawhide i386 2011-12-24 Pass
Fedora 16 x86_64 2011-12-24 Pass
Fedora 16 i386 2011-12-24 Pass
Fedora 15 x86-64 2011-12-24 Pass
Fedora 15 i386 2011-12-24 Pass
Ubuntu 11.10 amd64 2011-12-24 Pass
Ubuntu 11.10 i386 2011-12-24 Pass

Cluster Tests

Host Nodes are Fedora 16, x86_64.

Test Result Notes
Install via make install Pass
Start One Node, No quorum or fence Problem The systemctl start returns "job failed". The cluster itself is fine, and starting the second node joins it. Calling cman_tool status returns what you would expect. The biggest concern is that the systemctl status continues to think it has failed, even after a second node joins and it gains quorum. Restarting the cman daemon restores the status a good state, but of course that is a disruptive fix.

Having the daemons start, even without quorum, should be enough to have systemctl exit with success.
Start Second Node, Gain quorum and fence third Pass
Full Cluster Start Pass
Withdraw One Node, Retain Quorum Pass
Withdraw Second Node, Drop Quorum Pass
Push out updated cluster.conf Pass
Start the service via rgmanager Pass Similar issue to starting cman services and bash completion. Doesn't complete at all until after run once, then stop doesn't complete, but the rest do.
Manual relocate the service Pass
Fence node/recover service Pass
Release Ready? Yes Bash-completion issues should not be blockers.
General Notes Notice tab-completion of systemctl {stop,start,status} cman.service doesn't happen until after fully typing in the command once. Bash completion of stop doesn't work at all.

Cluster configuration used

<?xml version="1.0"?>
<cluster config_version="1" name="rm-cluster">
	<totem rrp_mode="none" secauth="off" />
	<clusternodes>
		<clusternode name="an-node03.alteeve.com" nodeid="3">
			<fence>
				<method name="pdu">
					<device action="reboot" name="pdu2" port="3" />
				</method>
			</fence>
		</clusternode>
		<clusternode name="an-node04.alteeve.com" nodeid="4">
			<fence>
				<method name="pdu">
					<device action="reboot" name="pdu2" port="4" />
				</method>
			</fence>
		</clusternode>
		<clusternode name="an-node05.alteeve.com" nodeid="5">
			<fence>
				<method name="pdu">
					<device action="reboot" name="pdu2" port="5" />
				</method>
			</fence>
		</clusternode>
	</clusternodes>
	<fencedevices>
		<fencedevice agent="fence_apc_snmp" ipaddr="pdu2.alteeve.com" name="pdu2" />
	</fencedevices>
	<rm log_level="5">
		<resources>
			<ip address="10.10.1.1" />
		</resources>
		<failoverdomains>
			<failoverdomain name="virt-ip" nofailback="0" ordered="1" restricted="0">
				<failoverdomainnode name="an-node03.alteeve.com" priority="1" />
				<failoverdomainnode name="an-node04.alteeve.com" priority="2" />
				<failoverdomainnode name="an-node05.alteeve.com" priority="3" />
			</failoverdomain>
		</failoverdomains>
		<service autostart="1" domain="virt-ip" name="float_ip" recovery="relocate">
			<ip ref="10.10.1.1" />
		</service>
	</rm>
</cluster>

Old Tests

Build Environment

This is how to setup a machine for building and releasing new version of RHCS. This requires a proper FAS account.

yum -y groupinstall "Development Libraries" "Development Tools" "Fedora Packager"
yum -y install vim wget gnupg

Remove any old builds.

cd ~/projects/RedHat/release-manager/
rm -rf ./cluster*
Warning: This must be done on the machine with the Fedora Account System - Fedora Project key.

Check-out latest cluster.

git clone ssh://git.fedorahosted.org/git/cluster.git
cd cluster/
git branch stable31 --track origin/STABLE31
git checkout stable31

Change this to reflect the appropriate versions.

Note: You will need to enter the proper passhrase to sign this TC.
make -f make/release.mk version=3.1.99 oldversion=3.1.8

Test Release Tarball Creation

Push the test candidate to each VM to be tested. Change the version number as appropriate.

cd ../cluster-3.1.99-release-candidate/
ls -lah
total 6.3M
drwxrwxr-x 2 digimer digimer 4.0K Dec 24 13:48 .
drwxrwxr-x 6 digimer digimer 4.0K Dec 24 13:48 ..
-rw-rw-r-- 1 digimer digimer    0 Dec 24 13:48 Changelog-3.1.99
-rw-rw-r-- 1 digimer digimer  619 Dec 24 13:48 cluster-3.1.99.sha256
-rw-rw-r-- 1 digimer digimer  836 Dec 24 13:48 cluster-3.1.99.sha256.asc
-rw-rw-r-- 1 digimer digimer 3.1M Dec 24 13:48 cluster-3.1.99.tar
-rw-rw-r-- 1 digimer digimer 513K Dec 24 13:48 cluster-3.1.99.tar.bz2
-rw-rw-r-- 1 digimer digimer 641K Dec 24 13:48 cluster-3.1.99.tar.gz
-rw-rw-r-- 1 digimer digimer 474K Dec 24 13:48 cluster-3.1.99.tar.xz
-rw-rw-r-- 1 digimer digimer 1.1M Dec 24 13:48 rgmanager-3.1.99.tar
-rw-rw-r-- 1 digimer digimer 174K Dec 24 13:48 rgmanager-3.1.99.tar.bz2
-rw-rw-r-- 1 digimer digimer 217K Dec 24 13:48 rgmanager-3.1.99.tar.gz
-rw-rw-r-- 1 digimer digimer 173K Dec 24 13:48 rgmanager-3.1.99.tar.xz
-rw-rw-r-- 1 digimer digimer    0 Dec 24 13:48 tag-3.1.99
rsync -av cluster-3.1.99.tar root@f15-make-x86-64:/root/
rsync -av cluster-3.1.99.tar root@f15-make-i386:/root/

Per-Node Compile Test

For distro testing, only ./configure && make are done.

cd ~
rm -rf cluster*
tar -xvf cluster-3.1.99.tar 
cd cluster-3.1.99
./configure
make

Cluster Test

This is done on the final 3-node cluster on.

Copy the test tarball to the test nodes.

rsync -av projects/RedHat/release-manager/cluster-3.1.99-release-candidate/cluster-3.1.99.tar root@an-node03:/root/
rsync -av projects/RedHat/release-manager/cluster-3.1.99-release-candidate/cluster-3.1.99.tar root@an-node04:/root/
rsync -av projects/RedHat/release-manager/cluster-3.1.99-release-candidate/cluster-3.1.99.tar root@an-node05:/root/

Then, on each node, run the following commands (replace the version with the appropriate one, of course.)

cd ~
tar -xvf cluster-3.1.99.tar
cd cluster-3.1.99
yum -y update
yum -y install vim wget corosynclib-devel openaislib-devel fence-agents resource-agents modcluster ricci
./configure 
make
make install

Pushing Changes to git

Initial Setup of git

Don't have notes, this is from bash's history. Sort this out later.

  1. Misc commands
man git-send-email
git clone http://git.fedorahosted.org/git/cluster.git
man git
git branch http://git.fedorahosted.org/git/cluster.git
man git
git show-branch http://git.fedorahosted.org/git/cluster.git
git clone http://git.fedorahosted.org/git/cluster.git
git show-branch
git branch

Pushing To origin

Confirming the changes before committing.

git diff

Pushing changes;

Commit locally with signature.

git commit -a -s

(open's editor)

[stable31 9be00f8] Changes the kernel version check to handle 3.x.y kernels. Now if the 'x' version of the running kernel is higher than the 'x' version of the minimum kernel, the test passes. Also changed the method of checking that version numbers were gathered so that a version number of '0' would be seen as valid.
 Committer: Digital Mermaid <digimer@lework.alteeve.com>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly:

    git config --global user.name "Your Name"
    git config --global user.email you@example.com

After doing this, you may fix the identity used for this commit with:

    git commit --amend --reset-author

 1 files changed, 3 insertions(+), 2 deletions(-)

Check the current branch name.

git branch
  master
* stable31

This shows that stable31 is active.

Now push up to the main git repo.

git push origin stable31:STABLE31
Counting objects: 5, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 543 bytes, done.
Total 3 (delta 2), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/cluster.git
   991bfb0..9be00f8  stable31 -> STABLE31

An email should have been automatically sent to the appropriate mailing lists.

Distro Testing

Note: Always update the OS before running tests!

Fedora 15, 16 and Rawhide

Packages to install;

yum -y groupinstall "Development Libraries" "Development Tools" "Fedora Packager"
yum -y install vim wget corosynclib-devel openaislib-devel fence-agents resource-agents modcluster ricci

32-bit versions also need;

yum -y install kernel-PAE-devel

Now make sure that ricci is running and that selinux, iptables and ip6tables are off.

sed -e "s/SELINUX=enforcing/SELINUX=disabled/" -i /etc/selinux/config
systemctl disable ip6tables.service
systemctl disable iptables.service
systemctl enable modclusterd.service
systemctl enable ricci.service
systemctl stop ip6tables.service
systemctl stop iptables.service
systemctl start modclusterd.service
systemctl restart ricci.service

Reboot if selinux was updated.

Debian 6

Debian 6 does not support RHEL's Cluster 3.1+ as it's included version of corosync is too old. No further compatibility testing will be run for this version of Debian.

Ubuntu 11.10

Packages to install;

apt-get update
apt-get -y dist-upgrade
apt-get -y install linux-headers-$(uname -r) libxml2-dev libcorosync-dev libldap2-dev zlib1g-dev libopenais-dev libdbus-1-dev \
 libslang2-dev libncurses5-dev

Pushing The Release

 

Any questions, feedback, advice, complaints or meanderings are welcome.
Alteeve's Niche! Enterprise Support:
Alteeve Support
Community Support
© Alteeve's Niche! Inc. 1997-2024   Anvil! "Intelligent Availability®" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.