Node Assassin Fence Agent v1.1.4: Difference between revisions

From Alteeve Wiki
Jump to navigation Jump to search
Line 17: Line 17:
= Configuration File =
= Configuration File =


Be sure to review and edit <span class="code">/etc/na/fence_na.conf</span>!
Be sure to review and edit <span class="code">/etc/na/fence_na.conf</span>! It is heavily documented and explains what each option is and how it needs to be set for your Node Assassin(s).


= The Cluster 'cluster.conf' File =
= The Cluster 'cluster.conf' File =

Revision as of 14:49, 16 April 2010

 Node Assassin :: Node_Assassin v1.1.4 :: Node Assassin Fence Agent v1.1.4

This is the fenced fence agent for Node Assassin.

Files

The Node Assassin fence agent v1.1.4 is split up into three files:

  • Source: fence_na - Download
    • This is the core fence agent that exists in /sbin/.
  • Source: fence_na.lib - Download
    • This is the fence agent's function library that exists in /etc/na/.
  • Source: fence_na.conf - Download
    • This is the common Node Assassin configuration file that exists in /etc/na/.

The reason for the three files is that, later, there will be a fourth executable that will program the Node Assassin devices. When this program is created, it will consult the common configuration file and will use some of the functions in the library.

Configuration File

Be sure to review and edit /etc/na/fence_na.conf! It is heavily documented and explains what each option is and how it needs to be set for your Node Assassin(s).

The Cluster 'cluster.conf' File

Here is an example of the cluster related entries you will need to use in order to properly use the Node Assassin.

<cluster name="an_cluster" config_version="1">
        <clusternodes>
                <clusternode name="an_node01.alteeve.com" nodeid="1">
                        <fence>
                                <method name="node_assassin">
                                        <device name="motoko" port="01" action="off"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="an_node02.alteeve.com" nodeid="2">
                        <fence>
                                <method name="node_assassin">
                                        <device name="motoko" port="02" action="off"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <fencedevices>
                <fencedevice name="node_assassin" agent="fence_na" ipaddr="motoko.alteeve.com" name="motoko" passwd="secret"></fencedevice>
        </fencedevices>
</cluster>

XML Validation Support

Until Node Assassin is natively supported by Red Hat, the cluster.ng validation file for cluster.conf needs to be updated in order for the Node Assassin arguments to successfully validate.

The installer takes care of this for you, so you do not need to manually edit the /usr/share/system-config-cluster/misc/cluster.ng file.

Support is added by adding two sections to the cluster.ng file. Here is the diff of the cluster.ng provided in CentOS 5.4 pre and post edit.

diff /usr/share/system-config-cluster/misc/cluster.ng /root/backups/cluster.ng
47,62d46
<        <!-- Node Assassin -->
<        <group>
<         <attribute name="ipaddr"/>
<         <optional>
<         <attribute name="login"/>
<         </optional>
<         <optional>
<         <attribute name="passwd"/>
<         </optional>
<         <optional>
<         <attribute name="passwd_script"/>
<         </optional>
<         <optional>
<          <attribute name="quiet"/>
<         </optional>
<        </group>
1062,1066d1045
<         <!-- Node Assassin -->
<         <group>
<          <attribute name="port"/>
<          <attribute name="action"/>
<         </group>

Step by step of a Fence Action

When fenced is asked to fence a node, it will:

  1. Call /sbin/fence_na because of the fencedevices -> agent value.
  2. It will pass the following arguments to the fence agent, one pair per line:
    agent=fence_na              # From 'fencedevices' -> 'agent'
    name=motoko                 # From 'fencedevices' -> 'name'
    ipaddr=motoko.alteeve.com   # From 'fencedevices' -> 'ipaddr'
    passwd=secret               # From 'fencedevices' -> 'passwd'
    port=01                     # From 'clusternode' -> 'an_node01.alteeve.com'
                                # -> 'port'
    action=fence_na             # From 'clusternode' -> 'an_node01.alteeve.com'
                                # -> 'option'. This must be 'on', 'off', 
                                # 'reboot', 'status' or 'monitor'. See below
                                # for how these terms are interpretted by this
                                # agent. In most cases, you will want to use
                                # 'off'.
                                # NOTE: If 'option' is passed, it's value will
                                # be stored in 'action'. That is, 'action' and
                                # 'option' are synonymous but 'option' is
                                # deprecated.

Node Assassin's implentation of 'action's

Here is what, internally, Node Assassin is told to do by the fence agent is called for the various action types are called.

off

This set's the node to state 1; Fenced. Internally, it will hit the reset switch for one second to immediately disable the node. Then it will release the reset switch for another second before pressing and holding the power switch. After five seconds, Node Assassin will check the node's power feed. If it is still on, it will wait another 25 seconds and check again. If the node is still on, an error will be generated. If the node turns off successfully, the fence is declared a success.

on

This sets the node to state 0; Unfenced. Both the power and reset switches are opened, the Node Assassin will pause for one second and then the power switch will be closed for one second to boot the node (that is, the node is set to state 2).

reboot

This essentially just calls an off and then an on. However, the fence agent will return a success (exit 0) even if only the off stage succeeded. As per the FenceAgentAPI, a reboot does not need to successfully boot the node to be considered a success.

status

This checks the power feed for the requested node is checked. If the node is on, the agent will exit with code 0. If the node is off (or disconnected), it will exit with code 1. If an error occurred calling the Node Assassin, this will exit with code 2.

monitor

Being a multi-port fence device, this simply call 'list'.

list

This returns a CSV of the ports on your Node Assassin. Each node will be on a new line in the format 'node,alias' where the alias is read from the config file.

Command Line Arguments

Any command line arguments used by this fence agent are not dictated by the Fence Agent API. The following command line options are used to match the precedent set by existing fence agents for other devices.

Where it says that a command line argument "maps" to a given variable, it is referencing the cluster.conf file's arguments for Node Assassin.

-a <ip>

Maps the value to 'ipaddr'.

-h

Print the help message and then exits.

-l <name>

Maps the value to 'name'.

-n <num>

Maps the value to 'port'.

-o <string>

Maps the value to 'action'.

-p <string>

Maps the value to 'passwd'.

-S <path>

Maps the value to 'passwd_script'.

NOTE: This is not used by Node Assassin (yet) and is simply ignored.

-q

Sets quiet mode. Only errors will be printed. Logging proceeds as normal.

-V

Prints the 'fence_na' version and the version of any attached Node Assassin(s) and exits.

Notes

All verifications of actions is done by checking the state of the node's "Power LED". For this reason, it is critical that you connect this feed.

The power and reset buttons are polarized. That is, you *MUST* connect the positive terminals from your mainboard's power and reset switches to the positive wires going to the Node Assassin.

IMPORTANT!

If you connect the power or reset buttons backwards, the circuit will be closed (that is, you will have pressed the button). This is by design!

ALWAYS TEST YOU Node Assassin!

Specifically, after connecting a new node, be sure to manually send the on -> off -> on actions to make sure that the nodes are properly setup. This sequence will boot, fence and reboot the node and will require all functions to be working properly to succeed.

Agent Testing

To test the agent in a manner similar to how fenced calls it, copy the following into a file (ie: args.txt):

# Test file used as input for the NA fence agent.
ipaddr=motoko.alteeve.com
port=1
login=motoko
passwd=secret
action=on

And cat it into the fence agent via a pipe:

clear; cat args.txt | ./fence_na

 

Input, advice, complaints and meanderings all welcome!
Digimer digimer@alteeve.ca https://alteeve.ca/w legal stuff:  
All info is provided "As-Is". Do not use anything here unless you are willing and able to take resposibility for your own actions. © 1997-2013
Naming credits go to Christopher Olah!
In memory of Kettle, Tonia, Josh, Leah and Harvey. In special memory of Hannah, Jack and Riley.