AN!Cluster Tutorial 2

**AN!Cluster Tutorial 2**

  Nodes                                                                                        \_/                                                                                           
  ____________________________________________________________________________             _____|____              ____________________________________________________________________________ 
 | an-a05n01.alteeve.ca                                                       |  /--------{_Internet_}---------\  |                                                       an-a05n02.alteeve.ca |
 |                                 Network:                                   |  |                             |  |                                   Network:                                 |
 |                                 _________________     _____________________|  |  _________________________  |  |_____________________     _________________                                 |
 |      Servers:                  |   ifn_bridge1   |---| ifn_bond1           |  | | an-switch01    Switch 1 | |  |           ifn_bond1 |---|   ifn_bridge1   |                  Servers:      |
 |      _______________________   |   10.255.50.1   |   | ____________________|  | |____ Internet-Facing ____| |  |____________________ |   |   10.255.50.2   |  .........................     |
 |     | [ vm01-win2008 ]      |  |_________________|   || ifn_link1          =----=_01_]    Network    [_02_=----=          ifn_link1 ||   |_________________|  :      [ vm01-win2008 ] :     |
 |     |   ____________________|    | : | | : : | |     || 00:1B:21:81:C3:34 ||  | |____________________[_24_=-/  || 00:1B:21:81:C2:EA ||     : : | | : : | :    :____________________   :     |
 |     |  | NIC 1              =----/ : | | : : | |     ||___________________||  | | an-switch02    Switch 2 |    ||___________________||     : : | | : : | :----=              NIC 1 |  :     |
 |     |  | 10.255.1.1        ||      : | | : : | |     | ____________________|  | |____                 ____|    |____________________ |     : : | | : : |      :|        10.255.1.1 |  :     |
 |     |  | ..:..:..:..:..:.. ||      : | | : : | |     || ifn_link2          =----=_01_]  VLAN ID 300  [_02_=----=          ifn_link2 ||     : : | | : : |      :| ..:..:..:..:..:.. |  :     |
 |     |  |___________________||      : | | : : | |     || A0:36:9F:02:E0:05 ||  | |____________________[_24_=-\  || A0:36:9F:07:D6:2F ||     : : | | : : |      :|___________________|  :     |
 |     |   ____                |      : | | : : | |     ||___________________||  |                             |  ||___________________||     : : | | : : |      :                ____   :     |
 |  /--=--[_c:_]               |      : | | : : | |     |_____________________|  \-----------------------------/  |_____________________|     : : | | : : |      :               [_c:_]--=--\  |
 |  |  |_______________________|      : | | : : | |      _____________________|                                   |_____________________      : : | | : : |      :.......................:  |  |
 |  |                                 : | | : : | |     | sn_bond1            |     _________________________     |            sn_bond1 |     : : | | : : |                                 |  |
 |  |     .........................   : | | : : | |     | 10.10.50.1          |    | an-switch01    Switch 1 |    |          10.10.50.2 |     : : | | : : |    _______________________      |  |
 |  |     : [ vm02-win2012 ]      :   : | | : : | |     | ____________________|    |____     Storage     ____|    |____________________ |     : : | | : : |   |      [ vm02-win2012 ] |     |  |
 |  |     :   ____________________:   : | | : : | |     || sn_link1           =----=_09_]    Network    [_10_=----=           sn_link1 ||     : : | | : : |   |____________________   |     |  |
 |  |     :  | NIC 1              =---: | | : : | |     || 00:19:99:9C:9B:9F ||    |_________________________|    || 00:19:99:9C:A0:6D ||     : : | | : : \---=              NIC 1 |  |     |  |
 |  |     :  | 10.255.1.2        |:     | | : : | |     ||___________________||    | an-switch02    Switch 2 |    ||___________________||     : : | | : :     ||        10.255.1.2 |  |     |  |
 |  |     :  | ..:..:..:..:..:.. |:     | | : : | |     | ____________________|    |____                 ____|    |____________________ |     : : | | : :     || ..:..:..:..:..:.. |  |     |  |
 |  |     :  |___________________|:     | | : : | |     || sn_link2           =----=_09_]  VLAN ID 200  [_10_=----=           sn_link2 ||     : : | | : :     ||___________________|  |     |  |
 |  |     :   ____                :     | | : : | |     || A0:36:9F:02:E0:04 ||    |_________________________|    || A0:36:9F:07:D6:2E ||     : : | | : :     |                ____   |     |  |
 |  |  /--=--[_c:_]               :     | | : : | |     ||___________________||                                   ||___________________||     : : | | : :     |               [_c:_]--=--\  |  |
 |  |  |  :.......................:     | | : : | |  /--|_____________________|                                   |_____________________|--\  : : | | : :     |_______________________|  |  |  |
 |  |  |                                | | : : | |  |   _____________________|                                   |_____________________   |  : : | | : :                                |  |  |
 |  |  |   _______________________      | | : : | |  |  | bcn_bond1           |     _________________________     |           bcn_bond1 |  |  : : | | : :     .........................  |  |  |
 |  |  |  | [ vm03-win7 ]         |     | | : : | |  |  | 10.20.50.1          |    | an-switch01    Switch 1 |    |          10.20.50.2 |  |  : : | | : :     :      [ vm02-win2012 ] :  |  |  |
 |  |  |  |   ____________________|     | | : : | |  |  | ____________________|    |____  Back-Channel   ____|    |____________________ |  |  : : | | : :     :____________________   :  |  |  |
 |  |  |  |  | NIC 1              =-----/ | : : | |  |  || bcn_link1          =----=_13_]    Network    [_14_=----=          bcn_link1 ||  |  : : | | : :-----=              NIC 1 |  :  |  |  |
 |  |  |  |  | 10.255.1.3        ||       | : : | |  |  || 00:19:99:9C:9B:9E ||    |_________________________|    || 00:19:99:9C:A0:6C ||  |  : : | | :       :|        10.255.1.3 |  :  |  |  |
 |  |  |  |  | ..:..:..:..:..:.. ||       | : : | |  |  ||___________________||    | an-switch02    Switch 2 |    ||___________________||  |  : : | | :       :| ..:..:..:..:..:.. |  :  |  |  |
 |  |  |  |  |___________________||       | : : | |  |  || bcn_link2          =----=_13_]  VLAN ID 100  [_14_=----=          bcn_link2 ||  |  : : | | :       :|___________________|  :  |  |  |
 |  |  |  |   ____                |       | : : | |  |  || 00:1B:21:81:C3:35 ||    |_________________________|    || 00:1B:21:81:C2:EB ||  |  : : | | :       :                ____   :  |  |  |
 |  +--|-=--[_c:_]                |       | : : | |  |  ||___________________||                                   ||___________________||  |  : : | | :       :               [_c:_]--=--|--+  |
 |  |  |  |_______________________|       | : : | |  |  |_____________________|                                   |_____________________|  |  : : | | :       :.......................:  |  |  |
 |  |  |                                  | : : | |  |                        |                                   |                        |  : : | | :                                  |  |  |
 |  |  |   _______________________        | : : | |  |                        |                                   |                        |  : : | | :       .........................  |  |  |
 |  |  |  | [ vm04-win8 ]         |       | : : | |  \                        |                                   |                       /   : : | | :       :         [ vm04-win8 ] :  |  |  |
 |  |  |  |   ____________________|       | : : | |   \                       |                                   |                      /    : : | | :       :____________________   :  |  |  |
 |  |  |  |  | NIC 1              =-------/ : : | |    |                      |                                   |                      |    : : | | :-------=              NIC 1 |  :  |  |  |
 |  |  |  |  | 10.255.1.4        ||         : : | |    |                      |                                   |                      |    : : | |         :|        10.255.1.4 |  :  |  |  |
 |  |  |  |  | ..:..:..:..:..:.. ||         : : | |    |                      |                                   |                      |    : : | |         :| ..:..:..:..:..:.. |  :  |  |  |
 |  |  |  |  |___________________||         : : | |    |                      |                                   |                      |    : : | |         :|___________________|  :  |  |  |
 |  |  |  |   ____                |         : : | |    |                      |                                   |                      |    : : | |         :                ____   :  |  |  |
 |  +--|-=--[_c:_]                |         : : | |    |                      |                                   |                      |    : : | |         :               [_c:_]--=--|--+  |
 |  |  |  |_______________________|         : : | |    |                      |                                   |                      |    : : | |         :.......................:  |  |  |
 |  |  |                                    : : | |    |                      |                                   |                      |    : : | |                                    |  |  |
 |  |  |  .........................         : : | |    |                      |                                   |                      |    : : | |          _______________________   |  |  |
 |  |  |  : [ vm05-freebsd9 ]     :         : : | |    |                      |                                   |                      |    : : | |         |     [ vm05-freebsd9 ] |  |  |  |
 |  |  |  :   ____________________:         : : | |    |                      |                                   |                      |    : : | |         |____________________   |  |  |  |
 |  |  |  :  | em0                =---------: : | |    |                      |                                   |                      |    : : | \---------=                em0 |  |  |  |  |
 |  |  |  :  | 10.255.1.5        |:           : | |    |                      |                                   |                      |    : : |           ||        10.255.1.5 |  |  |  |  |
 |  |  |  :  | ..:..:..:..:..:.. |:           : | |    |                      |                                   |                      |    : : |           || ..:..:..:..:..:.. |  |  |  |  |
 |  |  |  :  |___________________|:           : | |    |                      |                                   |                      |    : : |           ||___________________|  |  |  |  |
 |  |  |  :   ______              :           : | |    |                      |                                   |                      |    : : |           |              ______   |  |  |  |
 |  |  +--=--[_ada0_]             :           : | |    |                      |                                   |                      |    : : |           |             [_ada0_]--=--+  |  |
 |  |  |  :.......................:           : | |    |                      |                                   |                      |    : : |           |_______________________|  |  |  |
 |  |  |                                      : | |    |                      |                                   |                      |    : : |                                      |  |  |
 |  |  |  .........................           : | |    |                      |                                   |                      |    : : |            _______________________   |  |  |
 |  |  |  : [ vm06-solaris11 ]    :           : | |    |                      |                                   |                      |    : : |           |    [ vm06-solaris11 ] |  |  |  |
 |  |  |  :   ____________________:           : | |    |                      |                                   |                      |    : : |           |____________________   |  |  |  |
 |  |  |  :  | net0               =-----------: | |    |                      |                                   |                      |    : : \-----------=               net0 |  |  |  |  |
 |  |  |  :  | 10.255.1.6        |:             | |    |                      |                                   |                      |    : :             ||        10.255.1.6 |  |  |  |  |
 |  |  |  :  | ..:..:..:..:..:.. |:             | |    |                      |                                   |                      |    : :             || ..:..:..:..:..:.. |  |  |  |  |
 |  |  |  :  |___________________|:             | |    |                      |                                   |                      |    : :             ||___________________|  |  |  |  |
 |  |  |  :   ______              :             | |    |                      |                                   |                      |    : :             |              ______   |  |  |  |
 |  |  +--=--[_c3d0_]             :             | |    |                      |                                   |                      |    : :             |             [_c3d0_]--=--+  |  |
 |  |  |  :.......................:             | |    |                      |                                   |                      |    : :             |_______________________|  |  |  |
 |  |  |                                        | |    |                      |                                   |                      |    : :                                        |  |  |
 |  |  |   _______________________              | |    |                      |                                   |                      |    : :             .........................  |  |  |
 |  |  |  | [ vm07-rhel6 ]        |             | |    |                      |                                   |                      |    : :             :        [ vm07-rhel6 ] :  |  |  |
 |  |  |  |   ____________________|             | |    |                      |                                   |                      |    : :             :____________________   :  |  |  |
 |  |  |  |  | eth0               =-------------/ |    |                      |                                   |                      |    : :-------------=               eth0 |  :  |  |  |
 |  |  |  |  | 10.255.1.7        ||               |    |                      |                                   |                      |    :               :|        10.255.1.7 |  :  |  |  |
 |  |  |  |  | ..:..:..:..:..:.. ||               |    |                      |                                   |                      |    :               :| ..:..:..:..:..:.. |  :  |  |  |
 |  |  |  |  |___________________||               |    |                      |                                   |                      |    :               :|___________________|  :  |  |  |
 |  |  |  |   _____               |               |    |                      |                                   |                      |    :               :               _____   :  |  |  |
 |  +--|--=--[_vda_]              |               |    |                      |                                   |                      |    :               :              [_vda_]--=--|--+  |
 |  |  |  |_______________________|               |    |                      |                                   |                      |    :               :.......................:  |  |  |
 |  |  |                                          |    |                      |                                   |                      |    :                                          |  |  |
 |  |  |   _______________________                |    |                      |                                   |                      |    :               .........................  |  |  |
 |  |  |  | [ vm08-sles11 ]       |               |    |                      |                                   |                      |    :               :       [ vm08-sles11 ] :  |  |  |
 |  |  |  |   ____________________|               |    |                      |                                   |                      |    :               :____________________   :  |  |  |
 |  |  |  |  | eth0               =---------------/    |                      |                                   |                      |    :---------------=               eth0 |  :  |  |  |
 |  |  |  |  | 10.255.1.8        ||                    |                      |                                   |                      |                    :|        10.255.1.8 |  :  |  |  |
 |  |  |  |  | ..:..:..:..:..:.. ||                    |                      |                                   |                      |                    :| ..:..:..:..:..:.. |  :  |  |  |
 |  |  |  |  |___________________||                    |                      |                                   |                      |                    :|___________________|  :  |  |  |
 |  |  |  |   _____               |                    |                      |                                   |                      |                    :               _____   :  |  |  |
 |  +--|--=--[_vda_]              |                    |                      |                                   |                      |                    :              [_vda_]--=--|--+  |
 |  |  |  |_______________________|                    |                      |                                   |                      |                    :.......................:  |  |  |
 |  |  |                                               |                      |                                   |                      |                                               |  |  |
 |  |  |                                               |                      |                                   |                      |                                               |  |  |
 |  |  |                                               |                      |                                   |                      |                                               |  |  |
 |  |  |    Storage:                                   |                      |                                   |                      |                                   Storage:    |  |  |
 |  |  |    __________                                 |                      |                                   |                      |                                 __________    |  |  |
 |  |  |   [_/dev/sda_]                                |                      |                                   |                      |                                [_/dev/sda_]   |  |  |
 |  |  |     |   ___________    _______                |                      |                                   |                      |                _______    ___________   |     |  |  |
 |  |  |     +--[_/dev/sda1_]--[_/boot_]               |                      |                                   |                      |               [_/boot_]--[_/dev/sda1_]--+     |  |  |
 |  |  |     |   ___________    ________               |                      |                                   |                      |               ________    ___________   |     |  |  |
 |  |  |     +--[_/dev/sda2_]--[_<swap>_]              |                      |                                   |                      |              [_<swap>_]--[_/dev/sda2_]--+     |  |  |
 |  |  |     |   ___________    ___                    |                      |                                   |                      |                    ___    ___________   |     |  |  |
 |  |  |     +--[_/dev/sda3_]--[_/_]                   |                      |                                   |                      |                   [_/_]--[_/dev/sda3_]--+     |  |  |
 |  |  |     |   ___________    ____    ____________   |                      |                                   |                      |   ____________    ____    ___________   |     |  |  |
 |  |  |     +--[_/dev/sda5_]--[_r0_]--[_/dev/drbd0_]--+                      |                                   |                      +--[_/dev/drbd0_]--[_r0_]--[_/dev/sda5_]--+     |  |  |
 |  |  |     |                                    |    |                      |                                   |                      |    |                                    |     |  |  |
 |  |  |     |                                    \----|--\                   |                                   |                   /--|----/                                    |     |  |  |
 |  |  |     |   ___________    ____    ____________   |  |                   |                                   |                   |  |   ____________    ____    ___________   |     |  |  |
 |  |  |     \--[_/dev/sda6_]--[_r1_]--[_/dev/drbd1_]--/  |                   |                                   |                   |  \--[_/dev/drbd1_]--[_r1_]--[_/dev/sda6_]--/     |  |  |
 |  |  |                                          |       |                   |                                   |                   |       |                                          |  |  |
 |  |  |   Clustered LVM:                         |       |                   |                                   |                   |       |                      Clustered LVM:      |  |  |
 |  |  |   _________________________________      |       |                   |                                   |                   |       |   _________________________________      |  |  |
 |  |  +--[_/dev/an-a05n01_vg0/vm02-win2012_]-----+       |                   |                                   |                   |       +--[_/dev/an-a05n01_vg0/vm02-win2012_]-----+  |  |
 |  |  |   __________________________________     |       |                   |                                   |                   |       |   __________________________________     |  |  |
 |  |  +--[_/dev/an-a05n01_vg0/vm05-freebsd9_]----+       |                   |                                   |                   |       +--[_/dev/an-a05n01_vg0/vm05-freebsd9_]----+  |  |
 |  |  |   ___________________________________    |       |                   |                                   |                   |       |   ___________________________________    |  |  |
 |  |  \--[_/dev/an-a05n01_vg0/vm06-solaris11_]---/       |                   |                                   |                   |       \--[_/dev/an-a05n01_vg0/vm06-solaris11_]---/  |  |
 |  |                                                     |                   |                                   |                   |                                                     |  |
 |  |      _________________________________              |                   |                                   |                   |           _________________________________         |  |
 |  +-----[_/dev/an-a05n02_vg0/vm01-win2008_]-------------+                   |                                   |                   +----------[_/dev/an-a05n02_vg0/vm01-win2008_]--------+  |
 |  |      ______________________________                 |                   |                                   |                   |           ______________________________            |  |
 |  +-----[_/dev/an-a05n02_vg0/vm03-win7_]----------------+                   |                                   |                   +----------[_/dev/an-a05n02_vg0/vm03-win7_]-----------+  |
 |  |      ______________________________                 |                   |                                   |                   |           ______________________________            |  |
 |  +-----[_/dev/an-a05n02_vg0/vm04-win8_]----------------+                   |                                   |                   +----------[_/dev/an-a05n02_vg0/vm04-win8_]-----------+  |
 |  |      _______________________________                |                   |                                   |                   |           _______________________________           |  |
 |  +-----[_/dev/an-a05n02_vg0/vm07-rhel6_]---------------+                   |                                   |                   +----------[_/dev/an-a05n02_vg0/vm07-rhel6_]----------+  |
 |  |      ________________________________               |                   |                                   |                   |           ________________________________          |  |
 |  \-----[_/dev/an-a05n02_vg0/vm08-sles11_]--------------+                   |                                   |                   +----------[_/dev/an-a05n02_vg0/vm08-sles11_]---------/  |
 |         ___________________________                    |                   |                                   |                   |           ___________________________                  |
 |     /--[_/dev/an-a05n01_vg0/shared_]-------------------/                   |                                   |                   \----------[_/dev/an-a05n01_vg0/shared_]--\              |
 |     |   _________                                                          |     _________________________     |                                                  ________   |              |
 |     \--[_/shared_]                                                         |    | an-switch01    Switch 1 |    |                                                 [_shared_]--/              |
 |                                                        ____________________|    |____  Back-Channel   ____|    |____________________                                                        |
 |                                                       | IPMI               =----=_03_]    Network    [_04_=----=               IPMI |                                                       |
 |                                                       | 10.20.51.1        ||    |_________________________|    ||        10.20.51.2 |                                                       |
 |                                  _________    _____   | 00:19:99:9A:D8:E8 ||    | an-switch02    Switch 2 |    || 00:19:99:9A:B1:78 |   _____    _________                                  |
 |                                 {_sensors_}--[_BMC_]--|___________________||    |                         |    ||___________________|--[_BMC_]--{_sensors_}                                 |
 |                                                             ______ ______  |    |       VLAN ID 100       |    |  ______ ______                                                             |
 |                                                            | PSU1 | PSU2 | |    |____   ____   ____   ____|    | | PSU1 | PSU2 |                                                            |
 |____________________________________________________________|______|______|_|    |_03_]_[_07_]_[_08_]_[_04_|    |_|______|______|____________________________________________________________|
                                                                   || ||             |      |      |       |             || ||                                                                  
                                       /---------------------------||-||-------------|------/      \-------|-------------||-||---------------------------\                                      
                                       |                           || ||             |                     |             || ||                           |                                      
                        _______________|___                        || ||   __________|________     ________|__________   || ||                        ___|_______________                       
                       |             UPS 1 |                       || ||  |             PDU 1 |   |             PDU 2 |  || ||                       |             UPS 2 |                      
                       | an-ups01          |                       || ||  | an-pdu01          |   | an-pdu02          |  || ||                       | an-ups02          |                      
             _______   | 10.20.3.1         |                       || ||  | 10.20.2.1         |   | 10.20.2.2         |  || ||                       | 10.20.3.1         |   _______            
            {_Mains_}==| 00:C0:B7:58:3A:5A |=======================||=||==| 00:C0:B7:56:2D:AC |   | 00:C0:B7:59:55:7C |==||=||=======================| 00:C0:B7:C8:1C:B4 |=={_Mains_}           
                       |___________________|                       || ||  |___________________|   |___________________|  || ||                       |___________________|                      
                                                                   || ||                 || ||     || ||                 || ||                                                                  
                                                                   || \\===[ Port 1 ]====// ||     || \\====[ Port 2 ]===// ||                                                                  
                                                                   \\======[ Port 1 ]=======||=====//                       ||                                                                  
                                                                                            \\==============[ Port 2 ]======//

Feature	Consideration
MTU size	The default packet size on a network is 1500 bytes. If you build your VLANs in software, you need to account for the extra size needed for the VLAN header. If your switch supports "Jumbo Frames", then there should be no problem. However, some cheap switches do not support jumbo frames, requiring you to reduce the MTU size value for the interfaces on your nodes. If you have particularly large chunks of data to transmit, you may want to enable the largest MTU possible. This maximum value is determined by the smallest MTU in your network equipment. If you have nice network cards that support traditional 9 KiB MTU, but you have a cheap switch that supports a small jumbo frame, say 4 KiB, your effective MTU is 4 KiB.
Packets Per Second	This is a measure of how many packets can be routed per second, and generally is a reflection of the switch's processing power and memory. Cheaper switches will not have the ability to route a high number of packets at the same time, potentially causing congestion.
Multicast Groups	Some fancy switches, like some Cisco hardware, don't maintain multicast groups persistently. The cluster software uses multicast for communication, so if your switch drops a multicast group, it will cause your cluster to partition. If you have a managed switch, ensure that persistent multicast groups are enabled. We'll talk more about this later.
Port speed and count versus Internal Fabric Bandwidth	A switch that has, say, 48 Gbps ports may not be able to route 48 Gbps. This is a problem similar to over-provisioning we discussed above. If an inexpensive 48 port switch has an internal switch fabric of only 20 Gbps, then it can handle only up to 20 saturated ports at a time. Be sure to review the internal fabric capacity and make sure it's high enough to handle all connected interfaces running full speed. Note, of course, that only one link in a given bond will be active at a time.
Uplink speed	If you have a gigabit switch and you simply link the ports between the two switches, the link speed will be limited to 1 gigabit. Normally, all traffic will be kept on one switch, so this is fine. If a single link fails over to the backup switch, then its traffic will bounce up via the uplink cable to the main switch at full speed. However, if a second link fails, both will be sharing the single gigabit uplink, so there is a risk of congestion on the link. If you can't get stacked switches, which generally have 10 Gbps speeds or higher, then look for switches with 10 Gbps dedicated uplink ports and use those for uplinks.
Uplinks and VLANs	When using normal ports for uplinks with VLANs defined in the switch, each uplink port will be restricted to the VLAN it is a member of. In this case, you will need one uplink cable per VLAN.
Port Trunking	If your existing network supports it, choosing a switch with port trunking provides a backup link from the foundation pack switches to the main network. This extends the network redundancy out to the rest of your network.

Device	Host name	Examples	Note
Network Switches	xx-sYY	Switch #1; an-switch01 Switch #2; an-switch02	The xx prefix is the owner's prefix and YY is a simple sequence number.
Switched PDUs	xx-pYY	PDU #1; an-pdu01 PDU #2; an-pdu02	The xx prefix is the owner's prefix and YY is a simple sequence number.
Network Managed UPSes	xx-uYY	UPS #1; an-ups01 UPS #2; an-ups02	The xx prefix is the owner's prefix and YY is a simple sequence number.
Dashboard Servers	xx-mYY	Dashboard #1; an-striker01 Dashboard #2; an-striker02	The xx prefix is the owner's prefix and YY is a simple sequence number. Note that the m letter was chosen for historical reasons. The dashboard used to be called "monitoring servers". For consistency with existing dashboards, m has remained. Note also that the dashboards will connect to both the BCN and SN, so like the nodes, host names with the .bcn and .ifn suffixes will be used.

Purpose	Subnet	Notes
Internet-Facing Network (IFN)	10.255.50.0/16	Each node will use 10.255.50.x where x matches the node ID. Servers hosted by the Anvil! will use 10.255.1.x where x is the server's sequence number. Dashboard servers will use 10.255.4.x where x is the dashboard's sequence number.
Storage Network (SN)	10.10.50.x/16	Each node will use 10.10.50.x where x matches the node ID.
Back-Channel Network (BCN)	10.20.50.0/16	Each node will use 10.20.50.x where x matches the node ID. Node-specific IPMI or other out-of-band management devices will use 10.20.51.x where x matches the node ID. Network switches will use the IP addresses 10.20.1.x, where x is the switch's sequence number. Switched PDUs, which we will use as backup fence devices, will use 10.20.2.x where x is the PDU's sequence number. Network-managed UPSes with use 10.20.3.x where x is the UPS's sequence number. Dashboard servers will use 10.20.4.x where x is the dashboard's sequence number.

Component	Protocol	Port	Note
dlm	TCP	21064
drbd	TCP	7788+	Each DRBD resource will use an additional port, generally counting up (ie: r0 will use 7788, r1 will use 7789, r2 will use 7790 and so on).
luci	TCP	8084	Optional web-based configuration tool, not used in this tutorial but documented for reference.
modclusterd	TCP	16851
ricci	TCP	11111	Each DRBD resource will use an additional port, generally counting up (ie: r1 will use 7790, r2 will use 7791 and so on).
totem	UDP/multicast	5404, 5405	Uses a multicast group for cluster communications

Node	BCN IP and Device	SN IP and Device	IFN IP and Device
an-a05n01	10.20.50.1 on bcn_bond1	10.10.50.1 on sn_bond1	10.255.50.1 on ifn_bridge1 (ifn_bond1 slaved)
an-a05n02	10.20.50.2 on bcn_bond1	10.10.50.2 on sn_bond1	10.255.50.2 on ifn_bridge1 (ifn_bond1 slaved)

Subnet	Cable Colour	VLAN ID	Link 1	Link 2	Bond	IP
BCN	White	100	bcn_link1	bcn_link2	bcn_bond1	10.20.x.y/16
SN	Green	200	sn_link1	sn_link2	sn_bond1	10.10.x.y/16
IFN	Black	300	ifn_link1	ifn_link2	ifn_bond1	10.255.x.y/16

an-a05n01	cat /dev/null >/etc/libvirt/qemu/networks/default.xml
an-a05n02	cat /dev/null >/etc/libvirt/qemu/networks/default.xml

an-a05n01	/etc/init.d/iptables save iptables: Saving firewall rules to /etc/sysconfig/iptables:[ OK ]
an-a05n02	/etc/init.d/iptables save iptables: Saving firewall rules to /etc/sysconfig/iptables:[ OK ]

Have	Want
eth4	bcn_link1
eth5	sn_link1
	ifn_link1
	bcn_link2
	sn_link2
	ifn_link2

Device	New MAC address
bcn_link1	00:19:99:9C:9B:9E
sn_link1
ifn_link1
bcn_link2
sn_link2
ifn_link2

Variable	Description
DEVICE	This is the actual name given to this device. Generally is matches the file name. In this case, the DEVICE is ifn_bridge1 and the file name is ifcfg-ifn_bridge1. This matching of file name to device name is by convention and not strictly required.
TYPE	This is either Ethernet, the default, or Bridge, as we use here. Note that these values are case-sensitive! By setting this here, we're telling the OS that we're creating a bridge device.
NM_CONTROLLED	This can be yes, which is the default, or no, as we set here. This tells Network Manager that it is not allowed to manage this device. We've removed the NetworkManager package, so this is not strictly needed, but we'll add it just in case it gets installed in the future.
BOOTPROTO	This can be either none, which we're using here, dhcp or bootp if you want the interface to get an IP from a DHCP or BOOTP server, respectively. We're setting it to static, so we want this set to none.
IPADDR	This is the dotted-decimal IP address we're assigning to this interface.
NETMASK	This is the dotted-decimal subnet mask for this interface.
GATEWAY	This is the IP address the node will contact when we it needs to send traffic to other networks, like the Internet.
DNS1	This is the IP address of the primary domain name server to use when the node needs to translate a host or domain name into an IP address which wasn't found in the /etc/hosts file.
DNS2	This is the IP address of the backup domain name server, should the primary DNS server specified above fail.
DEFROUTE	This can be set to yes, as we've set it here, or no. If two or more interfaces has DEFROUTE set, the interface with this variable set to yes will be used.

Variable	Description
mode	This tells the Linux kernel what kind of bond we're creating here. There are seven modes available, each with a numeric value representing them. We're going use the "Active/Passive" mode, known as mode 1 (active-backup). As of RHEL 6.4, modes 0 (balance-rr) and mode 2 (balance-xor) are supported for use with corosync. Given the proven reliability of surviving numerous tested failure and recovery tests though, AN! still strongly recommends mode 1.
miimon	This tells the kernel how often, in milliseconds, to check for unreported link failures. We're using 100 which tells the bonding driver to check if the network cable has been unplugged or plugged in every 100 milliseconds. Most modern drivers will report link state via their driver, so this option is not strictly required, but it is recommended for extra safety.
use_carrier	Setting this to 1 tells the driver to use the driver to maintain the link state. Some drivers don't support that. If you run into trouble where the link shows as up when it's actually down, get a new network card or try changing this to 0.
updelay	Setting this to 120000 tells the driver to delay switching back to the primary interface for 120,000 milliseconds (120 seconds / 2 minutes). This is designed to give the switch connected to the primary interface time to finish booting. Setting this too low may cause the bonding driver to switch back before the network switch is ready to actually move data. Some switches will not provide a link until it is fully booted, so please experiment.
downdelay	Setting this to 0 tells the driver not to wait before changing the state of an interface when the link goes down. That is, when the driver detects a fault, it will switch to the backup interface immediately. This is the default behaviour, but setting this here insures that it is reset when the interface is reset, should the delay be somehow set elsewhere.

Variable	Description
Bonding Mode	This tells us which bonding mode is currently active. Here we see fault-tolerance (active-backup), which is exactly what we wanted when we set mode=1 in the bond's configuration file.
Primary Slave	This tells us that the bond will always use bcn_link1 if it is available. Recall that we set a primary interface to ensure that, when everything is working properly, all network traffic goes through the same switch to avoid congestion on the stack/uplink cable.
Currently Active Slave	This tells us which interface is being used at this time. If this shows the secondary interface, then either the primary has failed, or the primary has recovered by the updelay timer hasn't yet expired.
MII Status	This shows the effective link state of the bond. If either one of the slaved interfaces is active, this will be up.
MII Polling Interval (ms)	If you recall, this was set to 100ms, which tells the bond driver to verify the link state of the slaved interfaces.
Up Delay (ms)	This tells us how long the bond driver will wait before switching to the secondary interface. We want immediate fail-over, so we have this set to 0.
Down Delay (ms)	This tells us that the bond will wait for two minutes after a slaved interface comes up before it will consider it ready for use.

Variable	bcn_link1	bcn_link2	Description
Slave Interface	bcn_link1	bcn_link2	This is the name of the slaved device. The values below this reflect that named interface's state.
MII Status	up	up	This shows the current link state of the interface. Values you will see are: up, down and going back. The first two are obvious. The third is the link state between when the link comes up and before the updelay timer expires.
Speed	1000 Mbps	1000 Mbps	This tells you the link speed that the current interface is operating at. If it's ever lower than you expect, look in the switch configuration for statically set speeds. If that's not it, try another network cable.
Duplex	full	full	This tells you whether the given interface can send and receive network traffic at the same time, full, or not, half. All modern devices should support full duplex, so if you see half, examine your switch and cables.
Link Failure Count	0	0	When the bond driver starts, this is set to 0. Each time the link "fails", which includes an intentional unplugging of the cable, this counter increments. There is no hard in this increasing if the "errors" where intentional or known. It can be useful in detecting flaky connections though, should you find this number to be higher than expected.
Permanent HW addr	00:19:99:9c:9b:9e	00:1b:21:81:c3:35	This is the real MAC address of the slaved interface. Those who are particularly observant will have noticed that, in the ifconfig output above, both bcn_link1 and bcn_link2 showed the same MAC address. This is partly how active-passive bonding is able to fail over so extremely quickly. The MAC address of which ever interface is active will appear in ifconfig as the HWaddr address of both bond members.
Slave queue ID	0	0	In other bonding modes, this can be used to help direct certain traffic down certain slaved interface links. We won't use this so it should always be 0

Variable	ifn_link1	ifn_link2	Description
bridge name	ifn_bridge1	ifn_bridge1	This is the device name we set when we created the ifcfg-ifn_bridge1 configuration file.
bridge id	8000.001b2181c334	8000.001b2181c2ea	This is an automatically create unique ID for the given bridge.
STP enabled	no	no	This tells us where spanning tree protocol is enabled or not. Default is to be disabled, which is fine. If you enable it, it will help protect against loops that can cause broadcast storms and flood your network. Given how difficult it is to accidentally "plug both ends of a cable into the same switch", it's generally safe to leave off.
interfaces	ifn_bond1	ifn_bond1	This tells us which network interfaces are "plugged into" the bridge. We don't have any servers yet, so only ifn_bond1 is plugged in, which is the link that provides a route out to the real world. Later, when we create our servers, a vnetX file will be created for each server's interface. These are the virtual "network cables" providing a link between the servers and the bridge.

Variable	Value for an-ups01	Description
UPSNAME	an-ups01	This is the name to use for this UPS when writing log entries or reporting status information. It should be less than eight characters long. We're going to use the short host name for the UPS.
UPSTYPE	snmp	This tells apcupsd that we will communicate with this UPS using SNMP to talk to the network management card in the UPS.
DEVICE	an-ups01.alteeve.ca:161:APC_NOTRAP:private	This is the connection string needed for establishing the SNMP connection to the UPS. It's separated into four sections, each section separated by colons. The first value is the host name or IP address of the UPS. The second section is the TCP port to connect to, which is 161 on APC brand UPSes. The third and fourth sections are the vendor name and SNMP community, respectively. We're using the vendor name APC_NOTRAP in order to disable SNMP traps. The community should usually be private, unless you changed it in the network management card itself.
POLLTIME	30	This tells apcupsd how often, in seconds, to query the UPS status. The default is once per minute, but we will want twice per minute in order to match the scan frequency of the monitoring and alter system we will use later.
SCRIPTDIR	/etc/apcupsd/null	This tells apcupsd to use the scripts in our new null directory instead of the default ones.
PWRFAILDIR	/etc/apcupsd/null	Some UPSes need to be powered off themselves when the power is about to run out of the batteries. This is controlled by a file written to this directory which apcupsd's shut down script looks for. We've disabled shut down, but to be safe and thorough, we will disable this as well by pointing it at our null directory.
BATTERYLEVEL	0	This tells apcupsd to initiate a shut down once the UPS reports this percentage left in the batteries. We've disabled automatic shut down, but just the same, we'll set this to 0.
MINUTES	0	This tells apcupsd to initiate a shut down once the UPS reports this many minutes of run time left in the batteries. We've disabled automatic shut down, but just the same, we'll set this to 0.
NISPORT	3551	The default value here is fine for an-ups01, but it is important to highlight here. We will use apcaccess to query apcupsd's data over the network, even though it's on the same machine. Each UPS we monitor will have an apcupsd daemon running and listening on a dedicated TCP port. The first UPS, an-ups01, will listen on the default port. Which port we specify when using apcaccess later will determine which UPS status information is returned.
ANNOY	0	Normally, apcupsd will start "annoying" the users of the system to save their work and log out five minutes (300 seconds) before calling the shut down of the server. We're disabling automatic shut down, so this needs to be disabled.
EVENTSFILE	/var/log/apcupsd.an-ups01.events	This is where events related to this UPS are recorded.

Variable	Changed value for an-ups02
UPSNAME	an-ups02
DEVICE	an-ups02.alteeve.ca:161:APC_NOTRAP:private
NISPORT	3552
EVENTSFILE	/var/log/apcupsd.an-ups02.events

an-a05n01	semanage port -l \|grep apcups apcupsd_port_t tcp 3551 apcupsd_port_t udp 3551
an-a05n02	semanage port -l \|grep apcups apcupsd_port_t tcp 3551 apcupsd_port_t udp 3551

an-a05n01	touch /var/log/apcupsd.an-ups01.events touch /var/log/apcupsd.an-ups02.events
an-a05n02	touch /var/log/apcupsd.an-ups01.events touch /var/log/apcupsd.an-ups02.events

an-a05n01	/etc/init.d/ntpd restart Shutting down ntpd: [ OK ] Starting ntpd: [ OK ]
an-a05n02	/etc/init.d/ntpd restart Shutting down ntpd: [ OK ] Starting ntpd: [ OK ]

Test	Victim	Pass?
echo c > /proc/sysrq-trigger	an-a05n01	Yes / No
fence_apc_snmp -a an-pdu01.alteeve.ca -n 1 -o off fence_apc_snmp -a an-pdu02.alteeve.ca -n 1 -o off	an-a05n01	Yes / No
echo c > /proc/sysrq-trigger	an-a05n02	Yes / No
fence_apc_snmp -a an-pdu01.alteeve.ca -n 2 -o off fence_apc_snmp -a an-pdu02.alteeve.ca -n 2 -o off	an-a05n02	Yes / No

Cable short 1	Cable short 2	Cable short 3
Thanks to my very talented fellow admin, Lisa Seelye, for this object lesson.

an-a05n01	an-a05n02
vm01-win2008 (150 GB)
	vm02-win2012 (150 GB)
vm03-win7 (100 GB)
vm04-win8 (100 GB)
	vm05-freebsd9 (50 GB)
	vm06-solaris11 (100 GB)
vm07-rhel6 (50 GB)
vm08-sles11 (100 GB)
Total: 500 GB	Total: 300 GB

an-a05n01	pvscan No matching physical volumes found
an-a05n02	pvscan No matching physical volumes found

an-a05n01	vgscan Reading all physical volumes. This may take a while... No volume groups found
an-a05n02	vgscan Reading all physical volumes. This may take a while... No volume groups found

AN!Cluster Tutorial 2

What's New?

A Note on Terminology

Why Should I Follow This (Lengthy) Tutorial?

High-Level Explanation of How HA Clustering Works

The Task Ahead

A Note on Patience

Technologies We Will Use

A Note on Hardware

System Requirements

Recommended Hardware; A Little More Detail

The Most Important Consideration - Storage

Extra Security - LSI SafeStore

RAM - Preparing for Degradation

Never Over-Provision!

CPU Cores - Possibly Acceptable Over-Provisioning

A Note on Hyper-Threading

Six Network Interfaces, Seriously?

A Note on Dedicated IPMI Interfaces

Network Switches

Why Switched PDUs?

Network Managed UPSes Are Worth It

Dashboard Servers

What You Should Know Before Beginning

A Word on Complexity

Overview of Components

Component; Cman

Component; Corosync

A Little History

The Future of Corosync

Concept; Quorum

Concept; Virtual Synchrony

Concept; Fencing

Is "Fencing" the same as STONITH?

Component; Totem

Component; Rgmanager

What about Pacemaker?

Component; Qdisk

Component; DRBD

Component; DLM

Component; Clustered LVM

Component; GFS2

Component; KVM

Node Installation

Node Host Names

Foundation Pack Host Names

OS Installation

Network Security Considerations

SELinux Considerations

Network

A Map!

Subnets

A Note on STP

Setting Up the Network

Planning The Use of Physical Interfaces

Connecting Fence Devices

Let's Build!

Why so Much Duplication of Commands?

Red Hat Enterprise Linux Specific Steps

Add the Alteeve's Niche! Repo

Update the OS

Installing Required Programs

Switch Network Daemons

Altering Which Daemons Start on Boot

Network Security

Configuring iptables

Mapping Physical Network Interfaces to ethX Device Names

Making Sure All Network Interfaces are Started

Finding Current Names for Physical Interfaces

Building the MAC Address List

Changing the Interface Device Names

Test the New Network Name Mapping

Configuring our Bridge, Bonds and Interfaces

Creating New Network Configuration Files

Configuring the Bridge

Creating the Bonded Interfaces

Alter the Interface Configurations

Loading the New Network Configuration

Verifying the New Network Config

Adding Everything to /etc/hosts

an-a05n01	lvscan <nothing> # nothing printed
an-a05n02	lvscan # nothing printed

an-a05n01	/etc/init.d/gfs2 status Configured GFS2 mountpoints: /shared Active GFS2 mountpoints: /shared
an-a05n02	/etc/init.d/gfs2 status Configured GFS2 mountpoints: /shared Active GFS2 mountpoints: /shared

an-a05n01	ccs_config_validate Configuration validates
an-a05n02	cman_tool version 6.2.0 config 7

an-a05n01	/etc/init.d/rgmanager start Starting Cluster Service Manager: [ OK ]
an-a05n02	/etc/init.d/rgmanager start Starting Cluster Service Manager: [ OK ]

an-a05n01	/etc/init.d/libvirtd status libvirtd (pid 12131) is running...
an-a05n02	/etc/init.d/libvirtd status libvirtd (pid 11939) is running...

Command	Desctiption
clusvcadm -e <service> -m <node>	Enable the <service> on the specified <node>. When a <node> is not specified, the local node where the command was run is assumed.
clusvcadm -d <service>	Disable (stop) the <service>.

Terminal layout for monitoring during network testing.
an-a05n01, terminal window @ 70 x 10 Watch bcn_bond1	an-a05n01, terminal window @ 70 x 10 Ping flood an-a05n02.bcn	an-a05n01, terminal window @ 127 x 10 Watch cman_tool nodes
an-a05n01, terminal window @ 70 x 10 Watch sn_bond1	an-a05n01, terminal window @ 70 x 10 Ping flooding an-a05n02.sn	an-a05n01, terminal window @ 127 x 10 Watching /etc/init.d/drbd status
an-a05n01, terminal window @ 70 x 10 Watch ifn_bond1	an-a05n01, terminal window @ 70 x 10 Ping flood an-a05n02.ifn	an-a05n01, terminal window @ 127 x 10 Watch tail -f -n 0 /var/log/messages
an-a05n02, terminal window @ 70 x 10 Watch bcn_bond1	an-a05n02, terminal window @ 70 x 10 Ping flood an-a05n01.bcn	an-a05n02, terminal window @ 127 x 10 Watch cman_tool nodes
an-a05n02, terminal window @ 70 x 10 Watch sn_bond1	an-a05n02, terminal window @ 70 x 10 Ping flooding an-a05n01.sn	an-a05n02, terminal window @ 127 x 10 Watching /etc/init.d/drbd status
an-a05n02, terminal window @ 70 x 10 Watch ifn_bond1	an-a05n02, terminal window @ 70 x 10 Ping flood an-a05n01.ifn	an-a05n02, terminal window @ 127 x 10 Watch tail -f -n 0 /var/log/messages

an-a05n01	Task	Command
	Watch bcn_bond1	watch "cat /proc/net/bonding/bcn_bond1 \| grep -e Slave -e Status \| grep -v queue"
	Watch sn_bond1	watch "cat /proc/net/bonding/sn_bond1 \| grep -e Slave -e Status \| grep -v queue"
	Watch ifn_bond1	watch "cat /proc/net/bonding/ifn_bond1 \| grep -e Slave -e Status \| grep -v queue"
	Ping flood an-a05n02.bcn	clear; ping -f an-a05n02.bcn
	Ping flood an-a05n02.sn	clear; ping -f an-a05n02.sn
	Ping flood an-a05n02.ifn	clear; ping -f an-a05n02.ifn
	Watch cluster membership	watch cman_tool nodes
	Watch DRBD resource status	watch /etc/init.d/drbd status
	tail system logs	clear; tail -f -n 0 /var/log/messages
an-a05n02	Task	Command
	Watch bcn_bond1	watch "cat /proc/net/bonding/bcn_bond1 \| grep -e Slave -e Status \| grep -v queue"
	Watch sn_bond1	watch "cat /proc/net/bonding/sn_bond1 \| grep -e Slave -e Status \| grep -v queue"
	Watch ifn_bond1	watch "cat /proc/net/bonding/ifn_bond1 \| grep -e Slave -e Status \| grep -v queue"
	Ping flood an-a05n01.bcn	clear; ping -f an-a05n01.bcn
	Ping flood an-a05n01.sn	clear; ping -f an-a05n01.sn
	Ping flood an-a05n01.ifn	clear; ping -f an-a05n01.ifn
	Watch cluster membership	watch cman_tool nodes
	Watch DRBD resource status	watch /etc/init.d/drbd status
	tail system logs	clear; tail -f -n 0 /var/log/messages

an-a05n01
	Watching bcn_bond1 Primary Slave: bcn_link1 (primary_reselect always) Currently Active Slave: bcn_link1 MII Status: up Slave Interface: bcn_link1 MII Status: up Slave Interface: bcn_link2 MII Status: up	Ping flooding an-a05n02.bcn PING an-a05n02.bcn (10.20.50.2) 56(84) bytes of data. .	Watching cman_tool nodes Node Sts Inc Joined Name 1 M 348 2013-12-02 10:05:17 an-a05n01.alteeve.ca 2 M 360 2013-12-02 10:17:45 an-a05n02.alteeve.ca
	Watching sn_bond1 Primary Slave: sn_link1 (primary_reselect always) Currently Active Slave: sn_link1 MII Status: up Slave Interface: sn_link1 MII Status: up Slave Interface: sn_link2 MII Status: up	Ping flooding an-a05n02.sn PING an-a05n02.sn (10.10.50.2) 56(84) bytes of data. .	Watching /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2013-09-27 16:00:43 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Primary UpToDate/UpToDate C 1:r1 Connected Primary/Primary UpToDate/UpToDate C
	Watching ifn_bond1 Primary Slave: ifn_link1 (primary_reselect always) Currently Active Slave: ifn_link1 MII Status: up Slave Interface: ifn_link1 MII Status: up Slave Interface: ifn_link2 MII Status: up	Ping flooding an-a05n02.ifn PING an-a05n02.ifn (10.255.50.2) 56(84) bytes of data. .	Watching tail -f -n 0 /var/log/messages
an-a05n02
	Watching bcn_bond1 Primary Slave: ifn_link1 (primary_reselect always) Currently Active Slave: ifn_link1 MII Status: up Slave Interface: ifn_link1 MII Status: up Slave Interface: ifn_link2 MII Status: up	Ping flooding an-a05n01.bcn PING an-a05n01.bcn (10.20.50.1) 56(84) bytes of data. .	Watching cman_tool nodes Node Sts Inc Joined Name 1 M 360 2013-12-02 10:17:45 an-a05n01.alteeve.ca 2 M 356 2013-12-02 10:17:45 an-a05n02.alteeve.ca
	Watching sn_bond1 Primary Slave: sn_link1 (primary_reselect always) Currently Active Slave: sn_link1 MII Status: up Slave Interface: sn_link1 MII Status: up Slave Interface: sn_link2 MII Status: up	Ping flooding an-a05n01.sn PING an-a05n01.sn (10.10.50.1) 56(84) bytes of data. .	Watching /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2013-09-27 16:00:43 m:res cs ro ds p mounted fstype 0:r0 Connected Primary/Primary UpToDate/UpToDate C 1:r1 Connected Primary/Primary UpToDate/UpToDate C
	Watching ifn_bond1 Primary Slave: ifn_link1 (primary_reselect always) Currently Active Slave: ifn_link1 MII Status: up Slave Interface: ifn_link1 MII Status: up Slave Interface: ifn_link2 MII Status: up	Ping flooding an-a05n01.ifn PING an-a05n01.ifn (10.255.50.1) 56(84) bytes of data. .	Watching tail -f -n 0 /var/log/messages

Server	RAM (GiB)	Storage Pool (VG)	LV name	LV size
vm01-win2008	3	an-a05n01	vm01-win2008_0	150 GB
vm02-win2012	4	an-a05n02	vm02-win2012_0	150 GB
vm03-win7	3	an-a05n01	vm03-win7_0	100 GB
vm04-win8	4	an-a05n01	vm04-win8_0	100 GB
vm05-freebsd9	2	an-a05n02	vm05-freebsd9_0	50 GB
vm06-solaris11	2	an-a05n02	vm06-solaris11_0	100 GB
vm07-rhel6	2	an-a05n01	vm07-rhel6_0	50 GB
vm08-sles11	2	an-a05n01	vm08-sles11_0	100 GB