Fujitsu Notes: Difference between revisions

From Alteeve Wiki
Jump to navigation Jump to search
Line 585: Line 585:
Bingo!
Bingo!


=== Repartitioning an-c05n01 ===
=== Repartitioning the nodes ===


TODO: Document these steps properly.
TODO: Document these steps properly.

Revision as of 17:59, 3 August 2014

 AN!Wiki :: How To :: Fujitsu Notes

This covers specific setup notes for Fujitsu Primenergy servers on EL6.

Primergy RX200 S8

BIOS

  • Advanced
    • SATA Configuration
      • SATA Mode -> Disabled
  • Server Mgmt
    • Asset Tag -> (short host name)
    • Temperature Monitoring -> Enabled
  • Boot
    • Bootup NumLock State -> On
    • PXE Boot Option Retry -> Enabled

RAID Controller (D3116C)

  • Configuration Wizard
    • New Configuration -> Next
    • Confirm clear config -> Yes
    • Manual Configuration -> Next
      • Press and hold <ctrl>, Click to highlight all Drives in left pane -> Add to Array -> Accept DG -> Next.
      • Add to SPAN -> Next
        • RAID Level; 1-8 driver == RAID 5, 9+ drives == RAID 6
        • Write Policy -> Write Back with BBU
        • Select Size -> Enter size in green text under right pane; R5 size for RAID 5, R6 size for RAID 6.
        • Accept -> Confirm cache policy; Yes -> Next
    • Accept
    • Save the configuration; Yes -> Confirm existing data wipe; Yes
    • Click to select Set Boot Drive -> Go -> Back
  • Exit -> Confirm exit; Yes

Reboot.

Primergy RX300 S7

RAID

Install the MegaCLI tools;

Check for an updates MegaCLI from here (under "Management Software and Tools"). If there is an updated version, follow the

mkdir ~/temp
cd ~/temp
# Download the updated 8.04.07_MegaCLI.zip here
unzip 8.04.07_MegaCLI.zip
unzip CLI_Lin_8.04.07.zip 
unzip MegaCliLin.zip
rpm -Uvh MegaCli-8.04.07-1.noarch.rpm Lib_Utils-1.00-09.noarch.rpm

# This makes MegaCli64 app available without the full path
ln -s /opt/MegaRAID/MegaCli/MegaCli64 /sbin/

If you want to install from the AN!Cache, you can do this;

rpm -Uvh https://alteeve.ca/files/Lib_Utils-1.00-09.noarch.rpm https://alteeve.ca/files/MegaCli-8.04.07-1.noarch.rpm

# This makes MegaCli64 app available without the full path
ln -s /opt/MegaRAID/MegaCli/MegaCli64 /sbin/

Once installed, verify that you can see your hardware:

Replacing a Failed Drive

Replacing a failed drive involves two steps;

  1. Identify the drive that is failing and gathering the data needed to request the RMA.
  2. Swapping the actual drive when the replacement arrives on client site.

Identify the Failing Drive

If the drive has failed entirely, the red front LED on the drive should be lit, making identification and RMA request simple.

However, if the drive has not yet failed, identifying the drive and confirming it's pending failure requires a little extra work.

Identify the failed drive:

MegaCli64 PDList aAll
<snip>

Enclosure Device ID: 10
Slot Number: 5
Drive's postion: DiskGroup: 0, Span: 0, Arm: 5
Enclosure position: 1
Device Id: 7
WWN: 5000C50054AE9C38
Sequence Number: 2
Media Error Count: 0
Other Error Count: 2
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: 5301
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50054ae9c39
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: SEAGATE ST9300653SS     53016XN1EMF2    @#87980
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature :29C (84.20 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No

Notice the line:

Other Error Count: 2

This is a sign of impending failure, despite SMART showing the drive as still healthy. You can gather more details (which Fujitsu will require to verify impending failure), run;

MegaCli64 -AdpEventLog -GetEvents -f raid_events.log -aALL

When this finishes gathering data, it will create a file called raid_events.log. Send this file to your Fujitsu support rep. They will validate the pending failure and issue an RMA.

Identifying The Drive Prior to Replacement

If the drive has failed, identifying the drive is as simple as looking at the front of the node for the drive with the red error LED lit.

If the drive has not failed yet, then you can use the identify command to blink the LED. That is what we'll cover here.

In the previous section, we identified the failed drive using the MegaCli64 command. You need to note the following;

MegaCli64 PDList aAll
Enclosure Device ID: 10
Slot Number: 5
...
Other Error Count: 2

The two bits of information you need are the enclosure ID and slot number. In this case, that is 10 and 5, respectively.

With that info, you can trigger the drive's red LED using the following command;

MegaCli64 -PdLocate -start -physdrv [10:5] -aALL
Adapter: 0: Device at EnclId-10 SlotId-5  -- PD Locate Start Command was successfully sent to Firmware 

Exit Code: 0x00

Once you've located the drive, you can stop the "locate" command using:

MegaCli64 -PdLocate -stop -physdrv [10:5] -aALL
Adapter: 0: Device at EnclId-10 SlotId-5  -- PD Locate Stop Command was successfully sent to Firmware 

Exit Code: 0x00

Now that you know which drive has failed, you can OFFLINE it in preparation of replacing it.

MegaCli64 -PDOffline -physdrv [10:5] -aALL
Adapter: 0: EnclId-10 SlotId-5 state changed to OffLine.

Exit Code: 0x00

You can now physically remove the failed disk and insert the replacement disk.

Monitoring the Rebuild

When a replacement disk is inserted, the array should recognise it and automatically begin rebuilding the array. You can monitor this operation by calling;

MegaCli64 -PDRbld -ProgDsply -PhysDrv [10:5] -aALL

This will display the rebuild progress as a textual bar graph. A rebuild of a 300 GB 15,000rpm SAS drive in a 6-drive array took about 30 minutes. How long it takes in your case will vary depending on disk speed, array size and load.

Growing an Array

This assumes you've added additional disks to the node. In our example, we're growing a 3-disk RAID 5 array to a 4-disk array.

Start Point

View from parted:

parted -a opt /dev/sda "print free"
Model: LSI RAID 5/6 SAS 6G (scsi)
Disk /dev/sda: 599GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos

Number  Start   End     Size    Type      File system     Flags
        32.3kB  1049kB  1016kB            Free Space
 1      1049kB  538MB   537MB   primary   ext4            boot
 2      538MB   4833MB  4295MB  primary   linux-swap(v1)
 3      4833MB  26.3GB  21.5GB  primary   ext4
 4      26.3GB  599GB   573GB   extended                  lba
 5      26.3GB  333GB   307GB   logical
 6      333GB   599GB   266GB   logical

And the view from MegaCli64:

MegaCli64 LDInfo Lall aAll
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-5, Secondary-0, RAID Level Qualifier-3
Size                : 557.75 GB
Sector Size         : 512
Is VD emulated      : No
Parity Size         : 278.875 GB
State               : Optimal
Strip Size          : 64 KB
Number Of Drives    : 3
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disabled
Encryption Type     : None
Bad Blocks Exist: No
Is VD Cached: No
MegaCli64 PDList aAll
Adapter #0

Enclosure Device ID: 252
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: N/A
Device Id: 13
WWN: 5000C50043EE29E0
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Sector Size:  0
Logical Sector Size:  0
Physical Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: 1703
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50043ee29e1
SAS Address(1): 0x0
Connected Port Number: 2(path0) 
Inquiry Data: SEAGATE ST3300657SS     17036SJ3T7X6    @#87980 
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive:  Not Certified
Drive Temperature :42C (107.60 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 252
Slot Number: 1
Drive's position: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: N/A
Device Id: 12
WWN: 5000C5004310F4B4
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Sector Size:  0
Logical Sector Size:  0
Physical Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: 1703
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c5004310f4b5
SAS Address(1): 0x0
Connected Port Number: 1(path0) 
Inquiry Data: SEAGATE ST3300657SS     17036SJ3CMMC    @#87980 
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive:  Not Certified
Drive Temperature :45C (113.00 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 252
Slot Number: 2
Drive's position: DiskGroup: 0, Span: 0, Arm: 2
Enclosure position: N/A
Device Id: 11
WWN: 5000C500430189E4
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Sector Size:  0
Logical Sector Size:  0
Physical Sector Size:  0
Firmware state: Online, Spun Up
Device Firmware Level: 1703
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c500430189e5
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3300657SS     17036SJ3CD2Z    @#87980 
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive:  Not Certified
Drive Temperature :41C (105.80 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No

Insert New Disk

Insert the new physical disk into the enclosure and make sure it shows up.

MegaCli64 PDList aAll
<snip>

Enclosure Device ID: 252
Slot Number: 6
Enclosure position: N/A
Device Id: 5
WWN: 5000CCA00F5CA29F
Sequence Number: 9
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS

Raw Size: 419.186 GB [0x3465f870 Sectors]
Non Coerced Size: 418.686 GB [0x3455f870 Sectors]
Coerced Size: 418.656 GB [0x34550000 Sectors]
Sector Size:  0
Logical Sector Size:  0
Physical Sector Size:  0
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: A42B
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000cca00f5ca29d
SAS Address(1): 0x0
Connected Port Number: 3(path0) 
Inquiry Data: HITACHI HUS156045VLS600 A42BJVWMYA6L            
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive:  Not Certified
Drive Temperature :29C (84.20 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No

There it is, [252:6].

Grow the Logical Disk

MegaCli64 LDRecon Start r5 add PhysDrv[252:6] L0 a0
Start Reconstruction of Virtual Drive Success.

Exit Code: 0x00

You can check the progress with:

MegaCli64 LDRecon ShowProg L0 a0
Reconstruction on VD #0 (target id #0) Completed 0% in 0 Minutes.

Exit Code: 0x00
Note: This took about 3 hours to complete on my Fujitsu RX300 S6 nodes, adding a 450 GB disk to an existing RAID 5 array of 3x 300 GB disks. All disks were 15krpm SAS disks. How long this process takes will depend on your system, but expect it to take a while and expect a high disk load during this time. Schedule accordingly!

Sit back and wait. When it's done, we can verify that the LD is larger.

MegaCli64 LDInfo Lall aAll
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-5, Secondary-0, RAID Level Qualifier-3
Size                : 836.625 GB
Sector Size         : 512
Is VD emulated      : No
Parity Size         : 278.875 GB
State               : Optimal
Strip Size          : 64 KB
Number Of Drives    : 4
Span Depth          : 1
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disabled
Ongoing Progresses:
  Background Initialization: Completed 35%, Taken 0 min.
Encryption Type     : None
Bad Blocks Exist: No
Is VD Cached: No

Used to be 557.75 GB, now it is 836.625 GB. Excellent!

Update the OS

We need to tell the kernel to rescan /dev/sda so that it sees the new disk geometry:

echo 1 > /sys/block/sda/device/rescan

Nothing will be printed to screen, but if you check syslog, you will see:

Jul 30 22:41:23 an-c05n02 kernel: sda: detected capacity change from 598879502336 to 898319253504

We can confirm this with parted:

parted /dev/sda "print free"
Model: LSI RAID 5/6 SAS 6G (scsi)
Disk /dev/sda: 898GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos

Number  Start   End     Size    Type      File system     Flags
        32.3kB  1049kB  1016kB            Free Space
 1      1049kB  538MB   537MB   primary   ext4            boot
 2      538MB   4833MB  4295MB  primary   linux-swap(v1)
 3      4833MB  26.3GB  21.5GB  primary   ext4
 4      26.3GB  599GB   573GB   extended                  lba
 5      26.3GB  333GB   307GB   logical
 6      333GB   599GB   266GB   logical
        599GB   898GB   299GB             Free Space

Bingo!

Repartitioning the nodes

TODO: Document these steps properly.

We're going to have to reboot when we're done, so be sure that the first node is out of the cluster.

Warning: The drbdadm invalidate <res> after recreating the partition seems to be critical to ensure that a partial sync doesn't occur after the partitions are rebuilt, particularly for "r0" whose partition didn't really move.
  1. Once out, use parted to delete partition 4, 5 and 5 (extended and the two logical partitions backing DRBD's r0 and r1).
  2. Create the new extended and logical partitions with the new sizes.
  3. Reboot to make sure the OS reads the partition table changes.
  4. Zero-out the new /dev/sda5 and /dev/sda6. (dd if=/dev/zero of=/dev/sdaX bs=4M count=1000)
  5. Use drbdadm create-md r{0,1} to setup the new resources.
  6. Start cman to ensure fencing is available, then modprobe drbd.
  7. Attach the backing devices, INVALIDATE both resources and then connect the resources. They should begin to sync.
  8. Wait for the sync to complete, then restart rgmanager, migrate the servers over, withdraw an-c05n02 and delete/recreate it's partitions.

Grow DRBD Resources


Partition Manipulation

TBC

Checking the BBU

Check the BBU;

MegaCli64 AdpBbuCmd aAll

Triggering a re-learn cycle.

MegaCli64 AdpBbuCmd BbuLearn aAll

Monitor a learn cycle:

watch "MegaCli64 AdpBbuCmd aAll | grep -i -e learn -e status -e charge -e operation"

Updating Firmware

Tested on:

  • RX200 S8

Dependencies

yum install compat-libstdc++-33.i686 libstdc++.i686 libstdc++-devel.i686

Update Disk

Download:

Download 'ftp://ftp.ts.fujitsu.com/images/serverview/UPDATE_DVD_111402_00.iso' (or recent version).

iRMC

Warning: Update the iRMC before updating the BIOS.

Go to: http://support.ts.fujitsu.com/download/Showdescription.asp

Select the machine, RHEL 6 x86_64, choose Server Management Controller - iRMC S4 (Kronos 4) (Onboard on D3302-A1x) - Download top 'iRMC S4 (Kronos4) Firmware - RX200 S8 (ASP for Linux)'.

# -u == update only, don't reflash or downgrade
# -k == keep extracted files.
sh RX200S8_MangtCtr_FW0722F_SDR354.scexe -u -k

This will start the iRMC firmware update, you need to confirm:

***************** PRIMERGY Autonomous Support Package ***************

Description: iRMC S4 (Kronos4) Firmware - RX200 S8  
VersionMajor: RX200S8_7.22F_sdr03.54
VersionBuild: 1.0.0
Software Class - Category: Flash - Firmware 
Software Class - Name: (SV) Flash - Firmware 
Vendor: Fujitsu Technology Solutions 

Caution!
After firmware update has finished, iRMC will be rebooted automatically.

*********************************************************************

Continue processing this ASP?
Please answer: yes/y or no/n

The install will take a while.

*********************************************************************


                          CAUTION!

         Currently a new version is being installed.

       The installation process will take a long time.

                   .... please wait ....

          Don't interrupt this installation process!



*********************************************************************
*********************************************************************


                        SUCCESS!

           A new version was successfully installed.

*********************************************************************

When it's done, you will hear the fans spin.

BIOS

Go to: http://support.ts.fujitsu.com/download/Showdescription.asp

Select the machine, RHEL 6 x86_64, choose Flash - BIOS - Flash Bios for D3302-A1x - Downlaod top one.

Extract on the target and run:

# -u == update only, don't reflash or downgrade
# -k == keep extracted files.
sh RX200S8_D3302_BiosV4654_R160.scexe -u -k
***************** PRIMERGY Autonomous Support Package ***************

Description: Flash BIOS RX200 S8  
VersionMajor: V4.6.5.4
VersionMinor: R1.6.0
VersionBuild: 1.0.0
Software Class - Category: Flash - BIOS 
Software Class - Name: (SV) Flash Bios 
Vendor: Fujitsu Technology Solutions 

*********************************************************************

Continue processing this ASP?
Please answer: yes/y or no/n

ACK

*********************************************************************


                          CAUTION!

         Currently a new version is being installed.

       The installation process will take a long time.

                   .... please wait ....

          Don't interrupt this installation process!



*********************************************************************
*********************************************************************


                           FAILED!

         The new version was not correctly installed.



*********************************************************************

In this case, the BIOS was already up to date. Verify via 'dmidecode':

# dmidecode 2.12
SMBIOS 2.7 present.
88 structures occupying 4084 bytes.
Table at 0x7C8BD018.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: FUJITSU // American Megatrends Inc.
        Version: V4.6.5.4 R1.6.0 for D3302-A1x
        Release Date: 01/30/2014
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 13248 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Print screen service is supported (int 5h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 1.6

And compare against the latest version on the download page under "BIOS Update - Admin Pack for D3302-A1x"

V4.6.5.4 - R1.6.0
Warning: If the update succeeds, immediately reboot after.

Done!

D3116C RAID Controller

Warning: Immediately reboot when update completes!

Go to: http://support.ts.fujitsu.com/download/Showdescription.asp

Select the machine, RHEL 6 x86_64, choose SAS RAID - RAID Ctrl SAS 6G 1GB (D3116C) - Firmware for RAID Ctrl SAS 6G D3116C (ASP for Linux)

Extract on the target and run:

# -u == update only, don't reflash or downgrade
# -k == keep extracted files.
sh FTS_FirmwareforRAIDCtrlSAS6GD3116CASPforLinu_2390003332201252420100_1106185.SCEXE -u -k
***************** PRIMERGY Autonomous Support Package ***************

Description: Firmware for RAID Ctrl SAS 6G D3116C  
VersionMajor: 23.9.0-0033
VersionMinor: 3.220.125-2420
VersionBuild: 1.0.0
Software Class - Category: Flash - Firmware 
Software Class - Name: (SV) Flash - Firmware 
Vendor: LSI Logic 

*********************************************************************

Continue processing this ASP?
Please answer: yes/y or no/n

ACK

*********************************************************************


                          CAUTION!

         Currently a new version is being installed.

       The installation process will take a long time.

                   .... please wait ....

          Don't interrupt this installation process!



*********************************************************************
*********************************************************************


                       Attention! 

                     Successfully flashed.
                      Reboot your system.




*********************************************************************

REBOOT

reboot

Once rebooted, reset the controller to factory defaults. This will _not_ effect the Logical Disk! If you followed the AN!Cluster Tutorial 2, then you will not need to change any settings. If does require another reboot though.

Record the current settings (we'll diff after the reset):

MegaCli64 AdpAllInfo aAll > Adapter.pre-reset
MegaCli64 LDInfo Lall aAll > LD.pre-reset

Reset:

MegaCli64 AdpFacDefSet a0
Adapter 0: Factory Default Set Successfully. 
Please reboot the system for the changes to take effect

Exit Code: 0x00

Now reboot again.

reboot

Now dump the new settings and check for changes:

MegaCli64 AdpAllInfo aAll > Adapter.post-reset
MegaCli64 LDInfo Lall aAll > LD.post-reset

Check for differences:

diff -u Adapter.pre-reset Adapter.post-reset
--- Adapter.pre-reset	2014-04-14 15:48:35.011122984 -0400
+++ Adapter.post-reset	2014-04-14 15:52:27.445603604 -0400
@@ -72,11 +72,11 @@
 Temperature sensor for ROC    : Present
 Temperature sensor for controller    : Absent
 
-ROC temperature : 74  degree Celsius
+ROC temperature : 75  degree Celsius
 
                 Settings
                 ================
-Current Time                     : 19:48:35 4/14, 2014
+Current Time                     : 19:52:26 4/14, 2014
 Predictive Fail Poll Interval    : 300sec
 Interrupt Throttle Active Count  : 16
 Interrupt Throttle Completion    : 50us
diff -u LD.pre-reset LD.post-reset
# no output

Done!

Creating a Backup Set

ToDo: Explain this...

mkdir /root/base
cd /root/base
mkdir /root/base/root
mkdir -p /root/base/etc/sysconfig/network-scripts
mkdir -p /root/base/etc/udev/rules.d
mkdir -p /root/base/etc/init.d
mkdir -p /root/base/var/spool/cron

# Root user
rsync -av /root/.bashrc   /root/base/root/
rsync -av /root/.ssh      /root/base/root/
rsync -av /root/an-cm*    /root/base/root/
rsync -av /root/archive_* /root/base/root/

# Directories
rsync -av /etc/ssh     /root/base/etc/
rsync -av /etc/apcupsd /root/base/etc/
rsync -av /etc/cluster /root/base/etc/
rsync -av /etc/drbd.*  /root/base/etc/
rsync -av /etc/an      /root/base/etc/
rsync -av /etc/yum     /root/base/etc/
rsync -av /etc/pki     /root/base/etc/
rsync -av --exclude 'archive' --exclude 'cache' --exclude 'backup' /etc/lvm /root/base/etc/

# Specific files.
rsync -av /etc/sysconfig/network-scripts/ifcfg-{eth*,bond*,vbr*} /root/base/etc/sysconfig/network-scripts/
rsync -av /etc/udev/rules.d/70-persistent-net.rules              /root/base/etc/udev/rules.d/
rsync -av /etc/sysconfig/network /root/base/etc/sysconfig/
rsync -av /etc/hosts             /root/base/etc/
rsync -av /etc/ntp.conf          /root/base/etc/
rsync -av /etc/init.d/apcupsd    /root/base/etc/init.d/
rsync -av /var/spool/cron/root   /root/base/var/spool/cron/

# Save recreating user accounts.
rsync -av /etc/passwd            /root/base/etc/
rsync -av /etc/group             /root/base/etc/
rsync -av /etc/shadow            /root/base/etc/
rsync -av /etc/gshadow           /root/base/etc/

# If you have the cluster built and want to backup it's configs.
mkdir /root/base/etc/cluster
mkdir /root/base/etc/lvm
rsync -av /etc/cluster/cluster.conf /root/base/etc/cluster/

# NOTE: DRBD won't work until you've manually created the partitions.
rsync -av /etc/drbd.d /root/base/etc/

# If you're running RHEL and want to backup your registration info;
if [ -e "/etc/sysconfig/rhn" ]
then
	rsync -av /etc/sysconfig/rhn /root/base/etc/sysconfig/
fi

# Back up the logical and extended partition structure
for d in $(fdisk -l | grep 'Disk /dev' | grep -v mapper | sed 's/Disk \(.*\):.*/\1/')
do
        echo "#!/bin/bash" > /root/base/root/partition_drives.sh
        for i in $(parted -m -s -a opt $d "print free" | grep '^[4-9]')
        do
                if [ `echo $i | grep '^4:'` ]
                then
                        echo "$d:$i" | perl -pe 's/^(.*?):(\d+):(.*?):(.*?):.*/parted -s -a opt \1 "mkpart extended \3 \4"/'
                else
                        echo "$d:$i" | perl -pe 's/^(.*?):(\d+):(.*?):(.*?):.*/parted -s -a opt \1 "mkpart logical \3 \4"/'
                fi
        done
done >> /root/base/root/partition_drives.sh
chmod 755 /root/base/root/partition_drives.sh

# Pack it up
# NOTE: Change the name to suit your node.
cd /root/
tar -cvf base_$(hostname -s).tar /root/base/etc /root/base/root /root/base/var
ls -lah /root/base_*

Now copy it to your PXE server. In my case, that is 10.255.255.250. I am backing up RHEL6 nodes, so the places I store my backups in is /var/www/html/rhel6/x86_64/files/.

rsync -av /root/base_$(hostname -s).tar root@10.255.255.250:/var/www/html/rhel6/x86_64/files/
root@10.255.255.250's password: 
sending incremental file list
base_an-c05n02.tar

sent 4045378 bytes  received 31 bytes  898979.78 bytes/sec
total size is 4044800  speedup is 1.00

Now that it is on the server, I can use the following %post kickstart script entry in dedicated, per-node kickstart scripts.

%post
# Download the backup files and load them.
cd ~
wget http://10.255.255.250/rhel6/x86_64/files/base_an-c05n01.tar
cp base_an-c05n01.tar.gz /mnt/sysimage/root/
/etc/init.d/network stop
tar -xvzf base_an-c05n01.tar.gz -C /
rm -f /etc/udev/rules.d/70-persistent-net.rules
start_udev
/etc/init.d/network start
/mnt/systemroot/root/partition_drives.sh

When the install finishes, that will load all the the files we backed up. So when the node reboots, all if it's old RHN registration, network configs and so on will be restored. This should dramatically reduce recovery time!

Specific Fujitsu Model Notes

RX200 S7

BIOS Changes for use in Anvil!s

To enter the BIOS, press F2 during POST.

  • Advanced
    • SATA Configuration
      • SATA Mode -> Disaabled (if no optical drive)
      • - OR -
      • SATA Mode -> AHCI Mode (if optical drive)
    • Onboard Device Configuration
      • LAN 2 Oprom -> PXE
      • Onboard SAS/SATA (SCU) -> Disabled (if no optical drive)
  • Server Mgmt
    • Asset Asset Tag -> short host name of the node, lower case
    • Temperature Monitoring -> Enabled
  • Boot
    • Bootup NumLock State -> On

Save and Exit

LSI RAID Controller Setup

Warning: These instructions assume NO existing data. This is a destructive process!

After POST, press <ctrl> + H to enter the controller's WebBIOS

  • Choose the controller (usually only one available), click Start
  • Choose Configuration Wizard
    • Choose New Configuration -> click Next
    • Confirm the clear by clicking Yes
    • Choose Manual Configuration -> click Next
      • Click on the first drive, usually Slot: 0,..., press and hold <ctrl> button and then click to select the rest of the drives. Click Add to Array.
      • Click on Accept DG to create the drive group then click on Next
      • Click on Add to SPAN, then click on Next
      • Configure the array;
        • For 1 to 8 drives;
          • RAID Level -> RAID 5
          • Write Policy -> Write Back with BBU
          • Look at the R5:xxxx size on the right and enter that size in the Select Size section. Be sure to match the GB or TB suffix.
        • For 9 or more drives;
          • RAID Level -> RAID 6
          • Write Policy -> Write Back with BBU
          • Look at the R6:xxxx size on the right and enter that size in the Select Size section. Be sure to match the GB or TB suffix.
        • Click on Accept then click on Yes to accept the warning.
        • The virtual disk will now be shown on the right. Click Next to proceed.
      • Click Accept and then click Yes to save the configuration. Click Yes to acknowledge the warning and initialize the drive.
    • Click on Set Boot Drive and then click Go.
    • Clock on Home.
  • Click on Exit and then on Yes.

Reboot and you are done.


 

Any questions, feedback, advice, complaints or meanderings are welcome.
Alteeve's Niche! Enterprise Support:
Alteeve Support
Community Support
© Alteeve's Niche! Inc. 1997-2024   Anvil! "Intelligent Availability®" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.