Fujitsu Notes

From AN!Wiki
Jump to: navigation, search

 AN!Wiki :: How To :: Fujitsu Notes

This covers specific setup notes for Fujitsu Primenergy servers on EL6.

Contents

iRMC Remote KVM Access Through a Firewall

  1. . Port-forward port 80. (ie: <public ip>:41080 -> <ipmi ip>:80)
  2. . Log in; Network Settings -> Ports and Services -> HTTPS Port
    1. . Default is '443', but with multiple nodes, they can't all listen on 443 (unless you have many external IPs). So in this example, I change this to '41443'.
    2. . Save, log out.
  3. . Update firewall to forward <public ip>:41443 -> <ipmi ip>:41443
  4. . Connect to https://<public ip>:41443

Now iKVM will work.

iRMC HTML5 Video Redirection

Any server using iRMC S4 version 8.01 and up (8.24F current as of June 2016) has the option of using HTML5 rather than Java for video console redirection. To enable said functionality, from the iRMC web interface:

-> Console Redirection -> Video Redirection

And select HTML5 Viewer Enabled.


Primergy RX200 S8

IPMI

Early RX200 S8 servers suffered from the IPMI BMC hang bug. A hard power cycle of the system (full removal of input power) clears this issue.

In addition, on occasion upon restoration of power, an ipmi power on command can be issued and return 'failed' but after ~30-60 seconds, complete the power on command. This is not an issue in and of itself, but it is concerning when it happens and you see the error message.

BIOS

  • Advanced
    • SATA Configuration
      • SATA Mode -> Disabled
  • Server Mgmt
    • Asset Tag -> (short host name)
    • Temperature Monitoring -> Enabled
  • Boot
    • Bootup NumLock State -> On
    • PXE Boot Option Retry -> Enabled

RAID Controller (D3116C)

  • Configuration Wizard
    • New Configuration -> Next
    • Confirm clear config -> Yes
    • Manual Configuration -> Next
      • Press and hold <ctrl>, Click to highlight all Drives in left pane -> Add to Array -> Accept DG -> Next.
      • Add to SPAN -> Next
        • RAID Level; 1-8 driver == RAID 5, 9+ drives == RAID 6
        • Write Policy -> Write Back with BBU
        • Select Size -> Enter size in green text under right pane; R5 size for RAID 5, R6 size for RAID 6.
        • Accept -> Confirm cache policy; Yes -> Next
    • Accept
    • Save the configuration; Yes -> Confirm existing data wipe; Yes
    • Click to select Set Boot Drive -> Go -> Back
  • Exit -> Confirm exit; Yes

Reboot.

Primergy RX300 S7

RAID

Install the MegaCLI tools;

Check for an updates MegaCLI from here (under "Management Software and Tools"). If there is an updated version, follow the

mkdir ~/temp
cd ~/temp
# Download the updated 8.04.07_MegaCLI.zip here
unzip 8.04.07_MegaCLI.zip
unzip CLI_Lin_8.04.07.zip 
unzip MegaCliLin.zip
rpm -Uvh MegaCli-8.04.07-1.noarch.rpm Lib_Utils-1.00-09.noarch.rpm
 
# This makes MegaCli64 app available without the full path
ln -s /opt/MegaRAID/MegaCli/MegaCli64 /sbin/

If you want to install from the AN!Cache, you can do this;

rpm -Uvh https://alteeve.ca/files/Lib_Utils-1.00-09.noarch.rpm https://alteeve.ca/files/MegaCli-8.04.07-1.noarch.rpm
 
# This makes MegaCli64 app available without the full path
ln -s /opt/MegaRAID/MegaCli/MegaCli64 /sbin/

Once installed, verify that you can see your hardware:

Replacing a Failed Drive

Replacing a failed drive involves two steps;

  1. Identify the drive that is failing and gathering the data needed to request the RMA.
  2. Swapping the actual drive when the replacement arrives on client site.

Identify the Failing Drive

If the drive has failed entirely, the red front LED on the drive should be lit, making identification and RMA request simple.

However, if the drive has not yet failed, identifying the drive and confirming it's pending failure requires a little extra work.

Identify the failed drive:

MegaCli64 PDList aAll
<snip>
 
Enclosure Device ID: 10
Slot Number: 5
Drive's postion: DiskGroup: 0, Span: 0, Arm: 5
Enclosure position: 1
Device Id: 7
WWN: 5000C50054AE9C38
Sequence Number: 2
Media Error Count: 0
Other Error Count: 2
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
 
Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.875 GB [0x22dc0000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: 5301
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50054ae9c39
SAS Address(1): 0x0
Connected Port Number: 0(path0)
Inquiry Data: SEAGATE ST9300653SS     53016XN1EMF2    @#87980
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature :29C (84.20 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Port-1 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No

Notice the line:

Other Error Count: 2

This is a sign of impending failure, despite SMART showing the drive as still healthy. You can gather more details (which Fujitsu will require to verify impending failure), run;

MegaCli64 -AdpEventLog -GetEvents -f raid_events.log -aALL

When this finishes gathering data, it will create a file called raid_events.log. Send this file to your Fujitsu support rep. They will validate the pending failure and issue an RMA.

Identifying The Drive Prior to Replacement

If the drive has failed, identifying the drive is as simple as looking at the front of the node for the drive with the red error LED lit.

If the drive has not failed yet, then you can use the identify command to blink the LED. That is what we'll cover here.

In the previous section, we identified the failed drive using the MegaCli64 command. You need to note the following;

MegaCli64 PDList aAll
Enclosure Device ID: 10
Slot Number: 5
...
Other Error Count: 2

The two bits of information you need are the enclosure ID and slot number. In this case, that is 10 and 5, respectively.

With that info, you can trigger the drive's red LED using the following command;

MegaCli64 -PdLocate -start -physdrv [10:5] -aALL
Adapter: 0: Device at EnclId-10 SlotId-5  -- PD Locate Start Command was successfully sent to Firmware 
 
Exit Code: 0x00

Once you've located the drive, you can stop the "locate" command using:

MegaCli64 -PdLocate -stop -physdrv [10:5] -aALL
Adapter: 0: Device at EnclId-10 SlotId-5  -- PD Locate Stop Command was successfully sent to Firmware 
 
Exit Code: 0x00

Now that you know which drive has failed, you can OFFLINE it in preparation of replacing it.

MegaCli64 -PDOffline -physdrv [10:5] -aALL
Adapter: 0: EnclId-10 SlotId-5 state changed to OffLine.
 
Exit Code: 0x00

You can now physically remove the failed disk and insert the replacement disk.

Monitoring the Rebuild

When a replacement disk is inserted, the array should recognise it and automatically begin rebuilding the array. You can monitor this operation by calling;

MegaCli64 -PDRbld -ProgDsply -PhysDrv [10:5] -aALL

This will display the rebuild progress as a textual bar graph. A rebuild of a 300 GB 15,000rpm SAS drive in a 6-drive array took about 30 minutes. How long it takes in your case will vary depending on disk speed, array size and load.

Checking the BBU

Check the BBU;

MegaCli64 AdpBbuCmd aAll

Triggering a re-learn cycle.

MegaCli64 AdpBbuCmd BbuLearn aAll

Monitor a learn cycle:

watch "MegaCli64 AdpBbuCmd aAll | grep -i -e learn -e status -e charge -e operation"

Updating Firmware

Tested on:

  • RX200 S8

Dependencies

yum install kernel-headers gcc gcc-c++ compat-libstdc++-33.i686 libstdc++.i686 libstdc++-devel.i686 kernel-devel

iRMC

Template warning icon.png
Warning: Update the iRMC before updating the BIOS.

Go to: Fujitsu Downloads

Select:

  • "Product Search"
  • Enter the server's model number in the 'Product Search' text field. For example; "RX2540M1". Note that you can enter a partial model number and it will ask you to select the proper model from a list.
  • Click the 'Selected operating system' drop down list and choose "Red Hat Enterprise Linux 6 (x86_64)". Note that you might need to click on the selection box twice to get the pop-up menu with the OS selection list to appear.
  • Under the "Driver" tab;
  • Click to expand "Server Management Controller"
    • Click to expand "iRMC S4 (Kronos 4) (Onboard on D3289-A1x)"
    • Verify the title is 'iRMC S4 (Kronos4) Firmware - RX2540 M1 (ASP for Linux)'. If so, click on "Direct Download" on the left.
Template warning icon.png
Warning: Read and understand the notes and warnings!
  • When you are ready, click to check the "Terms of Use" checkbox and then click on "Download File".
  • Locate the file on your computer and extract the zip file.
  • Copy the RX2540M1_MangtCtr_<version>.scexe to the node.
Template note icon.png
Note: In this tutorial, the file name we're using is 'RX2540M1_MangtCtr_FW0824F_SDR367.scexe'.

Log into the node and run:

# -u == update only, don't reflash or downgrade
# -k == keep extracted files.
sh RX2540M1_MangtCtr_FW0824F_SDR367.scexe -u -k

This will start the iRMC firmware update, you need to confirm:

 

The install will take a while, be patient!

#...

When it's done, you will hear the fans spin.

BIOS

Template note icon.png
Note: If you already selected the machine type in the previous section, you will not need to select the machine type or operating system a second time.

Go to: Fujitsu Downloads

Select:

  • "Product Search"
  • Enter the server's model number in the 'Product Search' text field. For example; "RX2540M1". Note that you can enter a partial model number and it will ask you to select the proper model from a list.
  • Click the 'Selected operating system' drop down list and choose "Red Hat Enterprise Linux 6 (x86_64)". Note that you might need to click on the selection box twice to get the pop-up menu with the OS selection list to appear.
  • Click on the "BIOS" tab.
  • Click to expand "Flash - BIOS"
    • Click to expand "Flash BIOS for D3289-A1x" (the final ID may differ depending on your machine)
    • Verify the title is 'Flash BIOS RX2540 M1 (ASP for Linux)'. If so, click on "Direct Download" on the left.
Template warning icon.png
Warning: Read and understand the notes and warnings!
  • When you are ready, click to check the "Terms of Use" checkbox and then click on "Download File".

Extract on the target and run:

# -u == update only, don't reflash or downgrade
# -k == keep extracted files.
sh RX200S8_D3302_BiosV4654_R160.scexe -u -k
***************** PRIMERGY Autonomous Support Package ***************
 
Description: Flash BIOS RX200 S8  
VersionMajor: V4.6.5.4
VersionMinor: R1.6.0
VersionBuild: 1.0.0
Software Class - Category: Flash - BIOS 
Software Class - Name: (SV) Flash Bios 
Vendor: Fujitsu Technology Solutions 
 
*********************************************************************
 
Continue processing this ASP?
Please answer: yes/y or no/n

ACK

*********************************************************************
 
 
                          CAUTION!
 
         Currently a new version is being installed.
 
       The installation process will take a long time.
 
                   .... please wait ....
 
          Don't interrupt this installation process!
 
 
 
*********************************************************************
*********************************************************************
 
 
                           FAILED!
 
         The new version was not correctly installed.
 
 
 
*********************************************************************

In this case, the BIOS was already up to date. Verify via 'dmidecode':

# dmidecode 2.12
SMBIOS 2.7 present.
88 structures occupying 4084 bytes.
Table at 0x7C8BD018.
 
Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: FUJITSU // American Megatrends Inc.
        Version: V4.6.5.4 R1.6.0 for D3302-A1x
        Release Date: 01/30/2014
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 13248 kB
        Characteristics:
                PCI is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                Boot from CD is supported
                Selectable boot is supported
                EDD is supported
                Print screen service is supported (int 5h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                ACPI is supported
                USB legacy is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
                UEFI is supported
        BIOS Revision: 1.6

And compare against the latest version on the download page under "BIOS Update - Admin Pack for D3302-A1x"

V4.6.5.4 - R1.6.0
Template warning icon.png
Warning: If the update succeeds, immediately reboot after.

Done!

D3116C RAID Controller

Template warning icon.png
Warning: Immediately reboot when update completes!

Go to: RX*** Downloads

Select the machine, RHEL 6 x86_64, choose SAS RAID - RAID Ctrl SAS 6G 1GB (D3116C) - Firmware for RAID Ctrl SAS 6G D3116C (ASP for Linux)

Extract on the target and run:

# -u == update only, don't reflash or downgrade
# -k == keep extracted files.
sh FTS_FirmwareforRAIDCtrlSAS6GD3116CASPforLinu_2390003332201252420100_1106185.SCEXE -u -k
***************** PRIMERGY Autonomous Support Package ***************
 
Description: Firmware for RAID Ctrl SAS 6G D3116C  
VersionMajor: 23.9.0-0033
VersionMinor: 3.220.125-2420
VersionBuild: 1.0.0
Software Class - Category: Flash - Firmware 
Software Class - Name: (SV) Flash - Firmware 
Vendor: LSI Logic 
 
*********************************************************************
 
Continue processing this ASP?
Please answer: yes/y or no/n

ACK

*********************************************************************
 
 
                          CAUTION!
 
         Currently a new version is being installed.
 
       The installation process will take a long time.
 
                   .... please wait ....
 
          Don't interrupt this installation process!
 
 
 
*********************************************************************
*********************************************************************
 
 
                       Attention! 
 
                     Successfully flashed.
                      Reboot your system.
 
 
 
 
*********************************************************************

REBOOT

reboot

Once rebooted, reset the controller to factory defaults. This will _not_ effect the Logical Disk! If you followed the AN!Cluster Tutorial 2, then you will not need to change any settings. If does require another reboot though.

Record the current settings (we'll diff after the reset):

MegaCli64 AdpAllInfo aAll > Adapter.pre-reset
MegaCli64 LDInfo Lall aAll > LD.pre-reset

Reset:

MegaCli64 AdpFacDefSet a0
Adapter 0: Factory Default Set Successfully. 
Please reboot the system for the changes to take effect
 
Exit Code: 0x00

Now reboot again.

reboot

Now dump the new settings and check for changes:

MegaCli64 AdpAllInfo aAll > Adapter.post-reset
MegaCli64 LDInfo Lall aAll > LD.post-reset

Check for differences:

diff -u Adapter.pre-reset Adapter.post-reset
--- Adapter.pre-reset	2014-04-14 15:48:35.011122984 -0400
+++ Adapter.post-reset	2014-04-14 15:52:27.445603604 -0400
@@ -72,11 +72,11 @@
 Temperature sensor for ROC    : Present
 Temperature sensor for controller    : Absent
 
-ROC temperature : 74  degree Celsius
+ROC temperature : 75  degree Celsius
 
                 Settings
                 ================
-Current Time                     : 19:48:35 4/14, 2014
+Current Time                     : 19:52:26 4/14, 2014
 Predictive Fail Poll Interval    : 300sec
 Interrupt Throttle Active Count  : 16
 Interrupt Throttle Completion    : 50us
diff -u LD.pre-reset LD.post-reset
# no output

Done!

Creating a Backup Set

ToDo: Explain this...

mkdir /root/base
cd /root/base
mkdir /root/base/root
mkdir -p /root/base/etc/sysconfig/network-scripts
mkdir -p /root/base/etc/udev/rules.d
mkdir -p /root/base/etc/init.d
mkdir -p /root/base/var/spool/cron
 
# Root user
rsync -av /root/.bashrc   /root/base/root/
rsync -av /root/.ssh      /root/base/root/
rsync -av /root/an-cm*    /root/base/root/
rsync -av /root/archive_* /root/base/root/
 
# Directories
rsync -av /etc/ssh     /root/base/etc/
rsync -av /etc/apcupsd /root/base/etc/
rsync -av /etc/cluster /root/base/etc/
rsync -av /etc/drbd.*  /root/base/etc/
rsync -av /etc/an      /root/base/etc/
rsync -av /etc/yum     /root/base/etc/
rsync -av /etc/pki     /root/base/etc/
rsync -av --exclude 'archive' --exclude 'cache' --exclude 'backup' /etc/lvm /root/base/etc/
 
# Specific files.
rsync -av /etc/sysconfig/network-scripts/ifcfg-{eth*,bond*,vbr*} /root/base/etc/sysconfig/network-scripts/
rsync -av /etc/udev/rules.d/70-persistent-net.rules              /root/base/etc/udev/rules.d/
rsync -av /etc/sysconfig/network /root/base/etc/sysconfig/
rsync -av /etc/hosts             /root/base/etc/
rsync -av /etc/ntp.conf          /root/base/etc/
rsync -av /etc/init.d/apcupsd    /root/base/etc/init.d/
rsync -av /var/spool/cron/root   /root/base/var/spool/cron/
 
# Save recreating user accounts.
rsync -av /etc/passwd            /root/base/etc/
rsync -av /etc/group             /root/base/etc/
rsync -av /etc/shadow            /root/base/etc/
rsync -av /etc/gshadow           /root/base/etc/
 
# If you have the cluster built and want to backup it's configs.
mkdir /root/base/etc/cluster
mkdir /root/base/etc/lvm
rsync -av /etc/cluster/cluster.conf /root/base/etc/cluster/
 
# NOTE: DRBD won't work until you've manually created the partitions.
rsync -av /etc/drbd.d /root/base/etc/
 
# If you're running RHEL and want to backup your registration info;
if [ -e "/etc/sysconfig/rhn" ]
then
	rsync -av /etc/sysconfig/rhn /root/base/etc/sysconfig/
fi
 
# Back up the logical and extended partition structure
for d in $(fdisk -l | grep 'Disk /dev' | grep -v mapper | sed 's/Disk \(.*\):.*/\1/')
do
        echo "#!/bin/bash" > /root/base/root/partition_drives.sh
        for i in $(parted -m -s -a opt $d "print free" | grep '^[4-9]')
        do
                if [ `echo $i | grep '^4:'` ]
                then
                        echo "$d:$i" | perl -pe 's/^(.*?):(\d+):(.*?):(.*?):.*/parted -s -a opt \1 "mkpart extended \3 \4"/'
                else
                        echo "$d:$i" | perl -pe 's/^(.*?):(\d+):(.*?):(.*?):.*/parted -s -a opt \1 "mkpart logical \3 \4"/'
                fi
        done
done >> /root/base/root/partition_drives.sh
chmod 755 /root/base/root/partition_drives.sh
 
# Pack it up
# NOTE: Change the name to suit your node.
cd /root/
tar -cvf base_$(hostname -s).tar /root/base/etc /root/base/root /root/base/var
ls -lah /root/base_*

Now copy it to your PXE server. In my case, that is 10.255.255.250. I am backing up RHEL6 nodes, so the places I store my backups in is /var/www/html/rhel6/x86_64/files/.

rsync -av /root/base_$(hostname -s).tar root@10.255.255.250:/var/www/html/rhel6/x86_64/files/
root@10.255.255.250's password: 
sending incremental file list
base_an-c05n02.tar
 
sent 4045378 bytes  received 31 bytes  898979.78 bytes/sec
total size is 4044800  speedup is 1.00

Now that it is on the server, I can use the following %post kickstart script entry in dedicated, per-node kickstart scripts.

%post
# Download the backup files and load them.
cd ~
wget http://10.255.255.250/rhel6/x86_64/files/base_an-c05n01.tar
cp base_an-c05n01.tar.gz /mnt/sysimage/root/
/etc/init.d/network stop
tar -xvzf base_an-c05n01.tar.gz -C /
rm -f /etc/udev/rules.d/70-persistent-net.rules
start_udev
/etc/init.d/network start
/mnt/systemroot/root/partition_drives.sh

When the install finishes, that will load all the the files we backed up. So when the node reboots, all if it's old RHN registration, network configs and so on will be restored. This should dramatically reduce recovery time!

Specific Fujitsu Model Notes

RX200 S7

BIOS Changes for use in Anvil!s

To enter the BIOS, press F2 during POST.

  • Advanced
    • SATA Configuration
      • SATA Mode -> Disaabled (if no optical drive)
      • - OR -
      • SATA Mode -> AHCI Mode (if optical drive)
    • Onboard Device Configuration
      • LAN 2 Oprom -> PXE
      • Onboard SAS/SATA (SCU) -> Disabled (if no optical drive)
  • Server Mgmt
    • Asset Asset Tag -> short host name of the node, lower case
    • Temperature Monitoring -> Enabled
  • Boot
    • Bootup NumLock State -> On

Save and Exit

LSI RAID Controller Setup

Template warning icon.png
Warning: These instructions assume NO existing data. This is a destructive process!

After POST, press <ctrl> + H to enter the controller's WebBIOS

  • Choose the controller (usually only one available), click Start
  • Choose Configuration Wizard
    • Choose New Configuration -> click Next
    • Confirm the clear by clicking Yes
    • Choose Manual Configuration -> click Next
      • Click on the first drive, usually Slot: 0,..., press and hold <ctrl> button and then click to select the rest of the drives. Click Add to Array.
      • Click on Accept DG to create the drive group then click on Next
      • Click on Add to SPAN, then click on Next
      • Configure the array;
        • For 1 to 8 drives;
          • RAID Level -> RAID 5
          • Write Policy -> Write Back with BBU
          • Look at the R5:xxxx size on the right and enter that size in the Select Size section. Be sure to match the GB or TB suffix.
        • For 9 or more drives;
          • RAID Level -> RAID 6
          • Write Policy -> Write Back with BBU
          • Look at the R6:xxxx size on the right and enter that size in the Select Size section. Be sure to match the GB or TB suffix.
        • Click on Accept then click on Yes to accept the warning.
        • The virtual disk will now be shown on the right. Click Next to proceed.
      • Click Accept and then click Yes to save the configuration. Click Yes to acknowledge the warning and initialize the drive.
    • Click on Set Boot Drive and then click Go.
    • Clock on Home.
  • Click on Exit and then on Yes.

Reboot and you are done.


 
 

 

Any questions, feedback, advice, complaints or meanderings are welcome.
Us: Alteeve's Niche! Support: Mailing List IRC: #clusterlabs on Freenode   © Alteeve's Niche! Inc. 1997-2019
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.
Personal tools
Namespaces

Variants
Actions
Navigation
projects
Toolbox