Table of Contents

Monitoring LSI SAS RAID Controllers on Linux

See also LSI MegaRAID

Cheat Sheet: http://tools.rapidsoft.de/perc/

LSI MegaRAID SAS User's Guide: http://www.lsi.com/downloads/Public/MegaRAID%20Common%20Files/MegaRAID_SAS_SW_UG_51530-00_RevI.pdf

This configuration uses cron and the MegaCLI RAID configuration utility to monitor a Linux server with a newer LSI SAS RAID controller.

Warning messages are sent to the system administrator using the sendmail command.

The script checks for the controller to report it is running in 'Degraded' status. You might have it run from cron every 10 minutes.

XenServer

With XenServer, there is no 'mail' command nor MTA daemons running, but it does come with SSMTP installed. SSMTP is cool because you don't need an extra process running on your box and it eliminates a lot of problems you can run into with mail relaying.

Configure SSMTP

vi /etc/ssmtp/ssmtp.conf

root=admin@yourdomain.com
mailhub=mailserver.yourdomain.com:465
RewriteDomain=yourdomain.com
FromLineOverride=YES
UseTLS=YES
AuthUser=smtpuser
AuthPass=smtppassword

Test SSMTP

Test with no 'mail' command and no subject line:

echo test | ssmtp -v admin@yourdomain.com

Install the MegaCLI Utility

See also LSI MegaRAID

cron

Added to /etc/crontab to run the script every 10 minutes:

# check for RAID array problems every 10 minutes and send out a message
*/10 * * * * root /root/raid-status-megacli.sh 2>&1

raid-status-megacli.sh

Place this script in the /root directory and make it executable:

chmod +x /root/raid-status-megacli.sh

You will want to edit the variables to fit your needs.

#!/bin/bash
#
# raid-status-megacli.sh - Gene Cooper <gcooper at sonoracomm.com>
#
# For newer LSI RAID controllers
#
# Uses MegaCLI utility and sendmail command and should be run from cron

# Sender of e-mail warnings
# For WHMCS Ticket System, use the contact address here for automatic account assignment
FROM='"RAID Status" <user@domain.com>'

# Subject of e-mail warning
SUBJECT="RAID Array Failure on `hostname`"

# Recipient of e-mail warnings
ADMIN='"Your Name" <you@yourdomain.com>'

# RAID utility check command
COMMAND='/opt/MegaRAID/MegaCli/MegaCli -ldinfo -Lall -Aall -NoLog'

# RAID status check results
STATUS='/root/raidinfo.txt'

# dump hardware failure info to a text file
nice -n 19 ${COMMAND} > $STATUS

# Comment this section and run manually to test mail sending
# Check for error
if [ -z "$(grep "^State.*: Degraded$" $STATUS | awk '{print $NF}')" ]; then
  exit 0
fi

# Send mail
/usr/sbin/sendmail -t -i <<EOF
Date: $(date)
To:  $ADMIN
Subject: $SUBJECT
From: $FROM

$(cat $STATUS)

EOF
exit

Here is someone else's script for ideas:

#!/bin/sh
CONT="a0"
STATUS=0
MEGACLI=/opt/MegaRAID/MegaCli/MegaCli64

echo -n "Checking RAID status on "
hostname
for a in $CONT
 do

   NAME=`$MEGACLI -AdpAllInfo -$a |grep "Product Name" | cut -d: -f2`
   echo "Controller $a: $NAME"
   noonline=`$MEGACLI PDList -$a | grep Online | wc -l`
   echo "No of Physical disks online : $noonline"
   DEGRADED=`$MEGACLI -AdpAllInfo -a0  |grep "Degrade"`
   echo $DEGRADED
   NUM_DEGRADED=`echo $DEGRADED |cut -d" " -f3`
   [ "$NUM_DEGRADED" -ne 0 ] && STATUS=1
   FAILED=`$MEGACLI -AdpAllInfo -a0  |grep "Failed Disks"`
   echo $FAILED
   NUM_FAILED=`echo $FAILED |cut -d" " -f4`
   [ "$NUM_FAILED" -ne 0 ] && STATUS=1

 done

exit $STATUS