User Tools

Site Tools


computing:storage:lsi:megaraid_monitor

This is an old revision of the document!


Monitoring LSI SAS RAID Controllers on Linux

See also LSI MegaRAID

Cheat Sheet: http://tools.rapidsoft.de/perc/

LSI MegaRAID SAS User's Guide: http://www.lsi.com/downloads/Public/MegaRAID%20Common%20Files/MegaRAID_SAS_SW_UG_51530-00_RevI.pdf

This configuration uses cron and the MegaCLI RAID configuration utility to monitor a Linux server with a newer LSI SAS RAID controller.

Warning messages are sent to the system administrator using the sendmail command.

The script checks for the controller to report it is running in 'Degraded' status. You might have it run from cron every 10 minutes.

XenServer

With XenServer, there is no 'mail' command nor MTA daemons running, but it does come with SSMTP installed. SSMTP is cool because you don't need an extra process running on your box and it eliminates a lot of problems you can run into with mail relaying.

Configure SSMTP

vim /etc/ssmtp/ssmtp.conf

root=admin@yourdomain.com
mailhub=mailserver.yourdomain.com:465
RewriteDomain=yourdomain.com
FromLineOverride=YES
UseTLS=YES
AuthUser=smtpuser
AuthPass=smtppassword

Test SSMTP

Test with no 'mail' command and no subject line:

echo test | ssmtp -v admin@yourdomain.com

Install the MegaCLI Utility

Log in as 'root', then:

mkdir ~/MegaCLI
cd ~/MegaCLI
wget http://www.lsi.com/downloads/Public/MegaRAID%20Common%20Files/8.07.06_MegaCLI.zip
unzip 8.07.06_MegaCLI.zip

rpm -Uvh Linux/MegaCli-8.07.06-1.noarch.rpm

CRON

I added this to my /etc/crontab to run the script every 10 minutes:

# check for RAID array problems every 10 minutes and send out a message
*/10 * * * * root /root/raid-status-megacli.sh 2>&1

raid-status-megacli.sh

Place this script in the /root directory.

You will want to edit the variables to fit your needs.

#!/bin/bash
#
# raid-status-megacli.sh - Gene Cooper <gcooper@sonoracomm.com>
#
# For newer LSI RAID controllers
#
# Uses MegaCLI utility and sendmail command and should be run from cron

# Subject of e-mail warning
SUBJECT='RAID Array Failure on $HOSTNAME'

# Recipient of e-mail warnings
ADMIN='you@yourdomain.com'

# RAID utility check command
COMMAND='/opt/MegaRAID/MegaCli/MegaCli -ldinfo -Lall -Aall -NoLog'

# RAID status check results
STATUS='/root/raidinfo.txt'

# dump hardware failure info to a text file
nice -n 19 ${COMMAND} > $STATUS

# check for error
if [ -z "$(grep "^State.*: Degraded$" $STATUS | awk '{print $NF}')" ]; then
  exit 0
fi

# send mail
/usr/sbin/sendmail -t -i <<EOF
Date: $(date)
To:  $ADMIN
Subject: $SUBJECT
From: $ADMIN

$(cat $STATUS)

EOF
exit

Here is someone else's script for ideas:

#!/bin/sh
CONT="a0"
STATUS=0
MEGACLI=/opt/MegaRAID/MegaCli/MegaCli64

echo -n "Checking RAID status on "
hostname
for a in $CONT
 do

   NAME=`$MEGACLI -AdpAllInfo -$a |grep "Product Name" | cut -d: -f2`
   echo "Controller $a: $NAME"
   noonline=`$MEGACLI PDList -$a | grep Online | wc -l`
   echo "No of Physical disks online : $noonline"
   DEGRADED=`$MEGACLI -AdpAllInfo -a0  |grep "Degrade"`
   echo $DEGRADED
   NUM_DEGRADED=`echo $DEGRADED |cut -d" " -f3`
   [ "$NUM_DEGRADED" -ne 0 ] && STATUS=1
   FAILED=`$MEGACLI -AdpAllInfo -a0  |grep "Failed Disks"`
   echo $FAILED
   NUM_FAILED=`echo $FAILED |cut -d" " -f4`
   [ "$NUM_FAILED" -ne 0 ] && STATUS=1

 done

exit $STATUS
computing/storage/lsi/megaraid_monitor.1362245790.txt.gz · Last modified: 2013/03/02 10:36 by gcooper