See also LSI MegaRAID
Cheat Sheet: http://tools.rapidsoft.de/perc/
LSI MegaRAID SAS User's Guide: http://www.lsi.com/downloads/Public/MegaRAID%20Common%20Files/MegaRAID_SAS_SW_UG_51530-00_RevI.pdf
This configuration uses cron
and the MegaCLI RAID configuration utility to monitor a Linux server with a newer LSI SAS RAID controller.
Warning messages are sent to the system administrator using the sendmail
command.
The script checks for the controller to report it is running in 'Degraded' status. You might have it run from cron
every 10 minutes.
With XenServer, there is no 'mail' command nor MTA daemons running, but it does come with SSMTP installed. SSMTP is cool because you don't need an extra process running on your box and it eliminates a lot of problems you can run into with mail relaying.
vi /etc/ssmtp/ssmtp.conf root=admin@yourdomain.com mailhub=mailserver.yourdomain.com:465 RewriteDomain=yourdomain.com FromLineOverride=YES UseTLS=YES AuthUser=smtpuser AuthPass=smtppassword
Test with no 'mail' command and no subject line:
echo test | ssmtp -v admin@yourdomain.com
See also LSI MegaRAID
Added to /etc/crontab
to run the script every 10 minutes:
# check for RAID array problems every 10 minutes and send out a message */10 * * * * root /root/raid-status-megacli.sh 2>&1
Place this script in the /root
directory and make it executable:
chmod +x /root/raid-status-megacli.sh
You will want to edit the variables to fit your needs.
#!/bin/bash # # raid-status-megacli.sh - Gene Cooper <gcooper at sonoracomm.com> # # For newer LSI RAID controllers # # Uses MegaCLI utility and sendmail command and should be run from cron # Sender of e-mail warnings # For WHMCS Ticket System, use the contact address here for automatic account assignment FROM='"RAID Status" <user@domain.com>' # Subject of e-mail warning SUBJECT="RAID Array Failure on `hostname`" # Recipient of e-mail warnings ADMIN='"Your Name" <you@yourdomain.com>' # RAID utility check command COMMAND='/opt/MegaRAID/MegaCli/MegaCli -ldinfo -Lall -Aall -NoLog' # RAID status check results STATUS='/root/raidinfo.txt' # dump hardware failure info to a text file nice -n 19 ${COMMAND} > $STATUS # Comment this section and run manually to test mail sending # Check for error if [ -z "$(grep "^State.*: Degraded$" $STATUS | awk '{print $NF}')" ]; then exit 0 fi # Send mail /usr/sbin/sendmail -t -i <<EOF Date: $(date) To: $ADMIN Subject: $SUBJECT From: $FROM $(cat $STATUS) EOF exit
Here is someone else's script for ideas:
#!/bin/sh CONT="a0" STATUS=0 MEGACLI=/opt/MegaRAID/MegaCli/MegaCli64 echo -n "Checking RAID status on " hostname for a in $CONT do NAME=`$MEGACLI -AdpAllInfo -$a |grep "Product Name" | cut -d: -f2` echo "Controller $a: $NAME" noonline=`$MEGACLI PDList -$a | grep Online | wc -l` echo "No of Physical disks online : $noonline" DEGRADED=`$MEGACLI -AdpAllInfo -a0 |grep "Degrade"` echo $DEGRADED NUM_DEGRADED=`echo $DEGRADED |cut -d" " -f3` [ "$NUM_DEGRADED" -ne 0 ] && STATUS=1 FAILED=`$MEGACLI -AdpAllInfo -a0 |grep "Failed Disks"` echo $FAILED NUM_FAILED=`echo $FAILED |cut -d" " -f4` [ "$NUM_FAILED" -ne 0 ] && STATUS=1 done exit $STATUS