====== Adaptec RAID Monitoring ====== http://www.adaptec.com/en-US/downloads/storage_manager/sm?productId=SAS-3405&dn=Adaptec+RAID+3405 http://download.adaptec.com/pdfs/user_guides/CLI_v6_10_Users_Guide.pdf ===== Rules ===== * Never pull a SATA drive out of a running array *Shut the server down first *Hot-Swap SATA is a myth * Always back up a degraded array before screwing with it * Be afraid! * Never buy Adaptec RAID controllers * Especially SATA/SAS * Never trust the output of ''arcconf'' * Carefully reboot the server and verify using the BIOS ===== Adaptec CLI Tools ===== These commands download and extract the ''arcconf'' tool: wget ftp://ftp.supermicro.com/CDR-LPZCR1_1.11_for_AOC-LPZCR1_AOC-SOZCR1/ASM/ASM4859/linux/StorMan-4.11.i386.rpm rpm2cpio StorMan-4.11.i386.rpm | cpio -ivmur '*arcconf' Name it ''/usr/sbin/arcconf''. chmod +x /usr/sbin/arcconf arcconf getstatus 1 arcconf getconfig 1 ===== Monitoring Script ===== #!/bin/bash # raidmon.sh # # The intent of this script is to be run frequently from cron to alert a # sysadmin to RAID problems with Adaptec controllers # Who to send alerts to ALERT="you@yourdomain.com" # Where is the arcconf utility ARCCONF="/usr/sbin/arcconf" # note: this only tests the first array even if there is more than one RAIDSTATUS=`$ARCCONF getconfig 1 | awk '/Status of logical device/ {print $6; exit}'` if [ "$RAIDSTATUS" != "Optimal" ] then $ARCCONF getstatus 1 | mail -s "RAID Problem Report for ${HOSTNAME}" "$ALERT" fi exit 0 ===== Monitor with cron ===== Add to ''/etc/crontab'': # check for RAID array problems every 10 minutes and send out a message */10 * * * * root /root/raidmon.sh 2>&1 ==== SME Server ==== mkdir -p /etc/e-smith/templates-custom/etc/crontab vi /etc/e-smith/templates-custom/etc/crontab/40raidmon Modify per above, then: expand-template /etc/crontab cat /etc/crontab