Using free software to monitor S.M.A.R.T. drives on SmartArray (cciss) controllers

It is possible to access S.M.A.R.T. capabilities of drives attached to HP SmartArray (cciss) hardware RAID controllers using Free Software. Here are some examples of how I do this on Debian systems.

smartctl command line

Install the smartmontools package. As root you can run smartctl, adding a special option "-d cciss,#" where the number is the number of the drive on the controller that you want to query. For example

# /usr/sbin/smartctl -d cciss,0 -a /dev/cciss/c0d0
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: HP       DG146ABAB4       Version: HPD7
Serial number: 3NM0FYSF000097255W39
Device type: disk
Transport protocol: SAS
Local Time is: Sat Jun 28 04:19:34 2008 MDT
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK

Current Drive Temperature:     27 C
Drive Trip Temperature:        68 C
Elements in grown defect list: 87
Vendor (Seagate) cache information
  Blocks sent to initiator = 707632981
  Blocks received from initiator = 480660099
  Blocks read from cache and sent to initiator = 1692237021
  Number of read and write commands whose size <= segment size = 194230242
  Number of read and write commands whose size > segment size = 0
Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 6695.73
  number of minutes until next internal SMART test = 25

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0          0.000           0
write:         0        0         0         0          0          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -    6693                 - [-   -    -]
# 2  Background short  Completed                   -    6669                 - [-   -    -]
# 3  Background short  Completed                   -    6645                 - [-   -    -]

Long (extended) Self Test duration: 2070 seconds [34.5 minutes]

-i is info, -H is health, -A is attribute, -a is all. Read the smartctl(8) manpage for more info

smartd monitoring

Once you know that smartctl works, you can setup smartd to do periodic testing. Edit /etc/smartd.conf, comment out the line beginning with "DEVICESCAN" to turn off the automatic detection of all drives, then add something like the following

# Monitor 2 disks connected to the first HP SmartArray controller which
# uses the cciss driver. Start long tests on Sunday nights and short
# self-tests every night and send errors to root
/dev/cciss/c0d0 -d cciss,0 -a -s (L/../../7/02|S/../.././02) -m root
/dev/cciss/c0d0 -d cciss,1 -a -s (L/../../7/03|S/../.././03) -m root

I filed a bug to add the above as examples to the config file and add more cciss details to the man pages, see #488371 for more info. UPDATE: patch accepted in 5.38-2

munin hddtemp_smartctl plugin

Support was added for cciss in munin version 1.2.6 but you need to have recent version of smartmontools (newer than etch) in order for it to work (see #488357 UPDATE: backports.org now has a new enough version backported to etch). Current versions of munin and smartmontools in unstable backport to etch with no changes, so if you are using etch it's pretty easy. One you have new enough versions installed, edit (or add if it doesnt exist) the hddtemp_smartctl stanza in /etc/munin/plugin-conf.d/munin-node and add entries for cciss. For example, to monitor two drives on the first cciss controller

[hddtemp_smartctl]
user root
env.drives cciss0 cciss1
env.type_cciss0 cciss,0
env.dev_cciss0 cciss/c0d0
env.type_cciss1 cciss,1
env.dev_cciss1 cciss/c0d0

munin smart_ plugin

The munin smart_ plugin doesn't yet support cciss. I filed a bug with a non-working attempt at a patch, hopefully someone more familiar with the code can make it work properly. See #488360 for more info.

Matt Taggart <matt@lackof.org>
2008-06-25