mbox series

[RFT,v3,0/1] Summary: hwmon driver disk and solid state drives with temperature sensors

Message ID 20191226175051.31664-1-linux@roeck-us.net (mailing list archive)
Headers show
Series Summary: hwmon driver disk and solid state drives with temperature sensors | expand

Message

Guenter Roeck Dec. 26, 2019, 5:50 p.m. UTC
In the past, several attempts have been made to add support for reporting
SCSI/[S]ATA drive temperatures to the Linux kernel. This is desirable to
have a means to report drive temperatures to userspace without root
privileges and in a standard format, but also to be able to tie reported
temperatures with the thermal subsystem.

The most recent attempt was [1] by Linus Walleij. It went through a total
of seven iterations. At the end, it was rejected for a number of reasons;
see the provided link for details. This implementation resides in the
SCSI core. It originally resided in libata but was moved to SCSI per
maintainer request, where it was ultimately rejected.

An earlier submission of a driver to report SCSI/SATA drive temperatures
was made back in 2009 by Constantin Baranov [2]. This submission resides
in the hardware monitoring subsystem. It does not rely on changes in the
SCSI subsystem or in libata-scsi. Instead, it registers itself with the
SCSI subsystem using scsi_register_interface(). It was rejected primarily
because it executes ATA passthrough commands without verification that it
is actually connected to an ATA drive.

Both submissions use SMART attributes to read drive temperature information.
[1] also tries to identify temperature limits from those attributes.
Unfortunately, SMART attributes are not well defined, resulting in relative
complex code trying to identify the exact format of the reported data.

With the available information and feedback, we can make a number of
observations and conclusions.
a) Using available (S)ATA drive temperature information and convert it to
   a SCSI log page is an interesting idea. On the downside, it would add a
   substantial amount of complexity to libata-scsi. The code would either
   have to be optional, or it would have to be built into the kernel even
   if it is never used on a given system. Without access to SCSI drives
   supporting this feature, it would be all but impossible to test the code
   against such a drive. It would neither be possible to test correctness
   of the code in libata-scsi nor in the driver using that information.
   Overall it would be much easier and much less risky to implement such
   code on the receiving side (ie in a driver reporting the temperatures)
   instead of trying to convert the information from one format to another
   first. In summary, it is neither practical nor feasible. On top of that,
   there is no guarantee that code implementing this functionality would
   ever be accepted into the kernel for this very reason.
b) The code needed to read and analyze SCSI temperature log pages is quite
   complex (see smartmontools [5]). There is no existing support code
   in the Linux kernel; such code would have to be written. This makes
   the approach discussed in a) even more risky and less practical.
c) Overall, any attempt to report temperature information for anything
   but SATA drives in the kernel is not practical due to the complexity
   involved, and due to the inability to test the resulting code with
   non-SATA drives.
d) Using SMART data for anything but basic temperature reporting is not
   really feasible due to the lack of standardization. Any attempt to do
   this would add a substantial amount of code, ambiguity, and risk.

This submission implements a driver to report the temperature of SATA
drives through the hardware monitoring subsystem. It is implemented as
stand-alone driver in the hardware monitoring subsystem. The driver uses
the mechanism from submission [1] to register with the SCSI subsystem.
By using this mechanism, changes in the SCSI or ATA subsystems are not
required.  To reduce risk and complexity, it only instantiates after
reliably validating that it is connected to a SATA drive. It does not
attempt to report the temperature of non-SATA drives.

The driver uses the SCT Command Transport feature set as specified in
ATA8-ACS [4] to read and report the temperature as well as temperature
limits and lowest/highest temperature information (if available) for
SATA drives. If a drive does not support SCT Command Transport, the driver
attempts to access a limited set of well known SMART attributes to read
the drive temperature. In that case, only the current drive temperature
is reported.

The driver does not currently report temperatures for SCSI drives. This
will be added with a subsequent patch.

---
v3: Rename satatemp -> drivetemp
    Use cached VPD page 89 data (available with v5.5 and later kernels)
    Relax ATA drive detection; still check if inquiry data is
    present, but don't use it for access detection.
    Modify VPD data analysis following guidance from Martin K. Petersen
    Separate SATA drive detection into separate function
    Marked as RFT. Martin K. Petersen reports:
    "I get a crash in the driver core during probe if the drivetemp module
     is loaded prior to loading ahci or a SCSI HBA driver. This crash is
     unrelated to my changes. Haven't had time to debug."
    This will require further testing before the patch is applied.

v2: scsi_cmd variable is no longer static
    Fixed drive name in Kconfig 
    Describe heuristics used to select SCT or SMART in commit message
    Added Reviewed-by: from Linus Walleij

---
References:
[1] https://patchwork.kernel.org/patch/10688021/
[2] https://lore.kernel.org/lkml/20090913040104.ab1d0b69.const@mimas.ru/
[3] http://www.t10.org/cgi-bin/ac.pl?t=f&f=sat5r02.pdf
    Information technology - SCSI / ATA Translation - 5 (SAT-5),
    section 10.3.8 (Temperature log page).
[4] http://www.t13.org/documents/uploadeddocuments/docs2008/d1699r6a-ata8-acs.pdf
    ANS T13/1699-D "Information technology - AT Attachment 8 - ATA/ATAPI Command
    Set (ATA8-ACS)"
[5] https://github.com/mirror/smartmontools.git