Message ID | 1531342898-15790-3-git-send-email-eajames@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
On Wed, Jul 11, 2018 at 04:01:31PM -0500, Eddie James wrote: > Document the hwmon interface for the OCC. > > Signed-off-by: Eddie James <eajames@linux.vnet.ibm.com> > --- > Documentation/hwmon/occ | 73 +++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 73 insertions(+) > create mode 100644 Documentation/hwmon/occ > > diff --git a/Documentation/hwmon/occ b/Documentation/hwmon/occ > new file mode 100644 > index 0000000..465fa1a > --- /dev/null > +++ b/Documentation/hwmon/occ > @@ -0,0 +1,73 @@ > +Kernel driver occ-hwmon > +======================= > + > +Supported chips: > + * POWER8 > + * POWER9 > + > +Author: Eddie James <eajames@linux.vnet.ibm.com> > + > +Description > +----------- > + > +This driver supports hardware monitoring for the On-Chip Controller (OCC) > +embedded on POWER processors. The OCC is a device that collects and aggregates > +sensor data from the processor and the system. The OCC can provide the raw > +sensor data as well as perform thermal and power management on the system. > + > +The P8 version of this driver is a client driver of I2C. It may be probed > +manually if an "ibm,p8-occ-hwmon" compatible device is found under the > +appropriate I2C bus node in the device-tree. > + > +The P9 version of this driver is a client driver of the FSI-based OCC driver. > +It will be probed automatically by the FSI-based OCC driver. > + > +Sysfs entries > +------------- > + > +The following attributes are supported. All attributes are read-only unless > +specified. > + > +temp[1-n]_label OCC sensor id. > +temp[1-n]_input Measured temperature in millidegrees C. > +[with temperature sensor version 2+] > + temp[1-n]_fru_type Given FRU (Field Replaceable Unit) type. What is this ? An integer ? A string ? > + temp[1-n]_fault Temperature sensor fault. > + > +freq[1-n]_label OCC sensor id. > +freq[1-n]_input Measured frequency. What does that have to do with hardware monitoring, and what exactly does it measure ? AC voltage frequency ? Frequency of rainstorms in the surrounding area ? > + > +power[1-n]_label OCC sensor id. > +power[1-n]_input Measured power in microwatts. > +power[1-n]_update_tag Number of 250us samples represented in accumulator. update_tag to represent number of samples ? Odd choice for an attribute name. Why not "_samples" ? Also, if each sample represents a specific amount of time, why not report a time ? > +power[1-n]_accumulator Accumulation of 250us power readings. There is no explanation of "accumulation". Is this the energy ? If so, why not use energy attributes ? And what is the unit of this measurement ? > +[with power sensor version 2+] > + power[1-n]_function_id Identifies what the power reading is for. String ? Number ? Slot index ? Bitmap ? And why isn't that reported in the label ? After all, that is what the label is supposed to be used for. > + power[1-n]_apss_channel Indicates APSS channel. > + Does that provide any value to the user ? > +[power version 0xa0 only] > +power1_id OCC sensor id. This is inconsistent with the other attributes and even with itself. > +power[1-n]_label Sensor type, "system", "proc", "vdd", or "vdn". > +power[1-n]_input Most recent power reading in microwatts. Overall I am left with no idea what _id _label _function_id _apps_channel are and how they relate to each other, except that it all looks quite inconsistent. You might want to consider merging all those attributes into the label in some consistent way. > +power[1-n]_update_tag Number of samples in the accumulator. > +power[1-n]_accumulator Accumulation of power readings. Same as above. > +[with sensor type "system" and "proc" only] > + power[1-n]_update_time Time in us that the power value is read. > + > +caps1_current Current OCC power cap in watts. > +caps1_reading Current system output power in watts. > +caps1_norm Power cap without redundant power. > +caps1_max Maximum power cap. Why do those have to be non-standard attributes ? Please explain why you can not use power[1-n]_cap attributes. > +[caps version 1 and 2 only] > + caps1_min Minimum power cap. > +[caps version 3+] > + caps1_min_hard Hard minimum cap that can be set and held. > + caps1_min_soft Soft minimum cap below hard, not guaranteed. > +caps1_user The powercap specified by the user. Will be 0 if no > + user powercap exists. This attribute is read-write. > +[caps version 1+] > + caps1_user_source Indicates how the user power limit was set. > + > +extn[1-n]_label ASCII id or sensor id. > +extn[1-n]_flags Indicates type of label attribute. > +extn[1-n]_input Data. Great non-explanation. Not reviewing the series further. I am sure I asked that each non-standard attribute is explained. There is neither an explanation why the attributes are needed nor, in many cases, why non-standard attributes were chosen instead of standard ones. On top of that, the non-standard attributes are not even documented properly, leaving the reader wondering not only why they are needed, but what they are used for in the first place. Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/25/2018 11:36 AM, Guenter Roeck wrote: > On Wed, Jul 11, 2018 at 04:01:31PM -0500, Eddie James wrote: >> Document the hwmon interface for the OCC. >> >> Signed-off-by: Eddie James <eajames@linux.vnet.ibm.com> >> --- >> Documentation/hwmon/occ | 73 +++++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 73 insertions(+) >> create mode 100644 Documentation/hwmon/occ >> >> diff --git a/Documentation/hwmon/occ b/Documentation/hwmon/occ >> new file mode 100644 >> index 0000000..465fa1a >> --- /dev/null >> +++ b/Documentation/hwmon/occ >> @@ -0,0 +1,73 @@ >> +Kernel driver occ-hwmon >> +======================= >> + >> +Supported chips: >> + * POWER8 >> + * POWER9 >> + >> +Author: Eddie James <eajames@linux.vnet.ibm.com> >> + >> +Description >> +----------- >> + >> +This driver supports hardware monitoring for the On-Chip Controller (OCC) >> +embedded on POWER processors. The OCC is a device that collects and aggregates >> +sensor data from the processor and the system. The OCC can provide the raw >> +sensor data as well as perform thermal and power management on the system. >> + >> +The P8 version of this driver is a client driver of I2C. It may be probed >> +manually if an "ibm,p8-occ-hwmon" compatible device is found under the >> +appropriate I2C bus node in the device-tree. >> + >> +The P9 version of this driver is a client driver of the FSI-based OCC driver. >> +It will be probed automatically by the FSI-based OCC driver. >> + >> +Sysfs entries >> +------------- >> + >> +The following attributes are supported. All attributes are read-only unless >> +specified. >> + >> +temp[1-n]_label OCC sensor id. >> +temp[1-n]_input Measured temperature in millidegrees C. >> +[with temperature sensor version 2+] >> + temp[1-n]_fru_type Given FRU (Field Replaceable Unit) type. > What is this ? An integer ? A string ? > >> + temp[1-n]_fault Temperature sensor fault. >> + >> +freq[1-n]_label OCC sensor id. >> +freq[1-n]_input Measured frequency. > What does that have to do with hardware monitoring, and what exactly does it > measure ? AC voltage frequency ? Frequency of rainstorms in the surrounding > area ? > >> + >> +power[1-n]_label OCC sensor id. >> +power[1-n]_input Measured power in microwatts. >> +power[1-n]_update_tag Number of 250us samples represented in accumulator. > update_tag to represent number of samples ? Odd choice for > an attribute name. Why not "_samples" ? Also, if each sample > represents a specific amount of time, why not report a time ? > >> +power[1-n]_accumulator Accumulation of 250us power readings. > There is no explanation of "accumulation". Is this the energy ? > If so, why not use energy attributes ? And what is the unit of > this measurement ? > >> +[with power sensor version 2+] >> + power[1-n]_function_id Identifies what the power reading is for. > String ? Number ? Slot index ? Bitmap ? And why isn't that reported > in the label ? After all, that is what the label is supposed to be > used for. > >> + power[1-n]_apss_channel Indicates APSS channel. >> + > Does that provide any value to the user ? > >> +[power version 0xa0 only] >> +power1_id OCC sensor id. > This is inconsistent with the other attributes and even with itself. > >> +power[1-n]_label Sensor type, "system", "proc", "vdd", or "vdn". >> +power[1-n]_input Most recent power reading in microwatts. > Overall I am left with no idea what > _id > _label > _function_id > _apps_channel > are and how they relate to each other, except that it all looks quite > inconsistent. You might want to consider merging all those attributes into > the label in some consistent way. > >> +power[1-n]_update_tag Number of samples in the accumulator. >> +power[1-n]_accumulator Accumulation of power readings. > Same as above. > >> +[with sensor type "system" and "proc" only] >> + power[1-n]_update_time Time in us that the power value is read. >> + >> +caps1_current Current OCC power cap in watts. >> +caps1_reading Current system output power in watts. >> +caps1_norm Power cap without redundant power. >> +caps1_max Maximum power cap. > Why do those have to be non-standard attributes ? Please explain why you can not > use power[1-n]_cap attributes. > >> +[caps version 1 and 2 only] >> + caps1_min Minimum power cap. >> +[caps version 3+] >> + caps1_min_hard Hard minimum cap that can be set and held. >> + caps1_min_soft Soft minimum cap below hard, not guaranteed. >> +caps1_user The powercap specified by the user. Will be 0 if no >> + user powercap exists. This attribute is read-write. >> +[caps version 1+] >> + caps1_user_source Indicates how the user power limit was set. >> + >> +extn[1-n]_label ASCII id or sensor id. >> +extn[1-n]_flags Indicates type of label attribute. >> +extn[1-n]_input Data. > Great non-explanation. > > Not reviewing the series further. I am sure I asked that each non-standard > attribute is explained. There is neither an explanation why the attributes > are needed nor, in many cases, why non-standard attributes were chosen > instead of standard ones. On top of that, the non-standard attributes are > not even documented properly, leaving the reader wondering not only why > they are needed, but what they are used for in the first place. Hi, Thanks for the feedback Guenter. I am about to put up a new patch set with fixes for many of the issues you indicated and better descriptions. Now all the attributes except two should conform to standard hwmon attributes. The exceptions are for the user power cap and user power cap source. These are needed in order to make decisions about power management while controlling the system. Please look at their documentation to see if they're acceptable. Let me know what you think! Thanks, Eddie > > Guenter >
diff --git a/Documentation/hwmon/occ b/Documentation/hwmon/occ new file mode 100644 index 0000000..465fa1a --- /dev/null +++ b/Documentation/hwmon/occ @@ -0,0 +1,73 @@ +Kernel driver occ-hwmon +======================= + +Supported chips: + * POWER8 + * POWER9 + +Author: Eddie James <eajames@linux.vnet.ibm.com> + +Description +----------- + +This driver supports hardware monitoring for the On-Chip Controller (OCC) +embedded on POWER processors. The OCC is a device that collects and aggregates +sensor data from the processor and the system. The OCC can provide the raw +sensor data as well as perform thermal and power management on the system. + +The P8 version of this driver is a client driver of I2C. It may be probed +manually if an "ibm,p8-occ-hwmon" compatible device is found under the +appropriate I2C bus node in the device-tree. + +The P9 version of this driver is a client driver of the FSI-based OCC driver. +It will be probed automatically by the FSI-based OCC driver. + +Sysfs entries +------------- + +The following attributes are supported. All attributes are read-only unless +specified. + +temp[1-n]_label OCC sensor id. +temp[1-n]_input Measured temperature in millidegrees C. +[with temperature sensor version 2+] + temp[1-n]_fru_type Given FRU (Field Replaceable Unit) type. + temp[1-n]_fault Temperature sensor fault. + +freq[1-n]_label OCC sensor id. +freq[1-n]_input Measured frequency. + +power[1-n]_label OCC sensor id. +power[1-n]_input Measured power in microwatts. +power[1-n]_update_tag Number of 250us samples represented in accumulator. +power[1-n]_accumulator Accumulation of 250us power readings. +[with power sensor version 2+] + power[1-n]_function_id Identifies what the power reading is for. + power[1-n]_apss_channel Indicates APSS channel. + +[power version 0xa0 only] +power1_id OCC sensor id. +power[1-n]_label Sensor type, "system", "proc", "vdd", or "vdn". +power[1-n]_input Most recent power reading in microwatts. +power[1-n]_update_tag Number of samples in the accumulator. +power[1-n]_accumulator Accumulation of power readings. +[with sensor type "system" and "proc" only] + power[1-n]_update_time Time in us that the power value is read. + +caps1_current Current OCC power cap in watts. +caps1_reading Current system output power in watts. +caps1_norm Power cap without redundant power. +caps1_max Maximum power cap. +[caps version 1 and 2 only] + caps1_min Minimum power cap. +[caps version 3+] + caps1_min_hard Hard minimum cap that can be set and held. + caps1_min_soft Soft minimum cap below hard, not guaranteed. +caps1_user The powercap specified by the user. Will be 0 if no + user powercap exists. This attribute is read-write. +[caps version 1+] + caps1_user_source Indicates how the user power limit was set. + +extn[1-n]_label ASCII id or sensor id. +extn[1-n]_flags Indicates type of label attribute. +extn[1-n]_input Data.
Document the hwmon interface for the OCC. Signed-off-by: Eddie James <eajames@linux.vnet.ibm.com> --- Documentation/hwmon/occ | 73 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 Documentation/hwmon/occ