diff mbox

HPE SMART data retreival

Message ID 20160728094518.cqtaavikkrzqnf7k@c203.arch.suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Johannes Thumshirn July 28, 2016, 9:45 a.m. UTC
On Wed, Jul 27, 2016 at 02:23:51PM -0400, Linda Knippers wrote:
> Hi Johannes,
> 
> I hope Dan and Jerry also reply but I have recently started looking at
> this too and have some comments below.

So if we two are looking into it we might want to work together so we don't do
all the work twice.

I've pasted my current work to export HPE DIMM commands at the bottom of the
mail, but I don't like it very much (see below for a proposal of a more
elegant way).

> 
> On 7/27/2016 6:35 AM, Johannes Thumshirn wrote:
> > Hi Dan and Jerry,
> > 
> > I'm currently looking into SMART data retrieval on HPE NVDIMMs.
> > 
> > After the first obstacle (like getting cat
> > /sys/class/nd/ndctl0/device/nmem0/commands reutrn smart so ndctl will issue
> > the ioctl) I ran into a rather nasty problem. According to [1] HPEDIMMs
> > need the input buffer specially crafted for SMART data, according to [2]
> > Intel DIMMs don't.
> 
> It is unfortunate that the DSM functions are different for different NVDIMM
> 
> types.  There is a desire for standardization but for now this is what we have.

Sure, let's see what we can come up with to "fix" the situation.

I had a quick chat here internally at SUSE and one suggestion was to create
something, let's call it "filter drivers", to support different DSMs for
different NVDIMM families. I actually like the idea and if no-one objects I'll
try to implement some kind of prototype for the "filter drivers".

> 
> > Adding translation functions for the DIMMs accepted commands is one thing and
> > should be more or less trivial for all DIMMs (I'll post an RFC patch as soon
> > as Linus merged Dan's 4.8 pull request so I can rebase it) but doing this
> > type of conversation for each and every command for every defined vendor
> > family for both the input and output buffers will drive us all mad I guess.
> 
> In my opinion, I don't think we need ndctl to be able to issue every function
> and decode every response but I would like to see the --health option to report
> the "SMART and Health" data reported by functions 1 and 2.  I could be wrong
> but I think that's all it does for the NVDIMMs that use the Intel example DSM.

That's what I think as well.

> 
> There are functions in libndctl that issue other DSMs but I don't think
> those aren't used by the ndctl command itself.  Whether we need the other
> functions in libndctl to work with non-Intel DSM devices is still TBD.
> 
> > Especially from the distribution's POV I'm not to keen on having customers
> > with some new NVDIMM family and we would need to re-implement all the
> > translators again. Adding a new ID is one thing but translation tables are a
> > totally different story.
> > 
> > So the question is have I overlooked something and there is a clean and easy
> > solution to this problem, or not.
> 
> I don't think it's clean and easy today.  I think it needs to be made a bit
> more modular to handle the NVDIMMs that have DSMs implemented using the
> pass-through mode.
> 
> > @Jerry have you tested SMART data retrieval with ndctl? Did it work for you?
> 
> I tried it and it does not work.

Ok this at least rules out my own stupidity.


As said, here's my current translation code, I'll try to implement the filter
driver idea and report back to the ML once I have something working. The
emphasis is on SMART and Health data currently but I'd like to expand it so we
have most of the documented DSMs supported until we've found a standard way.

From 7572ea9aaae7382133da2afd972ee948adb9bf0f Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <jthumshirn@suse.de>
Date: Thu, 28 Jul 2016 09:07:13 +0200
Subject: [PATCH] nfit: Read DSM Mask for HPE DIMMs as well

Currently the nfit driver only reads the DIMMs supported DSM mask for Intel
DIMMs, so we end up with only "cmd_call" in sysfs' supported commands file
/sys/class/nd/ndctlX/device/nmemX/commands.

Introduce some translation functions to map NVDIMM_FAMILY_HPE1 and
NVDIMM_FAMILY_HPE2 to the currently implemented Intel commands.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 drivers/acpi/nfit.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 57 insertions(+), 2 deletions(-)

Comments

Linda Knippers July 28, 2016, 5:43 p.m. UTC | #1
On 7/28/2016 5:45 AM, Johannes Thumshirn wrote:
> On Wed, Jul 27, 2016 at 02:23:51PM -0400, Linda Knippers wrote:
>> Hi Johannes,
>>
>> I hope Dan and Jerry also reply but I have recently started looking at
>> this too and have some comments below.
> 
> So if we two are looking into it we might want to work together so we don't do
> all the work twice.

Definitely!

> I've pasted my current work to export HPE DIMM commands at the bottom of the
> mail, but I don't like it very much (see below for a proposal of a more
> elegant way).
> 
>>
>> On 7/27/2016 6:35 AM, Johannes Thumshirn wrote:
>>> Hi Dan and Jerry,
>>>
>>> I'm currently looking into SMART data retrieval on HPE NVDIMMs.
>>>
>>> After the first obstacle (like getting cat
>>> /sys/class/nd/ndctl0/device/nmem0/commands reutrn smart so ndctl will issue
>>> the ioctl) I ran into a rather nasty problem. According to [1] HPEDIMMs
>>> need the input buffer specially crafted for SMART data, according to [2]
>>> Intel DIMMs don't.
>>
>> It is unfortunate that the DSM functions are different for different NVDIMM
>>
>> types.  There is a desire for standardization but for now this is what we have.
> 
> Sure, let's see what we can come up with to "fix" the situation.
> 
> I had a quick chat here internally at SUSE and one suggestion was to create
> something, let's call it "filter drivers", to support different DSMs for
> different NVDIMM families. I actually like the idea and if no-one objects I'll
> try to implement some kind of prototype for the "filter drivers".

I think a question we need to answer is what should be in the kernel vs. in user
space and what the interface between the two should look like.  The kernel
doesn't necessarily need to know that function 1 returns health information
and function 2 returns threshold information unless it needs the information
itself.  I think the intent of cmd_call was just as a pass-through mechanism
to keep from adding more code into the kernel.  We could create ndctl functions
instead.  I know this is different from how the original Intel DSM functions
are implemented but that could be considered legacy code to support existing
applications at this point.

For history, the first discussions about this approach were on the nvdimm
list back in November and it went from there.
https://lists.01.org/pipermail/linux-nvdimm/2015-November/002721.html

>>
>>> Adding translation functions for the DIMMs accepted commands is one thing and
>>> should be more or less trivial for all DIMMs (I'll post an RFC patch as soon
>>> as Linus merged Dan's 4.8 pull request so I can rebase it) but doing this
>>> type of conversation for each and every command for every defined vendor
>>> family for both the input and output buffers will drive us all mad I guess.
>>
>> In my opinion, I don't think we need ndctl to be able to issue every function
>> and decode every response but I would like to see the --health option to report
>> the "SMART and Health" data reported by functions 1 and 2.  I could be wrong
>> but I think that's all it does for the NVDIMMs that use the Intel example DSM.
> 
> That's what I think as well.
> 
>>
>> There are functions in libndctl that issue other DSMs but I don't think
>> those aren't used by the ndctl command itself.  Whether we need the other
>> functions in libndctl to work with non-Intel DSM devices is still TBD.
>>
>>> Especially from the distribution's POV I'm not to keen on having customers
>>> with some new NVDIMM family and we would need to re-implement all the
>>> translators again. Adding a new ID is one thing but translation tables are a
>>> totally different story.
>>>
>>> So the question is have I overlooked something and there is a clean and easy
>>> solution to this problem, or not.
>>
>> I don't think it's clean and easy today.  I think it needs to be made a bit
>> more modular to handle the NVDIMMs that have DSMs implemented using the
>> pass-through mode.
>>
>>> @Jerry have you tested SMART data retrieval with ndctl? Did it work for you?
>>
>> I tried it and it does not work.
> 
> Ok this at least rules out my own stupidity.
> 
> 
> As said, here's my current translation code, I'll try to implement the filter
> driver idea and report back to the ML once I have something working. The
> emphasis is on SMART and Health data currently but I'd like to expand it so we
> have most of the documented DSMs supported until we've found a standard way.

My comments above may change all this but I'll comment on the code anyway....
> 
> From 7572ea9aaae7382133da2afd972ee948adb9bf0f Mon Sep 17 00:00:00 2001
> From: Johannes Thumshirn <jthumshirn@suse.de>
> Date: Thu, 28 Jul 2016 09:07:13 +0200
> Subject: [PATCH] nfit: Read DSM Mask for HPE DIMMs as well
> 
> Currently the nfit driver only reads the DIMMs supported DSM mask for Intel
> DIMMs, so we end up with only "cmd_call" in sysfs' supported commands file
> /sys/class/nd/ndctlX/device/nmemX/commands.
> 
> Introduce some translation functions to map NVDIMM_FAMILY_HPE1 and
> NVDIMM_FAMILY_HPE2 to the currently implemented Intel commands.

Today, we have no platform that implements HPE2 without also implementing
HPE1. HPE1 will be found and used first by the current Linux code.
We may discontinue HPE2...still being discussed...so we might not need
the HPE2 functions.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  drivers/acpi/nfit.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 57 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
> index ddff49d..50a1b81 100644
> --- a/drivers/acpi/nfit.c
> +++ b/drivers/acpi/nfit.c
> @@ -1163,6 +1163,53 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
>  	return 0;
>  }
>  
> +static inline unsigned long
> +acpi_nfit_decode_dsm_mask_from_intel(unsigned long dsm_mask)
> +{
> +	return dsm_mask;
> +}
> +
> +static inline unsigned long
> +acpi_nfit_decode_dsm_mask_from_hpe1(unsigned long dsm_mask)
> +{
> +	unsigned long cmd_mask = 0;
> +
> +	if (test_bit(1, &dsm_mask))
> +		set_bit(ND_CMD_SMART, &cmd_mask);
> +	if (test_bit(2, &dsm_mask))
> +		set_bit(ND_CMD_SMART_THRESHOLD, &cmd_mask);
> +
> +	return cmd_mask;
> +}
> +
> +static inline unsigned long
> +acpi_nfit_decode_dsm_mask_from_hpe2(unsigned long dsm_mask)
> +{
> +	unsigned long cmd_mask = 0;
> +
> +	if (test_bit(1, &dsm_mask))
> +		set_bit(ND_CMD_SMART, &dsm_mask);
> +	if (test_bit(2, &dsm_mask))
> +		set_bit(ND_CMD_SMART_THRESHOLD, &dsm_mask);
> +	if (test_bit(4, &dsm_mask))
> +		set_bit(ND_CMD_DIMM_FLAGS, &dsm_mask);
> +	if (test_bit(6, &dsm_mask))
> +		set_bit(ND_CMD_VENDOR_EFFECT_LOG_SIZE, &dsm_mask);
> +	if (test_bit(7, &dsm_mask))
> +		set_bit(ND_CMD_VENDOR_EFFECT_LOG, &dsm_mask);
> +	if (test_bit(8, &dsm_mask))
> +		set_bit(ND_CMD_VENDOR, &dsm_mask);
> +

You're actually setting dsm_mask, not cmd_mask above.

I'm not sure I see the value of re-using the ND_CMD_* values that are
defined for the Intel DSM where they happen to be the same as for other
DSM vs. just defining a new set of values that match each DSM implementation.

Or are you trying to map ND_CMD_SMART from user space into the function
number that implements that command for the particular DSM?

I think the only way to hid the implementation differences, including
the function differences, from user space would be to define generic
functions with generic output buffers that can represent all of the
data returned by any of the DSMs in a standard format.

> +	return cmd_mask;
> +}
> +
> +static unsigned long (*decoder_funcs[])(unsigned long dsm_mask) = {
> +	[NVDIMM_FAMILY_INTEL] = acpi_nfit_decode_dsm_mask_from_intel,
> +	[NVDIMM_FAMILY_HPE1] = acpi_nfit_decode_dsm_mask_from_hpe1,
> +	[NVDIMM_FAMILY_HPE2] = acpi_nfit_decode_dsm_mask_from_hpe2,

I think Dan just pushed the code for yet another family.

> +};
> +
> +
>  static int acpi_nfit_register_dimms(struct acpi_nfit_desc *acpi_desc)
>  {
>  	struct nfit_mem *nfit_mem;
> @@ -1199,8 +1246,16 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *acpi_desc)
>  		 * userspace interface.
>  		 */
>  		cmd_mask = 1UL << ND_CMD_CALL;
> -		if (nfit_mem->family == NVDIMM_FAMILY_INTEL)
> -			cmd_mask |= nfit_mem->dsm_mask;
> +		if (nfit_mem->family >= 0 &&
> +		    nfit_mem->family <= NVDIMM_FAMILY_HPE2) {
> +			int family = nfit_mem->family;
> +			unsigned long dsm_mask = nfit_mem->dsm_mask;
> +
> +			cmd_mask |= (*decoder_funcs[family])(dsm_mask);
> +		} else {
> +			dev_dbg(acpi_desc->dev, "Unknown NVDIMM Family %d\n",
> +				nfit_mem->family);
> +		}
>  
>  		nvdimm = nvdimm_create(acpi_desc->nvdimm_bus, nfit_mem,
>  				acpi_nfit_dimm_attribute_groups,
>
Dan Williams July 28, 2016, 6:20 p.m. UTC | #2
On Thu, Jul 28, 2016 at 10:43 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>
>
> On 7/28/2016 5:45 AM, Johannes Thumshirn wrote:
>> On Wed, Jul 27, 2016 at 02:23:51PM -0400, Linda Knippers wrote:
>>> Hi Johannes,
>>>
>>> I hope Dan and Jerry also reply but I have recently started looking at
>>> this too and have some comments below.
>>
>> So if we two are looking into it we might want to work together so we don't do
>> all the work twice.
>
> Definitely!
>
>> I've pasted my current work to export HPE DIMM commands at the bottom of the
>> mail, but I don't like it very much (see below for a proposal of a more
>> elegant way).
>>
>>>
>>> On 7/27/2016 6:35 AM, Johannes Thumshirn wrote:
>>>> Hi Dan and Jerry,
>>>>
>>>> I'm currently looking into SMART data retrieval on HPE NVDIMMs.
>>>>
>>>> After the first obstacle (like getting cat
>>>> /sys/class/nd/ndctl0/device/nmem0/commands reutrn smart so ndctl will issue
>>>> the ioctl) I ran into a rather nasty problem. According to [1] HPEDIMMs
>>>> need the input buffer specially crafted for SMART data, according to [2]
>>>> Intel DIMMs don't.
>>>
>>> It is unfortunate that the DSM functions are different for different NVDIMM
>>>
>>> types.  There is a desire for standardization but for now this is what we have.
>>
>> Sure, let's see what we can come up with to "fix" the situation.
>>
>> I had a quick chat here internally at SUSE and one suggestion was to create
>> something, let's call it "filter drivers", to support different DSMs for
>> different NVDIMM families. I actually like the idea and if no-one objects I'll
>> try to implement some kind of prototype for the "filter drivers".
>
> I think a question we need to answer is what should be in the kernel vs. in user
> space and what the interface between the two should look like.  The kernel
> doesn't necessarily need to know that function 1 returns health information
> and function 2 returns threshold information unless it needs the information
> itself.  I think the intent of cmd_call was just as a pass-through mechanism
> to keep from adding more code into the kernel.  We could create ndctl functions
> instead.  I know this is different from how the original Intel DSM functions
> are implemented but that could be considered legacy code to support existing
> applications at this point.
>
> For history, the first discussions about this approach were on the nvdimm
> list back in November and it went from there.
> https://lists.01.org/pipermail/linux-nvdimm/2015-November/002721.html
>

To me, the cmd_call mechanism is useful for commands that libnvdimm
does not and may never care about.  For the generally useful commands
that the kernel needs to be involved with I'd like to define a common
kernel command set and do in-kernel translation across the
different-for-difference sake implementations.  It might be good to
add versioning info in the payload so we can extend results without
breaking old binaries.
Linda Knippers July 28, 2016, 6:46 p.m. UTC | #3
On 7/28/2016 2:20 PM, Dan Williams wrote:
> On Thu, Jul 28, 2016 at 10:43 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>
>>
>> On 7/28/2016 5:45 AM, Johannes Thumshirn wrote:
>>> On Wed, Jul 27, 2016 at 02:23:51PM -0400, Linda Knippers wrote:
>>>> Hi Johannes,
>>>>
>>>> I hope Dan and Jerry also reply but I have recently started looking at
>>>> this too and have some comments below.
>>>
>>> So if we two are looking into it we might want to work together so we don't do
>>> all the work twice.
>>
>> Definitely!
>>
>>> I've pasted my current work to export HPE DIMM commands at the bottom of the
>>> mail, but I don't like it very much (see below for a proposal of a more
>>> elegant way).
>>>
>>>>
>>>> On 7/27/2016 6:35 AM, Johannes Thumshirn wrote:
>>>>> Hi Dan and Jerry,
>>>>>
>>>>> I'm currently looking into SMART data retrieval on HPE NVDIMMs.
>>>>>
>>>>> After the first obstacle (like getting cat
>>>>> /sys/class/nd/ndctl0/device/nmem0/commands reutrn smart so ndctl will issue
>>>>> the ioctl) I ran into a rather nasty problem. According to [1] HPEDIMMs
>>>>> need the input buffer specially crafted for SMART data, according to [2]
>>>>> Intel DIMMs don't.
>>>>
>>>> It is unfortunate that the DSM functions are different for different NVDIMM
>>>>
>>>> types.  There is a desire for standardization but for now this is what we have.
>>>
>>> Sure, let's see what we can come up with to "fix" the situation.
>>>
>>> I had a quick chat here internally at SUSE and one suggestion was to create
>>> something, let's call it "filter drivers", to support different DSMs for
>>> different NVDIMM families. I actually like the idea and if no-one objects I'll
>>> try to implement some kind of prototype for the "filter drivers".
>>
>> I think a question we need to answer is what should be in the kernel vs. in user
>> space and what the interface between the two should look like.  The kernel
>> doesn't necessarily need to know that function 1 returns health information
>> and function 2 returns threshold information unless it needs the information
>> itself.  I think the intent of cmd_call was just as a pass-through mechanism
>> to keep from adding more code into the kernel.  We could create ndctl functions
>> instead.  I know this is different from how the original Intel DSM functions
>> are implemented but that could be considered legacy code to support existing
>> applications at this point.
>>
>> For history, the first discussions about this approach were on the nvdimm
>> list back in November and it went from there.
>> https://lists.01.org/pipermail/linux-nvdimm/2015-November/002721.html
>>
> 
> To me, the cmd_call mechanism is useful for commands that libnvdimm
> does not and may never care about.  For the generally useful commands
> that the kernel needs to be involved with I'd like to define a common
> kernel command set and do in-kernel translation across the
> different-for-difference sake implementations.  

I don't think any of them are different-for-different sake.

No one has time for busy work.



It takes a lot of work and collaboration to standardize these things

and so far, we're still not even through the potential root device functions.

Standardizing functions across different device types with different

characteristics is even harder.



-- ljk


> It might be good to
> add versioning info in the payload so we can extend results without
> breaking old binaries.
Dan Williams July 28, 2016, 7:57 p.m. UTC | #4
On Thu, Jul 28, 2016 at 11:46 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>
>
> On 7/28/2016 2:20 PM, Dan Williams wrote:
>> On Thu, Jul 28, 2016 at 10:43 AM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>>
>>>
>>> On 7/28/2016 5:45 AM, Johannes Thumshirn wrote:
>>>> On Wed, Jul 27, 2016 at 02:23:51PM -0400, Linda Knippers wrote:
>>>>> Hi Johannes,
>>>>>
>>>>> I hope Dan and Jerry also reply but I have recently started looking at
>>>>> this too and have some comments below.
>>>>
>>>> So if we two are looking into it we might want to work together so we don't do
>>>> all the work twice.
>>>
>>> Definitely!
>>>
>>>> I've pasted my current work to export HPE DIMM commands at the bottom of the
>>>> mail, but I don't like it very much (see below for a proposal of a more
>>>> elegant way).
>>>>
>>>>>
>>>>> On 7/27/2016 6:35 AM, Johannes Thumshirn wrote:
>>>>>> Hi Dan and Jerry,
>>>>>>
>>>>>> I'm currently looking into SMART data retrieval on HPE NVDIMMs.
>>>>>>
>>>>>> After the first obstacle (like getting cat
>>>>>> /sys/class/nd/ndctl0/device/nmem0/commands reutrn smart so ndctl will issue
>>>>>> the ioctl) I ran into a rather nasty problem. According to [1] HPEDIMMs
>>>>>> need the input buffer specially crafted for SMART data, according to [2]
>>>>>> Intel DIMMs don't.
>>>>>
>>>>> It is unfortunate that the DSM functions are different for different NVDIMM
>>>>>
>>>>> types.  There is a desire for standardization but for now this is what we have.
>>>>
>>>> Sure, let's see what we can come up with to "fix" the situation.
>>>>
>>>> I had a quick chat here internally at SUSE and one suggestion was to create
>>>> something, let's call it "filter drivers", to support different DSMs for
>>>> different NVDIMM families. I actually like the idea and if no-one objects I'll
>>>> try to implement some kind of prototype for the "filter drivers".
>>>
>>> I think a question we need to answer is what should be in the kernel vs. in user
>>> space and what the interface between the two should look like.  The kernel
>>> doesn't necessarily need to know that function 1 returns health information
>>> and function 2 returns threshold information unless it needs the information
>>> itself.  I think the intent of cmd_call was just as a pass-through mechanism
>>> to keep from adding more code into the kernel.  We could create ndctl functions
>>> instead.  I know this is different from how the original Intel DSM functions
>>> are implemented but that could be considered legacy code to support existing
>>> applications at this point.
>>>
>>> For history, the first discussions about this approach were on the nvdimm
>>> list back in November and it went from there.
>>> https://lists.01.org/pipermail/linux-nvdimm/2015-November/002721.html
>>>
>>
>> To me, the cmd_call mechanism is useful for commands that libnvdimm
>> does not and may never care about.  For the generally useful commands
>> that the kernel needs to be involved with I'd like to define a common
>> kernel command set and do in-kernel translation across the
>> different-for-difference sake implementations.
>
> I don't think any of them are different-for-different sake.
>
> No one has time for busy work.
>
>
>
> It takes a lot of work and collaboration to standardize these things
>
> and so far, we're still not even through the potential root device functions.
>
> Standardizing functions across different device types with different
>
> characteristics is even harder.

I understand that it is hard, we are where we are due to expediency,
and Linux now supports all the known first generation command sets.
The "different for difference sake" comment was made with my Linux hat
on where I see the kernel supporting the _INTEL and _HPE2 command sets
which are frustratingly similar.

Linux has always been a vehicle for discovering and refactoring
commonality across vendors.  We can use a common Linux command format
to inform the standardization discussion going forward.
Johannes Thumshirn July 29, 2016, 11:55 a.m. UTC | #5
On Thu, Jul 28, 2016 at 01:43:17PM -0400, Linda Knippers wrote:

[...]

> > +acpi_nfit_decode_dsm_mask_from_hpe2(unsigned long dsm_mask)
> > +{
> > +	unsigned long cmd_mask = 0;
> > +
> > +	if (test_bit(1, &dsm_mask))
> > +		set_bit(ND_CMD_SMART, &dsm_mask);
> > +	if (test_bit(2, &dsm_mask))
> > +		set_bit(ND_CMD_SMART_THRESHOLD, &dsm_mask);
> > +	if (test_bit(4, &dsm_mask))
> > +		set_bit(ND_CMD_DIMM_FLAGS, &dsm_mask);
> > +	if (test_bit(6, &dsm_mask))
> > +		set_bit(ND_CMD_VENDOR_EFFECT_LOG_SIZE, &dsm_mask);
> > +	if (test_bit(7, &dsm_mask))
> > +		set_bit(ND_CMD_VENDOR_EFFECT_LOG, &dsm_mask);
> > +	if (test_bit(8, &dsm_mask))
> > +		set_bit(ND_CMD_VENDOR, &dsm_mask);
> > +
> 
> You're actually setting dsm_mask, not cmd_mask above.

*duh* good catch.

> 
> I'm not sure I see the value of re-using the ND_CMD_* values that are
> defined for the Intel DSM where they happen to be the same as for other
> DSM vs. just defining a new set of values that match each DSM implementation.
> 
> Or are you trying to map ND_CMD_SMART from user space into the function
> number that implements that command for the particular DSM?
> 
> I think the only way to hid the implementation differences, including
> the function differences, from user space would be to define generic
> functions with generic output buffers that can represent all of the
> data returned by any of the DSMs in a standard format.
> 

I'd like to go down the "map what can be mapped and create new commands
for things we don't have yet" road. And yes I've been looking into the
Microsoft DSM spec just to discover they define 31 methods and I'm not
sure if we can map them. This is why I wrote the patches in a "draw on the
whiteboard" style so you and Dan can comment early enough.

> > +	return cmd_mask;
> > +}
> > +
> > +static unsigned long (*decoder_funcs[])(unsigned long dsm_mask) = {
> > +	[NVDIMM_FAMILY_INTEL] = acpi_nfit_decode_dsm_mask_from_intel,
> > +	[NVDIMM_FAMILY_HPE1] = acpi_nfit_decode_dsm_mask_from_hpe1,
> > +	[NVDIMM_FAMILY_HPE2] = acpi_nfit_decode_dsm_mask_from_hpe2,
> 
> I think Dan just pushed the code for yet another family.
> 

Yes I know, but my development tree doesn't have it merged yet.

Anyways, how about this approach (for NFIT_FAMILY_HPE1, NFIT_FAMILY_INTEL also
started)?

I think this is the way we should go to a) handle pre-standard differences in
the kernel and b) don't end up with massively bloated object files as that
have code for multiple DSM specs all in one file.

/*
 * Copyright(c) 2016 SUSE Linux GmbH. All rights reserved.
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of version 2 of the GNU General Public License as
 * published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful, but
 * WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * General Public License for more details.
 */
#include <linux/module.h>
#include <linux/acpi.h>
#include <linux/nvdimm.h>

#include "dsm_filter.h"


static int
nfit_hpe1_acpi_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm,
		   unsigned int cmd, void *buf, unsigned int buf_len,
		   int *cmd_rc)
{
	/* do magic here */
}

static unsigned long nfit_hpe1_get_command_mask(unsigned long dsm_mask)
{
	unsigned long cmd_mask = 0;

	if (test_bit(1, &dsm_mask))
		set_bit(ND_CMD_SMART, &cmd_mask);
	if (test_bit(2, &dsm_mask))
		set_bit(ND_CMD_SMART_THRESHOLD, &cmd_mask);

	return cmd_mask;
}

static nfit_dsm_ops hpe1_dsm_ops = {
	.get_command_mask = nfit_hpe1_get_command_mask,
	.acpi_nfit_ctl = nfit_hpe1_acpi_ctl,
};

static in nfit_hpe1_probe(struct nfit_dsm_device *ddev, const u8 *uuid)
{
	return 0;
}

static void nfit_hpe1_remove(struct nfit_dsm_device *ddev)
{
	return;
}

static u8 *nfit_hpe1_uuids[16] = {
	"9002c334-acf3-4c0e-9642-a235f0d53bc6",
	0,
};
MODULE_DEVICE_TABLE(nfit_dsm, nfit_hpe1_uuids);

static struct nfit_dsm_driver hpe1_dsm_driver = {
	.driver = {
		.name = "nfit_dsm_hpe1",
		.owner = THIS_MODULE,
	},
	.probe = nfit_hpe1_probe,
	.remove = nfit_hpe1_remove,
	.uuids = nfit_hpe1_uuids,
};

static int __init nfit_dsm_hpe1_init(void)
{
	return nfit_register_dsm_filter(&hpe1_dsm_driver);
}
module_init(nfit_dsm_hpe1_init);

static void __exit nfit_dsm_hpe1_exit(void)
{
	nfit_unregister_dsm_filter(&hpe1_dsm_driver);
}
module_exit(nfit_dsm_hpe1_exit);

MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Johannes Thumshirn <jthumshirn@suse.de>");
MODULE_DESCRIPTION("NFIT DSM filter driver for HPE Type 1 NVDIMMs");

What's missing in here is a) the glue code to drivers/acpi/nfit.c and b) the
implementations of HPE1 DSMs.

Once I have these two I'll post a RFC patch series to the ML, but in the
meanwhile I'm already open to comments.

The glue code to nfit will be a bus so we have all the nice and shiny .probe()
.remove() .uevent() and what not.

Thanks,
	Johannes
Linda Knippers Aug. 1, 2016, 8:57 p.m. UTC | #6
On 7/29/2016 7:55 AM, Johannes Thumshirn wrote:
> On Thu, Jul 28, 2016 at 01:43:17PM -0400, Linda Knippers wrote:
> 
> [...]
>>
>> I'm not sure I see the value of re-using the ND_CMD_* values that are
>> defined for the Intel DSM where they happen to be the same as for other
>> DSM vs. just defining a new set of values that match each DSM implementation.
>>
>> Or are you trying to map ND_CMD_SMART from user space into the function
>> number that implements that command for the particular DSM?
>>
>> I think the only way to hid the implementation differences, including
>> the function differences, from user space would be to define generic
>> functions with generic output buffers that can represent all of the
>> data returned by any of the DSMs in a standard format.
>>
> 
> I'd like to go down the "map what can be mapped and create new commands
> for things we don't have yet" road. And yes I've been looking into the
> Microsoft DSM spec just to discover they define 31 methods and I'm not
> sure if we can map them. This is why I wrote the patches in a "draw on the
> whiteboard" style so you and Dan can comment early enough.
> 
>>> +	return cmd_mask;
>>> +}
>>> +
>>> +static unsigned long (*decoder_funcs[])(unsigned long dsm_mask) = {
>>> +	[NVDIMM_FAMILY_INTEL] = acpi_nfit_decode_dsm_mask_from_intel,
>>> +	[NVDIMM_FAMILY_HPE1] = acpi_nfit_decode_dsm_mask_from_hpe1,
>>> +	[NVDIMM_FAMILY_HPE2] = acpi_nfit_decode_dsm_mask_from_hpe2,
>>
>> I think Dan just pushed the code for yet another family.
>>
> 
> Yes I know, but my development tree doesn't have it merged yet.
> 
> Anyways, how about this approach (for NFIT_FAMILY_HPE1, NFIT_FAMILY_INTEL also
> started)?
> 
> I think this is the way we should go to a) handle pre-standard differences in
> the kernel and b) don't end up with massively bloated object files as that
> have code for multiple DSM specs all in one file.
> 
> /*
>  * Copyright(c) 2016 SUSE Linux GmbH. All rights reserved.
>  *
>  * This program is free software; you can redistribute it and/or modify
>  * it under the terms of version 2 of the GNU General Public License as
>  * published by the Free Software Foundation.
>  *
>  * This program is distributed in the hope that it will be useful, but
>  * WITHOUT ANY WARRANTY; without even the implied warranty of
>  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>  * General Public License for more details.
>  */
> #include <linux/module.h>
> #include <linux/acpi.h>
> #include <linux/nvdimm.h>
> 
> #include "dsm_filter.h"
> 
> 
> static int
> nfit_hpe1_acpi_ctl(struct nvdimm_bus_descriptor *nd_desc, struct nvdimm *nvdimm,
> 		   unsigned int cmd, void *buf, unsigned int buf_len,
> 		   int *cmd_rc)
> {
> 	/* do magic here */
> }
> 
> static unsigned long nfit_hpe1_get_command_mask(unsigned long dsm_mask)
> {
> 	unsigned long cmd_mask = 0;
> 
> 	if (test_bit(1, &dsm_mask))
> 		set_bit(ND_CMD_SMART, &cmd_mask);
> 	if (test_bit(2, &dsm_mask))
> 		set_bit(ND_CMD_SMART_THRESHOLD, &cmd_mask);
> 
> 	return cmd_mask;
> }
> 
> static nfit_dsm_ops hpe1_dsm_ops = {
> 	.get_command_mask = nfit_hpe1_get_command_mask,
> 	.acpi_nfit_ctl = nfit_hpe1_acpi_ctl,
> };
> 
> static in nfit_hpe1_probe(struct nfit_dsm_device *ddev, const u8 *uuid)
> {
> 	return 0;
> }
> 
> static void nfit_hpe1_remove(struct nfit_dsm_device *ddev)
> {
> 	return;
> }
> 
> static u8 *nfit_hpe1_uuids[16] = {
> 	"9002c334-acf3-4c0e-9642-a235f0d53bc6",
> 	0,
> };
> MODULE_DEVICE_TABLE(nfit_dsm, nfit_hpe1_uuids);
> 
> static struct nfit_dsm_driver hpe1_dsm_driver = {
> 	.driver = {
> 		.name = "nfit_dsm_hpe1",
> 		.owner = THIS_MODULE,
> 	},
> 	.probe = nfit_hpe1_probe,
> 	.remove = nfit_hpe1_remove,
> 	.uuids = nfit_hpe1_uuids,
> };
> 
> static int __init nfit_dsm_hpe1_init(void)
> {
> 	return nfit_register_dsm_filter(&hpe1_dsm_driver);
> }
> module_init(nfit_dsm_hpe1_init);
> 
> static void __exit nfit_dsm_hpe1_exit(void)
> {
> 	nfit_unregister_dsm_filter(&hpe1_dsm_driver);
> }
> module_exit(nfit_dsm_hpe1_exit);
> 
> MODULE_LICENSE("GPL v2");
> MODULE_AUTHOR("Johannes Thumshirn <jthumshirn@suse.de>");
> MODULE_DESCRIPTION("NFIT DSM filter driver for HPE Type 1 NVDIMMs");
> 
> What's missing in here is a) the glue code to drivers/acpi/nfit.c and b) the
> implementations of HPE1 DSMs.
> 
> Once I have these two I'll post a RFC patch series to the ML, but in the
> meanwhile I'm already open to comments.
> 
> The glue code to nfit will be a bus so we have all the nice and shiny .probe()
> .remove() .uevent() and what not.

There are some things I like about this approach.  What I like most is
that it add modularity.  I think that when the ACPI tables for NVDIMM
devices were designed, the idea was that you could have different types
of NVDIMMs with different capabilities and management requirements, and
have generic as well as device-specific software.  Section 9.20.5
describes the different fields that might be used to distinguish the
hardware and load device or vendor-specific drivers.  We don't have
flexibility in the current code.  Perhaps it makes sense to evolve our
current code base to one where the root functions are in the Generic NVDIMM
management driver and we have vendor-specific management drivers for the
non-root device DSMs.  I don't know if we need that for the other driver
types but management is becoming more complicated without them.

-- ljk
> 
> Thanks,
> 	Johannes
>
Dan Williams Aug. 1, 2016, 10:31 p.m. UTC | #7
On Mon, Aug 1, 2016 at 1:57 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>> What's missing in here is a) the glue code to drivers/acpi/nfit.c and b) the
>> implementations of HPE1 DSMs.
>>
>> Once I have these two I'll post a RFC patch series to the ML, but in the
>> meanwhile I'm already open to comments.
>>
>> The glue code to nfit will be a bus so we have all the nice and shiny .probe()
>> .remove() .uevent() and what not.
>
> There are some things I like about this approach.  What I like most is
> that it add modularity.  I think that when the ACPI tables for NVDIMM
> devices were designed, the idea was that you could have different types
> of NVDIMMs with different capabilities and management requirements, and
> have generic as well as device-specific software.  Section 9.20.5
> describes the different fields that might be used to distinguish the
> hardware and load device or vendor-specific drivers.  We don't have
> flexibility in the current code.  Perhaps it makes sense to evolve our
> current code base to one where the root functions are in the Generic NVDIMM
> management driver and we have vendor-specific management drivers for the
> non-root device DSMs.  I don't know if we need that for the other driver
> types but management is becoming more complicated without them.

I don't think that is quite what Johannes is proposing.  This isn't a
management model per-vendor, this is a scheme to have the kernel
present unified management model across vendors and minimize the
surface area of the userspace ABI as much as possible.
Linda Knippers Aug. 1, 2016, 10:51 p.m. UTC | #8
On 8/1/2016 6:31 PM, Dan Williams wrote:
> On Mon, Aug 1, 2016 at 1:57 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
>>> What's missing in here is a) the glue code to drivers/acpi/nfit.c and b) the
>>> implementations of HPE1 DSMs.
>>>
>>> Once I have these two I'll post a RFC patch series to the ML, but in the
>>> meanwhile I'm already open to comments.
>>>
>>> The glue code to nfit will be a bus so we have all the nice and shiny .probe()
>>> .remove() .uevent() and what not.
>>
>> There are some things I like about this approach.  What I like most is
>> that it add modularity.  I think that when the ACPI tables for NVDIMM
>> devices were designed, the idea was that you could have different types
>> of NVDIMMs with different capabilities and management requirements, and
>> have generic as well as device-specific software.  Section 9.20.5
>> describes the different fields that might be used to distinguish the
>> hardware and load device or vendor-specific drivers.  We don't have
>> flexibility in the current code.  Perhaps it makes sense to evolve our
>> current code base to one where the root functions are in the Generic NVDIMM
>> management driver and we have vendor-specific management drivers for the
>> non-root device DSMs.  I don't know if we need that for the other driver
>> types but management is becoming more complicated without them.
> 
> I don't think that is quite what Johannes is proposing.  This isn't a
> management model per-vendor, this is a scheme to have the kernel
> present unified management model across vendors and minimize the
> surface area of the userspace ABI as much as possible.

I don't think having multiple modules means there have to be multiple
management models, it just allows for multiple management implementations.
ACPI is all about abstracting differences.  The userspace ABI could be
more generic where the hardware-specific module is responsible for doing
the right thing for that piece of hardware to make it happen, whether
it's getting health information or creating a namespace.  Right now we
have multiple userspace ABIs because the DSMs are exposed directly and
except for the root device, there is no standard.

-- ljk
Dan Williams Aug. 2, 2016, 12:16 a.m. UTC | #9
On Mon, Aug 1, 2016 at 3:51 PM, Linda Knippers <linda.knippers@hpe.com> wrote:
> I don't think having multiple modules means there have to be multiple
> management models, it just allows for multiple management implementations.
> ACPI is all about abstracting differences.

ACPI is also about vendors talking to each other to minimize divergence.
diff mbox

Patch

diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index ddff49d..50a1b81 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -1163,6 +1163,53 @@  static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
 	return 0;
 }
 
+static inline unsigned long
+acpi_nfit_decode_dsm_mask_from_intel(unsigned long dsm_mask)
+{
+	return dsm_mask;
+}
+
+static inline unsigned long
+acpi_nfit_decode_dsm_mask_from_hpe1(unsigned long dsm_mask)
+{
+	unsigned long cmd_mask = 0;
+
+	if (test_bit(1, &dsm_mask))
+		set_bit(ND_CMD_SMART, &cmd_mask);
+	if (test_bit(2, &dsm_mask))
+		set_bit(ND_CMD_SMART_THRESHOLD, &cmd_mask);
+
+	return cmd_mask;
+}
+
+static inline unsigned long
+acpi_nfit_decode_dsm_mask_from_hpe2(unsigned long dsm_mask)
+{
+	unsigned long cmd_mask = 0;
+
+	if (test_bit(1, &dsm_mask))
+		set_bit(ND_CMD_SMART, &dsm_mask);
+	if (test_bit(2, &dsm_mask))
+		set_bit(ND_CMD_SMART_THRESHOLD, &dsm_mask);
+	if (test_bit(4, &dsm_mask))
+		set_bit(ND_CMD_DIMM_FLAGS, &dsm_mask);
+	if (test_bit(6, &dsm_mask))
+		set_bit(ND_CMD_VENDOR_EFFECT_LOG_SIZE, &dsm_mask);
+	if (test_bit(7, &dsm_mask))
+		set_bit(ND_CMD_VENDOR_EFFECT_LOG, &dsm_mask);
+	if (test_bit(8, &dsm_mask))
+		set_bit(ND_CMD_VENDOR, &dsm_mask);
+
+	return cmd_mask;
+}
+
+static unsigned long (*decoder_funcs[])(unsigned long dsm_mask) = {
+	[NVDIMM_FAMILY_INTEL] = acpi_nfit_decode_dsm_mask_from_intel,
+	[NVDIMM_FAMILY_HPE1] = acpi_nfit_decode_dsm_mask_from_hpe1,
+	[NVDIMM_FAMILY_HPE2] = acpi_nfit_decode_dsm_mask_from_hpe2,
+};
+
+
 static int acpi_nfit_register_dimms(struct acpi_nfit_desc *acpi_desc)
 {
 	struct nfit_mem *nfit_mem;
@@ -1199,8 +1246,16 @@  static int acpi_nfit_register_dimms(struct acpi_nfit_desc *acpi_desc)
 		 * userspace interface.
 		 */
 		cmd_mask = 1UL << ND_CMD_CALL;
-		if (nfit_mem->family == NVDIMM_FAMILY_INTEL)
-			cmd_mask |= nfit_mem->dsm_mask;
+		if (nfit_mem->family >= 0 &&
+		    nfit_mem->family <= NVDIMM_FAMILY_HPE2) {
+			int family = nfit_mem->family;
+			unsigned long dsm_mask = nfit_mem->dsm_mask;
+
+			cmd_mask |= (*decoder_funcs[family])(dsm_mask);
+		} else {
+			dev_dbg(acpi_desc->dev, "Unknown NVDIMM Family %d\n",
+				nfit_mem->family);
+		}
 
 		nvdimm = nvdimm_create(acpi_desc->nvdimm_bus, nfit_mem,
 				acpi_nfit_dimm_attribute_groups,