mbox series

[ndctl,v1,0/8] daxctl: Add device align and range mapping allocation

Message ID 20200716184707.23018-1-joao.m.martins@oracle.com (mailing list archive)
Headers show
Series daxctl: Add device align and range mapping allocation | expand

Message

Joao Martins July 16, 2020, 6:46 p.m. UTC
Hey,

This series builds on top of this one[0] and does the following improvements
to the Soft-Reserved subdivision:

 1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
 Here we add a '-a|--align 4K|2M|1G' option to the existing commands;

 2) Listing improvements for device alignment and mappings;
 Note: Perhaps it is better to hide the mappings by default, and only
       print with -v|--verbose. This would align with ndctl, as the mappings
       info can be quite large.

 3) Allow creating devices from selecting ranges. This allows to keep the
   same GPA->HPA mapping as before we kexec the hypervisor with running guests:

   daxctl list -d dax0.1 > /var/log/dax0.1.json
   kexec -d -l bzImage
   systemctl kexec
   daxctl create -u --restore /var/log/dax0.1.json

   The JSON was what I though it would be easier for an user, given that it is
   the data format daxctl outputs. Alternatives could be adding multiple:
   	--mapping <pgoff>:<start>-<end>

   But that could end up in a gigantic line and a little more
   unmanageable I think.

This series requires this series[0] on top of Dan's patches[1]:

 [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
 [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/

The only TODO here is docs and improving tests to validate mappings, and test
the restore path.

Suggestions/comments are welcome.

	Joao

Joao Martins (8):
  daxctl: add daxctl_dev_{get,set}_align()
  util/json: Print device align
  daxctl: add align support in reconfigure-device
  daxctl: add align support in create-device
  libdaxctl: add mapping iterator APIs
  daxctl: include mappings when listing
  libdaxctl: add daxctl_dev_set_mapping()
  daxctl: Allow restore devices from JSON metadata

 daxctl/device.c                | 154 +++++++++++++++++++++++++++++++++++++++--
 daxctl/lib/libdaxctl-private.h |   9 +++
 daxctl/lib/libdaxctl.c         | 152 +++++++++++++++++++++++++++++++++++++++-
 daxctl/lib/libdaxctl.sym       |   9 +++
 daxctl/libdaxctl.h             |  16 +++++
 util/json.c                    |  63 ++++++++++++++++-
 util/json.h                    |   3 +
 7 files changed, 396 insertions(+), 10 deletions(-)

Comments

Joao Martins Dec. 16, 2020, 11:39 a.m. UTC | #1
On 7/16/20 7:46 PM, Joao Martins wrote:
> Hey,
> 
> This series builds on top of this one[0] and does the following improvements
> to the Soft-Reserved subdivision:
> 
>  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
>  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
> 
>  2) Listing improvements for device alignment and mappings;
>  Note: Perhaps it is better to hide the mappings by default, and only
>        print with -v|--verbose. This would align with ndctl, as the mappings
>        info can be quite large.
> 
>  3) Allow creating devices from selecting ranges. This allows to keep the
>    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
> 
>    daxctl list -d dax0.1 > /var/log/dax0.1.json
>    kexec -d -l bzImage
>    systemctl kexec
>    daxctl create -u --restore /var/log/dax0.1.json
> 
>    The JSON was what I though it would be easier for an user, given that it is
>    the data format daxctl outputs. Alternatives could be adding multiple:
>    	--mapping <pgoff>:<start>-<end>
> 
>    But that could end up in a gigantic line and a little more
>    unmanageable I think.
> 
> This series requires this series[0] on top of Dan's patches[1]:
> 
>  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
>  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
> 
> The only TODO here is docs and improving tests to validate mappings, and test
> the restore path.
> 
> Suggestions/comments are welcome.
> 
There's a couple of issues in this series regarding daxctl-reconfigure options and
breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.

I will fix those and respin, and probably take out the last patch as it's more RFC-ish and
in need of feedback.

	Joao

> Joao Martins (8):
>   daxctl: add daxctl_dev_{get,set}_align()
>   util/json: Print device align
>   daxctl: add align support in reconfigure-device
>   daxctl: add align support in create-device
>   libdaxctl: add mapping iterator APIs
>   daxctl: include mappings when listing
>   libdaxctl: add daxctl_dev_set_mapping()
>   daxctl: Allow restore devices from JSON metadata
> 
>  daxctl/device.c                | 154 +++++++++++++++++++++++++++++++++++++++--
>  daxctl/lib/libdaxctl-private.h |   9 +++
>  daxctl/lib/libdaxctl.c         | 152 +++++++++++++++++++++++++++++++++++++++-
>  daxctl/lib/libdaxctl.sym       |   9 +++
>  daxctl/libdaxctl.h             |  16 +++++
>  util/json.c                    |  63 ++++++++++++++++-
>  util/json.h                    |   3 +
>  7 files changed, 396 insertions(+), 10 deletions(-)
>
Verma, Vishal L Dec. 16, 2020, 6:42 p.m. UTC | #2
On Wed, 2020-12-16 at 11:39 +0000, Joao Martins wrote:
> On 7/16/20 7:46 PM, Joao Martins wrote:
> > Hey,
> > 
> > This series builds on top of this one[0] and does the following improvements
> > to the Soft-Reserved subdivision:
> > 
> >  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
> >  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
> > 
> >  2) Listing improvements for device alignment and mappings;
> >  Note: Perhaps it is better to hide the mappings by default, and only
> >        print with -v|--verbose. This would align with ndctl, as the mappings
> >        info can be quite large.
> > 
> >  3) Allow creating devices from selecting ranges. This allows to keep the
> >    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
> > 
> >    daxctl list -d dax0.1 > /var/log/dax0.1.json
> >    kexec -d -l bzImage
> >    systemctl kexec
> >    daxctl create -u --restore /var/log/dax0.1.json
> > 
> >    The JSON was what I though it would be easier for an user, given that it is
> >    the data format daxctl outputs. Alternatives could be adding multiple:
> >    	--mapping <pgoff>:<start>-<end>
> > 
> >    But that could end up in a gigantic line and a little more
> >    unmanageable I think.
> > 
> > This series requires this series[0] on top of Dan's patches[1]:
> > 
> >  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
> >  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
> > 
> > The only TODO here is docs and improving tests to validate mappings, and test
> > the restore path.
> > 
> > Suggestions/comments are welcome.
> > 
> There's a couple of issues in this series regarding daxctl-reconfigure options and
> breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
> with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.
> 
> I will fix those and respin, and probably take out the last patch as it's more RFC-ish and
> in need of feedback.

Sounds good. Any objections to releasing v70 with the initial support,
and then adding this series on for the next one? I'm thinking I'll do a
much quicker v72 release say in early January with this and anything
else that missed v71.

> 
> 	Joao
> 
> > Joao Martins (8):
> >   daxctl: add daxctl_dev_{get,set}_align()
> >   util/json: Print device align
> >   daxctl: add align support in reconfigure-device
> >   daxctl: add align support in create-device
> >   libdaxctl: add mapping iterator APIs
> >   daxctl: include mappings when listing
> >   libdaxctl: add daxctl_dev_set_mapping()
> >   daxctl: Allow restore devices from JSON metadata
> > 
> >  daxctl/device.c                | 154 +++++++++++++++++++++++++++++++++++++++--
> >  daxctl/lib/libdaxctl-private.h |   9 +++
> >  daxctl/lib/libdaxctl.c         | 152 +++++++++++++++++++++++++++++++++++++++-
> >  daxctl/lib/libdaxctl.sym       |   9 +++
> >  daxctl/libdaxctl.h             |  16 +++++
> >  util/json.c                    |  63 ++++++++++++++++-
> >  util/json.h                    |   3 +
> >  7 files changed, 396 insertions(+), 10 deletions(-)
> >
Dan Williams Dec. 16, 2020, 7:13 p.m. UTC | #3
On Wed, Dec 16, 2020 at 3:41 AM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> On 7/16/20 7:46 PM, Joao Martins wrote:
> > Hey,
> >
> > This series builds on top of this one[0] and does the following improvements
> > to the Soft-Reserved subdivision:
> >
> >  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
> >  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
> >
> >  2) Listing improvements for device alignment and mappings;
> >  Note: Perhaps it is better to hide the mappings by default, and only
> >        print with -v|--verbose. This would align with ndctl, as the mappings
> >        info can be quite large.
> >
> >  3) Allow creating devices from selecting ranges. This allows to keep the
> >    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
> >
> >    daxctl list -d dax0.1 > /var/log/dax0.1.json
> >    kexec -d -l bzImage
> >    systemctl kexec
> >    daxctl create -u --restore /var/log/dax0.1.json
> >
> >    The JSON was what I though it would be easier for an user, given that it is
> >    the data format daxctl outputs. Alternatives could be adding multiple:
> >       --mapping <pgoff>:<start>-<end>
> >
> >    But that could end up in a gigantic line and a little more
> >    unmanageable I think.
> >
> > This series requires this series[0] on top of Dan's patches[1]:
> >
> >  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
> >  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
> >
> > The only TODO here is docs and improving tests to validate mappings, and test
> > the restore path.
> >
> > Suggestions/comments are welcome.
> >
> There's a couple of issues in this series regarding daxctl-reconfigure options and
> breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
> with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.

What's the breakage with older kernels, is it the kernel regressing
old daxctl, or is it new daxctl being incompatible with old kernels?
If it's the latter, it needs a fixup, if it's the former it needs a
kernel compat change.
Joao Martins Dec. 16, 2020, 9:35 p.m. UTC | #4
On 12/16/20 7:13 PM, Dan Williams wrote:
> On Wed, Dec 16, 2020 at 3:41 AM Joao Martins <joao.m.martins@oracle.com> wrote:
>>
>> On 7/16/20 7:46 PM, Joao Martins wrote:
>>> Hey,
>>>
>>> This series builds on top of this one[0] and does the following improvements
>>> to the Soft-Reserved subdivision:
>>>
>>>  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
>>>  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
>>>
>>>  2) Listing improvements for device alignment and mappings;
>>>  Note: Perhaps it is better to hide the mappings by default, and only
>>>        print with -v|--verbose. This would align with ndctl, as the mappings
>>>        info can be quite large.
>>>
>>>  3) Allow creating devices from selecting ranges. This allows to keep the
>>>    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
>>>
>>>    daxctl list -d dax0.1 > /var/log/dax0.1.json
>>>    kexec -d -l bzImage
>>>    systemctl kexec
>>>    daxctl create -u --restore /var/log/dax0.1.json
>>>
>>>    The JSON was what I though it would be easier for an user, given that it is
>>>    the data format daxctl outputs. Alternatives could be adding multiple:
>>>       --mapping <pgoff>:<start>-<end>
>>>
>>>    But that could end up in a gigantic line and a little more
>>>    unmanageable I think.
>>>
>>> This series requires this series[0] on top of Dan's patches[1]:
>>>
>>>  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
>>>  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
>>>
>>> The only TODO here is docs and improving tests to validate mappings, and test
>>> the restore path.
>>>
>>> Suggestions/comments are welcome.
>>>
>> There's a couple of issues in this series regarding daxctl-reconfigure options and
>> breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
>> with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.
> 
> What's the breakage with older kernels, is it the kernel regressing
> old daxctl, or is it new daxctl being incompatible with old kernels?
> If it's the latter, it needs a fixup, if it's the former it needs a
> kernel compat change.

It's the latter i.e. new daxctl being incompatible with old kernels, because of a change
in the first patch. Essentially a wrong assumption of device align being always available
in add_dax_dev().

The fixup would be this snip below to the first patch. But I will respin the first four
patches today or my morning tomorrow, with a test.

diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c
index 14bf48dd00bf..b01cc916eb6e 100644
--- a/daxctl/lib/libdaxctl.c
+++ b/daxctl/lib/libdaxctl.c
@@ -498,10 +498,12 @@ static void *add_dax_dev(void *parent, int id, const char *daxdev_base)
                goto err_read;
        dev->size = strtoull(buf, NULL, 0);

+       /* Device align attribute is only available in v5.10 or up */
        sprintf(path, "%s/align", daxdev_base);
-       if (sysfs_read_attr(ctx, path, buf) < 0)
-               goto err_read;
-       dev->align = strtoull(buf, NULL, 0);
+       if (!sysfs_read_attr(ctx, path, buf))
+               dev->align = strtoull(buf, NULL, 0);
+       else
+               dev->align = 0;

        dev->dev_path = strdup(daxdev_base);
        if (!dev->dev_path)
Joao Martins Dec. 16, 2020, 9:49 p.m. UTC | #5
On 12/16/20 6:42 PM, Verma, Vishal L wrote:
> On Wed, 2020-12-16 at 11:39 +0000, Joao Martins wrote:
>> On 7/16/20 7:46 PM, Joao Martins wrote:
>>> Hey,
>>>
>>> This series builds on top of this one[0] and does the following improvements
>>> to the Soft-Reserved subdivision:
>>>
>>>  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
>>>  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
>>>
>>>  2) Listing improvements for device alignment and mappings;
>>>  Note: Perhaps it is better to hide the mappings by default, and only
>>>        print with -v|--verbose. This would align with ndctl, as the mappings
>>>        info can be quite large.
>>>
>>>  3) Allow creating devices from selecting ranges. This allows to keep the
>>>    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
>>>
>>>    daxctl list -d dax0.1 > /var/log/dax0.1.json
>>>    kexec -d -l bzImage
>>>    systemctl kexec
>>>    daxctl create -u --restore /var/log/dax0.1.json
>>>
>>>    The JSON was what I though it would be easier for an user, given that it is
>>>    the data format daxctl outputs. Alternatives could be adding multiple:
>>>    	--mapping <pgoff>:<start>-<end>
>>>
>>>    But that could end up in a gigantic line and a little more
>>>    unmanageable I think.
>>>
>>> This series requires this series[0] on top of Dan's patches[1]:
>>>
>>>  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
>>>  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
>>>
>>> The only TODO here is docs and improving tests to validate mappings, and test
>>> the restore path.
>>>
>>> Suggestions/comments are welcome.
>>>
>> There's a couple of issues in this series regarding daxctl-reconfigure options and
>> breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
>> with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.
>>
>> I will fix those and respin, and probably take out the last patch as it's more RFC-ish and
>> in need of feedback.
> 
> Sounds good. Any objections to releasing v70 with the initial support,
> and then adding this series on for the next one? I'm thinking I'll do a
> much quicker v72 release say in early January with this and anything
> else that missed v71.

If we're able to wait until tomorrow, I could respin these first four patches with the
fixes and include the align support in the initial set. Otherwise, I am also good if you
prefer defering it to v72.

	Joao
Dan Williams Dec. 16, 2020, 10:31 p.m. UTC | #6
On Wed, Dec 16, 2020 at 1:51 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> On 12/16/20 6:42 PM, Verma, Vishal L wrote:
> > On Wed, 2020-12-16 at 11:39 +0000, Joao Martins wrote:
> >> On 7/16/20 7:46 PM, Joao Martins wrote:
> >>> Hey,
> >>>
> >>> This series builds on top of this one[0] and does the following improvements
> >>> to the Soft-Reserved subdivision:
> >>>
> >>>  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
> >>>  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
> >>>
> >>>  2) Listing improvements for device alignment and mappings;
> >>>  Note: Perhaps it is better to hide the mappings by default, and only
> >>>        print with -v|--verbose. This would align with ndctl, as the mappings
> >>>        info can be quite large.
> >>>
> >>>  3) Allow creating devices from selecting ranges. This allows to keep the
> >>>    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
> >>>
> >>>    daxctl list -d dax0.1 > /var/log/dax0.1.json
> >>>    kexec -d -l bzImage
> >>>    systemctl kexec
> >>>    daxctl create -u --restore /var/log/dax0.1.json
> >>>
> >>>    The JSON was what I though it would be easier for an user, given that it is
> >>>    the data format daxctl outputs. Alternatives could be adding multiple:
> >>>     --mapping <pgoff>:<start>-<end>
> >>>
> >>>    But that could end up in a gigantic line and a little more
> >>>    unmanageable I think.
> >>>
> >>> This series requires this series[0] on top of Dan's patches[1]:
> >>>
> >>>  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
> >>>  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
> >>>
> >>> The only TODO here is docs and improving tests to validate mappings, and test
> >>> the restore path.
> >>>
> >>> Suggestions/comments are welcome.
> >>>
> >> There's a couple of issues in this series regarding daxctl-reconfigure options and
> >> breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
> >> with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.
> >>
> >> I will fix those and respin, and probably take out the last patch as it's more RFC-ish and
> >> in need of feedback.
> >
> > Sounds good. Any objections to releasing v70 with the initial support,
> > and then adding this series on for the next one? I'm thinking I'll do a
> > much quicker v72 release say in early January with this and anything
> > else that missed v71.
>
> If we're able to wait until tomorrow, I could respin these first four patches with the
> fixes and include the align support in the initial set. Otherwise, I am also good if you
> prefer defering it to v72.
>

Does this change the JSON output? Might be nice to keep that all in
one update rather than invalidate some testing with the old format
betweem v71 and v72.
Joao Martins Dec. 16, 2020, 10:53 p.m. UTC | #7
On 12/16/20 10:31 PM, Dan Williams wrote:
> On Wed, Dec 16, 2020 at 1:51 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>>
>> On 12/16/20 6:42 PM, Verma, Vishal L wrote:
>>> On Wed, 2020-12-16 at 11:39 +0000, Joao Martins wrote:
>>>> On 7/16/20 7:46 PM, Joao Martins wrote:
>>>>> Hey,
>>>>>
>>>>> This series builds on top of this one[0] and does the following improvements
>>>>> to the Soft-Reserved subdivision:
>>>>>
>>>>>  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
>>>>>  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
>>>>>
>>>>>  2) Listing improvements for device alignment and mappings;
>>>>>  Note: Perhaps it is better to hide the mappings by default, and only
>>>>>        print with -v|--verbose. This would align with ndctl, as the mappings
>>>>>        info can be quite large.
>>>>>
>>>>>  3) Allow creating devices from selecting ranges. This allows to keep the
>>>>>    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
>>>>>
>>>>>    daxctl list -d dax0.1 > /var/log/dax0.1.json
>>>>>    kexec -d -l bzImage
>>>>>    systemctl kexec
>>>>>    daxctl create -u --restore /var/log/dax0.1.json
>>>>>
>>>>>    The JSON was what I though it would be easier for an user, given that it is
>>>>>    the data format daxctl outputs. Alternatives could be adding multiple:
>>>>>     --mapping <pgoff>:<start>-<end>
>>>>>
>>>>>    But that could end up in a gigantic line and a little more
>>>>>    unmanageable I think.
>>>>>
>>>>> This series requires this series[0] on top of Dan's patches[1]:
>>>>>
>>>>>  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
>>>>>  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
>>>>>
>>>>> The only TODO here is docs and improving tests to validate mappings, and test
>>>>> the restore path.
>>>>>
>>>>> Suggestions/comments are welcome.
>>>>>
>>>> There's a couple of issues in this series regarding daxctl-reconfigure options and
>>>> breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
>>>> with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.
>>>>
>>>> I will fix those and respin, and probably take out the last patch as it's more RFC-ish and
>>>> in need of feedback.
>>>
>>> Sounds good. Any objections to releasing v70 with the initial support,
>>> and then adding this series on for the next one? I'm thinking I'll do a
>>> much quicker v72 release say in early January with this and anything
>>> else that missed v71.
>>
>> If we're able to wait until tomorrow, I could respin these first four patches with the
>> fixes and include the align support in the initial set. Otherwise, I am also good if you
>> prefer defering it to v72.
>>
> 
> Does this change the JSON output? Might be nice to keep that all in
> one update rather than invalidate some testing with the old format
> betweem v71 and v72.
> 
Ugh, sent the v2 too early before seeing this.

The only change to the output is on daxctl when listing devices for 5.10+.
It starts displaying an "align" key/value.

	Joao
Dan Williams Dec. 16, 2020, 11:42 p.m. UTC | #8
On Wed, Dec 16, 2020 at 2:54 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> On 12/16/20 10:31 PM, Dan Williams wrote:
> > On Wed, Dec 16, 2020 at 1:51 PM Joao Martins <joao.m.martins@oracle.com> wrote:
> >>
> >> On 12/16/20 6:42 PM, Verma, Vishal L wrote:
> >>> On Wed, 2020-12-16 at 11:39 +0000, Joao Martins wrote:
> >>>> On 7/16/20 7:46 PM, Joao Martins wrote:
> >>>>> Hey,
> >>>>>
> >>>>> This series builds on top of this one[0] and does the following improvements
> >>>>> to the Soft-Reserved subdivision:
> >>>>>
> >>>>>  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
> >>>>>  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
> >>>>>
> >>>>>  2) Listing improvements for device alignment and mappings;
> >>>>>  Note: Perhaps it is better to hide the mappings by default, and only
> >>>>>        print with -v|--verbose. This would align with ndctl, as the mappings
> >>>>>        info can be quite large.
> >>>>>
> >>>>>  3) Allow creating devices from selecting ranges. This allows to keep the
> >>>>>    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
> >>>>>
> >>>>>    daxctl list -d dax0.1 > /var/log/dax0.1.json
> >>>>>    kexec -d -l bzImage
> >>>>>    systemctl kexec
> >>>>>    daxctl create -u --restore /var/log/dax0.1.json
> >>>>>
> >>>>>    The JSON was what I though it would be easier for an user, given that it is
> >>>>>    the data format daxctl outputs. Alternatives could be adding multiple:
> >>>>>     --mapping <pgoff>:<start>-<end>
> >>>>>
> >>>>>    But that could end up in a gigantic line and a little more
> >>>>>    unmanageable I think.
> >>>>>
> >>>>> This series requires this series[0] on top of Dan's patches[1]:
> >>>>>
> >>>>>  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
> >>>>>  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
> >>>>>
> >>>>> The only TODO here is docs and improving tests to validate mappings, and test
> >>>>> the restore path.
> >>>>>
> >>>>> Suggestions/comments are welcome.
> >>>>>
> >>>> There's a couple of issues in this series regarding daxctl-reconfigure options and
> >>>> breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
> >>>> with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.
> >>>>
> >>>> I will fix those and respin, and probably take out the last patch as it's more RFC-ish and
> >>>> in need of feedback.
> >>>
> >>> Sounds good. Any objections to releasing v70 with the initial support,
> >>> and then adding this series on for the next one? I'm thinking I'll do a
> >>> much quicker v72 release say in early January with this and anything
> >>> else that missed v71.
> >>
> >> If we're able to wait until tomorrow, I could respin these first four patches with the
> >> fixes and include the align support in the initial set. Otherwise, I am also good if you
> >> prefer defering it to v72.
> >>
> >
> > Does this change the JSON output? Might be nice to keep that all in
> > one update rather than invalidate some testing with the old format
> > betweem v71 and v72.
> >
> Ugh, sent the v2 too early before seeing this.
>
> The only change to the output is on daxctl when listing devices for 5.10+.
> It starts displaying an "align" key/value.

No worries. Vishal and I chatted and it looks to me like your
improvements to the provisioning flow are worthwhile to sneak into the
v71 release if you want to include those changes as well.
Joao Martins Dec. 17, 2020, 11:23 a.m. UTC | #9
On 12/16/20 11:42 PM, Dan Williams wrote:
> On Wed, Dec 16, 2020 at 2:54 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>> On 12/16/20 10:31 PM, Dan Williams wrote:
>>> On Wed, Dec 16, 2020 at 1:51 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>>>> On 12/16/20 6:42 PM, Verma, Vishal L wrote:
>>>>> On Wed, 2020-12-16 at 11:39 +0000, Joao Martins wrote:
>>>>>> On 7/16/20 7:46 PM, Joao Martins wrote:
>>>>>>> Hey,
>>>>>>>
>>>>>>> This series builds on top of this one[0] and does the following improvements
>>>>>>> to the Soft-Reserved subdivision:
>>>>>>>
>>>>>>>  1) Support for {create,reconfigure}-device for selecting @align (hugepage size).
>>>>>>>  Here we add a '-a|--align 4K|2M|1G' option to the existing commands;
>>>>>>>
>>>>>>>  2) Listing improvements for device alignment and mappings;
>>>>>>>  Note: Perhaps it is better to hide the mappings by default, and only
>>>>>>>        print with -v|--verbose. This would align with ndctl, as the mappings
>>>>>>>        info can be quite large.
>>>>>>>
>>>>>>>  3) Allow creating devices from selecting ranges. This allows to keep the
>>>>>>>    same GPA->HPA mapping as before we kexec the hypervisor with running guests:
>>>>>>>
>>>>>>>    daxctl list -d dax0.1 > /var/log/dax0.1.json
>>>>>>>    kexec -d -l bzImage
>>>>>>>    systemctl kexec
>>>>>>>    daxctl create -u --restore /var/log/dax0.1.json
>>>>>>>
>>>>>>>    The JSON was what I though it would be easier for an user, given that it is
>>>>>>>    the data format daxctl outputs. Alternatives could be adding multiple:
>>>>>>>     --mapping <pgoff>:<start>-<end>
>>>>>>>
>>>>>>>    But that could end up in a gigantic line and a little more
>>>>>>>    unmanageable I think.
>>>>>>>
>>>>>>> This series requires this series[0] on top of Dan's patches[1]:
>>>>>>>
>>>>>>>  [0] https://lore.kernel.org/linux-nvdimm/20200716172913.19658-1-joao.m.martins@oracle.com/
>>>>>>>  [1] https://lore.kernel.org/linux-nvdimm/159457116473.754248.7879464730875147365.stgit@dwillia2-desk3.amr.corp.intel.com/
>>>>>>>
>>>>>>> The only TODO here is docs and improving tests to validate mappings, and test
>>>>>>> the restore path.
>>>>>>>
>>>>>>> Suggestions/comments are welcome.
>>>>>>>
>>>>>> There's a couple of issues in this series regarding daxctl-reconfigure options and
>>>>>> breakage of ndctl with kernels (<5.10) that do not supply a device @align upon testing
>>>>>> with NVDIMMs. Plus it is missing daxctl-create.sh unit test for @align.
>>>>>>
>>>>>> I will fix those and respin, and probably take out the last patch as it's more RFC-ish and
>>>>>> in need of feedback.
>>>>>
>>>>> Sounds good. Any objections to releasing v70 with the initial support,
>>>>> and then adding this series on for the next one? I'm thinking I'll do a
>>>>> much quicker v72 release say in early January with this and anything
>>>>> else that missed v71.
>>>>
>>>> If we're able to wait until tomorrow, I could respin these first four patches with the
>>>> fixes and include the align support in the initial set. Otherwise, I am also good if you
>>>> prefer defering it to v72.
>>>
>>> Does this change the JSON output? Might be nice to keep that all in
>>> one update rather than invalidate some testing with the old format
>>> betweem v71 and v72.
>>>
>> Ugh, sent the v2 too early before seeing this.
>>
>> The only change to the output is on daxctl when listing devices for 5.10+.
>> It starts displaying an "align" key/value.
> 
> No worries. Vishal and I chatted and it looks to me like your
> improvements to the provisioning flow are worthwhile to sneak into the
> v71 release if you want to include those changes as well.

The provisioning flow additions has some questions open about the daxctl
user interface. To summarize:

1) Should we always list mappings, or only list them with a -v option? Or
maybe instead of -v to use instead a new -M option which enables the
listing of mappings?

The reason being that it can get quite verbose with a device picks a lot
of mappings, hence I would imagine this info isn't necessary for the casual
daxctl listing.

2) Does the '--restore <file.json>' should instead be called it
instead '--device'? I feel the name '--restore' is too tied to one specific
way of using it when the feature can be used by a tool which wants to manage
its own range mappings. Additionally, I was thinking that if one wants to
manually add/fixup ranges more easily that we would add
a --mapping <pgoff>:<start>-<end> sort of syntax. But I suppose this could
be added later if its really desired.

And with these clarified, I could respin it over. Oh and I'm missing a
mappings test as well.

It's worth mentioning that kexec will need fixing, as dax_hmem regions
created with HMAT or manually assigned with efi_fake_mem= get lost on
kexec because we do not pass the EFI_MEMMAP conventional memory ranges
to the second kernel (only runtime code/boot services). I have a
RFC patch for x86 efi handling, but should get that conversation started
after holidays.

	Joao
Verma, Vishal L Dec. 17, 2020, 8:18 p.m. UTC | #10
On Thu, 2020-12-17 at 11:23 +0000, Joao Martins wrote:
> 
> The provisioning flow additions has some questions open about the daxctl
> user interface. To summarize:
> 
> 1) Should we always list mappings, or only list them with a -v option? Or
> maybe instead of -v to use instead a new -M option which enables the
> listing of mappings?
> 
> The reason being that it can get quite verbose with a device picks a lot
> of mappings, hence I would imagine this info isn't necessary for the casual
> daxctl listing.

I think hiding them behind a new option is probably best. And then we
can have different verbosity levels turn on or off. The verbosity levels
stuff is implemented in ndctl, but I don't think it is in daxctl yet, so
we can just do the specific option to display mappings for now, and then
revisit verbosity levels in the future if we feel like the number of
options is getting out of hand.

> 
> 2) Does the '--restore <file.json>' should instead be called it
> instead '--device'? I feel the name '--restore' is too tied to one specific
> way of using it when the feature can be used by a tool which wants to manage

Hm, I looked at other commands that take an input file - write labels
just calls it --input, so there might be value in staying consistent
with that. But write-infoblock just uses stdin - so that could be
another option. I'd be fine with either of those.

> its own range mappings. Additionally, I was thinking that if one wants to
> manually add/fixup ranges more easily that we would add
> a --mapping <pgoff>:<start>-<end> sort of syntax. But I suppose this could
> be added later if its really desired.

Agreed with adding this later if needed.

> 
> And with these clarified, I could respin it over. Oh and I'm missing a
> mappings test as well.

Sounds good I'll wait to get these in.

> 
> It's worth mentioning that kexec will need fixing, as dax_hmem regions
> created with HMAT or manually assigned with efi_fake_mem= get lost on
> kexec because we do not pass the EFI_MEMMAP conventional memory ranges
> to the second kernel (only runtime code/boot services). I have a
> RFC patch for x86 efi handling, but should get that conversation started
> after holidays.
> 
> 	Joao