Message ID | 20220804045029.2905056-1-dave@stgolabs.net |
---|---|
Headers | show |
Series | cxl: BG operations and device sanitation | expand |
*sigh* Cc Dave. On Wed, 03 Aug 2022, Davidlohr Bueso wrote: >Hello, > >The following is a followup to some of the discussions around CXL device security >and sanitation[0, 1]. It is marked as RFC mostly to see if the background handling >will satisfy all users, not just sanitize/overwrite. For example there has been ideas >to avoid command hogging the device and/or interleaving scan media regions instead >of all in one, etc. More details in each patch, but: > >Patch 1 adds the required background cmd handling bits. While this is currently >only polling, it would be good to know if there are any fundamental blockers for >supporting irqs (beyond just background ops) between PCIe and CXL. For example, >are there any requirements/difficulties that is not the standard MSI/MSI-X PCI >vector allocation + devm_request_irq()? I have not looked at this into details but >it the topic has come up in the past as delicate', iirc. > >Patch 2 implements the sanitation commands (overwrite and secure erase). > >As for testing, while I used the mock device to test the secure erase command, I >ended up hacking up a prototype[2] for qemu to support overwrite and bg commands. >So while the some of the paths this series introduces have been exercised, there >is of course a lot more to do. > >Applies against Dave's cxl-security branch[2] which deals with the pmem-only bits. > >Thanks, >Davidlohr > >[0]: https://lore.kernel.org/all/20220708172455.gi37dh3od4w5lqrd@offworld/ >[1]: https://lore.kernel.org/all/165791918718.2491387.4203738301057301285.stgit@djiang5-desk3.ch.intel.com/ >[2]: https://github.com/davidlohr/qemu/commit/64a93a5b824b59d3b547f06f7fbb1269fb4790ce >[3]: https://git.kernel.org/pub/scm/linux/kernel/git/djiang/linux.git/log/?h=cxl-security > >Davidlohr Bueso (2): > cxl/mbox: Add background operation handling machinery > cxl/mem: Support sanitation commands > > Documentation/ABI/testing/sysfs-bus-cxl | 19 ++ > drivers/cxl/core/core.h | 2 +- > drivers/cxl/core/mbox.c | 304 +++++++++++++++++++++++- > drivers/cxl/core/memdev.c | 58 +++++ > drivers/cxl/core/port.c | 9 +- > drivers/cxl/cxl.h | 8 + > drivers/cxl/cxlmem.h | 65 ++++- > drivers/cxl/pci.c | 3 +- > drivers/cxl/pmem.c | 5 +- > drivers/cxl/security.c | 13 +- > include/uapi/linux/cxl_mem.h | 2 + > 11 files changed, 461 insertions(+), 27 deletions(-) > >-- >2.37.1 >
On 8/3/2022 9:50 PM, Davidlohr Bueso wrote: > Hello, > > The following is a followup to some of the discussions around CXL device security > and sanitation[0, 1]. It is marked as RFC mostly to see if the background handling > will satisfy all users, not just sanitize/overwrite. For example there has been ideas > to avoid command hogging the device and/or interleaving scan media regions instead > of all in one, etc. More details in each patch, but: > > Patch 1 adds the required background cmd handling bits. While this is currently > only polling, it would be good to know if there are any fundamental blockers for > supporting irqs (beyond just background ops) between PCIe and CXL. For example, > are there any requirements/difficulties that is not the standard MSI/MSI-X PCI > vector allocation + devm_request_irq()? I have not looked at this into details but > it the topic has come up in the past as delicate', iirc. > > Patch 2 implements the sanitation commands (overwrite and secure erase). > > As for testing, while I used the mock device to test the secure erase command, I > ended up hacking up a prototype[2] for qemu to support overwrite and bg commands. > So while the some of the paths this series introduces have been exercised, there > is of course a lot more to do. > > Applies against Dave's cxl-security branch[2] which deals with the pmem-only bits. From the operational sense everything looks good to me. As for the polling delay on overwrite, with pre-CXL pmem on Optane, we've discovered that overwrite can take a long time depending on the size. Sometimes MANY hours if the size is really large. We just opted to increment the polling interval as time went on [1] instead of based on size. [1]: https://elixir.bootlin.com/linux/v5.19/source/drivers/nvdimm/security.c#L434 > Thanks, > Davidlohr > > [0]: https://lore.kernel.org/all/20220708172455.gi37dh3od4w5lqrd@offworld/ > [1]: https://lore.kernel.org/all/165791918718.2491387.4203738301057301285.stgit@djiang5-desk3.ch.intel.com/ > [2]: https://github.com/davidlohr/qemu/commit/64a93a5b824b59d3b547f06f7fbb1269fb4790ce > [3]: https://git.kernel.org/pub/scm/linux/kernel/git/djiang/linux.git/log/?h=cxl-security > > Davidlohr Bueso (2): > cxl/mbox: Add background operation handling machinery > cxl/mem: Support sanitation commands > > Documentation/ABI/testing/sysfs-bus-cxl | 19 ++ > drivers/cxl/core/core.h | 2 +- > drivers/cxl/core/mbox.c | 304 +++++++++++++++++++++++- > drivers/cxl/core/memdev.c | 58 +++++ > drivers/cxl/core/port.c | 9 +- > drivers/cxl/cxl.h | 8 + > drivers/cxl/cxlmem.h | 65 ++++- > drivers/cxl/pci.c | 3 +- > drivers/cxl/pmem.c | 5 +- > drivers/cxl/security.c | 13 +- > include/uapi/linux/cxl_mem.h | 2 + > 11 files changed, 461 insertions(+), 27 deletions(-) > > -- > 2.37.1 >
On Thu, 04 Aug 2022, Dave Jiang wrote: >From the operational sense everything looks good to me. As for the >polling delay on overwrite, with pre-CXL pmem on Optane, we've >discovered that overwrite can take a long time depending on the size. >Sometimes MANY hours if the size is really large. We just opted to >increment the polling interval as time went on [1] instead of based on >size. Thanks for having a look. Sure, we can do that, I have no particular attachment to doing it based on size (it's just the way it occured to me). I am curious, though: While regardless of size vs time based estimates, are the numbers expected to be similar for volatile regions? All these numbers being from nvdimm DSM docs. Thanks, Davidlohr
On 8/4/2022 1:07 PM, Davidlohr Bueso wrote: > On Thu, 04 Aug 2022, Dave Jiang wrote: > >> From the operational sense everything looks good to me. As for the >> polling delay on overwrite, with pre-CXL pmem on Optane, we've >> discovered that overwrite can take a long time depending on the size. >> Sometimes MANY hours if the size is really large. We just opted to >> increment the polling interval as time went on [1] instead of based on >> size. > > Thanks for having a look. Sure, we can do that, I have no particular > attachment > to doing it based on size (it's just the way it occured to me). I am > curious, > though: While regardless of size vs time based estimates, are the numbers > expected to be similar for volatile regions? All these numbers being from > nvdimm DSM docs. I don't either. Just pointing out that's what we did with the Optane stuff. I think that the volatile devices (DRAM?) would probably be a lot faster when it comes to writes. So maybe won't take as long. And also perhaps smaller in size in the immediate future? Just guessing. > > Thanks, > Davidlohr