Message ID | 20230604-dcd-type2-upstream-v2-0-f740c47e7916@intel.com |
---|---|
Headers | show |
Series | DCD: Add support for Dynamic Capacity Devices (DCD) | expand |
On Mon, Aug 28, 2023 at 10:20:51PM -0700, Ira Weiny wrote: > A Dynamic Capacity Device (DCD) (CXL 3.0 spec 9.13.3) is a CXL memory > device that implements dynamic capacity. Dynamic capacity feature > allows memory capacity to change dynamically, without the need for > resetting the device. > > Even though this is marked v2 by b4, this is effectively a whole new > series for DCD support. Quite a bit of the core support was completed > by Navneet in [4]. However, the architecture through the CXL region, > DAX region, and DAX Device layers is completely different. Particular > attention was paid to: > > 1) managing skip resources in the hardware device > 2) ensuring the host OS only sent a release memory mailbox > response when all DAX devices are done using an extent > 3) allowing dax devices to span extents > 4) allowing dax devices to use parts of extents > > I could say all of the review comments from v1 are addressed but frankly > the series has changed so much that I can't guarantee anything. > > The series continues to be based on the type-2 work posted from Dan.[2] > However, my branch with that work is a bit dated. Therefore I have > posted this series on github here.[5] > > Testing was sped up with cxl-test and ndctl dcd support. A preview of > that work is on github.[6] In addition Fan Ni's Qemu DCD series was > used part of the time.[3] > > The major parts of this series are: > > - Get the dynamic capacity (DC) region information from cxl device > - Configure device DC regions reported by hardware > - Enhance CXL and DAX regions for DC > a. maintain separation between the hardware extents and the CXL > region extents to provide for the addition of interleaving in > the future. > - Get and maintain the hardware extent lists for each device via an > initial extent list and DC event records > a. Add capacity Events > b. Add capacity response > b. Release capacity events > d. Release capacity response > - Notify region layers of extent changes > - Allow for DAX devices to be created on extents which are surfaced > - Maintain references on extents which are in use > a. Send Release capacity Response only when DAX devices are not > using memory > - Allow DAX region extent labels to change to allow for flexibility in > DAX device creation in the future (further enhancements are required > to ndctl for this) > - Trace Dynamic Capacity events > - Add cxl-test infrastructure to allow for faster unit testing > > To: Dan Williams <dan.j.williams@intel.com> > Cc: Navneet Singh <navneet.singh@intel.com> > Cc: Fan Ni <fan.ni@samsung.com> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Cc: Davidlohr Bueso <dave@stgolabs.net> > Cc: Dave Jiang <dave.jiang@intel.com> > Cc: Alison Schofield <alison.schofield@intel.com> > Cc: Vishal Verma <vishal.l.verma@intel.com> > Cc: Ira Weiny <ira.weiny@intel.com> > Cc: linux-cxl@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > > [1] https://lore.kernel.org/all/64326437c1496_934b2949f@dwillia2-mobl3.amr.corp.intel.com.notmuch/ > [2] https://lore.kernel.org/all/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/ > [3] https://lore.kernel.org/all/6483946e8152f_f1132294a2@iweiny-mobl.notmuch/ > [4] https://lore.kernel.org/r/20230604-dcd-type2-upstream-v1-0-71b6341bae54@intel.com > [5] https://github.com/weiny2/linux-kernel/commits/dcd-v2-2023-08-28 > [6] https://github.com/weiny2/ndctl/tree/dcd-region2 > Hi Ira, I tried to test the patch series with the qemu dcd patches, however, I hit some issues, and would like to check the following with you. 1. After we create a region for DC before any extents are added, a dax device will show under /dev. Is that what we want? If I remember it correctly, the dax device used to show up after a dc extent is added. 2. add/release extent does not work correctly for me. The code path is not called, and I made the following changes to make it pass. --- drivers/cxl/cxl.h | 3 ++- drivers/cxl/cxlmem.h | 1 + drivers/cxl/pci.c | 7 +++++++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index 2c73a30980b6..0d132c1739ce 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -168,7 +168,8 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw) #define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO | \ CXLDEV_EVENT_STATUS_WARN | \ CXLDEV_EVENT_STATUS_FAIL | \ - CXLDEV_EVENT_STATUS_FATAL) + CXLDEV_EVENT_STATUS_FATAL| \ + CXLDEV_EVENT_STATUS_DCD) /* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */ #define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0) diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 8ca81fd067c2..ae9dcb291c75 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -235,6 +235,7 @@ struct cxl_event_interrupt_policy { u8 warn_settings; u8 failure_settings; u8 fatal_settings; + u8 dyncap_settings; } __packed; /** diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 10c1a583113c..e30fe0304514 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -686,6 +686,7 @@ static int cxl_event_config_msgnums(struct cxl_memdev_state *mds, .warn_settings = CXL_INT_MSI_MSIX, .failure_settings = CXL_INT_MSI_MSIX, .fatal_settings = CXL_INT_MSI_MSIX, + .dyncap_settings = CXL_INT_MSI_MSIX, }; mbox_cmd = (struct cxl_mbox_cmd) { @@ -739,6 +740,12 @@ static int cxl_event_irqsetup(struct cxl_memdev_state *mds) return rc; } + rc = cxl_event_req_irq(cxlds, policy.dyncap_settings); + if (rc) { + dev_err(cxlds->dev, "Failed to get interrupt for event dyncap log\n"); + return rc; + } + return 0; } -- 3. With changes made in 2, the code for add/release dc extent can be called, however, the system behaviour seems different from before. Previously, after a dc extent is added, it will show up with lsmem command and listed as offline. Now, nothing is showing. Is it expected? What should we do to make it usable as system ram? Please let me know if I miss something or did something wrong. Thanks. Fan > --- > Changes in v2: > - iweiny: Complete rework of the entire series > - Link to v1: https://lore.kernel.org/r/20230604-dcd-type2-upstream-v1-0-71b6341bae54@intel.com > > --- > Ira Weiny (15): > cxl/hdm: Debug, use decoder name function > cxl/mbox: Flag support for Dynamic Capacity Devices (DCD) > cxl/region: Add Dynamic Capacity decoder and region modes > cxl/port: Add Dynamic Capacity mode support to endpoint decoders > cxl/port: Add Dynamic Capacity size support to endpoint decoders > cxl/region: Add Dynamic Capacity CXL region support > cxl/mem: Read extents on memory device discovery > cxl/mem: Handle DCD add and release capacity events. > cxl/region: Expose DC extents on region driver load > cxl/region: Notify regions of DC changes > dax/bus: Factor out dev dax resize logic > dax/region: Support DAX device creation on dynamic DAX regions > tools/testing/cxl: Make event logs dynamic > tools/testing/cxl: Add DC Regions to mock mem data > tools/testing/cxl: Add Dynamic Capacity events > > Navneet Singh (3): > cxl/mem: Read Dynamic capacity configuration from the device > cxl/mem: Expose device dynamic capacity configuration > cxl/mem: Trace Dynamic capacity Event Record > > Documentation/ABI/testing/sysfs-bus-cxl | 56 ++- > drivers/cxl/core/core.h | 1 + > drivers/cxl/core/hdm.c | 215 ++++++++- > drivers/cxl/core/mbox.c | 646 +++++++++++++++++++++++++- > drivers/cxl/core/memdev.c | 77 ++++ > drivers/cxl/core/port.c | 19 + > drivers/cxl/core/region.c | 418 +++++++++++++++-- > drivers/cxl/core/trace.h | 65 +++ > drivers/cxl/cxl.h | 99 +++- > drivers/cxl/cxlmem.h | 138 +++++- > drivers/cxl/mem.c | 50 ++ > drivers/cxl/pci.c | 8 + > drivers/dax/Makefile | 1 + > drivers/dax/bus.c | 263 ++++++++--- > drivers/dax/bus.h | 1 + > drivers/dax/cxl.c | 213 ++++++++- > drivers/dax/dax-private.h | 61 +++ > drivers/dax/extent.c | 133 ++++++ > tools/testing/cxl/test/mem.c | 782 +++++++++++++++++++++++++++----- > 19 files changed, 3005 insertions(+), 241 deletions(-) > --- > base-commit: c76cce37fb6f3796e8e146677ba98d3cca30a488 > change-id: 20230604-dcd-type2-upstream-0cd15f6216fd > > Best regards, > -- > Ira Weiny <ira.weiny@intel.com> >
Fan Ni wrote: > On Mon, Aug 28, 2023 at 10:20:51PM -0700, Ira Weiny wrote: Sorry for the delay, I've been walking through the responses and just saw this. > > Hi Ira, > > I tried to test the patch series with the qemu dcd patches, however, I > hit some issues, and would like to check the following with you. > > 1. After we create a region for DC before any extents are added, a dax > device will show under /dev. Is that what we want? Yes, see cxl/region: Add Dynamic Capacity CXL region support "Special case DC capable CXL regions to create a 0 sized seed DAX device until others can be created on dynamic space later." The seed device is required but is left empty. It can be resized when extents are added later. > If I remember it > correctly, the dax device used to show up after a dc extent is added. > > > 2. add/release extent does not work correctly for me. The code path is > not called, and I made the following changes to make it pass. :-( This is the problem with cxl_test... I've just realized this after seeing Jorgen's email regarding the interrupt configuration code. I've added it back in. I'm not sure where it got lost along the way but it was completely gone from this RFC v2. Sorry about that. > --- > drivers/cxl/cxl.h | 3 ++- > drivers/cxl/cxlmem.h | 1 + > drivers/cxl/pci.c | 7 +++++++ > 3 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index 2c73a30980b6..0d132c1739ce 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -168,7 +168,8 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw) > #define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO | \ > CXLDEV_EVENT_STATUS_WARN | \ > CXLDEV_EVENT_STATUS_FAIL | \ > - CXLDEV_EVENT_STATUS_FATAL) > + CXLDEV_EVENT_STATUS_FATAL| \ > + CXLDEV_EVENT_STATUS_DCD) > > /* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */ > #define CXLDEV_EVENT_INT_MODE_MASK GENMASK(1, 0) > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index 8ca81fd067c2..ae9dcb291c75 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > @@ -235,6 +235,7 @@ struct cxl_event_interrupt_policy { > u8 warn_settings; > u8 failure_settings; > u8 fatal_settings; > + u8 dyncap_settings; > } __packed; > > /** > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 10c1a583113c..e30fe0304514 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -686,6 +686,7 @@ static int cxl_event_config_msgnums(struct cxl_memdev_state *mds, > .warn_settings = CXL_INT_MSI_MSIX, > .failure_settings = CXL_INT_MSI_MSIX, > .fatal_settings = CXL_INT_MSI_MSIX, > + .dyncap_settings = CXL_INT_MSI_MSIX, > }; > > mbox_cmd = (struct cxl_mbox_cmd) { > @@ -739,6 +740,12 @@ static int cxl_event_irqsetup(struct cxl_memdev_state *mds) > return rc; > } > > + rc = cxl_event_req_irq(cxlds, policy.dyncap_settings); > + if (rc) { > + dev_err(cxlds->dev, "Failed to get interrupt for event dyncap log\n"); > + return rc; > + } > + > return 0; > } > > -- > > 3. With changes made in 2, the code for add/release dc extent can be called, > however, the system behaviour seems different from before. Previously, after a > dc extent is added, it will show up with lsmem command and listed as offline. > Now, nothing is showing. Is it expected? What should we do to make it usable > as system ram? Yes this behavior was not correct before. DAX devices should be flexible to be created throughout the region. Either within extents or across extents. Dave Jiang mentioned to me internally it might help to add some ASCII art documentation regarding how this works. Generally, the dax region available size will increase when extents are added and new dax devices can be created to utilize that space. Check out the dcd-test.sh in ndctl at this link for the commands to create a dax device in the new architecture. https://github.com/weiny2/ndctl/tree/dcd-region2 Hope this helps. > > Please let me know if I miss something or did something wrong. Thanks. You did not. I thought the new dax code would explain this new dax device operation. Some new documentation is in order. Ira