Message ID | cover.1686712819.git.alison.schofield@intel.com |
---|---|
Headers | show |
Series | CXL: Apply SRAT defined PXM to entire CFMWS window | expand |
On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote:
> The CXL subsystem requires the creation of NUMA nodes for CFMWS
The thing is CXL some persistent memory thing, right? But what is this
CFMWS thing? I don't think I've ever seen that particular combination of
letters together.
On Wed, 14 Jun 2023 10:32:40 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote: > > The CXL subsystem requires the creation of NUMA nodes for CFMWS > > The thing is CXL some persistent memory thing, right? But what is this > CFMWS thing? I don't think I've ever seen that particular combination of > letters together. > Hi Peter, To save time before the US based folk wake up. Both persistent and volatile memory found on CXL devices (mostly volatile on early devices). CXL Fixed Memory Window (structure) (CFMWS - defined in 9.17.1.3 of CXL r3.0 - via an ACPI table (CEDT). CFMWS, as a term, is sometimes abused in the kernel (and here) for the window rather than the structure describing the window (the S on the end). CFMWS - A region of Host Physical Address (HPA) Space which routes accesses to CXL Host bridges. A CFMWS describes interleaving as well (so multiple target host bridges). If multiple interleave setups are available, then you'll see multiple CFMWS entries - so different statically regions of HPA can route to same host bridges with different interleave setups (decoding via the configurable part to hit different actual memory on the downstream devices). Where accesses are routed after that depends on the configurable parts of the CXL topology (Host-Managed Device Memory (HDM) decoders in host bridges, switches etc). Note that a CFMWS address may route to nowhere if downstream devices aren't available / configured yet. CFMWS is the CXL specification avoiding defining interfaces for controlling the host address space to CXL host bridge mapping as those vary so much across host implementations + not always configurable at runtime anyway. Also includes a bunch of other details about the region (too many details perhaps!) Who does the configuration (BIOS / kernel) varies across implementations and we have OS managed hotplug so the OS always has to do some of it (personally I prefer the kernel doing everything :) It's made messier by CXL 1.1 hosts where a lot less was discoverable so generally the BIOS has to do the heavy lifting. For CXL 2.0 onwards the OS 'might' do everything except whatever is needed on the host to configure the CXL Fixed Memory Windows it is advertising. Note there is no requirement that the access characteristics of memory mapped into a given CFMWS should be remotely consistent across the whole window - some of the window may route through switches, and to directly connected devices. That's a simplifying assumption made today as we don't yet know the full scope of what people are building. Hope that helps (rather than causing confusion!) Jonathan
Jonathan Cameron wrote: > On Wed, 14 Jun 2023 10:32:40 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote: > > > The CXL subsystem requires the creation of NUMA nodes for CFMWS > > > > The thing is CXL some persistent memory thing, right? But what is this > > CFMWS thing? I don't think I've ever seen that particular combination of > > letters together. > > > Hi Peter, > > To save time before the US based folk wake up. > [..] > Note there is no requirement that the access characteristics of memory mapped > into a given CFMWS should be remotely consistent across the whole window > - some of the window may route through switches, and to directly connected > devices. > That's a simplifying assumption made today as we don't yet know the full > scope of what people are building. > > Hope that helps (rather than causing confusion!) Thanks Jonathan! Patch 1 changelog also goes into more detail.
From: Alison Schofield <alison.schofield@intel.com> Along with the changes in v2 listed below, Dan questioned the maintenance burden of x86 not switching to use the memblock API. See Dan Williams & Mike Rapoport discuss the issue in the v1 link. [1] IIUC switching existing x86 meminfo usage to memblock is the pre-existing outstanding work, and per Mike 'that's quite some work needed to make that happen' and since the memblock API doesn't support something like numa_fill_memblks(), add that work on top. So, with that open awaiting feedback from x86 maintainers, here's v2. Changes in v2: Patch 1/2: x86/numa: Introduce numa_fill_memblks() - Update commit log with policy description. (Dan) - Collect memblks with any HPA range intersect. (Dan) - Adjust head or tail memblk to include, not align to, HPA range. - Let the case of a single memblk simply fall through. - Simplify the sort compare function to use start only. - Rewrite and simplify the fill loop. - Add code comment for exclusive memblk->end. (Dan) - Add code comment for memblks being adjusted in place. (Dan) - Add Tags: Reported-by, Suggested-by, Tested-by Patch 2/2: ACPI: NUMA: Apply SRAT proximity domain to entire CFMWS window - Add Tags: Reported-by, Suggested-by, Tested-by - No changes in patch body. [1] v1: https://lore.kernel.org/linux-cxl/cover.1684448934.git.alison.schofield@intel.com/ Cover Letter: The CXL subsystem requires the creation of NUMA nodes for CFMWS Windows not described in the SRAT. The existing implementation only addresses windows that the SRAT describes completely or not at all. This work addresses the case of partially described CFMWS Windows by extending proximity domains in a portion of a CFMWS window to the entire window. Introduce a NUMA helper, numa_fill_memblks(), to fill gaps in a numa_meminfo memblk address range. Update the CFMWS parsing in the ACPI driver to use numa_fill_memblks() to extend SRAT defined proximity domains to entire CXL windows. An RFC of this patchset was previously posted for CXL folks review.[2] The RFC feedback led to the implementation here, extending existing memblks (Dan). Also, both Jonathan and Dan influenced the changelog comments in the ACPI patch, with regards to setting expectations on this evolving heuristic. Repeating here to set reviewer expectations: *Note that this heuristic will evolve when CFMWS Windows present a wider range of characteristics. The extension of the proximity domain, implemented here, is likely a step in developing a more sophisticated performance profile in the future. [2] https://lore.kernel.org/linux-cxl/cover.1683742429.git.alison.schofield@intel.com/ Alison Schofield (2): x86/numa: Introduce numa_fill_memblks() ACPI: NUMA: Apply SRAT proximity domain to entire CFMWS window arch/x86/include/asm/sparsemem.h | 2 + arch/x86/mm/numa.c | 87 ++++++++++++++++++++++++++++++++ drivers/acpi/numa/srat.c | 11 ++-- include/linux/numa.h | 7 +++ 4 files changed, 104 insertions(+), 3 deletions(-) base-commit: 6e2e1e779912345f0b5f86ef01facc2802bd97cc