mbox series

[v1,00/14] Add MEMORY_DEVICE_PUBLIC for CPU-accessible coherent device memory

Message ID 20210825034828.12927-1-alex.sierra@amd.com (mailing list archive)
Headers show
Series Add MEMORY_DEVICE_PUBLIC for CPU-accessible coherent device memory | expand

Message

Sierra Guiza, Alejandro (Alex) Aug. 25, 2021, 3:48 a.m. UTC
AMD is building a system architecture for the Frontier supercomputer
with a coherent interconnect between CPUs and GPUs. This hardware
architecture allows the CPUs to coherently access GPU device memory.
We have hardware in our labs and we are working with our partner HPE on
the BIOS, firmware and software for delivery to the DOE.

The system BIOS advertises the GPU device memory (aka VRAM) as SPM
(special purpose memory) in the UEFI system address map. The amdgpu
driver registers the memory with devmap as MEMORY_DEVICE_PUBLIC using
devm_memremap_pages.

This patch series adds MEMORY_DEVICE_PUBLIC, which is similar to
MEMORY_DEVICE_GENERIC in that it can be mapped for CPU access, but adds
support for migrating this memory similar to MEMORY_DEVICE_PRIVATE. We
also included and updated two patches from Ralph Campbell (Nvidia),
which change ZONE_DEVICE reference counting as requested in previous
reviews of this patch series (see https://patchwork.freedesktop.org/series/90706/).
Finally, we extended hmm_test to cover migration of MEMORY_DEVICE_PUBLIC.

This work is based on HMM and our SVM memory manager, which has landed
in Linux 5.14 recently.

Alex Sierra (12):
  mm: add iomem vma selection for memory migration
  mm: add zone device public type memory support
  drm/amdkfd: ref count init for device pages
  drm/amdkfd: add SPM support for SVM
  drm/amdkfd: public type as sys mem on migration to ram
  mm: add public type support to migrate_vma helpers
  mm: call pgmap->ops->page_free for DEVICE_PUBLIC pages
  lib: test_hmm add ioctl to get zone device type
  lib: test_hmm add module param for zone device type
  lib: add support for device public type in test_hmm
  tools: update hmm-test to support device public type
  tools: update test_hmm script to support SP config

Ralph Campbell (2):
  ext4/xfs: add page refcount helper
  mm: remove extra ZONE_DEVICE struct page refcount

 arch/powerpc/kvm/book3s_hv_uvmem.c       |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  36 ++--
 drivers/gpu/drm/nouveau/nouveau_dmem.c   |   2 +-
 fs/dax.c                                 |   8 +-
 fs/ext4/inode.c                          |   5 +-
 fs/fuse/dax.c                            |   4 +-
 fs/xfs/xfs_file.c                        |   4 +-
 include/linux/dax.h                      |  10 +
 include/linux/memremap.h                 |  15 +-
 include/linux/migrate.h                  |   1 +
 include/linux/mm.h                       |  19 +-
 lib/test_hmm.c                           | 247 +++++++++++++++--------
 lib/test_hmm_uapi.h                      |  16 ++
 mm/internal.h                            |   8 +
 mm/memcontrol.c                          |   6 +-
 mm/memory-failure.c                      |   6 +-
 mm/memremap.c                            |  70 ++-----
 mm/migrate.c                             |  27 +--
 mm/page_alloc.c                          |   3 +
 mm/swap.c                                |  45 +----
 tools/testing/selftests/vm/hmm-tests.c   | 142 +++++++++++--
 tools/testing/selftests/vm/test_hmm.sh   |  20 +-
 22 files changed, 443 insertions(+), 253 deletions(-)