mbox series

[v11,00/10] Add support for SVM atomics in Nouveau

Message ID 20210616105937.23201-1-apopple@nvidia.com (mailing list archive)
Headers show
Series Add support for SVM atomics in Nouveau | expand

Message

Alistair Popple June 16, 2021, 10:59 a.m. UTC
Hi Andrew,

This is my series to add support for SVM atomics in Nouveau rebased onto
v5.13-rc6-mmots-2021-06-15-20-28. Functionally there are no changes from
the previous v10, however some changes were made during the rebase. If
anyone would like to see a diff-of-diffs I can post it but my exhaustive
(assuming I didn't miss anything in the diff-of-diff) summary below is
probably easier to read.

Hugh - based on our previous disscussion I'm reasonably confident I haven't
missed anything from you in the rebase, but patch 4 ("Split migration into
its own function") might be worth looking at as that contained the largest
conflicts. Thanks.

[PATCH 01/10] mm: Remove special swap entry functions

- Fixed migration_entry_to_page() reference in __split_huge_pmd_locked()
  introduced by Hugh's "mm/thp: fix __split_huge_pmd_locked() on shmem
  migration entry".

- Fixed migration_entry_to_page() reference in page_vma_mapped_walk()
  changed by Hugh in "mm: page_vma_mapped_walk(): prettify PVMW_MIGRATION
  block"

[PATCH 02/10] mm/swapops: Rework swap entry manipulation code

- No changes required.

[PATCH 03/10] mm/rmap: Split try_to_munlock from try_to_unmap

- No changes required.

[PATCH 04/10] mm/rmap: Split migration into its own function

- Updated try_to_migrate_one() and try_to_migrate() to accept TTU_SYNC
  flag.

- Updated mmu notifier range calculation in try_to_migrate_one() to use
  vma_address_end() introduced by Hugh in "mm/thp: fix vma_address() if
  virtual address below file offset".

- Update to unmap_page() in mm/hugh_memory.c to pass TTU_SYNC and check
  page_mapped() now that try_to_{unmap|migrate} return void.

[PATCH 05/10] mm: Rename migrate_pgmap_owner

- Added Peter's Reviewed-by.

[PATCH 06/10] mm/memory.c: Allow different return codes for copy_non_present_pte()

- Updated call to copy_nonpresent_pte() for extra arguments added by Peter
  in "mm/userfaultfd: fix uffd-wp special cases for fork()".

- Added Peter's Reviewed-by.

[PATCH 07/10] mm: Device exclusive memory access

- s/vma/src_vma/ in copy_nonpresent_pte() due to "mm/userfaultfd: fix
  uffd-wp special cases for fork()".

- Skip calling rmap_walk on tail pages.

[PATCH 08/10] mm: Selftests for exclusive device memory

- Squashed in a fix from Colin King (see
  https://lore.kernel.org/kernel-janitors/20210526170530.3766167-1-colin.king@canonical.com/).
  Not sure what the best way of attributing that is though given it was
  against next.

[PATCH 09/10] nouveau/svm: Refactor nouveau_range_fault

- No changes.

[PATCH 10/10] nouveau/svm: Implement atomic SVM access

- No changes.

Alistair Popple (10):
  mm: Remove special swap entry functions
  mm/swapops: Rework swap entry manipulation code
  mm/rmap: Split try_to_munlock from try_to_unmap
  mm/rmap: Split migration into its own function
  mm: Rename migrate_pgmap_owner
  mm/memory.c: Allow different return codes for copy_nonpresent_pte()
  mm: Device exclusive memory access
  mm: Selftests for exclusive device memory
  nouveau/svm: Refactor nouveau_range_fault
  nouveau/svm: Implement atomic SVM access

 Documentation/vm/hmm.rst                      |  19 +-
 Documentation/vm/unevictable-lru.rst          |  33 +-
 arch/s390/mm/pgtable.c                        |   2 +-
 drivers/gpu/drm/nouveau/include/nvif/if000c.h |   1 +
 drivers/gpu/drm/nouveau/nouveau_svm.c         | 156 ++++-
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
 .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c    |   6 +
 fs/proc/task_mmu.c                            |  23 +-
 include/linux/mmu_notifier.h                  |  26 +-
 include/linux/rmap.h                          |  11 +-
 include/linux/swap.h                          |  13 +-
 include/linux/swapops.h                       | 123 ++--
 lib/test_hmm.c                                | 127 +++-
 lib/test_hmm_uapi.h                           |   2 +
 mm/debug_vm_pgtable.c                         |  12 +-
 mm/hmm.c                                      |  12 +-
 mm/huge_memory.c                              |  48 +-
 mm/hugetlb.c                                  |  10 +-
 mm/memcontrol.c                               |   2 +-
 mm/memory.c                                   | 175 ++++-
 mm/migrate.c                                  |  51 +-
 mm/mlock.c                                    |  12 +-
 mm/mprotect.c                                 |  18 +-
 mm/page_vma_mapped.c                          |  15 +-
 mm/rmap.c                                     | 612 +++++++++++++++---
 tools/testing/selftests/vm/hmm-tests.c        | 158 +++++
 26 files changed, 1341 insertions(+), 327 deletions(-)

Comments

Andrew Morton June 16, 2021, 11:35 p.m. UTC | #1
On Wed, 16 Jun 2021 20:59:27 +1000 Alistair Popple <apopple@nvidia.com> wrote:

> This is my series to add support for SVM atomics in Nouveau

Can we please have a nice [0/n] overview for this patchset?
Alistair Popple June 17, 2021, 1:29 a.m. UTC | #2
Introduction
============

Some devices have features such as atomic PTE bits that can be used to
implement atomic access to system memory. To support atomic operations to a
shared virtual memory page such a device needs access to that page which is
exclusive of the CPU. This series introduces a mechanism to temporarily
unmap pages granting exclusive access to a device.

These changes are required to support OpenCL atomic operations in Nouveau
to shared virtual memory (SVM) regions allocated with the
CL_MEM_SVM_ATOMICS clSVMAlloc flag. A more complete description of the
OpenCL SVM feature is available at
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/
OpenCL_API.html#_shared_virtual_memory .

Implementation
==============

Exclusive device access is implemented by adding a new swap entry type
(SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main
difference is that on fault the original entry is immediately restored by
the fault handler instead of waiting.

Restoring the entry triggers calls to MMU notifers which allows a device
driver to revoke the atomic access permission from the GPU prior to the CPU
finalising the entry.

Patches
=======

Patches 1 & 2 refactor existing migration and device private entry
functions.

Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated
functionality into separate functions - try_to_migrate_one() and
try_to_munlock_one().

Patch 5 renames some existing code but does not introduce functionality.

Patch 6 is a small clean-up to swap entry handling in copy_pte_range().

Patch 7 contains the bulk of the implementation for device exclusive
memory.

Patch 8 contains some additions to the HMM selftests to ensure everything
works as expected.

Patch 9 is a cleanup for the Nouveau SVM implementation.

Patch 10 contains the implementation of atomic access for the Nouveau
driver.

Testing
=======

This has been tested with upstream Mesa 21.1.0 and a simple OpenCL program
which checks that GPU atomic accesses to system memory are atomic. Without
this series the test fails as there is no way of write-protecting the page
mapping which results in the device clobbering CPU writes. For reference
the test is available at https://ozlabs.org/~apopple/opencl_svm_atomics/

Further testing has been performed by adding support for testing exclusive
access to the hmm-tests kselftests.
On Wednesday, 16 June 2021 8:59:27 PM AEST Alistair Popple wrote:

Alistair Popple (10):
  mm: Remove special swap entry functions
  mm/swapops: Rework swap entry manipulation code
  mm/rmap: Split try_to_munlock from try_to_unmap
  mm/rmap: Split migration into its own function
  mm: Rename migrate_pgmap_owner
  mm/memory.c: Allow different return codes for copy_nonpresent_pte()
  mm: Device exclusive memory access
  mm: Selftests for exclusive device memory
  nouveau/svm: Refactor nouveau_range_fault
  nouveau/svm: Implement atomic SVM access

 Documentation/vm/hmm.rst                      |  19 +-
 Documentation/vm/unevictable-lru.rst          |  33 +-
 arch/s390/mm/pgtable.c                        |   2 +-
 drivers/gpu/drm/nouveau/include/nvif/if000c.h |   1 +
 drivers/gpu/drm/nouveau/nouveau_svm.c         | 156 ++++-
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
 .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c    |   6 +
 fs/proc/task_mmu.c                            |  23 +-
 include/linux/mmu_notifier.h                  |  26 +-
 include/linux/rmap.h                          |  11 +-
 include/linux/swap.h                          |  13 +-
 include/linux/swapops.h                       | 123 ++--
 lib/test_hmm.c                                | 127 +++-
 lib/test_hmm_uapi.h                           |   2 +
 mm/debug_vm_pgtable.c                         |  12 +-
 mm/hmm.c                                      |  12 +-
 mm/huge_memory.c                              |  48 +-
 mm/hugetlb.c                                  |  10 +-
 mm/memcontrol.c                               |   2 +-
 mm/memory.c                                   | 175 ++++-
 mm/migrate.c                                  |  51 +-
 mm/mlock.c                                    |  12 +-
 mm/mprotect.c                                 |  18 +-
 mm/page_vma_mapped.c                          |  15 +-
 mm/rmap.c                                     | 612 +++++++++++++++---
 tools/testing/selftests/vm/hmm-tests.c        | 158 +++++
 26 files changed, 1341 insertions(+), 327 deletions(-)
Alistair Popple June 17, 2021, 1:32 a.m. UTC | #3
On Thursday, 17 June 2021 9:35:29 AM AEST Andrew Morton wrote:
> On Wed, 16 Jun 2021 20:59:27 +1000 Alistair Popple <apopple@nvidia.com> 
wrote:
> 
> > This is my series to add support for SVM atomics in Nouveau
> 
> Can we please have a nice [0/n] overview for this patchset?
> 

Sorry, I forgot to include that this time. Please see the update I just 
posted. Let me know if you need me to resend the whole series with the updated 
cover letter. Thanks.

 - Alistair