[v2] nouveau/hmm: map pages after migration

Message ID	20200303010023.2983-1-rcampbell@nvidia.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=qryl=4U=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D65B24677 TLS: TLSv1.2, DES-CBC3-SHA) id <B5e5dac220000>; Mon, 02 Mar 2020 17:00:18 -0800 From: Ralph Campbell <rcampbell@nvidia.com> To: <dri-devel@lists.freedesktop.org>, <linux-rdma@vger.kernel.org>, <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>, <nouveau@lists.freedesktop.org> CC: Jerome Glisse <jglisse@redhat.com>, John Hubbard <jhubbard@nvidia.com>, Christoph Hellwig <hch@lst.de>, Jason Gunthorpe <jgg@mellanox.com>, "Andrew Morton" <akpm@linux-foundation.org>, Ben Skeggs <bskeggs@redhat.com>, "Ralph Campbell" <rcampbell@nvidia.com> Subject: [PATCH v2] nouveau/hmm: map pages after migration Date: Mon, 2 Mar 2020 17:00:23 -0800 Message-ID: <20200303010023.2983-1-rcampbell@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	[v2] nouveau/hmm: map pages after migration \| expand [v2] nouveau/hmm: map pages after migration

Message ID

20200303010023.2983-1-rcampbell@nvidia.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0D65B24677
From: Ralph Campbell <rcampbell@nvidia.com>
To: <dri-devel@lists.freedesktop.org>, <linux-rdma@vger.kernel.org>,
	<linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
	<nouveau@lists.freedesktop.org>
CC: Jerome Glisse <jglisse@redhat.com>, John Hubbard <jhubbard@nvidia.com>,
	Christoph Hellwig <hch@lst.de>, Jason Gunthorpe <jgg@mellanox.com>, "Andrew
 Morton" <akpm@linux-foundation.org>, Ben Skeggs <bskeggs@redhat.com>, "Ralph
 Campbell" <rcampbell@nvidia.com>
Subject: [PATCH v2] nouveau/hmm: map pages after migration
Date: Mon, 2 Mar 2020 17:00:23 -0800
Message-ID: <20200303010023.2983-1-rcampbell@nvidia.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

[v2] nouveau/hmm: map pages after migration | expand

Commit Message

Ralph Campbell March 3, 2020, 1 a.m. UTC

When memory is migrated to the GPU, it is likely to be accessed by GPU
code soon afterwards. Instead of waiting for a GPU fault, map the
migrated memory into the GPU page tables with the same access permissions
as the source CPU page table entries. This preserves copy on write
semantics.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
---

Originally this patch was targeted for Jason's rdma tree since other HMM
related changes were queued there. Now that those have been merged, this
patch just contains changes to nouveau so it could go through any tree.
I guess Ben Skeggs' tree would be appropriate.

Changes since v1:
 Rebase to linux-5.6.0-rc4
 Address Christoph Hellwig's comments

 drivers/gpu/drm/nouveau/nouveau_dmem.c | 44 ++++++++-----
 drivers/gpu/drm/nouveau/nouveau_svm.c  | 85 ++++++++++++++++++++++++++
 drivers/gpu/drm/nouveau/nouveau_svm.h  |  5 ++
 3 files changed, 118 insertions(+), 16 deletions(-)

Comments

Jason Gunthorpe March 3, 2020, 12:42 p.m. UTC | #1

On Mon, Mar 02, 2020 at 05:00:23PM -0800, Ralph Campbell wrote:
> When memory is migrated to the GPU, it is likely to be accessed by GPU
> code soon afterwards. Instead of waiting for a GPU fault, map the
> migrated memory into the GPU page tables with the same access permissions
> as the source CPU page table entries. This preserves copy on write
> semantics.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jason Gunthorpe <jgg@mellanox.com>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> Cc: Ben Skeggs <bskeggs@redhat.com>
> ---
> 
> Originally this patch was targeted for Jason's rdma tree since other HMM
> related changes were queued there. Now that those have been merged, this
> patch just contains changes to nouveau so it could go through any tree.
> I guess Ben Skeggs' tree would be appropriate.

Yep

> +static inline struct nouveau_pfnmap_args *
> +nouveau_pfns_to_args(void *pfns)

don't use static inline inside C files

> +{
> +	struct nvif_vmm_pfnmap_v0 *p =
> +		container_of(pfns, struct nvif_vmm_pfnmap_v0, phys);
> +
> +	return container_of(p, struct nouveau_pfnmap_args, p);

And this should just be 

   return container_of(pfns, struct nouveau_pfnmap_args, p.phys);

> +static struct nouveau_svmm *
> +nouveau_find_svmm(struct nouveau_svm *svm, struct mm_struct *mm)
> +{
> +	struct nouveau_ivmm *ivmm;
> +
> +	list_for_each_entry(ivmm, &svm->inst, head) {
> +		if (ivmm->svmm->notifier.mm == mm)
> +			return ivmm->svmm;
> +	}
> +	return NULL;
> +}

Is this re-implementing mmu_notifier_get() ?

Jason

Ralph Campbell March 3, 2020, 9:15 p.m. UTC | #2

On 3/3/20 4:42 AM, Jason Gunthorpe wrote:
> On Mon, Mar 02, 2020 at 05:00:23PM -0800, Ralph Campbell wrote:
>> When memory is migrated to the GPU, it is likely to be accessed by GPU
>> code soon afterwards. Instead of waiting for a GPU fault, map the
>> migrated memory into the GPU page tables with the same access permissions
>> as the source CPU page table entries. This preserves copy on write
>> semantics.
>>
>> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Jason Gunthorpe <jgg@mellanox.com>
>> Cc: "Jérôme Glisse" <jglisse@redhat.com>
>> Cc: Ben Skeggs <bskeggs@redhat.com>
>> ---
>>
>> Originally this patch was targeted for Jason's rdma tree since other HMM
>> related changes were queued there. Now that those have been merged, this
>> patch just contains changes to nouveau so it could go through any tree.
>> I guess Ben Skeggs' tree would be appropriate.
> 
> Yep
> 
>> +static inline struct nouveau_pfnmap_args *
>> +nouveau_pfns_to_args(void *pfns)
> 
> don't use static inline inside C files

OK.

>> +{
>> +	struct nvif_vmm_pfnmap_v0 *p =
>> +		container_of(pfns, struct nvif_vmm_pfnmap_v0, phys);
>> +
>> +	return container_of(p, struct nouveau_pfnmap_args, p);
> 
> And this should just be
> 
>     return container_of(pfns, struct nouveau_pfnmap_args, p.phys);

Much simpler, thanks.

>> +static struct nouveau_svmm *
>> +nouveau_find_svmm(struct nouveau_svm *svm, struct mm_struct *mm)
>> +{
>> +	struct nouveau_ivmm *ivmm;
>> +
>> +	list_for_each_entry(ivmm, &svm->inst, head) {
>> +		if (ivmm->svmm->notifier.mm == mm)
>> +			return ivmm->svmm;
>> +	}
>> +	return NULL;
>> +}
> 
> Is this re-implementing mmu_notifier_get() ?
> 
> Jason

Not quite. This is being called from an ioctl() call on the GPU device
file which calls nouveau_svmm_bind() which locks mmap_sem for reading,
walks the vmas for the address range given in the ioctl() data, and migrates
the pages to the GPU memory.
mmu_notifier_get() would try to lock mmap_sem for writing so that would deadlock.
But it is similar in that the GPU specific process context (nouveau_svmm) needs
to be found for the given ioctl caller.
If find_get_mmu_notifier() was exported, I think that could work.
Now that I look at this again, there is an easier way to find the svmm and I see
some other bugs that need fixing. I'll post a v3 as soon as I get those written
and tested.

Thanks for the review.

Christoph Hellwig March 4, 2020, 4:27 p.m. UTC | #3

On Tue, Mar 03, 2020 at 01:15:21PM -0800, Ralph Campbell wrote:
>>> +static inline struct nouveau_pfnmap_args *
>>> +nouveau_pfns_to_args(void *pfns)
>>
>> don't use static inline inside C files
>
> OK.
>
>>> +{
>>> +	struct nvif_vmm_pfnmap_v0 *p =
>>> +		container_of(pfns, struct nvif_vmm_pfnmap_v0, phys);
>>> +
>>> +	return container_of(p, struct nouveau_pfnmap_args, p);
>>
>> And this should just be
>>
>>     return container_of(pfns, struct nouveau_pfnmap_args, p.phys);
>
> Much simpler, thanks.

Btw, for the case where we just have an container_of wrapper I strongly
disagree with avoiding the inline - not inlining this would be stupid,
but unfortunately compilers often behave stupidly.  It also is a very
clear marker.

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 0ad5d87b5a8e..172e0c98cec5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -25,11 +25,13 @@ 
 #include "nouveau_dma.h"
 #include "nouveau_mem.h"
 #include "nouveau_bo.h"
+#include "nouveau_svm.h"
 
 #include <nvif/class.h>
 #include <nvif/object.h>
 #include <nvif/if500b.h>
 #include <nvif/if900b.h>
+#include <nvif/if000c.h>
 
 #include <linux/sched/mm.h>
 #include <linux/hmm.h>
@@ -558,10 +560,11 @@  nouveau_dmem_init(struct nouveau_drm *drm)
 }
 
 static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
-		unsigned long src, dma_addr_t *dma_addr)
+		unsigned long src, dma_addr_t *dma_addr, u64 *pfn)
 {
 	struct device *dev = drm->dev->dev;
 	struct page *dpage, *spage;
+	unsigned long paddr;
 
 	spage = migrate_pfn_to_page(src);
 	if (!spage || !(src & MIGRATE_PFN_MIGRATE))
@@ -569,17 +572,21 @@  static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
 
 	dpage = nouveau_dmem_page_alloc_locked(drm);
 	if (!dpage)
-		return 0;
+		goto out;
 
 	*dma_addr = dma_map_page(dev, spage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
 	if (dma_mapping_error(dev, *dma_addr))
 		goto out_free_page;
 
+	paddr = nouveau_dmem_page_addr(dpage);
 	if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_VRAM,
-			nouveau_dmem_page_addr(dpage), NOUVEAU_APER_HOST,
-			*dma_addr))
+			paddr, NOUVEAU_APER_HOST, *dma_addr))
 		goto out_dma_unmap;
 
+	*pfn = NVIF_VMM_PFNMAP_V0_V | NVIF_VMM_PFNMAP_V0_VRAM |
+		((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT);
+	if (src & MIGRATE_PFN_WRITE)
+		*pfn |= NVIF_VMM_PFNMAP_V0_W;
 	return migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
 
 out_dma_unmap:
@@ -587,18 +594,19 @@  static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
 out_free_page:
 	nouveau_dmem_page_free_locked(drm, dpage);
 out:
+	*pfn = NVIF_VMM_PFNMAP_V0_NONE;
 	return 0;
 }
 
 static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm,
-		struct migrate_vma *args, dma_addr_t *dma_addrs)
+		struct migrate_vma *args, dma_addr_t *dma_addrs, u64 *pfns)
 {
 	struct nouveau_fence *fence;
 	unsigned long addr = args->start, nr_dma = 0, i;
 
 	for (i = 0; addr < args->end; i++) {
 		args->dst[i] = nouveau_dmem_migrate_copy_one(drm, args->src[i],
-				dma_addrs + nr_dma);
+				dma_addrs + nr_dma, pfns + i);
 		if (args->dst[i])
 			nr_dma++;
 		addr += PAGE_SIZE;
@@ -607,15 +615,12 @@  static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm,
 	nouveau_fence_new(drm->dmem->migrate.chan, false, &fence);
 	migrate_vma_pages(args);
 	nouveau_dmem_fence_done(&fence);
+	nouveau_pfns_map(drm, args->vma->vm_mm, args->start, pfns, i);
 
 	while (nr_dma--) {
 		dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE,
 				DMA_BIDIRECTIONAL);
 	}
-	/*
-	 * FIXME optimization: update GPU page table to point to newly migrated
-	 * memory.
-	 */
 	migrate_vma_finalize(args);
 }
 
@@ -632,7 +637,8 @@  nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
 		.vma		= vma,
 		.start		= start,
 	};
-	unsigned long c, i;
+	unsigned long i;
+	u64 *pfns;
 	int ret = -ENOMEM;
 
 	args.src = kcalloc(max, sizeof(*args.src), GFP_KERNEL);
@@ -646,19 +652,25 @@  nouveau_dmem_migrate_vma(struct nouveau_drm *drm,
 	if (!dma_addrs)
 		goto out_free_dst;
 
-	for (i = 0; i < npages; i += c) {
-		c = min(SG_MAX_SINGLE_ALLOC, npages);
-		args.end = start + (c << PAGE_SHIFT);
+	pfns = nouveau_pfns_alloc(max);
+	if (!pfns)
+		goto out_free_dma;
+
+	for (i = 0; i < npages; i += max) {
+		args.end = start + (max << PAGE_SHIFT);
 		ret = migrate_vma_setup(&args);
 		if (ret)
-			goto out_free_dma;
+			goto out_free_pfns;
 
 		if (args.cpages)
-			nouveau_dmem_migrate_chunk(drm, &args, dma_addrs);
+			nouveau_dmem_migrate_chunk(drm, &args, dma_addrs,
+						   pfns);
 		args.start = args.end;
 	}
 
 	ret = 0;
+out_free_pfns:
+	nouveau_pfns_free(pfns);
 out_free_dma:
 	kfree(dma_addrs);
 out_free_dst:
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index df9bf1fd1bc0..8c629918a3c6 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -70,6 +70,12 @@  struct nouveau_svm {
 #define SVM_DBG(s,f,a...) NV_DEBUG((s)->drm, "svm: "f"\n", ##a)
 #define SVM_ERR(s,f,a...) NV_WARN((s)->drm, "svm: "f"\n", ##a)
 
+struct nouveau_pfnmap_args {
+	struct nvif_ioctl_v0 i;
+	struct nvif_ioctl_mthd_v0 m;
+	struct nvif_vmm_pfnmap_v0 p;
+};
+
 struct nouveau_ivmm {
 	struct nouveau_svmm *svmm;
 	u64 inst;
@@ -782,6 +788,85 @@  nouveau_svm_fault(struct nvif_notify *notify)
 	return NVIF_NOTIFY_KEEP;
 }
 
+static inline struct nouveau_pfnmap_args *
+nouveau_pfns_to_args(void *pfns)
+{
+	struct nvif_vmm_pfnmap_v0 *p =
+		container_of(pfns, struct nvif_vmm_pfnmap_v0, phys);
+
+	return container_of(p, struct nouveau_pfnmap_args, p);
+}
+
+u64 *
+nouveau_pfns_alloc(unsigned long npages)
+{
+	struct nouveau_pfnmap_args *args;
+
+	args = kzalloc(struct_size(args, p.phys, npages), GFP_KERNEL);
+	if (!args)
+		return NULL;
+
+	args->i.type = NVIF_IOCTL_V0_MTHD;
+	args->m.method = NVIF_VMM_V0_PFNMAP;
+	args->p.page = PAGE_SHIFT;
+
+	return args->p.phys;
+}
+
+void
+nouveau_pfns_free(u64 *pfns)
+{
+	struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns);
+
+	kfree(args);
+}
+
+static struct nouveau_svmm *
+nouveau_find_svmm(struct nouveau_svm *svm, struct mm_struct *mm)
+{
+	struct nouveau_ivmm *ivmm;
+
+	list_for_each_entry(ivmm, &svm->inst, head) {
+		if (ivmm->svmm->notifier.mm == mm)
+			return ivmm->svmm;
+	}
+	return NULL;
+}
+
+void
+nouveau_pfns_map(struct nouveau_drm *drm, struct mm_struct *mm,
+		 unsigned long addr, u64 *pfns, unsigned long npages)
+{
+	struct nouveau_svm *svm = drm->svm;
+	struct nouveau_svmm *svmm;
+	struct nouveau_pfnmap_args *args;
+	int ret;
+
+	if (!svm)
+		return;
+
+	mutex_lock(&svm->mutex);
+	svmm = nouveau_find_svmm(svm, mm);
+	if (!svmm) {
+		mutex_unlock(&svm->mutex);
+		return;
+	}
+	mutex_unlock(&svm->mutex);
+
+	args = nouveau_pfns_to_args(pfns);
+	args->p.addr = addr;
+	args->p.size = npages << PAGE_SHIFT;
+
+	mutex_lock(&svmm->mutex);
+
+	svmm->vmm->vmm.object.client->super = true;
+	ret = nvif_object_ioctl(&svmm->vmm->vmm.object, args, sizeof(*args) +
+				npages * sizeof(args->p.phys[0]), NULL);
+	svmm->vmm->vmm.object.client->super = false;
+
+	mutex_unlock(&svmm->mutex);
+}
+
 static void
 nouveau_svm_fault_buffer_fini(struct nouveau_svm *svm, int id)
 {
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.h b/drivers/gpu/drm/nouveau/nouveau_svm.h
index e839d8189461..0649f8d587a8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.h
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.h
@@ -18,6 +18,11 @@  void nouveau_svmm_fini(struct nouveau_svmm **);
 int nouveau_svmm_join(struct nouveau_svmm *, u64 inst);
 void nouveau_svmm_part(struct nouveau_svmm *, u64 inst);
 int nouveau_svmm_bind(struct drm_device *, void *, struct drm_file *);
+
+u64 *nouveau_pfns_alloc(unsigned long npages);
+void nouveau_pfns_free(u64 *pfns);
+void nouveau_pfns_map(struct nouveau_drm *drm, struct mm_struct *mm,
+		      unsigned long addr, u64 *pfns, unsigned long npages);
 #else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
 static inline void nouveau_svm_init(struct nouveau_drm *drm) {}
 static inline void nouveau_svm_fini(struct nouveau_drm *drm) {}

[v2] nouveau/hmm: map pages after migration

Commit Message

Comments

Patch