Message ID | 20230722234746.205949-3-dmitry.osipenko@collabora.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers | expand |
On Sun, 23 Jul 2023 02:47:36 +0300 Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote: > And new pages_pin_count field to struct drm_gem_shmem_object that will > determine whether pages are evictable by memory shrinker. The pages will > be evictable only when pages_pin_count=0. This patch prepares code for > addition of the memory shrinker that will utilize the new field. > > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> > --- > drivers/gpu/drm/drm_gem_shmem_helper.c | 9 +++++++++ > include/drm/drm_gem_shmem_helper.h | 9 +++++++++ > 2 files changed, 18 insertions(+) > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > index 267153853e2c..42ba201dda50 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -274,15 +274,24 @@ static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem) > dma_resv_assert_held(shmem->base.resv); > > ret = drm_gem_shmem_get_pages(shmem); > + if (!ret) > + shmem->pages_pin_count++; > > return ret; > } > > static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem) > { > + struct drm_gem_object *obj = &shmem->base; > + > dma_resv_assert_held(shmem->base.resv); > > + if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count)) > + return; > + > drm_gem_shmem_put_pages(shmem); > + > + shmem->pages_pin_count--; > } > > /** > diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h > index bf0c31aa8fbe..7111f5743006 100644 > --- a/include/drm/drm_gem_shmem_helper.h > +++ b/include/drm/drm_gem_shmem_helper.h > @@ -39,6 +39,15 @@ struct drm_gem_shmem_object { > */ > unsigned int pages_use_count; > > + /** > + * @pages_pin_count: > + * > + * Reference count on the pinned pages table. > + * The pages allowed to be evicted by memory shrinker > + * only when the count is zero. > + */ > + unsigned int pages_pin_count; Can we make it an atomic_t, so we can avoid taking the lock when the GEM has already been pinned. That's something I need to be able to grab a pin-ref in a path where the GEM resv lock is already held[1]. We could of course expose the locked version, but in my case, I want to enforce the fact the GEM has been pinned before the drm_gem_shmem_pin() call in the section protected by the resv lock, so catching a "refcount 0 -> 1" situation would be useful. Beside, using an atomic to avoid the lock/unlock dance when refcount > 1 might be beneficial to everyone. [1]https://gitlab.freedesktop.org/bbrezillon/linux/-/commit/4420fa0d5768ebdc35b34d58d4ae5fad9fbb93f9 > + > /** > * @madv: State for madvise > *
On Tue, 25 Jul 2023 09:27:09 +0200 Boris Brezillon <boris.brezillon@collabora.com> wrote: > On Sun, 23 Jul 2023 02:47:36 +0300 > Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote: > > > And new pages_pin_count field to struct drm_gem_shmem_object that will > > determine whether pages are evictable by memory shrinker. The pages will > > be evictable only when pages_pin_count=0. This patch prepares code for > > addition of the memory shrinker that will utilize the new field. > > > > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > --- > > drivers/gpu/drm/drm_gem_shmem_helper.c | 9 +++++++++ > > include/drm/drm_gem_shmem_helper.h | 9 +++++++++ > > 2 files changed, 18 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > > index 267153853e2c..42ba201dda50 100644 > > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > > @@ -274,15 +274,24 @@ static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem) > > dma_resv_assert_held(shmem->base.resv); > > > > ret = drm_gem_shmem_get_pages(shmem); > > + if (!ret) > > + shmem->pages_pin_count++; > > > > return ret; > > } > > > > static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem) > > { > > + struct drm_gem_object *obj = &shmem->base; > > + > > dma_resv_assert_held(shmem->base.resv); > > > > + if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count)) > > + return; > > + > > drm_gem_shmem_put_pages(shmem); > > + > > + shmem->pages_pin_count--; > > } > > > > /** > > diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h > > index bf0c31aa8fbe..7111f5743006 100644 > > --- a/include/drm/drm_gem_shmem_helper.h > > +++ b/include/drm/drm_gem_shmem_helper.h > > @@ -39,6 +39,15 @@ struct drm_gem_shmem_object { > > */ > > unsigned int pages_use_count; > > > > + /** > > + * @pages_pin_count: > > + * > > + * Reference count on the pinned pages table. > > + * The pages allowed to be evicted by memory shrinker > > + * only when the count is zero. > > + */ > > + unsigned int pages_pin_count; > > Can we make it an atomic_t, so we can avoid taking the lock when the > GEM has already been pinned. That's something I need to be able to grab > a pin-ref in a path where the GEM resv lock is already held[1]. We could > of course expose the locked version, My bad, that's actually not true. The problem is not that I call drm_gem_shmem_pin() with the resv lock already held, but that I call drm_gem_shmem_pin() in a dma-signaling path where I'm not allowed to take a resv lock. I know for sure pin_count > 0, because all GEM objects mapped to a VM have their memory pinned right now, and this should stand until we decide to add support for live-GEM eviction, at which point we'll probably have a way to detect when a GEM is evicted, and avoid calling drm_gem_shmem_pin() on it. TLDR; I can't trade the atomic_t for a drm_gem_shmem_pin_locked(), because that wouldn't solve my problem. The other solution would be to add an atomic_t at the driver-GEM level, and only call drm_gem_shmem_[un]pin() on 0 <-> 1 transitions, but I thought using an atomic at the GEM-shmem level, to avoid locking when we can, would be beneficial to the rest of the eco-system. Let me know if that's not an option, and I'll go back to the driver-specific atomic_t. > but in my case, I want to enforce > the fact the GEM has been pinned before the drm_gem_shmem_pin() call in > the section protected by the resv lock, so catching a "refcount 0 -> 1" > situation would be useful. Beside, using an atomic to avoid the > lock/unlock dance when refcount > 1 might be beneficial to everyone. > > [1]https://gitlab.freedesktop.org/bbrezillon/linux/-/commit/4420fa0d5768ebdc35b34d58d4ae5fad9fbb93f9 > > > + > > /** > > * @madv: State for madvise > > * >
On 7/25/23 11:32, Boris Brezillon wrote: >> Can we make it an atomic_t, so we can avoid taking the lock when the >> GEM has already been pinned. That's something I need to be able to grab >> a pin-ref in a path where the GEM resv lock is already held[1]. We could >> of course expose the locked version, > My bad, that's actually not true. The problem is not that I call > drm_gem_shmem_pin() with the resv lock already held, but that I call > drm_gem_shmem_pin() in a dma-signaling path where I'm not allowed to > take a resv lock. I know for sure pin_count > 0, because all GEM objects > mapped to a VM have their memory pinned right now, and this should > stand until we decide to add support for live-GEM eviction, at which > point we'll probably have a way to detect when a GEM is evicted, and > avoid calling drm_gem_shmem_pin() on it. > > TLDR; I can't trade the atomic_t for a drm_gem_shmem_pin_locked(), > because that wouldn't solve my problem. The other solution would be to > add an atomic_t at the driver-GEM level, and only call > drm_gem_shmem_[un]pin() on 0 <-> 1 transitions, but I thought using an > atomic at the GEM-shmem level, to avoid locking when we can, would be > beneficial to the rest of the eco-system. Let me know if that's not an > option, and I'll go back to the driver-specific atomic_t. Could you please explain why do you need to pin GEM in a signal handler? This is not something drivers usually do or need to do. You likely also shouldn't need to detect that GEM is evicted in yours driver. I'd expect that Panthor shouldn't differ from Panfrost in regards to how GEM memory management is done and Panfrost doesn't need to do anything special. Note that patch #14 makes locked pin/unpin functions public and turns the unlocked variants into helpers, you'll be able to experiment with these funcs in the Panthor driver. In general, using atomic_t or kref should be a good thing to do, but AFAICS it shouldn't bring benefits to the today's drm-shmem users. I'd want to understand what you're trying to achieve in the Panthor driver.
On 7/31/23 15:27, Dmitry Osipenko wrote: > On 7/25/23 11:32, Boris Brezillon wrote: >>> Can we make it an atomic_t, so we can avoid taking the lock when the >>> GEM has already been pinned. That's something I need to be able to grab >>> a pin-ref in a path where the GEM resv lock is already held[1]. We could >>> of course expose the locked version, >> My bad, that's actually not true. The problem is not that I call >> drm_gem_shmem_pin() with the resv lock already held, but that I call >> drm_gem_shmem_pin() in a dma-signaling path where I'm not allowed to >> take a resv lock. I know for sure pin_count > 0, because all GEM objects >> mapped to a VM have their memory pinned right now, and this should >> stand until we decide to add support for live-GEM eviction, at which >> point we'll probably have a way to detect when a GEM is evicted, and >> avoid calling drm_gem_shmem_pin() on it. >> >> TLDR; I can't trade the atomic_t for a drm_gem_shmem_pin_locked(), >> because that wouldn't solve my problem. The other solution would be to >> add an atomic_t at the driver-GEM level, and only call >> drm_gem_shmem_[un]pin() on 0 <-> 1 transitions, but I thought using an >> atomic at the GEM-shmem level, to avoid locking when we can, would be >> beneficial to the rest of the eco-system. Let me know if that's not an >> option, and I'll go back to the driver-specific atomic_t. > > Could you please explain why do you need to pin GEM in a signal handler? > This is not something drivers usually do or need to do. You likely also > shouldn't need to detect that GEM is evicted in yours driver. I'd expect > that Panthor shouldn't differ from Panfrost in regards to how GEM memory > management is done and Panfrost doesn't need to do anything special. > > Note that patch #14 makes locked pin/unpin functions public and turns > the unlocked variants into helpers, you'll be able to experiment with > these funcs in the Panthor driver. correction: that's patch #10 > In general, using atomic_t or kref should be a good thing to do, but > AFAICS it shouldn't bring benefits to the today's drm-shmem users. I'd > want to understand what you're trying to achieve in the Panthor driver. >
+Danilo, to confirm my understanding of the gpuva remap operation is correct. On Mon, 31 Jul 2023 15:27:31 +0300 Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote: > On 7/25/23 11:32, Boris Brezillon wrote: > >> Can we make it an atomic_t, so we can avoid taking the lock when the > >> GEM has already been pinned. That's something I need to be able to grab > >> a pin-ref in a path where the GEM resv lock is already held[1]. We could > >> of course expose the locked version, > > My bad, that's actually not true. The problem is not that I call > > drm_gem_shmem_pin() with the resv lock already held, but that I call > > drm_gem_shmem_pin() in a dma-signaling path where I'm not allowed to > > take a resv lock. I know for sure pin_count > 0, because all GEM objects > > mapped to a VM have their memory pinned right now, and this should > > stand until we decide to add support for live-GEM eviction, at which > > point we'll probably have a way to detect when a GEM is evicted, and > > avoid calling drm_gem_shmem_pin() on it. > > > > TLDR; I can't trade the atomic_t for a drm_gem_shmem_pin_locked(), > > because that wouldn't solve my problem. The other solution would be to > > add an atomic_t at the driver-GEM level, and only call > > drm_gem_shmem_[un]pin() on 0 <-> 1 transitions, but I thought using an > > atomic at the GEM-shmem level, to avoid locking when we can, would be > > beneficial to the rest of the eco-system. Let me know if that's not an > > option, and I'll go back to the driver-specific atomic_t. > > Could you please explain why do you need to pin GEM in a signal handler? > This is not something drivers usually do or need to do. You likely also > shouldn't need to detect that GEM is evicted in yours driver. I'd expect > that Panthor shouldn't differ from Panfrost in regards to how GEM memory > management is done and Panfrost doesn't need to do anything special. Panthor VM management is completely different, and the case I'm referring to is 'asynchronous VM_BIND': mapping a GEM object to a GPU VM asynchronously, so we can make it depend on other operations, encoded as syncobjs passed to the VM_BIND operation. Here is the workflow we have for this use case: 1. Create + push a VM_BIND job to the VM_BIND queue (a drm_sched_entity that's taking care of asynchronous VM map/unmap operations). Because this operation is asynchronous, and the execution itself happens in a dma-signaling path (drm_sched::run_job()), we need to pre-allocate the MMU page tables for the worst case scenario, and make sure the GEM pages are pinned at job creation time. 2. The VM operation itself is executed when all dependencies are met (drm_sched calls run_job()). In case of a map operation, we call drm_gpuva_sm_map(), which might split the map operation into remap+unamp+map ones if the region being mapped is covering a region that was previously mapped to a different GEM object or a different portion of the same GEM object (see the gpuva_mgr doc [1]). A remap operation is just a way to split an existing mapping in 2 mappings covering the left/right side of the previous mapping, plus a hole in the middle. This means that our VM mapping object (drm_gpuva), which was pointing to a GEM object that had its pages pinned, is now turned into 2 mapping objects, and we need to make sure those 2 mappings own a reference to the pages, otherwise we'll have an unbalanced refcount when we release those 2 mappings further down the road. 3. Release resources attached to mappings that were removed (that includes releasing the ref we had on GEM pages) and free the mapping objects. We do that asynchronously, outside of the dma-signaling path. > > Note that patch #14 makes locked pin/unpin functions public and turns > the unlocked variants into helpers, you'll be able to experiment with > these funcs in the Panthor driver. Unfortunately, those won't help. I really need a way to increment the refcount without holding the lock, because we're in a dma-signaling path when we call drm_gpuva_sm_map(). Note that I could live with a drm_shmem_gem_pin_if_already_pinned() variant that would return NULL if pin_count == 0 instead of trying to acquire the lock, but I'd still need this refcount to be an atomic_t. As I said, an alternative to this approach would be to have a separate atomic refcount at the panthor_gem_object level, but I feel like we'd just be duplicating something that exists already. [1]https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/drm_gpuva_mgr.c#n67
On 7/31/23 15:35, Boris Brezillon wrote: > +Danilo, to confirm my understanding of the gpuva remap operation is > correct. Your understanding is correct. Unfortunately, re-mapping things has such implications. I'm currently working on tracking external GEM objects in the GPUVA manager, where, ideally, you'd want to add the extobj to the VM when the first mapping being backed by this GEM is created and removed when the last mapping being backed by this GEM is removed. Hence, extobjs need to be ref-counted based on how many mappings they back. However, when re-mapping such a mapping, the reference counter might drop to 0 temporarily and the slot of the data structure tracking the extobj is cleaned up and needs to be re-allocated. Surely, we could just increase the reference count while re-mapping or for the whole transaction (job), but this would make the API kinda bulky. > > On Mon, 31 Jul 2023 15:27:31 +0300 > Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote: > >> On 7/25/23 11:32, Boris Brezillon wrote: >>>> Can we make it an atomic_t, so we can avoid taking the lock when the >>>> GEM has already been pinned. That's something I need to be able to grab >>>> a pin-ref in a path where the GEM resv lock is already held[1]. We could >>>> of course expose the locked version, >>> My bad, that's actually not true. The problem is not that I call >>> drm_gem_shmem_pin() with the resv lock already held, but that I call >>> drm_gem_shmem_pin() in a dma-signaling path where I'm not allowed to >>> take a resv lock. I know for sure pin_count > 0, because all GEM objects >>> mapped to a VM have their memory pinned right now, and this should >>> stand until we decide to add support for live-GEM eviction, at which >>> point we'll probably have a way to detect when a GEM is evicted, and >>> avoid calling drm_gem_shmem_pin() on it. >>> >>> TLDR; I can't trade the atomic_t for a drm_gem_shmem_pin_locked(), >>> because that wouldn't solve my problem. The other solution would be to >>> add an atomic_t at the driver-GEM level, and only call >>> drm_gem_shmem_[un]pin() on 0 <-> 1 transitions, but I thought using an >>> atomic at the GEM-shmem level, to avoid locking when we can, would be >>> beneficial to the rest of the eco-system. Let me know if that's not an >>> option, and I'll go back to the driver-specific atomic_t. >> >> Could you please explain why do you need to pin GEM in a signal handler? >> This is not something drivers usually do or need to do. You likely also >> shouldn't need to detect that GEM is evicted in yours driver. I'd expect >> that Panthor shouldn't differ from Panfrost in regards to how GEM memory >> management is done and Panfrost doesn't need to do anything special. > > Panthor VM management is completely different, and the case I'm > referring to is 'asynchronous VM_BIND': mapping a GEM object to a GPU VM > asynchronously, so we can make it depend on other operations, encoded as > syncobjs passed to the VM_BIND operation. > > Here is the workflow we have for this use case: > > 1. Create + push a VM_BIND job to the VM_BIND queue (a drm_sched_entity > that's taking care of asynchronous VM map/unmap operations). Because > this operation is asynchronous, and the execution itself happens in a > dma-signaling path (drm_sched::run_job()), we need to pre-allocate the > MMU page tables for the worst case scenario, and make sure the GEM pages > are pinned at job creation time. > > 2. The VM operation itself is executed when all dependencies are met > (drm_sched calls run_job()). In case of a map operation, we call > drm_gpuva_sm_map(), which might split the map operation into > remap+unamp+map ones if the region being mapped is covering a region > that was previously mapped to a different GEM object or a different > portion of the same GEM object (see the gpuva_mgr doc [1]). A > remap operation is just a way to split an existing mapping in 2 mappings > covering the left/right side of the previous mapping, plus a hole in > the middle. This means that our VM mapping object (drm_gpuva), which > was pointing to a GEM object that had its pages pinned, is now turned > into 2 mapping objects, and we need to make sure those 2 mappings own a > reference to the pages, otherwise we'll have an unbalanced refcount > when we release those 2 mappings further down the road. > > 3. Release resources attached to mappings that were removed (that > includes releasing the ref we had on GEM pages) and free the mapping > objects. We do that asynchronously, outside of the dma-signaling path. > >> >> Note that patch #14 makes locked pin/unpin functions public and turns >> the unlocked variants into helpers, you'll be able to experiment with >> these funcs in the Panthor driver. > > Unfortunately, those won't help. I really need a way to increment the > refcount without holding the lock, because we're in a dma-signaling > path when we call drm_gpuva_sm_map(). Note that I could live with a > drm_shmem_gem_pin_if_already_pinned() variant that would return NULL if > pin_count == 0 instead of trying to acquire the lock, but I'd still > need this refcount to be an atomic_t. > > As I said, an alternative to this approach would be to have a separate > atomic refcount at the panthor_gem_object level, but I feel like we'd > just be duplicating something that exists already. > > [1]https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/drm_gpuva_mgr.c#n67 >
On Wed, 2 Aug 2023 04:31:52 +0200 Danilo Krummrich <dakr@redhat.com> wrote: > On 7/31/23 15:35, Boris Brezillon wrote: > > +Danilo, to confirm my understanding of the gpuva remap operation is > > correct. > > Your understanding is correct. > > Unfortunately, re-mapping things has such implications. > > I'm currently working on tracking external GEM objects in the GPUVA > manager, where, ideally, you'd want to add the extobj to the VM when the > first mapping being backed by this GEM is created and removed when the > last mapping being backed by this GEM is removed. Hence, extobjs need to > be ref-counted based on how many mappings they back. Uh, right. I went for a much simpler (but also less efficient) approach where I basically track things at the mapping level (my panthor_vma object, which inherits from drm_gpuva, has a list node so it can be inserted in a shared_bos list tracked at the VM level), instead of the GEM level. So we'd basically be trying to acquire resv locks multiple times and reserving multiple slots if the same shared GEM is mapped multiple times. With the IGNORE_DUPLICATES flag passed to drm_exec, that works, but it might not be ideal if we expect shared BOs to be mapped multiple times in the same VM. > > However, when re-mapping such a mapping, the reference counter might > drop to 0 temporarily and the slot of the data structure tracking the > extobj is cleaned up and needs to be re-allocated. Surely, we could just > increase the reference count while re-mapping or for the whole > transaction (job), but this would make the API kinda bulky. With things happening in the dma-signaling path, we'd need to pre-allocate this shared-bo container object anyway, because we can't assume there will be one available by the time we get to run the VM operation. So I think it's safe to assume that, even if the unmap part of the remap operation drops the last ref of this container object, when you get to map the same BO again, you'll have another container to play with. It's just a matter of pre-allocating one more thing when bo_is_shared==true && op==map, I think.
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 267153853e2c..42ba201dda50 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -274,15 +274,24 @@ static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem) dma_resv_assert_held(shmem->base.resv); ret = drm_gem_shmem_get_pages(shmem); + if (!ret) + shmem->pages_pin_count++; return ret; } static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem) { + struct drm_gem_object *obj = &shmem->base; + dma_resv_assert_held(shmem->base.resv); + if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count)) + return; + drm_gem_shmem_put_pages(shmem); + + shmem->pages_pin_count--; } /** diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h index bf0c31aa8fbe..7111f5743006 100644 --- a/include/drm/drm_gem_shmem_helper.h +++ b/include/drm/drm_gem_shmem_helper.h @@ -39,6 +39,15 @@ struct drm_gem_shmem_object { */ unsigned int pages_use_count; + /** + * @pages_pin_count: + * + * Reference count on the pinned pages table. + * The pages allowed to be evicted by memory shrinker + * only when the count is zero. + */ + unsigned int pages_pin_count; + /** * @madv: State for madvise *
And new pages_pin_count field to struct drm_gem_shmem_object that will determine whether pages are evictable by memory shrinker. The pages will be evictable only when pages_pin_count=0. This patch prepares code for addition of the memory shrinker that will utilize the new field. Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> --- drivers/gpu/drm/drm_gem_shmem_helper.c | 9 +++++++++ include/drm/drm_gem_shmem_helper.h | 9 +++++++++ 2 files changed, 18 insertions(+)