mbox series

[0/4] drm/gpuvm: Add support for single-page-filled mappings

Message ID 20250202-gpuvm-single-page-v1-0-8cbd44fdcbd4@asahilina.net (mailing list archive)
Headers show
Series drm/gpuvm: Add support for single-page-filled mappings | expand

Message

Asahi Lina Feb. 2, 2025, 1:34 p.m. UTC
Some hardware requires dummy page mappings to efficiently implement
Vulkan sparse features. These mappings consist of the same physical
memory page, repeated for a large range of address space (e.g. 16GiB).

Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
ranges to correspond 1:1 to virtual memory ranges that are mapped, and
does math on the BO offset accordingly. To make single page mappings
work, we need a way to turn off that math, keeping the BO offset always
constant and pointing to the same page (typically BO offset 0).

To make this work, we need to handle all the corner cases when these
mappings intersect with regular mappings. The rules are simply to never
mix or merge a "regular" mapping with a single page mapping.

drm_gpuvm has support for a flags field in drm_gpuva objects. This is
normally managed by drivers directly. We can introduce a
DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
sm_map and friends need to know ahead of time whether the new mapping is
a single page mapping or not. Therefore, we need to add an argument to
these functions so drivers can provide the flags to be filled into
drm_gpuva.flags.

These changes should not affect any existing drivers that use drm_gpuvm
other than the API change:

- imagination: Does not use flags at all
- nouveau: Only uses drm_gpuva_invalidate(), which is only called on
  existing drm_gpuva objects (after the map steps)
- panthor: Does not use flags at all
- xe: Does not use drm_gpuva_init_from_op() or
  drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
flags field of the gpuva object is managed by the driver only, so these
changes cannot clobber it.

Note that the way this is implemented, drm_gpuvm does not need to know
the GPU page size. It only has to never do math on the BO offset to meet
the requirements.

I suspect that after this change there could be some cleanup possible in
the xe driver (which right now passes flags around in various
driver-specific ways from the map step through to drm_gpuva objects),
but I'll leave that to the Xe folks.

Signed-off-by: Asahi Lina <lina@asahilina.net>
---
Asahi Lina (4):
      drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
      drm/gpuvm: Plumb through flags into drm_gpuva_op_map
      drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
      drm/gpuvm: Plumb through flags into drm_gpuva_init

 drivers/gpu/drm/drm_gpuvm.c            | 72 ++++++++++++++++++++++++++--------
 drivers/gpu/drm/imagination/pvr_vm.c   |  3 +-
 drivers/gpu/drm/nouveau/nouveau_uvmm.c |  3 +-
 drivers/gpu/drm/panthor/panthor_mmu.c  |  3 +-
 drivers/gpu/drm/xe/xe_vm.c             |  3 +-
 include/drm/drm_gpuvm.h                | 26 +++++++++---
 6 files changed, 84 insertions(+), 26 deletions(-)
---
base-commit: ffd294d346d185b70e28b1a28abe367bbfe53c04
change-id: 20250202-gpuvm-single-page-253346a74677

Cheers,
~~ Lina

Comments

Danilo Krummrich Feb. 2, 2025, 6:53 p.m. UTC | #1
Hi Lina,

On Sun, Feb 02, 2025 at 10:34:49PM +0900, Asahi Lina wrote:
> Some hardware requires dummy page mappings to efficiently implement
> Vulkan sparse features. These mappings consist of the same physical
> memory page, repeated for a large range of address space (e.g. 16GiB).
> 
> Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
> ranges to correspond 1:1 to virtual memory ranges that are mapped, and
> does math on the BO offset accordingly. To make single page mappings
> work, we need a way to turn off that math, keeping the BO offset always
> constant and pointing to the same page (typically BO offset 0).
> 
> To make this work, we need to handle all the corner cases when these
> mappings intersect with regular mappings. The rules are simply to never
> mix or merge a "regular" mapping with a single page mapping.
> 
> drm_gpuvm has support for a flags field in drm_gpuva objects. This is
> normally managed by drivers directly. We can introduce a
> DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
> sm_map and friends need to know ahead of time whether the new mapping is
> a single page mapping or not. Therefore, we need to add an argument to
> these functions so drivers can provide the flags to be filled into
> drm_gpuva.flags.
> 
> These changes should not affect any existing drivers that use drm_gpuvm
> other than the API change:
> 
> - imagination: Does not use flags at all
> - nouveau: Only uses drm_gpuva_invalidate(), which is only called on
>   existing drm_gpuva objects (after the map steps)
> - panthor: Does not use flags at all
> - xe: Does not use drm_gpuva_init_from_op() or
>   drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
> flags field of the gpuva object is managed by the driver only, so these
> changes cannot clobber it.
> 
> Note that the way this is implemented, drm_gpuvm does not need to know
> the GPU page size. It only has to never do math on the BO offset to meet
> the requirements.
> 
> I suspect that after this change there could be some cleanup possible in
> the xe driver (which right now passes flags around in various
> driver-specific ways from the map step through to drm_gpuva objects),
> but I'll leave that to the Xe folks.
> 
> Signed-off-by: Asahi Lina <lina@asahilina.net>
> ---
> Asahi Lina (4):
>       drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
>       drm/gpuvm: Plumb through flags into drm_gpuva_op_map
>       drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
>       drm/gpuvm: Plumb through flags into drm_gpuva_init

Without looking into any details yet:

This is a bit of tricky one, since we're not even close to having a user for
this new feature upstream yet, are we?

I wonder if we could do an exception by adding some KUnit tests (which
admittedly I never got to) validating things with and without this new feature.

Speaking of tests, how did you validate this change to validate the behavior
without DRM_GPUVA_SINGLE_PAGE?

- Danilo
Asahi Lina Feb. 2, 2025, 11:56 p.m. UTC | #2
On 2/3/25 3:53 AM, Danilo Krummrich wrote:
> Hi Lina,
> 
> On Sun, Feb 02, 2025 at 10:34:49PM +0900, Asahi Lina wrote:
>> Some hardware requires dummy page mappings to efficiently implement
>> Vulkan sparse features. These mappings consist of the same physical
>> memory page, repeated for a large range of address space (e.g. 16GiB).
>>
>> Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
>> ranges to correspond 1:1 to virtual memory ranges that are mapped, and
>> does math on the BO offset accordingly. To make single page mappings
>> work, we need a way to turn off that math, keeping the BO offset always
>> constant and pointing to the same page (typically BO offset 0).
>>
>> To make this work, we need to handle all the corner cases when these
>> mappings intersect with regular mappings. The rules are simply to never
>> mix or merge a "regular" mapping with a single page mapping.
>>
>> drm_gpuvm has support for a flags field in drm_gpuva objects. This is
>> normally managed by drivers directly. We can introduce a
>> DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
>> sm_map and friends need to know ahead of time whether the new mapping is
>> a single page mapping or not. Therefore, we need to add an argument to
>> these functions so drivers can provide the flags to be filled into
>> drm_gpuva.flags.
>>
>> These changes should not affect any existing drivers that use drm_gpuvm
>> other than the API change:
>>
>> - imagination: Does not use flags at all
>> - nouveau: Only uses drm_gpuva_invalidate(), which is only called on
>>   existing drm_gpuva objects (after the map steps)
>> - panthor: Does not use flags at all
>> - xe: Does not use drm_gpuva_init_from_op() or
>>   drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
>> flags field of the gpuva object is managed by the driver only, so these
>> changes cannot clobber it.
>>
>> Note that the way this is implemented, drm_gpuvm does not need to know
>> the GPU page size. It only has to never do math on the BO offset to meet
>> the requirements.
>>
>> I suspect that after this change there could be some cleanup possible in
>> the xe driver (which right now passes flags around in various
>> driver-specific ways from the map step through to drm_gpuva objects),
>> but I'll leave that to the Xe folks.
>>
>> Signed-off-by: Asahi Lina <lina@asahilina.net>
>> ---
>> Asahi Lina (4):
>>       drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
>>       drm/gpuvm: Plumb through flags into drm_gpuva_op_map
>>       drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
>>       drm/gpuvm: Plumb through flags into drm_gpuva_init
> 
> Without looking into any details yet:
> 
> This is a bit of tricky one, since we're not even close to having a user for
> this new feature upstream yet, are we?

I'd hope we're at least somewhere "this year" close to upstreaming
drm/asahi!

> 
> I wonder if we could do an exception by adding some KUnit tests (which
> admittedly I never got to) validating things with and without this new feature.
> 
> Speaking of tests, how did you validate this change to validate the behavior
> without DRM_GPUVA_SINGLE_PAGE?

Mostly just making sure our driver passes GL/Vulkan CTS including sparse
after this change. I do want to put together some low-level tests in
igt-gpu-tools, but I haven't gotten around to it yet...

~~ Lina
Boris Brezillon Feb. 3, 2025, 9:21 a.m. UTC | #3
+Akash with whom we've been discussing adding a 'REPEAT' mode to
drm_gpuvm/panthor.

On Sun, 2 Feb 2025 19:53:47 +0100
Danilo Krummrich <dakr@kernel.org> wrote:

> Hi Lina,
> 
> On Sun, Feb 02, 2025 at 10:34:49PM +0900, Asahi Lina wrote:
> > Some hardware requires dummy page mappings to efficiently implement
> > Vulkan sparse features. These mappings consist of the same physical
> > memory page, repeated for a large range of address space (e.g. 16GiB).
> > 
> > Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
> > ranges to correspond 1:1 to virtual memory ranges that are mapped, and
> > does math on the BO offset accordingly. To make single page mappings
> > work, we need a way to turn off that math, keeping the BO offset always
> > constant and pointing to the same page (typically BO offset 0).
> > 
> > To make this work, we need to handle all the corner cases when these
> > mappings intersect with regular mappings. The rules are simply to never
> > mix or merge a "regular" mapping with a single page mapping.
> > 
> > drm_gpuvm has support for a flags field in drm_gpuva objects. This is
> > normally managed by drivers directly. We can introduce a
> > DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
> > sm_map and friends need to know ahead of time whether the new mapping is
> > a single page mapping or not. Therefore, we need to add an argument to
> > these functions so drivers can provide the flags to be filled into
> > drm_gpuva.flags.
> > 
> > These changes should not affect any existing drivers that use drm_gpuvm
> > other than the API change:
> > 
> > - imagination: Does not use flags at all
> > - nouveau: Only uses drm_gpuva_invalidate(), which is only called on
> >   existing drm_gpuva objects (after the map steps)
> > - panthor: Does not use flags at all
> > - xe: Does not use drm_gpuva_init_from_op() or
> >   drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
> > flags field of the gpuva object is managed by the driver only, so these
> > changes cannot clobber it.
> > 
> > Note that the way this is implemented, drm_gpuvm does not need to know
> > the GPU page size. It only has to never do math on the BO offset to meet
> > the requirements.
> > 
> > I suspect that after this change there could be some cleanup possible in
> > the xe driver (which right now passes flags around in various
> > driver-specific ways from the map step through to drm_gpuva objects),
> > but I'll leave that to the Xe folks.
> > 
> > Signed-off-by: Asahi Lina <lina@asahilina.net>
> > ---
> > Asahi Lina (4):
> >       drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
> >       drm/gpuvm: Plumb through flags into drm_gpuva_op_map
> >       drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
> >       drm/gpuvm: Plumb through flags into drm_gpuva_init  
> 
> Without looking into any details yet:
> 
> This is a bit of tricky one, since we're not even close to having a user for
> this new feature upstream yet, are we?

Actually, we would be interesting in having this feature hooked up in
panthor. One use case we have is Vulkan sparse bindings, of course. But
we also have cases where we need to map a dummy page repeatedly on the
FW side. The approach we've been considering is slightly different:
pass a DRM_GPUVA_REPEAT_FLAG along with GEM range, so we can repeat a
range of the GEM (see the below diff, which is completely untested by
the way), but I think we'd be fine with this SINGLE_PAGE flag.

--->8---
diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index f9eb56f24bef..ea61f3ffaddf 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -2053,16 +2053,17 @@ EXPORT_SYMBOL_GPL(drm_gpuva_unmap);
 
 static int
 op_map_cb(const struct drm_gpuvm_ops *fn, void *priv,
-      u64 addr, u64 range,
-      struct drm_gem_object *obj, u64 offset)
+      u64 addr, u64 va_range,
+      struct drm_gem_object *obj, u64 offset, u64 gem_range)
 {
     struct drm_gpuva_op op = {};
 
     op.op = DRM_GPUVA_OP_MAP;
     op.map.va.addr = addr;
-    op.map.va.range = range;
+    op.map.va.range = va_range;
     op.map.gem.obj = obj;
     op.map.gem.offset = offset;
+    op.map.gem.range = gem_range;
 
     return fn->sm_step_map(&op, priv);
 }
@@ -2102,7 +2103,8 @@ static int
 __drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm,
            const struct drm_gpuvm_ops *ops, void *priv,
            u64 req_addr, u64 req_range,
-           struct drm_gem_object *req_obj, u64 req_offset)
+           struct drm_gem_object *req_obj,
+           u64 req_offset, u64 req_gem_range)
 {
     struct drm_gpuva *va, *next;
     u64 req_end = req_addr + req_range;
@@ -2237,7 +2239,7 @@ __drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm,
 
     return op_map_cb(ops, priv,
              req_addr, req_range,
-             req_obj, req_offset);
+             req_obj, req_offset, req_gem_range);
 }
 
 static int
@@ -2344,10 +2346,43 @@ drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
 
     return __drm_gpuvm_sm_map(gpuvm, ops, priv,
                   req_addr, req_range,
-                  req_obj, req_offset);
+                  req_obj, req_offset, 0);
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map);
 
+/**
+ * drm_gpuvm_sm_map_repeat() - repeatedly maps a GEM range over a VA range
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
+ * @priv: pointer to a driver private data structure
+ * @req_addr: the start address of the new mapping
+ * @req_range: the range of the new mapping
+ * @req_obj: the &drm_gem_object to map
+ * @req_offset: the offset within the &drm_gem_object
+ * @req_gem_range: the offset within the &drm_gem_object
+ *
+ * Same as drm_gpuvm_sm_map() except it repeats a GEM range over a VA range
+ *
+ * Returns: 0 on success or a negative error code
+ */
+int
+drm_gpuvm_sm_map_repeat(struct drm_gpuvm *gpuvm, void *priv,
+            u64 req_addr, u64 req_range,
+            struct drm_gem_object *req_obj,
+            u64 req_offset, u64 req_gem_range)
+{
+    const struct drm_gpuvm_ops *ops = gpuvm->ops;
+
+    if (unlikely(!(ops && ops->sm_step_map &&
+               ops->sm_step_remap &&
+               ops->sm_step_unmap)))
+        return -EINVAL;
+
+    return __drm_gpuvm_sm_map(gpuvm, ops, priv,
+                  req_addr, req_range,
+                  req_obj, req_offset, req_gem_range);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map_repeat);
+
 /**
  * drm_gpuvm_sm_unmap() - creates the &drm_gpuva_ops to split on unmap
  * @gpuvm: the &drm_gpuvm representing the GPU VA space
@@ -2536,7 +2571,7 @@ drm_gpuvm_sm_map_ops_create(struct drm_gpuvm *gpuvm,
 
     ret = __drm_gpuvm_sm_map(gpuvm, &gpuvm_list_ops, &args,
                  req_addr, req_range,
-                 req_obj, req_offset);
+                 req_obj, req_offset, 0);
     if (ret)
         goto err_free_ops;
 
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index 00d4e43b76b6..8157ede365d1 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -846,6 +846,14 @@ struct drm_gpuva_op_map {
          */
         u64 offset;
 
+        /**
+         * @gem.range: the range of the GEM to map
+         *
+         * If smaller than va.range, the GEM range should be mapped
+         * multiple times over the VA range.
+         */
+        u64 range;
+
         /**
          * @gem.obj: the &drm_gem_object to map
          */
@@ -1203,6 +1211,11 @@ int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
              u64 addr, u64 range,
              struct drm_gem_object *obj, u64 offset);
 
+int drm_gpuvm_sm_map_repeat(struct drm_gpuvm *gpuvm, void *priv,
+                u64 addr, u64 range,
+                struct drm_gem_object *obj,
+                u64 offset, u64 gem_range);
+
 int drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv,
                u64 addr, u64 range);
Liviu Dudau Feb. 3, 2025, 11:23 a.m. UTC | #4
On Mon, Feb 03, 2025 at 10:21:53AM +0100, Boris Brezillon wrote:
> +Akash with whom we've been discussing adding a 'REPEAT' mode to
> drm_gpuvm/panthor.
> 
> On Sun, 2 Feb 2025 19:53:47 +0100
> Danilo Krummrich <dakr@kernel.org> wrote:
> 
> > Hi Lina,
> > 
> > On Sun, Feb 02, 2025 at 10:34:49PM +0900, Asahi Lina wrote:
> > > Some hardware requires dummy page mappings to efficiently implement
> > > Vulkan sparse features. These mappings consist of the same physical
> > > memory page, repeated for a large range of address space (e.g. 16GiB).
> > > 
> > > Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
> > > ranges to correspond 1:1 to virtual memory ranges that are mapped, and
> > > does math on the BO offset accordingly. To make single page mappings
> > > work, we need a way to turn off that math, keeping the BO offset always
> > > constant and pointing to the same page (typically BO offset 0).
> > > 
> > > To make this work, we need to handle all the corner cases when these
> > > mappings intersect with regular mappings. The rules are simply to never
> > > mix or merge a "regular" mapping with a single page mapping.
> > > 
> > > drm_gpuvm has support for a flags field in drm_gpuva objects. This is
> > > normally managed by drivers directly. We can introduce a
> > > DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
> > > sm_map and friends need to know ahead of time whether the new mapping is
> > > a single page mapping or not. Therefore, we need to add an argument to
> > > these functions so drivers can provide the flags to be filled into
> > > drm_gpuva.flags.
> > > 
> > > These changes should not affect any existing drivers that use drm_gpuvm
> > > other than the API change:
> > > 
> > > - imagination: Does not use flags at all
> > > - nouveau: Only uses drm_gpuva_invalidate(), which is only called on
> > >   existing drm_gpuva objects (after the map steps)
> > > - panthor: Does not use flags at all
> > > - xe: Does not use drm_gpuva_init_from_op() or
> > >   drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
> > > flags field of the gpuva object is managed by the driver only, so these
> > > changes cannot clobber it.
> > > 
> > > Note that the way this is implemented, drm_gpuvm does not need to know
> > > the GPU page size. It only has to never do math on the BO offset to meet
> > > the requirements.
> > > 
> > > I suspect that after this change there could be some cleanup possible in
> > > the xe driver (which right now passes flags around in various
> > > driver-specific ways from the map step through to drm_gpuva objects),
> > > but I'll leave that to the Xe folks.
> > > 
> > > Signed-off-by: Asahi Lina <lina@asahilina.net>
> > > ---
> > > Asahi Lina (4):
> > >       drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
> > >       drm/gpuvm: Plumb through flags into drm_gpuva_op_map
> > >       drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
> > >       drm/gpuvm: Plumb through flags into drm_gpuva_init  
> > 
> > Without looking into any details yet:
> > 
> > This is a bit of tricky one, since we're not even close to having a user for
> > this new feature upstream yet, are we?
> 
> Actually, we would be interesting in having this feature hooked up in
> panthor. One use case we have is Vulkan sparse bindings, of course. But
> we also have cases where we need to map a dummy page repeatedly on the
> FW side. The approach we've been considering is slightly different:
> pass a DRM_GPUVA_REPEAT_FLAG along with GEM range, so we can repeat a
> range of the GEM (see the below diff, which is completely untested by
> the way), but I think we'd be fine with this SINGLE_PAGE flag.

Unless I've misunderstood the intent completely, it looks like Xe also wants
something similar although they call it CPU_ADDR_MIRROR for some reason:

https://lore.kernel.org/r/20250129195212.745731-9-matthew.brost@intel.com

Best regards,
Liviu

> 
> --->8---
> diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> index f9eb56f24bef..ea61f3ffaddf 100644
> --- a/drivers/gpu/drm/drm_gpuvm.c
> +++ b/drivers/gpu/drm/drm_gpuvm.c
> @@ -2053,16 +2053,17 @@ EXPORT_SYMBOL_GPL(drm_gpuva_unmap);
>  
>  static int
>  op_map_cb(const struct drm_gpuvm_ops *fn, void *priv,
> -      u64 addr, u64 range,
> -      struct drm_gem_object *obj, u64 offset)
> +      u64 addr, u64 va_range,
> +      struct drm_gem_object *obj, u64 offset, u64 gem_range)
>  {
>      struct drm_gpuva_op op = {};
>  
>      op.op = DRM_GPUVA_OP_MAP;
>      op.map.va.addr = addr;
> -    op.map.va.range = range;
> +    op.map.va.range = va_range;
>      op.map.gem.obj = obj;
>      op.map.gem.offset = offset;
> +    op.map.gem.range = gem_range;
>  
>      return fn->sm_step_map(&op, priv);
>  }
> @@ -2102,7 +2103,8 @@ static int
>  __drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm,
>             const struct drm_gpuvm_ops *ops, void *priv,
>             u64 req_addr, u64 req_range,
> -           struct drm_gem_object *req_obj, u64 req_offset)
> +           struct drm_gem_object *req_obj,
> +           u64 req_offset, u64 req_gem_range)
>  {
>      struct drm_gpuva *va, *next;
>      u64 req_end = req_addr + req_range;
> @@ -2237,7 +2239,7 @@ __drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm,
>  
>      return op_map_cb(ops, priv,
>               req_addr, req_range,
> -             req_obj, req_offset);
> +             req_obj, req_offset, req_gem_range);
>  }
>  
>  static int
> @@ -2344,10 +2346,43 @@ drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>  
>      return __drm_gpuvm_sm_map(gpuvm, ops, priv,
>                    req_addr, req_range,
> -                  req_obj, req_offset);
> +                  req_obj, req_offset, 0);
>  }
>  EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map);
>  
> +/**
> + * drm_gpuvm_sm_map_repeat() - repeatedly maps a GEM range over a VA range
> + * @gpuvm: the &drm_gpuvm representing the GPU VA space
> + * @priv: pointer to a driver private data structure
> + * @req_addr: the start address of the new mapping
> + * @req_range: the range of the new mapping
> + * @req_obj: the &drm_gem_object to map
> + * @req_offset: the offset within the &drm_gem_object
> + * @req_gem_range: the offset within the &drm_gem_object
> + *
> + * Same as drm_gpuvm_sm_map() except it repeats a GEM range over a VA range
> + *
> + * Returns: 0 on success or a negative error code
> + */
> +int
> +drm_gpuvm_sm_map_repeat(struct drm_gpuvm *gpuvm, void *priv,
> +            u64 req_addr, u64 req_range,
> +            struct drm_gem_object *req_obj,
> +            u64 req_offset, u64 req_gem_range)
> +{
> +    const struct drm_gpuvm_ops *ops = gpuvm->ops;
> +
> +    if (unlikely(!(ops && ops->sm_step_map &&
> +               ops->sm_step_remap &&
> +               ops->sm_step_unmap)))
> +        return -EINVAL;
> +
> +    return __drm_gpuvm_sm_map(gpuvm, ops, priv,
> +                  req_addr, req_range,
> +                  req_obj, req_offset, req_gem_range);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map_repeat);
> +
>  /**
>   * drm_gpuvm_sm_unmap() - creates the &drm_gpuva_ops to split on unmap
>   * @gpuvm: the &drm_gpuvm representing the GPU VA space
> @@ -2536,7 +2571,7 @@ drm_gpuvm_sm_map_ops_create(struct drm_gpuvm *gpuvm,
>  
>      ret = __drm_gpuvm_sm_map(gpuvm, &gpuvm_list_ops, &args,
>                   req_addr, req_range,
> -                 req_obj, req_offset);
> +                 req_obj, req_offset, 0);
>      if (ret)
>          goto err_free_ops;
>  
> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> index 00d4e43b76b6..8157ede365d1 100644
> --- a/include/drm/drm_gpuvm.h
> +++ b/include/drm/drm_gpuvm.h
> @@ -846,6 +846,14 @@ struct drm_gpuva_op_map {
>           */
>          u64 offset;
>  
> +        /**
> +         * @gem.range: the range of the GEM to map
> +         *
> +         * If smaller than va.range, the GEM range should be mapped
> +         * multiple times over the VA range.
> +         */
> +        u64 range;
> +
>          /**
>           * @gem.obj: the &drm_gem_object to map
>           */
> @@ -1203,6 +1211,11 @@ int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>               u64 addr, u64 range,
>               struct drm_gem_object *obj, u64 offset);
>  
> +int drm_gpuvm_sm_map_repeat(struct drm_gpuvm *gpuvm, void *priv,
> +                u64 addr, u64 range,
> +                struct drm_gem_object *obj,
> +                u64 offset, u64 gem_range);
> +
>  int drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv,
>                 u64 addr, u64 range);
Boris Brezillon Feb. 3, 2025, 12:12 p.m. UTC | #5
On Mon, 3 Feb 2025 11:23:53 +0000
Liviu Dudau <liviu.dudau@arm.com> wrote:

> On Mon, Feb 03, 2025 at 10:21:53AM +0100, Boris Brezillon wrote:
> > +Akash with whom we've been discussing adding a 'REPEAT' mode to
> > drm_gpuvm/panthor.
> > 
> > On Sun, 2 Feb 2025 19:53:47 +0100
> > Danilo Krummrich <dakr@kernel.org> wrote:
> >   
> > > Hi Lina,
> > > 
> > > On Sun, Feb 02, 2025 at 10:34:49PM +0900, Asahi Lina wrote:  
> > > > Some hardware requires dummy page mappings to efficiently implement
> > > > Vulkan sparse features. These mappings consist of the same physical
> > > > memory page, repeated for a large range of address space (e.g. 16GiB).
> > > > 
> > > > Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
> > > > ranges to correspond 1:1 to virtual memory ranges that are mapped, and
> > > > does math on the BO offset accordingly. To make single page mappings
> > > > work, we need a way to turn off that math, keeping the BO offset always
> > > > constant and pointing to the same page (typically BO offset 0).
> > > > 
> > > > To make this work, we need to handle all the corner cases when these
> > > > mappings intersect with regular mappings. The rules are simply to never
> > > > mix or merge a "regular" mapping with a single page mapping.
> > > > 
> > > > drm_gpuvm has support for a flags field in drm_gpuva objects. This is
> > > > normally managed by drivers directly. We can introduce a
> > > > DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
> > > > sm_map and friends need to know ahead of time whether the new mapping is
> > > > a single page mapping or not. Therefore, we need to add an argument to
> > > > these functions so drivers can provide the flags to be filled into
> > > > drm_gpuva.flags.
> > > > 
> > > > These changes should not affect any existing drivers that use drm_gpuvm
> > > > other than the API change:
> > > > 
> > > > - imagination: Does not use flags at all
> > > > - nouveau: Only uses drm_gpuva_invalidate(), which is only called on
> > > >   existing drm_gpuva objects (after the map steps)
> > > > - panthor: Does not use flags at all
> > > > - xe: Does not use drm_gpuva_init_from_op() or
> > > >   drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
> > > > flags field of the gpuva object is managed by the driver only, so these
> > > > changes cannot clobber it.
> > > > 
> > > > Note that the way this is implemented, drm_gpuvm does not need to know
> > > > the GPU page size. It only has to never do math on the BO offset to meet
> > > > the requirements.
> > > > 
> > > > I suspect that after this change there could be some cleanup possible in
> > > > the xe driver (which right now passes flags around in various
> > > > driver-specific ways from the map step through to drm_gpuva objects),
> > > > but I'll leave that to the Xe folks.
> > > > 
> > > > Signed-off-by: Asahi Lina <lina@asahilina.net>
> > > > ---
> > > > Asahi Lina (4):
> > > >       drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
> > > >       drm/gpuvm: Plumb through flags into drm_gpuva_op_map
> > > >       drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
> > > >       drm/gpuvm: Plumb through flags into drm_gpuva_init    
> > > 
> > > Without looking into any details yet:
> > > 
> > > This is a bit of tricky one, since we're not even close to having a user for
> > > this new feature upstream yet, are we?  
> > 
> > Actually, we would be interesting in having this feature hooked up in
> > panthor. One use case we have is Vulkan sparse bindings, of course. But
> > we also have cases where we need to map a dummy page repeatedly on the
> > FW side. The approach we've been considering is slightly different:
> > pass a DRM_GPUVA_REPEAT_FLAG along with GEM range, so we can repeat a
> > range of the GEM (see the below diff, which is completely untested by
> > the way), but I think we'd be fine with this SINGLE_PAGE flag.  
> 
> Unless I've misunderstood the intent completely, it looks like Xe also wants
> something similar although they call it CPU_ADDR_MIRROR for some reason:
> 
> https://lore.kernel.org/r/20250129195212.745731-9-matthew.brost@intel.com

At first glance, it doesn't seem related. The Xe stuff looks more like
an alloc-on-fault mechanism. SINGLE_PAGE is about mapping a dummy page
repeatedly over a virtual address range so that sparse Vulkan images
never get a fault when an unbound image region is accessed.
Matthew Auld Feb. 3, 2025, 12:17 p.m. UTC | #6
On 03/02/2025 12:12, Boris Brezillon wrote:
> On Mon, 3 Feb 2025 11:23:53 +0000
> Liviu Dudau <liviu.dudau@arm.com> wrote:
> 
>> On Mon, Feb 03, 2025 at 10:21:53AM +0100, Boris Brezillon wrote:
>>> +Akash with whom we've been discussing adding a 'REPEAT' mode to
>>> drm_gpuvm/panthor.
>>>
>>> On Sun, 2 Feb 2025 19:53:47 +0100
>>> Danilo Krummrich <dakr@kernel.org> wrote:
>>>    
>>>> Hi Lina,
>>>>
>>>> On Sun, Feb 02, 2025 at 10:34:49PM +0900, Asahi Lina wrote:
>>>>> Some hardware requires dummy page mappings to efficiently implement
>>>>> Vulkan sparse features. These mappings consist of the same physical
>>>>> memory page, repeated for a large range of address space (e.g. 16GiB).
>>>>>
>>>>> Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
>>>>> ranges to correspond 1:1 to virtual memory ranges that are mapped, and
>>>>> does math on the BO offset accordingly. To make single page mappings
>>>>> work, we need a way to turn off that math, keeping the BO offset always
>>>>> constant and pointing to the same page (typically BO offset 0).
>>>>>
>>>>> To make this work, we need to handle all the corner cases when these
>>>>> mappings intersect with regular mappings. The rules are simply to never
>>>>> mix or merge a "regular" mapping with a single page mapping.
>>>>>
>>>>> drm_gpuvm has support for a flags field in drm_gpuva objects. This is
>>>>> normally managed by drivers directly. We can introduce a
>>>>> DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
>>>>> sm_map and friends need to know ahead of time whether the new mapping is
>>>>> a single page mapping or not. Therefore, we need to add an argument to
>>>>> these functions so drivers can provide the flags to be filled into
>>>>> drm_gpuva.flags.
>>>>>
>>>>> These changes should not affect any existing drivers that use drm_gpuvm
>>>>> other than the API change:
>>>>>
>>>>> - imagination: Does not use flags at all
>>>>> - nouveau: Only uses drm_gpuva_invalidate(), which is only called on
>>>>>    existing drm_gpuva objects (after the map steps)
>>>>> - panthor: Does not use flags at all
>>>>> - xe: Does not use drm_gpuva_init_from_op() or
>>>>>    drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
>>>>> flags field of the gpuva object is managed by the driver only, so these
>>>>> changes cannot clobber it.
>>>>>
>>>>> Note that the way this is implemented, drm_gpuvm does not need to know
>>>>> the GPU page size. It only has to never do math on the BO offset to meet
>>>>> the requirements.
>>>>>
>>>>> I suspect that after this change there could be some cleanup possible in
>>>>> the xe driver (which right now passes flags around in various
>>>>> driver-specific ways from the map step through to drm_gpuva objects),
>>>>> but I'll leave that to the Xe folks.
>>>>>
>>>>> Signed-off-by: Asahi Lina <lina@asahilina.net>
>>>>> ---
>>>>> Asahi Lina (4):
>>>>>        drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
>>>>>        drm/gpuvm: Plumb through flags into drm_gpuva_op_map
>>>>>        drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
>>>>>        drm/gpuvm: Plumb through flags into drm_gpuva_init
>>>>
>>>> Without looking into any details yet:
>>>>
>>>> This is a bit of tricky one, since we're not even close to having a user for
>>>> this new feature upstream yet, are we?
>>>
>>> Actually, we would be interesting in having this feature hooked up in
>>> panthor. One use case we have is Vulkan sparse bindings, of course. But
>>> we also have cases where we need to map a dummy page repeatedly on the
>>> FW side. The approach we've been considering is slightly different:
>>> pass a DRM_GPUVA_REPEAT_FLAG along with GEM range, so we can repeat a
>>> range of the GEM (see the below diff, which is completely untested by
>>> the way), but I think we'd be fine with this SINGLE_PAGE flag.
>>
>> Unless I've misunderstood the intent completely, it looks like Xe also wants
>> something similar although they call it CPU_ADDR_MIRROR for some reason:
>>
>> https://lore.kernel.org/r/20250129195212.745731-9-matthew.brost@intel.com
> 
> At first glance, it doesn't seem related. The Xe stuff looks more like
> an alloc-on-fault mechanism. SINGLE_PAGE is about mapping a dummy page
> repeatedly over a virtual address range so that sparse Vulkan images
> never get a fault when an unbound image region is accessed.

Yeah, the CPU_ADDR_MIRROR is for upcoming svm stuff and not related. I 
believe in xe the sparse/repeat stuff in this series is rather done 
using a special HW feature called "NULL page", which is just a bit you 
can set in the pte (can also use huge pages) that creates a special 
mapping that doesn't actually point to any real memory underneath, but 
when doing a read it will always return zeroes and any writes to it will 
be dropped by the HW.
Asahi Lina Feb. 3, 2025, 1:46 p.m. UTC | #7
Hi,

On 2/3/25 6:21 PM, Boris Brezillon wrote:
> +Akash with whom we've been discussing adding a 'REPEAT' mode to
> drm_gpuvm/panthor.
> 
> On Sun, 2 Feb 2025 19:53:47 +0100
> Danilo Krummrich <dakr@kernel.org> wrote:
> 
>> Hi Lina,
>>
>> On Sun, Feb 02, 2025 at 10:34:49PM +0900, Asahi Lina wrote:
>>> Some hardware requires dummy page mappings to efficiently implement
>>> Vulkan sparse features. These mappings consist of the same physical
>>> memory page, repeated for a large range of address space (e.g. 16GiB).
>>>
>>> Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
>>> ranges to correspond 1:1 to virtual memory ranges that are mapped, and
>>> does math on the BO offset accordingly. To make single page mappings
>>> work, we need a way to turn off that math, keeping the BO offset always
>>> constant and pointing to the same page (typically BO offset 0).
>>>
>>> To make this work, we need to handle all the corner cases when these
>>> mappings intersect with regular mappings. The rules are simply to never
>>> mix or merge a "regular" mapping with a single page mapping.
>>>
>>> drm_gpuvm has support for a flags field in drm_gpuva objects. This is
>>> normally managed by drivers directly. We can introduce a
>>> DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
>>> sm_map and friends need to know ahead of time whether the new mapping is
>>> a single page mapping or not. Therefore, we need to add an argument to
>>> these functions so drivers can provide the flags to be filled into
>>> drm_gpuva.flags.
>>>
>>> These changes should not affect any existing drivers that use drm_gpuvm
>>> other than the API change:
>>>
>>> - imagination: Does not use flags at all
>>> - nouveau: Only uses drm_gpuva_invalidate(), which is only called on
>>>   existing drm_gpuva objects (after the map steps)
>>> - panthor: Does not use flags at all
>>> - xe: Does not use drm_gpuva_init_from_op() or
>>>   drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
>>> flags field of the gpuva object is managed by the driver only, so these
>>> changes cannot clobber it.
>>>
>>> Note that the way this is implemented, drm_gpuvm does not need to know
>>> the GPU page size. It only has to never do math on the BO offset to meet
>>> the requirements.
>>>
>>> I suspect that after this change there could be some cleanup possible in
>>> the xe driver (which right now passes flags around in various
>>> driver-specific ways from the map step through to drm_gpuva objects),
>>> but I'll leave that to the Xe folks.
>>>
>>> Signed-off-by: Asahi Lina <lina@asahilina.net>
>>> ---
>>> Asahi Lina (4):
>>>       drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
>>>       drm/gpuvm: Plumb through flags into drm_gpuva_op_map
>>>       drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
>>>       drm/gpuvm: Plumb through flags into drm_gpuva_init  
>>
>> Without looking into any details yet:
>>
>> This is a bit of tricky one, since we're not even close to having a user for
>> this new feature upstream yet, are we?
> 
> Actually, we would be interesting in having this feature hooked up in
> panthor. One use case we have is Vulkan sparse bindings, of course. But
> we also have cases where we need to map a dummy page repeatedly on the
> FW side. The approach we've been considering is slightly different:
> pass a DRM_GPUVA_REPEAT_FLAG along with GEM range, so we can repeat a
> range of the GEM (see the below diff, which is completely untested by
> the way), but I think we'd be fine with this SINGLE_PAGE flag.

That sounds similar, though your patch does not handle gpuva
splitting/remapping and all the other corner cases. I think you'll find
that once you handle those, the logic will become significantly more
complicated, since you need to start storing the start offset within the
repeat range on GPUVAs to be able to split them while keeping the
mappings identical, and do modular arithmetic to keep it all consistent
across all the corner cases.

If SINGLE_PAGE works for you then I would advocate for that. It keeps
complexity down to a minimum in drm_gpuvm. You can still have a range
that's greater than one page in practice, you'd just have to handle it
driver-internal and pass the desired range out of band as a flag or
other field. For example, you could decide that the mapping is always
congruent to the VA (GEM page offset = start offset + VA % range) and
always treat SINGLE_PAGE mappings like that when you actually set up the
page tables, or pass in an extra offset to be able to shift the phase of
the mapping to whatever you want. You just need to ensure that, if you
mix range sizes or other configuration, you don't do that for the same
GEM BO at the same offset, so that the drm_gpuvm core does not wrongly
consider them equivalent.

Maybe I should rename SINGLE_PAGE to something else, since it isn't
technically limited to that as far as gpuvm is concerned. Something like
FIXED_OFFSET?

> 
> --->8---
> diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
> index f9eb56f24bef..ea61f3ffaddf 100644
> --- a/drivers/gpu/drm/drm_gpuvm.c
> +++ b/drivers/gpu/drm/drm_gpuvm.c
> @@ -2053,16 +2053,17 @@ EXPORT_SYMBOL_GPL(drm_gpuva_unmap);
>  
>  static int
>  op_map_cb(const struct drm_gpuvm_ops *fn, void *priv,
> -      u64 addr, u64 range,
> -      struct drm_gem_object *obj, u64 offset)
> +      u64 addr, u64 va_range,
> +      struct drm_gem_object *obj, u64 offset, u64 gem_range)
>  {
>      struct drm_gpuva_op op = {};
>  
>      op.op = DRM_GPUVA_OP_MAP;
>      op.map.va.addr = addr;
> -    op.map.va.range = range;
> +    op.map.va.range = va_range;
>      op.map.gem.obj = obj;
>      op.map.gem.offset = offset;
> +    op.map.gem.range = gem_range;
>  
>      return fn->sm_step_map(&op, priv);
>  }
> @@ -2102,7 +2103,8 @@ static int
>  __drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm,
>             const struct drm_gpuvm_ops *ops, void *priv,
>             u64 req_addr, u64 req_range,
> -           struct drm_gem_object *req_obj, u64 req_offset)
> +           struct drm_gem_object *req_obj,
> +           u64 req_offset, u64 req_gem_range)
>  {
>      struct drm_gpuva *va, *next;
>      u64 req_end = req_addr + req_range;
> @@ -2237,7 +2239,7 @@ __drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm,
>  
>      return op_map_cb(ops, priv,
>               req_addr, req_range,
> -             req_obj, req_offset);
> +             req_obj, req_offset, req_gem_range);
>  }
>  
>  static int
> @@ -2344,10 +2346,43 @@ drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>  
>      return __drm_gpuvm_sm_map(gpuvm, ops, priv,
>                    req_addr, req_range,
> -                  req_obj, req_offset);
> +                  req_obj, req_offset, 0);
>  }
>  EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map);
>  
> +/**
> + * drm_gpuvm_sm_map_repeat() - repeatedly maps a GEM range over a VA range
> + * @gpuvm: the &drm_gpuvm representing the GPU VA space
> + * @priv: pointer to a driver private data structure
> + * @req_addr: the start address of the new mapping
> + * @req_range: the range of the new mapping
> + * @req_obj: the &drm_gem_object to map
> + * @req_offset: the offset within the &drm_gem_object
> + * @req_gem_range: the offset within the &drm_gem_object
> + *
> + * Same as drm_gpuvm_sm_map() except it repeats a GEM range over a VA range
> + *
> + * Returns: 0 on success or a negative error code
> + */
> +int
> +drm_gpuvm_sm_map_repeat(struct drm_gpuvm *gpuvm, void *priv,
> +            u64 req_addr, u64 req_range,
> +            struct drm_gem_object *req_obj,
> +            u64 req_offset, u64 req_gem_range)
> +{
> +    const struct drm_gpuvm_ops *ops = gpuvm->ops;
> +
> +    if (unlikely(!(ops && ops->sm_step_map &&
> +               ops->sm_step_remap &&
> +               ops->sm_step_unmap)))
> +        return -EINVAL;
> +
> +    return __drm_gpuvm_sm_map(gpuvm, ops, priv,
> +                  req_addr, req_range,
> +                  req_obj, req_offset, req_gem_range);
> +}
> +EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map_repeat);
> +
>  /**
>   * drm_gpuvm_sm_unmap() - creates the &drm_gpuva_ops to split on unmap
>   * @gpuvm: the &drm_gpuvm representing the GPU VA space
> @@ -2536,7 +2571,7 @@ drm_gpuvm_sm_map_ops_create(struct drm_gpuvm *gpuvm,
>  
>      ret = __drm_gpuvm_sm_map(gpuvm, &gpuvm_list_ops, &args,
>                   req_addr, req_range,
> -                 req_obj, req_offset);
> +                 req_obj, req_offset, 0);
>      if (ret)
>          goto err_free_ops;
>  
> diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
> index 00d4e43b76b6..8157ede365d1 100644
> --- a/include/drm/drm_gpuvm.h
> +++ b/include/drm/drm_gpuvm.h
> @@ -846,6 +846,14 @@ struct drm_gpuva_op_map {
>           */
>          u64 offset;
>  
> +        /**
> +         * @gem.range: the range of the GEM to map
> +         *
> +         * If smaller than va.range, the GEM range should be mapped
> +         * multiple times over the VA range.
> +         */
> +        u64 range;
> +
>          /**
>           * @gem.obj: the &drm_gem_object to map
>           */
> @@ -1203,6 +1211,11 @@ int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
>               u64 addr, u64 range,
>               struct drm_gem_object *obj, u64 offset);
>  
> +int drm_gpuvm_sm_map_repeat(struct drm_gpuvm *gpuvm, void *priv,
> +                u64 addr, u64 range,
> +                struct drm_gem_object *obj,
> +                u64 offset, u64 gem_range);
> +
>  int drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv,
>                 u64 addr, u64 range);
> 

~~ Lina
Boris Brezillon Feb. 3, 2025, 5:13 p.m. UTC | #8
On Mon, 3 Feb 2025 22:46:15 +0900
Asahi Lina <lina@asahilina.net> wrote:

> Hi,
> 
> On 2/3/25 6:21 PM, Boris Brezillon wrote:
> > +Akash with whom we've been discussing adding a 'REPEAT' mode to
> > drm_gpuvm/panthor.
> > 
> > On Sun, 2 Feb 2025 19:53:47 +0100
> > Danilo Krummrich <dakr@kernel.org> wrote:
> >   
> >> Hi Lina,
> >>
> >> On Sun, Feb 02, 2025 at 10:34:49PM +0900, Asahi Lina wrote:  
> >>> Some hardware requires dummy page mappings to efficiently implement
> >>> Vulkan sparse features. These mappings consist of the same physical
> >>> memory page, repeated for a large range of address space (e.g. 16GiB).
> >>>
> >>> Add support for this to drm_gpuvm. Currently, drm_gpuvm expects BO
> >>> ranges to correspond 1:1 to virtual memory ranges that are mapped, and
> >>> does math on the BO offset accordingly. To make single page mappings
> >>> work, we need a way to turn off that math, keeping the BO offset always
> >>> constant and pointing to the same page (typically BO offset 0).
> >>>
> >>> To make this work, we need to handle all the corner cases when these
> >>> mappings intersect with regular mappings. The rules are simply to never
> >>> mix or merge a "regular" mapping with a single page mapping.
> >>>
> >>> drm_gpuvm has support for a flags field in drm_gpuva objects. This is
> >>> normally managed by drivers directly. We can introduce a
> >>> DRM_GPUVA_SINGLE_PAGE flag to handle this. However, to make it work,
> >>> sm_map and friends need to know ahead of time whether the new mapping is
> >>> a single page mapping or not. Therefore, we need to add an argument to
> >>> these functions so drivers can provide the flags to be filled into
> >>> drm_gpuva.flags.
> >>>
> >>> These changes should not affect any existing drivers that use drm_gpuvm
> >>> other than the API change:
> >>>
> >>> - imagination: Does not use flags at all
> >>> - nouveau: Only uses drm_gpuva_invalidate(), which is only called on
> >>>   existing drm_gpuva objects (after the map steps)
> >>> - panthor: Does not use flags at all
> >>> - xe: Does not use drm_gpuva_init_from_op() or
> >>>   drm_gpuva_remap()/drm_gpuva_map() (which call it). This means that the
> >>> flags field of the gpuva object is managed by the driver only, so these
> >>> changes cannot clobber it.
> >>>
> >>> Note that the way this is implemented, drm_gpuvm does not need to know
> >>> the GPU page size. It only has to never do math on the BO offset to meet
> >>> the requirements.
> >>>
> >>> I suspect that after this change there could be some cleanup possible in
> >>> the xe driver (which right now passes flags around in various
> >>> driver-specific ways from the map step through to drm_gpuva objects),
> >>> but I'll leave that to the Xe folks.
> >>>
> >>> Signed-off-by: Asahi Lina <lina@asahilina.net>
> >>> ---
> >>> Asahi Lina (4):
> >>>       drm/gpuvm: Add a flags argument to drm_gpuvm_sm_map[_*]
> >>>       drm/gpuvm: Plumb through flags into drm_gpuva_op_map
> >>>       drm/gpuvm: Add DRM_GPUVA_SINGLE_PAGE flag and logic
> >>>       drm/gpuvm: Plumb through flags into drm_gpuva_init    
> >>
> >> Without looking into any details yet:
> >>
> >> This is a bit of tricky one, since we're not even close to having a user for
> >> this new feature upstream yet, are we?  
> > 
> > Actually, we would be interesting in having this feature hooked up in
> > panthor. One use case we have is Vulkan sparse bindings, of course. But
> > we also have cases where we need to map a dummy page repeatedly on the
> > FW side. The approach we've been considering is slightly different:
> > pass a DRM_GPUVA_REPEAT_FLAG along with GEM range, so we can repeat a
> > range of the GEM (see the below diff, which is completely untested by
> > the way), but I think we'd be fine with this SINGLE_PAGE flag.  
> 
> That sounds similar, though your patch does not handle gpuva
> splitting/remapping and all the other corner cases.

Indeed, I didn't really consider the remapping could be in the middle
of a repeated region, and I see how it complicates things.

> I think you'll find
> that once you handle those, the logic will become significantly more
> complicated, since you need to start storing the start offset within the
> repeat range on GPUVAs to be able to split them while keeping the
> mappings identical, and do modular arithmetic to keep it all consistent
> across all the corner cases.
> 
> If SINGLE_PAGE works for you then I would advocate for that.

I'm perfectly fine with that.

> It keeps
> complexity down to a minimum in drm_gpuvm. You can still have a range
> that's greater than one page in practice, you'd just have to handle it
> driver-internal and pass the desired range out of band as a flag or
> other field. For example, you could decide that the mapping is always
> congruent to the VA (GEM page offset = start offset + VA % range) and
> always treat SINGLE_PAGE mappings like that when you actually set up the
> page tables, or pass in an extra offset to be able to shift the phase of
> the mapping to whatever you want. You just need to ensure that, if you
> mix range sizes or other configuration, you don't do that for the same
> GEM BO at the same offset, so that the drm_gpuvm core does not wrongly
> consider them equivalent.
> 
> Maybe I should rename SINGLE_PAGE to something else, since it isn't
> technically limited to that as far as gpuvm is concerned. Something like
> FIXED_OFFSET?

FWIW, I think I prefer SINGLE_PAGE or REPEAT over FIXED_OFFSET. I mean,
the documentation should clear any confusion, but I like when names are
obvious enough that people can guess their purpose without having to go
read the doc, and I don't think FIXED_OFFSET is clear enough in this
regard.