diff mbox series

[4/4] drm/i915/uapi: reject set_domain for discrete

Message ID 20210715101536.2606307-5-matthew.auld@intel.com (mailing list archive)
State New, archived
Headers show
Series Some DG1 uAPI cleanup | expand

Commit Message

Matthew Auld July 15, 2021, 10:15 a.m. UTC
The CPU domain should be static for discrete, and on DG1 we don't need
any flushing since everything is already coherent, so really all this
does is an object wait, for which we have an ioctl. Longer term the
desired caching should be an immutable creation time property for the
BO, which can be set with something like gem_create_ext.

One other user is iris + userptr, which uses the set_domain to probe all
the pages to check if the GUP succeeds, however we now have a PROBE
flag for this purpose.

v2: add some more kernel doc, also add the implicit rules with caching

Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c |  3 +++
 include/uapi/drm/i915_drm.h                | 19 +++++++++++++++++++
 2 files changed, 22 insertions(+)

Comments

Tvrtko Ursulin July 16, 2021, 2:52 p.m. UTC | #1
On 15/07/2021 11:15, Matthew Auld wrote:
> The CPU domain should be static for discrete, and on DG1 we don't need
> any flushing since everything is already coherent, so really all this
> does is an object wait, for which we have an ioctl. Longer term the
> desired caching should be an immutable creation time property for the
> BO, which can be set with something like gem_create_ext.
> 
> One other user is iris + userptr, which uses the set_domain to probe all
> the pages to check if the GUP succeeds, however we now have a PROBE
> flag for this purpose.
> 
> v2: add some more kernel doc, also add the implicit rules with caching
> 
> Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> Cc: Jason Ekstrand <jason@jlekstrand.net>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Ramalingam C <ramalingam.c@intel.com>
> Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_domain.c |  3 +++
>   include/uapi/drm/i915_drm.h                | 19 +++++++++++++++++++
>   2 files changed, 22 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> index 43004bef55cb..b684a62bf3b0 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> @@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
>   	u32 write_domain = args->write_domain;
>   	int err;
>   
> +	if (IS_DGFX(to_i915(dev)))
> +		return -ENODEV;
> +
>   	/* Only handle setting domains to types used by the CPU. */
>   	if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
>   		return -EINVAL;
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 2e4112bf4d38..04ce310e7ee6 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -901,6 +901,25 @@ struct drm_i915_gem_mmap_offset {
>    *	- I915_GEM_DOMAIN_GTT: Mappable aperture domain
>    *
>    * All other domains are rejected.
> + *
> + * Note that for discrete, starting from DG1, this is no longer supported, and
> + * is instead rejected. On such platforms the CPU domain is effectively static,
> + * where we also only support a single &drm_i915_gem_mmap_offset cache mode,
> + * which can't be set explicitly and instead depends on the object placements,
> + * as per the below.
> + *
> + * Implicit caching rules, starting from DG1:
> + *
> + *	- If any of the object placements (see &drm_i915_gem_create_ext_memory_regions)
> + *	  contain I915_MEMORY_CLASS_DEVICE then the object will be allocated and
> + *	  mapped as write-combined only.

A note about write-combine buffer? I guess saying it is userspace 
responsibility to do it and how.

> + *
> + *	- Everything else is always allocated and mapped as write-back, with the
> + *	  guarantee that everything is also coherent with the GPU.

Haven't been following this so just a question on this one - it is not 
considered interesting to offer non-coherent modes, or even write 
combine, with system memory buffers, for a specific reason?

Regards,

Tvrtko

> + *
> + * Note that this is likely to change in the future again, where we might need
> + * more flexibility on future devices, so making this all explicit as part of a
> + * new &drm_i915_gem_create_ext extension is probable.
>    */
>   struct drm_i915_gem_set_domain {
>   	/** @handle: Handle for the object. */
>
Jason Ekstrand July 16, 2021, 3:23 p.m. UTC | #2
On Fri, Jul 16, 2021 at 9:52 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 15/07/2021 11:15, Matthew Auld wrote:
> > The CPU domain should be static for discrete, and on DG1 we don't need
> > any flushing since everything is already coherent, so really all this
> > does is an object wait, for which we have an ioctl. Longer term the
> > desired caching should be an immutable creation time property for the
> > BO, which can be set with something like gem_create_ext.
> >
> > One other user is iris + userptr, which uses the set_domain to probe all
> > the pages to check if the GUP succeeds, however we now have a PROBE
> > flag for this purpose.
> >
> > v2: add some more kernel doc, also add the implicit rules with caching
> >
> > Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> > Cc: Jordan Justen <jordan.l.justen@intel.com>
> > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Ramalingam C <ramalingam.c@intel.com>
> > Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_domain.c |  3 +++
> >   include/uapi/drm/i915_drm.h                | 19 +++++++++++++++++++
> >   2 files changed, 22 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > index 43004bef55cb..b684a62bf3b0 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > @@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
> >       u32 write_domain = args->write_domain;
> >       int err;
> >
> > +     if (IS_DGFX(to_i915(dev)))
> > +             return -ENODEV;
> > +
> >       /* Only handle setting domains to types used by the CPU. */
> >       if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
> >               return -EINVAL;
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 2e4112bf4d38..04ce310e7ee6 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -901,6 +901,25 @@ struct drm_i915_gem_mmap_offset {
> >    *  - I915_GEM_DOMAIN_GTT: Mappable aperture domain
> >    *
> >    * All other domains are rejected.
> > + *
> > + * Note that for discrete, starting from DG1, this is no longer supported, and
> > + * is instead rejected. On such platforms the CPU domain is effectively static,
> > + * where we also only support a single &drm_i915_gem_mmap_offset cache mode,
> > + * which can't be set explicitly and instead depends on the object placements,
> > + * as per the below.
> > + *
> > + * Implicit caching rules, starting from DG1:
> > + *
> > + *   - If any of the object placements (see &drm_i915_gem_create_ext_memory_regions)
> > + *     contain I915_MEMORY_CLASS_DEVICE then the object will be allocated and
> > + *     mapped as write-combined only.

Is this accurate?  I thought they got WB when living in SMEM and WC
when on the device.  But, since both are coherent, it's safe to lie to
userspace and say it's all WC.  Is that correct or am I missing
something?

> A note about write-combine buffer? I guess saying it is userspace
> responsibility to do it and how.

What exactly are you thinking is userspace's responsibility?

> > + *
> > + *   - Everything else is always allocated and mapped as write-back, with the
> > + *     guarantee that everything is also coherent with the GPU.
>
> Haven't been following this so just a question on this one - it is not
> considered interesting to offer non-coherent modes, or even write
> combine, with system memory buffers, for a specific reason?

We only care about non-coherent modes on integrated little-core.
There, we share memory between CPU and GPU but snooping from the GPU
is optional.  Depending on access patterns, we might want WB with GPU
snooping or we might want WC.  I don't think we care about WC for SMEM
allocations on discrete.  For that matter, I'm not sure you can
actually shut snooping off when going across a "real" PCIe bus.  At
least not with DG1.

--Jason

> Regards,
>
> Tvrtko
>
> > + *
> > + * Note that this is likely to change in the future again, where we might need
> > + * more flexibility on future devices, so making this all explicit as part of a
> > + * new &drm_i915_gem_create_ext extension is probable.
> >    */
> >   struct drm_i915_gem_set_domain {
> >       /** @handle: Handle for the object. */
> >
Tvrtko Ursulin July 19, 2021, 9 a.m. UTC | #3
On 16/07/2021 16:23, Jason Ekstrand wrote:
> On Fri, Jul 16, 2021 at 9:52 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 15/07/2021 11:15, Matthew Auld wrote:
>>> The CPU domain should be static for discrete, and on DG1 we don't need
>>> any flushing since everything is already coherent, so really all this
>>> does is an object wait, for which we have an ioctl. Longer term the
>>> desired caching should be an immutable creation time property for the
>>> BO, which can be set with something like gem_create_ext.
>>>
>>> One other user is iris + userptr, which uses the set_domain to probe all
>>> the pages to check if the GUP succeeds, however we now have a PROBE
>>> flag for this purpose.
>>>
>>> v2: add some more kernel doc, also add the implicit rules with caching
>>>
>>> Suggested-by: Daniel Vetter <daniel@ffwll.ch>
>>> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
>>> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>>> Cc: Kenneth Graunke <kenneth@whitecape.org>
>>> Cc: Jason Ekstrand <jason@jlekstrand.net>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Ramalingam C <ramalingam.c@intel.com>
>>> Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gem/i915_gem_domain.c |  3 +++
>>>    include/uapi/drm/i915_drm.h                | 19 +++++++++++++++++++
>>>    2 files changed, 22 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
>>> index 43004bef55cb..b684a62bf3b0 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
>>> @@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
>>>        u32 write_domain = args->write_domain;
>>>        int err;
>>>
>>> +     if (IS_DGFX(to_i915(dev)))
>>> +             return -ENODEV;
>>> +
>>>        /* Only handle setting domains to types used by the CPU. */
>>>        if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
>>>                return -EINVAL;
>>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>>> index 2e4112bf4d38..04ce310e7ee6 100644
>>> --- a/include/uapi/drm/i915_drm.h
>>> +++ b/include/uapi/drm/i915_drm.h
>>> @@ -901,6 +901,25 @@ struct drm_i915_gem_mmap_offset {
>>>     *  - I915_GEM_DOMAIN_GTT: Mappable aperture domain
>>>     *
>>>     * All other domains are rejected.
>>> + *
>>> + * Note that for discrete, starting from DG1, this is no longer supported, and
>>> + * is instead rejected. On such platforms the CPU domain is effectively static,
>>> + * where we also only support a single &drm_i915_gem_mmap_offset cache mode,
>>> + * which can't be set explicitly and instead depends on the object placements,
>>> + * as per the below.
>>> + *
>>> + * Implicit caching rules, starting from DG1:
>>> + *
>>> + *   - If any of the object placements (see &drm_i915_gem_create_ext_memory_regions)
>>> + *     contain I915_MEMORY_CLASS_DEVICE then the object will be allocated and
>>> + *     mapped as write-combined only.
> 
> Is this accurate?  I thought they got WB when living in SMEM and WC
> when on the device.  But, since both are coherent, it's safe to lie to
> userspace and say it's all WC.  Is that correct or am I missing
> something?
> 
>> A note about write-combine buffer? I guess saying it is userspace
>> responsibility to do it and how.
> 
> What exactly are you thinking is userspace's responsibility?

Flushing of the write combine buffer.

> 
>>> + *
>>> + *   - Everything else is always allocated and mapped as write-back, with the
>>> + *     guarantee that everything is also coherent with the GPU.
>>
>> Haven't been following this so just a question on this one - it is not
>> considered interesting to offer non-coherent modes, or even write
>> combine, with system memory buffers, for a specific reason?
> 
> We only care about non-coherent modes on integrated little-core.
> There, we share memory between CPU and GPU but snooping from the GPU
> is optional.  Depending on access patterns, we might want WB with GPU
> snooping or we might want WC.  I don't think we care about WC for SMEM
> allocations on discrete.  For that matter, I'm not sure you can
> actually shut snooping off when going across a "real" PCIe bus.  At
> least not with DG1.

But writes to system memory buffers aren't going over the PCIe bus?!

Anyways, I am not claiming it is an interesting use case, just wondering 
about the reasoning for making the modes fixed.

Regards,

Tvrtko

> 
> --Jason
> 
>> Regards,
>>
>> Tvrtko
>>
>>> + *
>>> + * Note that this is likely to change in the future again, where we might need
>>> + * more flexibility on future devices, so making this all explicit as part of a
>>> + * new &drm_i915_gem_create_ext extension is probable.
>>>     */
>>>    struct drm_i915_gem_set_domain {
>>>        /** @handle: Handle for the object. */
>>>
Matthew Auld July 19, 2021, 9:09 a.m. UTC | #4
On Fri, 16 Jul 2021 at 16:23, Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Fri, Jul 16, 2021 at 9:52 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
> >
> >
> > On 15/07/2021 11:15, Matthew Auld wrote:
> > > The CPU domain should be static for discrete, and on DG1 we don't need
> > > any flushing since everything is already coherent, so really all this
> > > does is an object wait, for which we have an ioctl. Longer term the
> > > desired caching should be an immutable creation time property for the
> > > BO, which can be set with something like gem_create_ext.
> > >
> > > One other user is iris + userptr, which uses the set_domain to probe all
> > > the pages to check if the GUP succeeds, however we now have a PROBE
> > > flag for this purpose.
> > >
> > > v2: add some more kernel doc, also add the implicit rules with caching
> > >
> > > Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> > > Cc: Jordan Justen <jordan.l.justen@intel.com>
> > > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Cc: Ramalingam C <ramalingam.c@intel.com>
> > > Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_domain.c |  3 +++
> > >   include/uapi/drm/i915_drm.h                | 19 +++++++++++++++++++
> > >   2 files changed, 22 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > index 43004bef55cb..b684a62bf3b0 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > @@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
> > >       u32 write_domain = args->write_domain;
> > >       int err;
> > >
> > > +     if (IS_DGFX(to_i915(dev)))
> > > +             return -ENODEV;
> > > +
> > >       /* Only handle setting domains to types used by the CPU. */
> > >       if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
> > >               return -EINVAL;
> > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > index 2e4112bf4d38..04ce310e7ee6 100644
> > > --- a/include/uapi/drm/i915_drm.h
> > > +++ b/include/uapi/drm/i915_drm.h
> > > @@ -901,6 +901,25 @@ struct drm_i915_gem_mmap_offset {
> > >    *  - I915_GEM_DOMAIN_GTT: Mappable aperture domain
> > >    *
> > >    * All other domains are rejected.
> > > + *
> > > + * Note that for discrete, starting from DG1, this is no longer supported, and
> > > + * is instead rejected. On such platforms the CPU domain is effectively static,
> > > + * where we also only support a single &drm_i915_gem_mmap_offset cache mode,
> > > + * which can't be set explicitly and instead depends on the object placements,
> > > + * as per the below.
> > > + *
> > > + * Implicit caching rules, starting from DG1:
> > > + *
> > > + *   - If any of the object placements (see &drm_i915_gem_create_ext_memory_regions)
> > > + *     contain I915_MEMORY_CLASS_DEVICE then the object will be allocated and
> > > + *     mapped as write-combined only.
>
> Is this accurate?  I thought they got WB when living in SMEM and WC
> when on the device.  But, since both are coherent, it's safe to lie to
> userspace and say it's all WC.  Is that correct or am I missing
> something?

Yes, it's accurate, it will be allocated and mapped as WC. I think we
can just make select_tt_caching always return cached if we want, and
it looks like ttm seems to be fine with having different caching
values for the tt vs io resource. Daniel, should we adjust this?

>
> > A note about write-combine buffer? I guess saying it is userspace
> > responsibility to do it and how.
>
> What exactly are you thinking is userspace's responsibility?
>
> > > + *
> > > + *   - Everything else is always allocated and mapped as write-back, with the
> > > + *     guarantee that everything is also coherent with the GPU.
> >
> > Haven't been following this so just a question on this one - it is not
> > considered interesting to offer non-coherent modes, or even write
> > combine, with system memory buffers, for a specific reason?
>
> We only care about non-coherent modes on integrated little-core.
> There, we share memory between CPU and GPU but snooping from the GPU
> is optional.  Depending on access patterns, we might want WB with GPU
> snooping or we might want WC.  I don't think we care about WC for SMEM
> allocations on discrete.  For that matter, I'm not sure you can
> actually shut snooping off when going across a "real" PCIe bus.  At
> least not with DG1.
>
> --Jason
>
> > Regards,
> >
> > Tvrtko
> >
> > > + *
> > > + * Note that this is likely to change in the future again, where we might need
> > > + * more flexibility on future devices, so making this all explicit as part of a
> > > + * new &drm_i915_gem_create_ext extension is probable.
> > >    */
> > >   struct drm_i915_gem_set_domain {
> > >       /** @handle: Handle for the object. */
> > >
Jason Ekstrand July 19, 2021, 7:57 p.m. UTC | #5
On Mon, Jul 19, 2021 at 4:10 AM Matthew Auld
<matthew.william.auld@gmail.com> wrote:
>
> On Fri, 16 Jul 2021 at 16:23, Jason Ekstrand <jason@jlekstrand.net> wrote:
> >
> > On Fri, Jul 16, 2021 at 9:52 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> > >
> > >
> > > On 15/07/2021 11:15, Matthew Auld wrote:
> > > > The CPU domain should be static for discrete, and on DG1 we don't need
> > > > any flushing since everything is already coherent, so really all this
> > > > does is an object wait, for which we have an ioctl. Longer term the
> > > > desired caching should be an immutable creation time property for the
> > > > BO, which can be set with something like gem_create_ext.
> > > >
> > > > One other user is iris + userptr, which uses the set_domain to probe all
> > > > the pages to check if the GUP succeeds, however we now have a PROBE
> > > > flag for this purpose.
> > > >
> > > > v2: add some more kernel doc, also add the implicit rules with caching
> > > >
> > > > Suggested-by: Daniel Vetter <daniel@ffwll.ch>
> > > > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > > > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> > > > Cc: Jordan Justen <jordan.l.justen@intel.com>
> > > > Cc: Kenneth Graunke <kenneth@whitecape.org>
> > > > Cc: Jason Ekstrand <jason@jlekstrand.net>
> > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > Cc: Ramalingam C <ramalingam.c@intel.com>
> > > > Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
> > > > ---
> > > >   drivers/gpu/drm/i915/gem/i915_gem_domain.c |  3 +++
> > > >   include/uapi/drm/i915_drm.h                | 19 +++++++++++++++++++
> > > >   2 files changed, 22 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > index 43004bef55cb..b684a62bf3b0 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
> > > > @@ -490,6 +490,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
> > > >       u32 write_domain = args->write_domain;
> > > >       int err;
> > > >
> > > > +     if (IS_DGFX(to_i915(dev)))
> > > > +             return -ENODEV;
> > > > +
> > > >       /* Only handle setting domains to types used by the CPU. */
> > > >       if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
> > > >               return -EINVAL;
> > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > > index 2e4112bf4d38..04ce310e7ee6 100644
> > > > --- a/include/uapi/drm/i915_drm.h
> > > > +++ b/include/uapi/drm/i915_drm.h
> > > > @@ -901,6 +901,25 @@ struct drm_i915_gem_mmap_offset {
> > > >    *  - I915_GEM_DOMAIN_GTT: Mappable aperture domain
> > > >    *
> > > >    * All other domains are rejected.
> > > > + *
> > > > + * Note that for discrete, starting from DG1, this is no longer supported, and
> > > > + * is instead rejected. On such platforms the CPU domain is effectively static,
> > > > + * where we also only support a single &drm_i915_gem_mmap_offset cache mode,
> > > > + * which can't be set explicitly and instead depends on the object placements,
> > > > + * as per the below.
> > > > + *
> > > > + * Implicit caching rules, starting from DG1:
> > > > + *
> > > > + *   - If any of the object placements (see &drm_i915_gem_create_ext_memory_regions)
> > > > + *     contain I915_MEMORY_CLASS_DEVICE then the object will be allocated and
> > > > + *     mapped as write-combined only.
> >
> > Is this accurate?  I thought they got WB when living in SMEM and WC
> > when on the device.  But, since both are coherent, it's safe to lie to
> > userspace and say it's all WC.  Is that correct or am I missing
> > something?
>
> Yes, it's accurate, it will be allocated and mapped as WC. I think we
> can just make select_tt_caching always return cached if we want, and
> it looks like ttm seems to be fine with having different caching
> values for the tt vs io resource. Daniel, should we adjust this?

Mildly related, we had an issue some time back with i915+amdgpu where
we were choosing different caching settings for SMEM shared BOs and
the fallout was that we had all sorts of caching trouble when running
an integrated+discrete setup with them.  I don't remember how all that
shook out but we should think about it here.  Why is this important?
Because our mmap caching settings are going to be related to our
snooping settings for GPU access across the PCIe bar to SMEM.  If
we're WC then when can avoid snooping but if we're WB then we need
snooping enabled.  WC+snooping might work but I'm not sure off-hand.

--Jason

> >
> > > A note about write-combine buffer? I guess saying it is userspace
> > > responsibility to do it and how.
> >
> > What exactly are you thinking is userspace's responsibility?
> >
> > > > + *
> > > > + *   - Everything else is always allocated and mapped as write-back, with the
> > > > + *     guarantee that everything is also coherent with the GPU.
> > >
> > > Haven't been following this so just a question on this one - it is not
> > > considered interesting to offer non-coherent modes, or even write
> > > combine, with system memory buffers, for a specific reason?
> >
> > We only care about non-coherent modes on integrated little-core.
> > There, we share memory between CPU and GPU but snooping from the GPU
> > is optional.  Depending on access patterns, we might want WB with GPU
> > snooping or we might want WC.  I don't think we care about WC for SMEM
> > allocations on discrete.  For that matter, I'm not sure you can
> > actually shut snooping off when going across a "real" PCIe bus.  At
> > least not with DG1.
> >
> > --Jason
> >
> > > Regards,
> > >
> > > Tvrtko
> > >
> > > > + *
> > > > + * Note that this is likely to change in the future again, where we might need
> > > > + * more flexibility on future devices, so making this all explicit as part of a
> > > > + * new &drm_i915_gem_create_ext extension is probable.
> > > >    */
> > > >   struct drm_i915_gem_set_domain {
> > > >       /** @handle: Handle for the object. */
> > > >
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index 43004bef55cb..b684a62bf3b0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -490,6 +490,9 @@  i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	u32 write_domain = args->write_domain;
 	int err;
 
+	if (IS_DGFX(to_i915(dev)))
+		return -ENODEV;
+
 	/* Only handle setting domains to types used by the CPU. */
 	if ((write_domain | read_domains) & I915_GEM_GPU_DOMAINS)
 		return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 2e4112bf4d38..04ce310e7ee6 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -901,6 +901,25 @@  struct drm_i915_gem_mmap_offset {
  *	- I915_GEM_DOMAIN_GTT: Mappable aperture domain
  *
  * All other domains are rejected.
+ *
+ * Note that for discrete, starting from DG1, this is no longer supported, and
+ * is instead rejected. On such platforms the CPU domain is effectively static,
+ * where we also only support a single &drm_i915_gem_mmap_offset cache mode,
+ * which can't be set explicitly and instead depends on the object placements,
+ * as per the below.
+ *
+ * Implicit caching rules, starting from DG1:
+ *
+ *	- If any of the object placements (see &drm_i915_gem_create_ext_memory_regions)
+ *	  contain I915_MEMORY_CLASS_DEVICE then the object will be allocated and
+ *	  mapped as write-combined only.
+ *
+ *	- Everything else is always allocated and mapped as write-back, with the
+ *	  guarantee that everything is also coherent with the GPU.
+ *
+ * Note that this is likely to change in the future again, where we might need
+ * more flexibility on future devices, so making this all explicit as part of a
+ * new &drm_i915_gem_create_ext extension is probable.
  */
 struct drm_i915_gem_set_domain {
 	/** @handle: Handle for the object. */