Message ID | 20231009132204.15098-1-ville.syrjala@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/4] drm/i915/dsb: Allocate command buffer from local memory | expand |
> -----Original Message----- > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Ville > Syrjala > Sent: Monday, October 9, 2023 6:52 PM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH 1/4] drm/i915/dsb: Allocate command buffer from > local memory > > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Using system memory for the DSB command buffer doesn't appear to work. > On DG2 it seems like the hardware internally replaces the actual memory reads > with zeroes, and so we end up executing a bunch of NOOPs instead of whatever > commands we put in the buffer. To determine that I measured the time it takes to > execute the instructions, and the results are always more or less consistent with > executing a buffer full of NOOPs from local memory. > > Another theory I considered was some kind of cache coherency issue. > Looks like i915_gem_object_pin_map_unlocked() will in fact give you a WB > mapping for system memory on DGFX regardless of what mapping mode was > requested (WC in case of the DSB code). But clflush did not change the behaviour > at all, so that theory seems moot. > > On DG1 it looks like the hardware might actually be fetching data from system > memory as the logs indicate that we just get underruns. But that is equally bad, so > doens't look like we can really use system memory on > DG1 either. > > Thus always allocate the DSB command buffer from local memory on discrete > GPUs. This seems fair to do, Reviewed-by: Uma Shankar <uma.shankar@intel.com> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > --- > drivers/gpu/drm/i915/display/intel_dsb.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c > b/drivers/gpu/drm/i915/display/intel_dsb.c > index 3e32aa49b8eb..7410ba3126f9 100644 > --- a/drivers/gpu/drm/i915/display/intel_dsb.c > +++ b/drivers/gpu/drm/i915/display/intel_dsb.c > @@ -5,6 +5,7 @@ > */ > > #include "gem/i915_gem_internal.h" > +#include "gem/i915_gem_lmem.h" > > #include "i915_drv.h" > #include "i915_irq.h" > @@ -461,7 +462,11 @@ struct intel_dsb *intel_dsb_prepare(const struct > intel_crtc_state *crtc_state, > /* ~1 qword per instruction, full cachelines */ > size = ALIGN(max_cmds * 8, CACHELINE_BYTES); > > - obj = i915_gem_object_create_internal(i915, PAGE_ALIGN(size)); > + if (HAS_LMEM(i915)) > + obj = i915_gem_object_create_lmem(i915, PAGE_ALIGN(size), > + > I915_BO_ALLOC_CONTIGUOUS); > + else > + obj = i915_gem_object_create_internal(i915, PAGE_ALIGN(size)); > if (IS_ERR(obj)) > goto out_put_rpm; > > -- > 2.41.0
diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index 3e32aa49b8eb..7410ba3126f9 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -5,6 +5,7 @@ */ #include "gem/i915_gem_internal.h" +#include "gem/i915_gem_lmem.h" #include "i915_drv.h" #include "i915_irq.h" @@ -461,7 +462,11 @@ struct intel_dsb *intel_dsb_prepare(const struct intel_crtc_state *crtc_state, /* ~1 qword per instruction, full cachelines */ size = ALIGN(max_cmds * 8, CACHELINE_BYTES); - obj = i915_gem_object_create_internal(i915, PAGE_ALIGN(size)); + if (HAS_LMEM(i915)) + obj = i915_gem_object_create_lmem(i915, PAGE_ALIGN(size), + I915_BO_ALLOC_CONTIGUOUS); + else + obj = i915_gem_object_create_internal(i915, PAGE_ALIGN(size)); if (IS_ERR(obj)) goto out_put_rpm;