Message ID | 1403906985-11045-1-git-send-email-benjamin.widawsky@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> On Fri, Jun 27, 2014 at 3:09 PM, Ben Widawsky <benjamin.widawsky@intel.com> wrote: > There are some cases in the code where we need to know how many rings to > iterate over, but cannot use for_each_ring(). These are always error cases > which happen either before ring setup, or after ring teardown (or > reset). > > Note, a NUM_RINGS issue exists in semaphores, but this is fixed by the > remaining semaphore patches which Rodrigo will resubmit shortly. I'd > rather see those patches for fixing the problem than fix it here. > > I found this initially for the BSD2 case where on the same platform we > can have differing rings. AFAICT however this effects many platforms. > > I'd CC stable on this, except I think all the issues have been around > for multiple releases without bug reports. > > Compile tested only for now. > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_gem_context.c | 6 +++--- > drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++ > 3 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > b/drivers/gpu/drm/i915/i915_gem_context.c > index b9bac25..0c044a9 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -403,7 +403,7 @@ void i915_gem_context_reset(struct drm_device *dev) > > /* Prevent the hardware from restoring the last context (which > hung) on > * the next switch */ > - for (i = 0; i < I915_NUM_RINGS; i++) { > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) { > struct intel_engine_cs *ring = &dev_priv->ring[i]; > struct intel_context *dctx = ring->default_context; > > @@ -456,7 +456,7 @@ int i915_gem_context_init(struct drm_device *dev) > } > > /* NB: RCS will hold a ref for all rings */ > - for (i = 0; i < I915_NUM_RINGS; i++) > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) > dev_priv->ring[i].default_context = ctx; > > DRM_DEBUG_DRIVER("%s context support initialized\n", > dev_priv->hw_context_size ? "HW" : "fake"); > @@ -493,7 +493,7 @@ void i915_gem_context_fini(struct drm_device *dev) > i915_gem_object_ggtt_unpin(dctx->obj); > } > > - for (i = 0; i < I915_NUM_RINGS; i++) { > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) { > struct intel_engine_cs *ring = &dev_priv->ring[i]; > > if (ring->last_context) > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > b/drivers/gpu/drm/i915/i915_gpu_error.c > index 86362de..6e5250d 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -848,7 +848,7 @@ static uint32_t i915_error_generate_code(struct > drm_i915_private *dev_priv, > * synchronization commands which almost always appear in the case > * strictly a client bug. Use instdone to differentiate those some. > */ > - for (i = 0; i < I915_NUM_RINGS; i++) { > + for (i = 0; i < I915_ACTIVE_RINGS(dev_priv->dev); i++) { > if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) { > if (ring_id) > *ring_id = i; > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > b/drivers/gpu/drm/i915/intel_ringbuffer.h > index e72017b..67e2919 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -90,6 +90,8 @@ struct intel_engine_cs { > } id; > #define I915_NUM_RINGS 5 > #define LAST_USER_RING (VECS + 1) > +#define I915_ACTIVE_RINGS(dev) hweight8(INTEL_INFO(dev)->ring_mask) > + > u32 mmio_base; > struct drm_device *dev; > struct intel_ringbuffer *buffer; > -- > 2.0.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx >
On Fri, Jun 27, 2014 at 03:21:20PM -0700, Rodrigo Vivi wrote: > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > I have a couple of spots that I think are important to add, all in error state. I'll repost v2 after I can actually test it. [snip]
On Fri, Jun 27, 2014 at 03:21:20PM -0700, Rodrigo Vivi wrote: > Reviewed-by: Rodrigo Vivi <[1]rodrigo.vivi@intel.com> > > On Fri, Jun 27, 2014 at 3:09 PM, Ben Widawsky > <[2]benjamin.widawsky@intel.com> wrote: > > There are some cases in the code where we need to know how many rings to > iterate over, but cannot use for_each_ring(). These are always error > cases > which happen either before ring setup, or after ring teardown (or > reset). > > Note, a NUM_RINGS issue exists in semaphores, but this is fixed by the > remaining semaphore patches which Rodrigo will resubmit shortly. I'd > rather see those patches for fixing the problem than fix it here. > > I found this initially for the BSD2 case where on the same platform we > can have differing rings. AFAICT however this effects many platforms. > > I'd CC stable on this, except I think all the issues have been around > for multiple releases without bug reports. > > Compile tested only for now. > > Signed-off-by: Ben Widawsky <[3]ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_gem_context.c | 6 +++--- > drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++ > 3 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > b/drivers/gpu/drm/i915/i915_gem_context.c > index b9bac25..0c044a9 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -403,7 +403,7 @@ void i915_gem_context_reset(struct drm_device *dev) > > /* Prevent the hardware from restoring the last context (which > hung) on > * the next switch */ > - for (i = 0; i < I915_NUM_RINGS; i++) { > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) { > struct intel_engine_cs *ring = &dev_priv->ring[i]; > struct intel_context *dctx = ring->default_context; > > @@ -456,7 +456,7 @@ int i915_gem_context_init(struct drm_device *dev) > } > > /* NB: RCS will hold a ref for all rings */ > - for (i = 0; i < I915_NUM_RINGS; i++) > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) > dev_priv->ring[i].default_context = ctx; > > DRM_DEBUG_DRIVER("%s context support initialized\n", > dev_priv->hw_context_size ? "HW" : "fake"); > @@ -493,7 +493,7 @@ void i915_gem_context_fini(struct drm_device *dev) > i915_gem_object_ggtt_unpin(dctx->obj); > } > > - for (i = 0; i < I915_NUM_RINGS; i++) { > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) { > struct intel_engine_cs *ring = &dev_priv->ring[i]; > > if (ring->last_context) > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > b/drivers/gpu/drm/i915/i915_gpu_error.c > index 86362de..6e5250d 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -848,7 +848,7 @@ static uint32_t i915_error_generate_code(struct > drm_i915_private *dev_priv, > * synchronization commands which almost always appear in the > case > * strictly a client bug. Use instdone to differentiate those > some. > */ > - for (i = 0; i < I915_NUM_RINGS; i++) { > + for (i = 0; i < I915_ACTIVE_RINGS(dev_priv->dev); i++) { > if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) { > if (ring_id) > *ring_id = i; > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > b/drivers/gpu/drm/i915/intel_ringbuffer.h > index e72017b..67e2919 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -90,6 +90,8 @@ struct intel_engine_cs { > } id; > #define I915_NUM_RINGS 5 > #define LAST_USER_RING (VECS + 1) > +#define I915_ACTIVE_RINGS(dev) hweight8(INTEL_INFO(dev)->ring_mask) What does the popcount of the mask have to do with the validity of the arrays being iterated over in this patch? -Chris
On Sat, Jun 28, 2014 at 07:20:38AM +0100, Chris Wilson wrote: > On Fri, Jun 27, 2014 at 03:21:20PM -0700, Rodrigo Vivi wrote: > > Reviewed-by: Rodrigo Vivi <[1]rodrigo.vivi@intel.com> > > > > On Fri, Jun 27, 2014 at 3:09 PM, Ben Widawsky > > <[2]benjamin.widawsky@intel.com> wrote: > > > > There are some cases in the code where we need to know how many rings to > > iterate over, but cannot use for_each_ring(). These are always error > > cases > > which happen either before ring setup, or after ring teardown (or > > reset). > > > > Note, a NUM_RINGS issue exists in semaphores, but this is fixed by the > > remaining semaphore patches which Rodrigo will resubmit shortly. I'd > > rather see those patches for fixing the problem than fix it here. > > > > I found this initially for the BSD2 case where on the same platform we > > can have differing rings. AFAICT however this effects many platforms. > > > > I'd CC stable on this, except I think all the issues have been around > > for multiple releases without bug reports. > > > > Compile tested only for now. > > > > Signed-off-by: Ben Widawsky <[3]ben@bwidawsk.net> > > --- > > drivers/gpu/drm/i915/i915_gem_context.c | 6 +++--- > > drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++ > > 3 files changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c > > b/drivers/gpu/drm/i915/i915_gem_context.c > > index b9bac25..0c044a9 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > > @@ -403,7 +403,7 @@ void i915_gem_context_reset(struct drm_device *dev) > > > > /* Prevent the hardware from restoring the last context (which > > hung) on > > * the next switch */ > > - for (i = 0; i < I915_NUM_RINGS; i++) { > > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) { > > struct intel_engine_cs *ring = &dev_priv->ring[i]; > > struct intel_context *dctx = ring->default_context; > > > > @@ -456,7 +456,7 @@ int i915_gem_context_init(struct drm_device *dev) > > } > > > > /* NB: RCS will hold a ref for all rings */ > > - for (i = 0; i < I915_NUM_RINGS; i++) > > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) > > dev_priv->ring[i].default_context = ctx; > > > > DRM_DEBUG_DRIVER("%s context support initialized\n", > > dev_priv->hw_context_size ? "HW" : "fake"); > > @@ -493,7 +493,7 @@ void i915_gem_context_fini(struct drm_device *dev) > > i915_gem_object_ggtt_unpin(dctx->obj); > > } > > > > - for (i = 0; i < I915_NUM_RINGS; i++) { > > + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) { > > struct intel_engine_cs *ring = &dev_priv->ring[i]; > > > > if (ring->last_context) > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > > b/drivers/gpu/drm/i915/i915_gpu_error.c > > index 86362de..6e5250d 100644 > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > @@ -848,7 +848,7 @@ static uint32_t i915_error_generate_code(struct > > drm_i915_private *dev_priv, > > * synchronization commands which almost always appear in the > > case > > * strictly a client bug. Use instdone to differentiate those > > some. > > */ > > - for (i = 0; i < I915_NUM_RINGS; i++) { > > + for (i = 0; i < I915_ACTIVE_RINGS(dev_priv->dev); i++) { > > if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) { > > if (ring_id) > > *ring_id = i; > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > > b/drivers/gpu/drm/i915/intel_ringbuffer.h > > index e72017b..67e2919 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > > @@ -90,6 +90,8 @@ struct intel_engine_cs { > > } id; > > #define I915_NUM_RINGS 5 > > #define LAST_USER_RING (VECS + 1) > > +#define I915_ACTIVE_RINGS(dev) hweight8(INTEL_INFO(dev)->ring_mask) > > What does the popcount of the mask have to do with the validity of the > arrays being iterated over in this patch? > -Chris The popcount of the mask represents the number of rings available on the specific SKU, as opposed to the total number of rings on any SKU ever. It is not always correct to iterate on all rings in the system. Please note, the patch is incomplete. I have a couple of other, perhaps more interesting, cases which I've missed.
On Sat, Jun 28, 2014 at 08:26:15AM -0700, Ben Widawsky wrote: > On Sat, Jun 28, 2014 at 07:20:38AM +0100, Chris Wilson wrote: > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > > > b/drivers/gpu/drm/i915/i915_gpu_error.c > > > index 86362de..6e5250d 100644 > > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > > @@ -848,7 +848,7 @@ static uint32_t i915_error_generate_code(struct > > > drm_i915_private *dev_priv, > > > * synchronization commands which almost always appear in the > > > case > > > * strictly a client bug. Use instdone to differentiate those > > > some. > > > */ > > > - for (i = 0; i < I915_NUM_RINGS; i++) { > > > + for (i = 0; i < I915_ACTIVE_RINGS(dev_priv->dev); i++) { > > > if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) { > > > if (ring_id) > > > *ring_id = i; > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > > > b/drivers/gpu/drm/i915/intel_ringbuffer.h > > > index e72017b..67e2919 100644 > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > > > @@ -90,6 +90,8 @@ struct intel_engine_cs { > > > } id; > > > #define I915_NUM_RINGS 5 > > > #define LAST_USER_RING (VECS + 1) > > > +#define I915_ACTIVE_RINGS(dev) hweight8(INTEL_INFO(dev)->ring_mask) > > > > What does the popcount of the mask have to do with the validity of the > > arrays being iterated over in this patch? > > -Chris > > The popcount of the mask represents the number of rings available on the > specific SKU, as opposed to the total number of rings on any SKU ever. > It is not always correct to iterate on all rings in the system. Please > note, the patch is incomplete. I have a couple of other, perhaps more > interesting, cases which I've missed. You still iterate over holes in the ring mask, and the iteration here is over a completely different array, not rings. -Chris
On Sat, Jun 28, 2014 at 08:28:55PM +0100, Chris Wilson wrote: > On Sat, Jun 28, 2014 at 08:26:15AM -0700, Ben Widawsky wrote: > > On Sat, Jun 28, 2014 at 07:20:38AM +0100, Chris Wilson wrote: > > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > > > > b/drivers/gpu/drm/i915/i915_gpu_error.c > > > > index 86362de..6e5250d 100644 > > > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > > > @@ -848,7 +848,7 @@ static uint32_t i915_error_generate_code(struct > > > > drm_i915_private *dev_priv, > > > > * synchronization commands which almost always appear in the > > > > case > > > > * strictly a client bug. Use instdone to differentiate those > > > > some. > > > > */ > > > > - for (i = 0; i < I915_NUM_RINGS; i++) { > > > > + for (i = 0; i < I915_ACTIVE_RINGS(dev_priv->dev); i++) { > > > > if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) { > > > > if (ring_id) > > > > *ring_id = i; > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > index e72017b..67e2919 100644 > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > @@ -90,6 +90,8 @@ struct intel_engine_cs { > > > > } id; > > > > #define I915_NUM_RINGS 5 > > > > #define LAST_USER_RING (VECS + 1) > > > > +#define I915_ACTIVE_RINGS(dev) hweight8(INTEL_INFO(dev)->ring_mask) > > > > > > What does the popcount of the mask have to do with the validity of the > > > arrays being iterated over in this patch? > > > -Chris > > > > The popcount of the mask represents the number of rings available on the > > specific SKU, as opposed to the total number of rings on any SKU ever. > > It is not always correct to iterate on all rings in the system. Please > > note, the patch is incomplete. I have a couple of other, perhaps more > > interesting, cases which I've missed. > > You still iterate over holes in the ring mask, and the iteration here is > over a completely different array, not rings. > -Chris For the holes, I mentioned that in the commit message of the yet to be submitted v2; it's not really an issue in the way things are today. When/if we add a new ring, it will be. What you're asking for has already been submitted multiple times with seemingly no traction. I do realize the fixes (with my v2) are due to bugs introduced in patches I've not yet submitted, so I think for that reason, it's fair to drop this patch. I'd rather the other patch get in (for_each_active_ring), but it's tied up with execlists atm, and I continue to think this is a useful way to iterate over the rings in error conditions and during reset. As for your second point, assuming it's the code above, I don't quite follow what you mean. Error code generation shouldn't be based upon inactive rings. As for whether it changes any of the functionality, it does not - but that wasn't the point of that hunk. > -- > Chris Wilson, Intel Open Source Technology Centre
--------------------------------------------------------------------- Intel Corporation (UK) Limited Registered No. 1134945 (England) Registered Office: Pipers Way, Swindon SN3 1RJ VAT No: 860 2173 47 > -----Original Message----- > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf > Of Ben Widawsky > Sent: Saturday, June 28, 2014 10:56 PM > To: Chris Wilson; Rodrigo Vivi; Widawsky, Benjamin; Intel GFX > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Fix some NUM_RING iterators > > On Sat, Jun 28, 2014 at 08:28:55PM +0100, Chris Wilson wrote: > > On Sat, Jun 28, 2014 at 08:26:15AM -0700, Ben Widawsky wrote: > > > On Sat, Jun 28, 2014 at 07:20:38AM +0100, Chris Wilson wrote: > > > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > > > > > b/drivers/gpu/drm/i915/i915_gpu_error.c > > > > > index 86362de..6e5250d 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > > > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > > > > @@ -848,7 +848,7 @@ static uint32_t > i915_error_generate_code(struct > > > > > drm_i915_private *dev_priv, > > > > > * synchronization commands which almost always appear in > the > > > > > case > > > > > * strictly a client bug. Use instdone to differentiate those > > > > > some. > > > > > */ > > > > > - for (i = 0; i < I915_NUM_RINGS; i++) { > > > > > + for (i = 0; i < I915_ACTIVE_RINGS(dev_priv->dev); i++) { > > > > > if (error->ring[i].hangcheck_action == > HANGCHECK_HUNG) { > > > > > if (ring_id) > > > > > *ring_id = i; > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > > index e72017b..67e2919 100644 > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > > @@ -90,6 +90,8 @@ struct intel_engine_cs { > > > > > } id; > > > > > #define I915_NUM_RINGS 5 > > > > > #define LAST_USER_RING (VECS + 1) > > > > > +#define I915_ACTIVE_RINGS(dev) > > > > > hweight8(INTEL_INFO(dev)->ring_mask) > > > > > > > > What does the popcount of the mask have to do with the validity of > > > > the arrays being iterated over in this patch? > > > > -Chris > > > > > > The popcount of the mask represents the number of rings available on > > > the specific SKU, as opposed to the total number of rings on any SKU > ever. > > > It is not always correct to iterate on all rings in the system. > > > Please note, the patch is incomplete. I have a couple of other, > > > perhaps more interesting, cases which I've missed. > > > > You still iterate over holes in the ring mask, and the iteration here > > is over a completely different array, not rings. > > -Chris > > For the holes, I mentioned that in the commit message of the yet to be > submitted v2; it's not really an issue in the way things are today. > When/if we add a new ring, it will be. > > What you're asking for has already been submitted multiple times with > seemingly no traction. I do realize the fixes (with my v2) are due to bugs > introduced in patches I've not yet submitted, so I think for that reason, it's > fair to drop this patch. > > I'd rather the other patch get in (for_each_active_ring), but it's tied up with > execlists atm, and I continue to think this is a useful way to iterate over the > rings in error conditions and during reset. I dropped that patch, since it received some resistance and I couldn´t really justify it on the Execlists series anymore (on the latest versions we don´t introduce new for i < I915_NUM_RINGS). I imagine the patch could be sent again as a standalone? -- Oscar
On Mon, Jun 30, 2014 at 11:27:25AM +0000, Mateo Lozano, Oscar wrote: > > > --------------------------------------------------------------------- > Intel Corporation (UK) Limited > Registered No. 1134945 (England) > Registered Office: Pipers Way, Swindon SN3 1RJ > VAT No: 860 2173 47 > > > -----Original Message----- > > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf > > Of Ben Widawsky > > Sent: Saturday, June 28, 2014 10:56 PM > > To: Chris Wilson; Rodrigo Vivi; Widawsky, Benjamin; Intel GFX > > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Fix some NUM_RING iterators > > > > On Sat, Jun 28, 2014 at 08:28:55PM +0100, Chris Wilson wrote: > > > On Sat, Jun 28, 2014 at 08:26:15AM -0700, Ben Widawsky wrote: > > > > On Sat, Jun 28, 2014 at 07:20:38AM +0100, Chris Wilson wrote: > > > > > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c > > > > > > b/drivers/gpu/drm/i915/i915_gpu_error.c > > > > > > index 86362de..6e5250d 100644 > > > > > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > > > > > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > > > > > > @@ -848,7 +848,7 @@ static uint32_t > > i915_error_generate_code(struct > > > > > > drm_i915_private *dev_priv, > > > > > > * synchronization commands which almost always appear in > > the > > > > > > case > > > > > > * strictly a client bug. Use instdone to differentiate those > > > > > > some. > > > > > > */ > > > > > > - for (i = 0; i < I915_NUM_RINGS; i++) { > > > > > > + for (i = 0; i < I915_ACTIVE_RINGS(dev_priv->dev); i++) { > > > > > > if (error->ring[i].hangcheck_action == > > HANGCHECK_HUNG) { > > > > > > if (ring_id) > > > > > > *ring_id = i; > > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > > > index e72017b..67e2919 100644 > > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > > > > > > @@ -90,6 +90,8 @@ struct intel_engine_cs { > > > > > > } id; > > > > > > #define I915_NUM_RINGS 5 > > > > > > #define LAST_USER_RING (VECS + 1) > > > > > > +#define I915_ACTIVE_RINGS(dev) > > > > > > hweight8(INTEL_INFO(dev)->ring_mask) > > > > > > > > > > What does the popcount of the mask have to do with the validity of > > > > > the arrays being iterated over in this patch? > > > > > -Chris > > > > > > > > The popcount of the mask represents the number of rings available on > > > > the specific SKU, as opposed to the total number of rings on any SKU > > ever. > > > > It is not always correct to iterate on all rings in the system. > > > > Please note, the patch is incomplete. I have a couple of other, > > > > perhaps more interesting, cases which I've missed. > > > > > > You still iterate over holes in the ring mask, and the iteration here > > > is over a completely different array, not rings. > > > -Chris > > > > For the holes, I mentioned that in the commit message of the yet to be > > submitted v2; it's not really an issue in the way things are today. > > When/if we add a new ring, it will be. > > > > What you're asking for has already been submitted multiple times with > > seemingly no traction. I do realize the fixes (with my v2) are due to bugs > > introduced in patches I've not yet submitted, so I think for that reason, it's > > fair to drop this patch. > > > > I'd rather the other patch get in (for_each_active_ring), but it's tied up with > > execlists atm, and I continue to think this is a useful way to iterate over the > > rings in error conditions and during reset. > > I dropped that patch, since it received some resistance and I couldn´t > really justify it on the Execlists series anymore (on the latest > versions we don´t introduce new for i < I915_NUM_RINGS). I imagine the > patch could be sent again as a standalone? With Chris' patch to no longer tear down ring structures over reset/system suspend we should be able to always use ring_for_each. If not that means we still have some fun to look at. In any case I'm always happy to merge such drive-by cleanup patches, no need to have a big patch series to justify it. Well as long as it's indeed a step forward, which occasionally is a contentions topic ... -Daniel
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index b9bac25..0c044a9 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -403,7 +403,7 @@ void i915_gem_context_reset(struct drm_device *dev) /* Prevent the hardware from restoring the last context (which hung) on * the next switch */ - for (i = 0; i < I915_NUM_RINGS; i++) { + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) { struct intel_engine_cs *ring = &dev_priv->ring[i]; struct intel_context *dctx = ring->default_context; @@ -456,7 +456,7 @@ int i915_gem_context_init(struct drm_device *dev) } /* NB: RCS will hold a ref for all rings */ - for (i = 0; i < I915_NUM_RINGS; i++) + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) dev_priv->ring[i].default_context = ctx; DRM_DEBUG_DRIVER("%s context support initialized\n", dev_priv->hw_context_size ? "HW" : "fake"); @@ -493,7 +493,7 @@ void i915_gem_context_fini(struct drm_device *dev) i915_gem_object_ggtt_unpin(dctx->obj); } - for (i = 0; i < I915_NUM_RINGS; i++) { + for (i = 0; i < I915_ACTIVE_RINGS(dev); i++) { struct intel_engine_cs *ring = &dev_priv->ring[i]; if (ring->last_context) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 86362de..6e5250d 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -848,7 +848,7 @@ static uint32_t i915_error_generate_code(struct drm_i915_private *dev_priv, * synchronization commands which almost always appear in the case * strictly a client bug. Use instdone to differentiate those some. */ - for (i = 0; i < I915_NUM_RINGS; i++) { + for (i = 0; i < I915_ACTIVE_RINGS(dev_priv->dev); i++) { if (error->ring[i].hangcheck_action == HANGCHECK_HUNG) { if (ring_id) *ring_id = i; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index e72017b..67e2919 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -90,6 +90,8 @@ struct intel_engine_cs { } id; #define I915_NUM_RINGS 5 #define LAST_USER_RING (VECS + 1) +#define I915_ACTIVE_RINGS(dev) hweight8(INTEL_INFO(dev)->ring_mask) + u32 mmio_base; struct drm_device *dev; struct intel_ringbuffer *buffer;
There are some cases in the code where we need to know how many rings to iterate over, but cannot use for_each_ring(). These are always error cases which happen either before ring setup, or after ring teardown (or reset). Note, a NUM_RINGS issue exists in semaphores, but this is fixed by the remaining semaphore patches which Rodrigo will resubmit shortly. I'd rather see those patches for fixing the problem than fix it here. I found this initially for the BSD2 case where on the same platform we can have differing rings. AFAICT however this effects many platforms. I'd CC stable on this, except I think all the issues have been around for multiple releases without bug reports. Compile tested only for now. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_gem_context.c | 6 +++--- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/intel_ringbuffer.h | 2 ++ 3 files changed, 6 insertions(+), 4 deletions(-)