Message ID | 1369794154-1639-19-git-send-email-ben@bwidawsk.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, May 28, 2013 at 07:22:34PM -0700, Ben Widawsky wrote: > From: "Xiang, Haihao" <haihao.xiang@intel.com> > > This will let userland only try to use the new ring > when the appropriate kernel is present > > Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> > Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> So originally I wanted to take a closer look at the interrupt handling changes before merging them. But somehow my brain was notoriously not up to the task of reviewing tricky interrupt stuff. Anyway here's my list of things I've spotted while applying patches: - Unrelated, but spotted while checking interrupt code: ironlake_enable_display_irq is not called with the irq_lock held everywhere, and some are outside of the irq setup/teradown (so real races). Specifically commit 8664281b64c457705db72fc60143d03827e75ca9 Author: Paulo Zanoni <paulo.r.zanoni@intel.com> Date: Fri Apr 12 17:57:57 2013 -0300 drm/i915: report Gen5+ CPU and PCH FIFO underruns broke stuff. But I guess a full review of the interrupt handling code should be in order ... At least we should sprinkle assert_spin_locked harder. Now the real stuff: - rmw register access isn't paranoid, but imo fragile. Either we don't need it, and then it just obfuscates the code (especially with scary comments around). Or there's indeed a race somewhere, and then rmw is _really_ good at papering over it. Until it blows up randomly and no one has a clue why. So I'm not a fan, and I've pretty much exlcusive just killed them. These patches add _lots_ of them instead. - rps.lock could just be killed and existing users switched over to irq_lock. Would simplify a lot in the interrupt handling. E.g. the special ring->irqrefcount could just be dropped. - rps setup is async now (yeah, that's newer than these patches, still). I think I've seen a few races around that ... - There's no inconsistencies in the PM rps interrupt setup on snb vs ivb/hsw. Such inconsistency tend to be fragile (but I haven't spotted something which would blow up on snb). - CS_MASTER interrupt handling: None of the other rings seem to enable this on gen5+. Not sure whether it actually works correctly, I suspect at least i915_report_and_clear_eir needs some more code ... - Calling queue_work from the spin_lock protection smells a bit like cargo-culting (I've reinstated that one since later patches would have been broken). - hsw_pm_irq_handler has no need to unconditionally take the spinlock. Patches are merged for now, but I'm not happy by far. Cheers, Daniel
On Fri, May 31, 2013 at 08:52:29PM +0200, Daniel Vetter wrote: > On Tue, May 28, 2013 at 07:22:34PM -0700, Ben Widawsky wrote: > > From: "Xiang, Haihao" <haihao.xiang@intel.com> > > > > This will let userland only try to use the new ring > > when the appropriate kernel is present > > > > Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> > > Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > So originally I wanted to take a closer look at the interrupt handling > changes before merging them. But somehow my brain was notoriously not up > to the task of reviewing tricky interrupt stuff. Anyway here's my list of > things I've spotted while applying patches: > > - Unrelated, but spotted while checking interrupt code: > ironlake_enable_display_irq is not called with the irq_lock held > everywhere, and some are outside of the irq setup/teradown (so real > races). Specifically > > commit 8664281b64c457705db72fc60143d03827e75ca9 > Author: Paulo Zanoni <paulo.r.zanoni@intel.com> > Date: Fri Apr 12 17:57:57 2013 -0300 > > drm/i915: report Gen5+ CPU and PCH FIFO underruns > > broke stuff. But I guess a full review of the interrupt handling code > should be in order ... At least we should sprinkle assert_spin_locked > harder. > > Now the real stuff: > > - rmw register access isn't paranoid, but imo fragile. Either we don't > need it, and then it just obfuscates the code (especially with scary > comments around). Or there's indeed a race somewhere, and then rmw is > _really_ good at papering over it. Until it blows up randomly and no one > has a clue why. > > So I'm not a fan, and I've pretty much exlcusive just killed them. These > patches add _lots_ of them instead. > > - rps.lock could just be killed and existing users switched over to > irq_lock. Would simplify a lot in the interrupt handling. E.g. the > special ring->irqrefcount could just be dropped. > > - rps setup is async now (yeah, that's newer than these patches, still). I > think I've seen a few races around that ... > > - There's no inconsistencies in the PM rps interrupt setup on snb vs > ivb/hsw. Such inconsistency tend to be fragile (but I haven't spotted > something which would blow up on snb). > > - CS_MASTER interrupt handling: None of the other rings seem to enable > this on gen5+. Not sure whether it actually works correctly, I suspect > at least i915_report_and_clear_eir needs some more code ... > > - Calling queue_work from the spin_lock protection smells a bit like > cargo-culting (I've reinstated that one since later patches would have > been broken). > > - hsw_pm_irq_handler has no need to unconditionally take the spinlock. > > Patches are merged for now, but I'm not happy by far. > > Cheers, Daniel Of these, which have you changed/fixed? That way if someone wants to start addressing the issues, they know where to start.
On Fri, May 31, 2013 at 12:52:03PM -0700, Ben Widawsky wrote: > On Fri, May 31, 2013 at 08:52:29PM +0200, Daniel Vetter wrote: > > On Tue, May 28, 2013 at 07:22:34PM -0700, Ben Widawsky wrote: > > > From: "Xiang, Haihao" <haihao.xiang@intel.com> > > > > > > This will let userland only try to use the new ring > > > when the appropriate kernel is present > > > > > > Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> > > > Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > > So originally I wanted to take a closer look at the interrupt handling > > changes before merging them. But somehow my brain was notoriously not up > > to the task of reviewing tricky interrupt stuff. Anyway here's my list of > > things I've spotted while applying patches: > > > > - Unrelated, but spotted while checking interrupt code: > > ironlake_enable_display_irq is not called with the irq_lock held > > everywhere, and some are outside of the irq setup/teradown (so real > > races). Specifically > > > > commit 8664281b64c457705db72fc60143d03827e75ca9 > > Author: Paulo Zanoni <paulo.r.zanoni@intel.com> > > Date: Fri Apr 12 17:57:57 2013 -0300 > > > > drm/i915: report Gen5+ CPU and PCH FIFO underruns > > > > broke stuff. But I guess a full review of the interrupt handling code > > should be in order ... At least we should sprinkle assert_spin_locked > > harder. > > > > Now the real stuff: > > > > - rmw register access isn't paranoid, but imo fragile. Either we don't > > need it, and then it just obfuscates the code (especially with scary > > comments around). Or there's indeed a race somewhere, and then rmw is > > _really_ good at papering over it. Until it blows up randomly and no one > > has a clue why. > > > > So I'm not a fan, and I've pretty much exlcusive just killed them. These > > patches add _lots_ of them instead. > > > > - rps.lock could just be killed and existing users switched over to > > irq_lock. Would simplify a lot in the interrupt handling. E.g. the > > special ring->irqrefcount could just be dropped. > > > > - rps setup is async now (yeah, that's newer than these patches, still). I > > think I've seen a few races around that ... > > > > - There's no inconsistencies in the PM rps interrupt setup on snb vs *now* > > ivb/hsw. Such inconsistency tend to be fragile (but I haven't spotted > > something which would blow up on snb). > > > > - CS_MASTER interrupt handling: None of the other rings seem to enable > > this on gen5+. Not sure whether it actually works correctly, I suspect > > at least i915_report_and_clear_eir needs some more code ... > > > > - Calling queue_work from the spin_lock protection smells a bit like > > cargo-culting (I've reinstated that one since later patches would have > > been broken). > > > > - hsw_pm_irq_handler has no need to unconditionally take the spinlock. > > > > Patches are merged for now, but I'm not happy by far. > > > > Cheers, Daniel > > Of these, which have you changed/fixed? That way if someone wants to > start addressing the issues, they know where to start. I've added a FIXME comment for the dropped WARN (which isn't in this list, oops). Otherwise I've dropped all my changes since it grew unwiedly, fast. But anyway that list above is only a rough guideline of things that had a funny smell while reading. Like I've said my brain wasn't on top of things to do a full irq handling review. -Daniel
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 3cd2b60..03ee193 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -956,6 +956,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_BLT: value = intel_ring_initialized(&dev_priv->ring[BCS]); break; + case I915_PARAM_HAS_VEBOX: + value = intel_ring_initialized(&dev_priv->ring[VECS]); + break; case I915_PARAM_HAS_RELAXED_FENCING: value = 1; break; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 81b9981..923ed7f 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -305,7 +305,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_WAIT_TIMEOUT 19 #define I915_PARAM_HAS_SEMAPHORES 20 #define I915_PARAM_HAS_PRIME_VMAP_FLUSH 21 -#define I915_PARAM_RSVD_FOR_FUTURE_USE 22 +#define I915_PARAM_HAS_VEBOX 22 #define I915_PARAM_HAS_SECURE_BATCHES 23 #define I915_PARAM_HAS_PINNED_BATCHES 24 #define I915_PARAM_HAS_EXEC_NO_RELOC 25