Message ID | 1441138895-23732-5-git-send-email-eich@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Sep 01, 2015 at 10:21:35PM +0200, Egbert Eich wrote: > A HPD interrupt may fire during intel_crt_detect_hotplug() - especially > when HPD interrupt storms occur. > Since the interrupt handler changes the enabled interrupt lines when it > detects a storm this races with intel_crt_detect_hotplug(). > To avoid this, shiled the rmw cycles with IRQ save spinlocks. > > Signed-off-by: Egbert Eich <eich@suse.de> I think this only reduces one source of such races, but fundamentally we can't avoid them. E.g. if you're _very_ unlucky you might cause a real hpd right when the strom code frobs around. Plugging the race with this known interrupt source here doesn't fix the fundamental issue. Also I think the actual interrupt deliver is fairly asynchronous, both in the hardware and in the sw handling. E.g. spin_lock_irq does not synchronize with the interrupt handler on SMP systems, it only guarantees that it's not running concurrently on the same cpu (which would deadlock). I think fundamentally this race is unfixable. -Daniel > --- > drivers/gpu/drm/i915/intel_crt.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_crt.c b/drivers/gpu/drm/i915/intel_crt.c > index af5e43b..685f3de 100644 > --- a/drivers/gpu/drm/i915/intel_crt.c > +++ b/drivers/gpu/drm/i915/intel_crt.c > @@ -376,9 +376,10 @@ static bool intel_crt_detect_hotplug(struct drm_connector *connector) > { > struct drm_device *dev = connector->dev; > struct drm_i915_private *dev_priv = dev->dev_private; > - u32 hotplug_en, orig, stat; > + u32 hotplug_en, stat; > bool ret = false; > int i, tries = 0; > + unsigned long irqflags; > > if (HAS_PCH_SPLIT(dev)) > return intel_ironlake_crt_detect_hotplug(connector); > @@ -395,12 +396,14 @@ static bool intel_crt_detect_hotplug(struct drm_connector *connector) > tries = 2; > else > tries = 1; > - hotplug_en = orig = I915_READ(PORT_HOTPLUG_EN); > - hotplug_en |= CRT_HOTPLUG_FORCE_DETECT; > > for (i = 0; i < tries ; i++) { > /* turn on the FORCE_DETECT */ > - I915_WRITE(PORT_HOTPLUG_EN, hotplug_en); > + spin_lock_irqsave(&dev_priv->irq_lock, irqflags); > + hotplug_en = I915_READ(PORT_HOTPLUG_EN); > + I915_WRITE(PORT_HOTPLUG_EN, hotplug_en | > + CRT_HOTPLUG_FORCE_DETECT); > + spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags); > /* wait for FORCE_DETECT to go off */ > if (wait_for((I915_READ(PORT_HOTPLUG_EN) & > CRT_HOTPLUG_FORCE_DETECT) == 0, > @@ -416,7 +419,11 @@ static bool intel_crt_detect_hotplug(struct drm_connector *connector) > I915_WRITE(PORT_HOTPLUG_STAT, CRT_HOTPLUG_INT_STATUS); > > /* and put the bits back */ > - I915_WRITE(PORT_HOTPLUG_EN, orig); > + spin_lock_irqsave(&dev_priv->irq_lock, irqflags); > + hotplug_en = I915_READ(PORT_HOTPLUG_EN); > + hotplug_en &= ~CRT_HOTPLUG_FORCE_DETECT; > + I915_WRITE(PORT_HOTPLUG_EN, hotplug_en); > + spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags); > > return ret; > } > -- > 1.8.4.5 >
Daniel Vetter writes: > On Tue, Sep 01, 2015 at 10:21:35PM +0200, Egbert Eich wrote: > > A HPD interrupt may fire during intel_crt_detect_hotplug() - especially > > when HPD interrupt storms occur. > > Since the interrupt handler changes the enabled interrupt lines when it > > detects a storm this races with intel_crt_detect_hotplug(). > > To avoid this, shiled the rmw cycles with IRQ save spinlocks. > > > > Signed-off-by: Egbert Eich <eich@suse.de> > > I think this only reduces one source of such races, but fundamentally we > can't avoid them. E.g. if you're _very_ unlucky you might cause a real hpd > right when the strom code frobs around. Plugging the race with this known This is exactly the scenatio I'm getting here. I get HPD interrupts at an order of 10^4 / sec. > interrupt source here doesn't fix the fundamental issue. > > Also I think the actual interrupt deliver is fairly asynchronous, both in > the hardware and in the sw handling. E.g. spin_lock_irq does not > synchronize with the interrupt handler on SMP systems, it only guarantees > that it's not running concurrently on the same cpu (which would deadlock). > > I think fundamentally this race is unfixable. There is one important race we avoid with this patch: It is that we change the PORT_HOTPLUG_EN concurrently in the interrupt handler (thru xxx_hpd_irq_setup() and in intel_crt_detect_hotplug()). With the test system that I have here the old version of the code easily runs into this as the time spent inside intel_crt_detect_hotplug() is quite long - especially when no CRT is connected. What happens intel_crt_detect_hotplug() reads the value of PORT_HOTPLUG_EN at entry, then frobs around for ages, during this time several HPD interrupts occur, the storm is detected, the bit related to the stormy line is unset then after ages intel_crt_detect_hotplug() decides to give up and restores the value it read on entry. To remedy this this patch reads the value of PORT_HOTPLUG_EN whenever it is going to change it and adds the spin locks to make sure the read-modify-write cycles don't happen concurrently. PORT_HOTPLUG_EN is only touched in two places: in xxx_hpd_irq_setup() and in intel_crt_detect_hotplug(), the former can be called from an interrupt handler. Not sure why you see a problem here. Cheers, Egbert.
On Wed, 02 Sep 2015, Egbert Eich <eich@suse.com> wrote: > This is exactly the scenatio I'm getting here. I get HPD interrupts at an > order of 10^4 / sec. Makes you wonder if either you have faulty hardware or we are configuring the hardware wrong (we overlook some configuration about some voltage/duration threshold maybe, or get irqs from a line that's floating, or something). BR, Jani.
On Wed, Sep 02, 2015 at 04:19:00PM +0200, Egbert Eich wrote: > Daniel Vetter writes: > > On Tue, Sep 01, 2015 at 10:21:35PM +0200, Egbert Eich wrote: > > > A HPD interrupt may fire during intel_crt_detect_hotplug() - especially > > > when HPD interrupt storms occur. > > > Since the interrupt handler changes the enabled interrupt lines when it > > > detects a storm this races with intel_crt_detect_hotplug(). > > > To avoid this, shiled the rmw cycles with IRQ save spinlocks. > > > > > > Signed-off-by: Egbert Eich <eich@suse.de> > > > > I think this only reduces one source of such races, but fundamentally we > > can't avoid them. E.g. if you're _very_ unlucky you might cause a real hpd > > right when the strom code frobs around. Plugging the race with this known > > This is exactly the scenatio I'm getting here. I get HPD interrupts at an > order of 10^4 / sec. > > > interrupt source here doesn't fix the fundamental issue. > > > > Also I think the actual interrupt deliver is fairly asynchronous, both in > > the hardware and in the sw handling. E.g. spin_lock_irq does not > > synchronize with the interrupt handler on SMP systems, it only guarantees > > that it's not running concurrently on the same cpu (which would deadlock). > > > > I think fundamentally this race is unfixable. > > > There is one important race we avoid with this patch: It is that > we change the PORT_HOTPLUG_EN concurrently in the interrupt handler > (thru xxx_hpd_irq_setup() and in intel_crt_detect_hotplug()). > > With the test system that I have here the old version of the code > easily runs into this as the time spent inside intel_crt_detect_hotplug() > is quite long - especially when no CRT is connected. > > What happens intel_crt_detect_hotplug() reads the value of PORT_HOTPLUG_EN > at entry, then frobs around for ages, during this time several HPD interrupts > occur, the storm is detected, the bit related to the stormy line is unset > then after ages intel_crt_detect_hotplug() decides to give up and restores > the value it read on entry. > > To remedy this this patch reads the value of PORT_HOTPLUG_EN whenever > it is going to change it and adds the spin locks to make sure the > read-modify-write cycles don't happen concurrently. > > PORT_HOTPLUG_EN is only touched in two places: in xxx_hpd_irq_setup() > and in intel_crt_detect_hotplug(), the former can be called from an > interrupt handler. > > Not sure why you see a problem here. Hm I missed that this same register is also accessed by the irq handler code, and it's not just that touching these bits can cause interrupts. So yeah we need your patch, but it needs to be clearer in the commit message that there's also trouble with concurrent register access to PORT_HOTPLUG_EN. Also I think a commen in the code why we grab that spinlock would be good. For that extracting a small helper to manipulate the register (like we do with other irq mask registers with functions like ilk_update_gt_irq) would be good - then we have just one place to put that commment. -Daniel
Jani Nikula writes: > On Wed, 02 Sep 2015, Egbert Eich <eich@suse.com> wrote: > > This is exactly the scenatio I'm getting here. I get HPD interrupts at an > > order of 10^4 / sec. > > Makes you wonder if either you have faulty hardware or we are > configuring the hardware wrong (we overlook some configuration about > some voltage/duration threshold maybe, or get irqs from a line that's > floating, or something). It is faulty hardware. But it is not a single machine that broke. It is an entire series. IMHO due to bad signal routing and poor shilding there is crosstalk on the SDVO lines signaling the plug status. Since SDVO uses PCIe lines it is AC coupled, if I recall correctly from reading the specs long time ago, one status is signalled by a 10MHz signal, the other by 20MHz. At the time when I implemented this I've seen other reports from systems which showed similar problems under certain conditions(*) - although not quite as bad, therefore I thought of a general solution to get rid of this once and for all. If this had only been one system with this problem, I would just have blacklisted it. (*) It seems that this somewhat depends on the video mode set (supports the crosstalk theory) but I also had a report where this occurred at certain charging levels and whether a power supply was connected or not. Cheers, Egbert.
Daniel Vetter writes: > On Wed, Sep 02, 2015 at 04:19:00PM +0200, Egbert Eich wrote: > > Hm I missed that this same register is also accessed by the irq handler > code, and it's not just that touching these bits can cause interrupts. So > yeah we need your patch, but it needs to be clearer in the commit message > that there's also trouble with concurrent register access to > PORT_HOTPLUG_EN. > > Also I think a commen in the code why we grab that spinlock would be good. > For that extracting a small helper to manipulate the register (like we do > with other irq mask registers with functions like ilk_update_gt_irq) would > be good - then we have just one place to put that commment. OK, I will come up with a suggestion. Cheers, Egbert.
diff --git a/drivers/gpu/drm/i915/intel_crt.c b/drivers/gpu/drm/i915/intel_crt.c index af5e43b..685f3de 100644 --- a/drivers/gpu/drm/i915/intel_crt.c +++ b/drivers/gpu/drm/i915/intel_crt.c @@ -376,9 +376,10 @@ static bool intel_crt_detect_hotplug(struct drm_connector *connector) { struct drm_device *dev = connector->dev; struct drm_i915_private *dev_priv = dev->dev_private; - u32 hotplug_en, orig, stat; + u32 hotplug_en, stat; bool ret = false; int i, tries = 0; + unsigned long irqflags; if (HAS_PCH_SPLIT(dev)) return intel_ironlake_crt_detect_hotplug(connector); @@ -395,12 +396,14 @@ static bool intel_crt_detect_hotplug(struct drm_connector *connector) tries = 2; else tries = 1; - hotplug_en = orig = I915_READ(PORT_HOTPLUG_EN); - hotplug_en |= CRT_HOTPLUG_FORCE_DETECT; for (i = 0; i < tries ; i++) { /* turn on the FORCE_DETECT */ - I915_WRITE(PORT_HOTPLUG_EN, hotplug_en); + spin_lock_irqsave(&dev_priv->irq_lock, irqflags); + hotplug_en = I915_READ(PORT_HOTPLUG_EN); + I915_WRITE(PORT_HOTPLUG_EN, hotplug_en | + CRT_HOTPLUG_FORCE_DETECT); + spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags); /* wait for FORCE_DETECT to go off */ if (wait_for((I915_READ(PORT_HOTPLUG_EN) & CRT_HOTPLUG_FORCE_DETECT) == 0, @@ -416,7 +419,11 @@ static bool intel_crt_detect_hotplug(struct drm_connector *connector) I915_WRITE(PORT_HOTPLUG_STAT, CRT_HOTPLUG_INT_STATUS); /* and put the bits back */ - I915_WRITE(PORT_HOTPLUG_EN, orig); + spin_lock_irqsave(&dev_priv->irq_lock, irqflags); + hotplug_en = I915_READ(PORT_HOTPLUG_EN); + hotplug_en &= ~CRT_HOTPLUG_FORCE_DETECT; + I915_WRITE(PORT_HOTPLUG_EN, hotplug_en); + spin_unlock_irqrestore(&dev_priv->irq_lock, irqflags); return ret; }
A HPD interrupt may fire during intel_crt_detect_hotplug() - especially when HPD interrupt storms occur. Since the interrupt handler changes the enabled interrupt lines when it detects a storm this races with intel_crt_detect_hotplug(). To avoid this, shiled the rmw cycles with IRQ save spinlocks. Signed-off-by: Egbert Eich <eich@suse.de> --- drivers/gpu/drm/i915/intel_crt.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)