Message ID | YgqmfKhwU5spS069@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Depend on !PREEMPT_RT. | expand |
On 2022-02-14 19:59:08 [+0100], To intel-gfx@lists.freedesktop.org wrote: > There are a few sections in the driver which are not compatible with > PREEMPT_RT. They trigger warnings and can lead to deadlocks at runtime. > > Disable the i915 driver on a PREEMPT_RT enabled kernel. This way > PREEMPT_RT itself can be enabled without needing to address the i915 > issues first. The RT related patches are still in RT queue and will be > handled later. > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> A gentle ping ;) > --- > drivers/gpu/drm/i915/Kconfig | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig > index a4c94dc2e2164..3aa719d5a0f0d 100644 > --- a/drivers/gpu/drm/i915/Kconfig > +++ b/drivers/gpu/drm/i915/Kconfig > @@ -3,6 +3,7 @@ config DRM_I915 > tristate "Intel 8xx/9xx/G3x/G4x/HD Graphics" > depends on DRM > depends on X86 && PCI > + depends on !PREEMPT_RT > select INTEL_GTT > select INTERVAL_TREE > # we need shmfs for the swappable backing store, and in particular Sebastian
Hi, On 25/02/2022 23:03, Sebastian Andrzej Siewior wrote: > On 2022-02-14 19:59:08 [+0100], To intel-gfx@lists.freedesktop.org wrote: >> There are a few sections in the driver which are not compatible with >> PREEMPT_RT. They trigger warnings and can lead to deadlocks at runtime. >> >> Disable the i915 driver on a PREEMPT_RT enabled kernel. This way >> PREEMPT_RT itself can be enabled without needing to address the i915 >> issues first. The RT related patches are still in RT queue and will be >> handled later. >> >> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > > A gentle ping ;) Could you paste a link to the queue of i915 patches pending for a quick overview of how much work there is and in what areas? Also, I assume due absence of ARCH_SUPPORTS_RT being defined by any arch, that something more is not yet ready? Regards, Tvrtko >> --- >> drivers/gpu/drm/i915/Kconfig | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig >> index a4c94dc2e2164..3aa719d5a0f0d 100644 >> --- a/drivers/gpu/drm/i915/Kconfig >> +++ b/drivers/gpu/drm/i915/Kconfig >> @@ -3,6 +3,7 @@ config DRM_I915 >> tristate "Intel 8xx/9xx/G3x/G4x/HD Graphics" >> depends on DRM >> depends on X86 && PCI >> + depends on !PREEMPT_RT >> select INTEL_GTT >> select INTERVAL_TREE >> # we need shmfs for the swappable backing store, and in particular > > Sebastian
On 2022-02-28 10:10:48 [+0000], Tvrtko Ursulin wrote: > Hi, Hi, > Could you paste a link to the queue of i915 patches pending for a quick > overview of how much work there is and in what areas? Last post to the list: https://https://lkml.kernel.org/r/.kernel.org/all/20211214140301.520464-1-bigeasy@linutronix.de/ or if you look at the DRM section in https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/series?h=v5.17-rc6-rt10-patches#n156 you see: 0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch 0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch 0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch 0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch 0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch 0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch 0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch 0010-drm-i915-Drop-the-irqs_disabled-check.patch Revert-drm-i915-Depend-on-PREEMPT_RT.patch and you could view them from https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches?h=v5.17-rc6-rt10-patches > Also, I assume due absence of ARCH_SUPPORTS_RT being defined by any arch, > that something more is not yet ready? Correct. Looking at what I have queued for the next merge window I have less than 20 patches (excluding i915 and printk) before ARCH_SUPPORTS_RT can be enabled for x86-64. > Regards, > > Tvrtko Sebastian
On 28/02/2022 10:35, Sebastian Andrzej Siewior wrote: > On 2022-02-28 10:10:48 [+0000], Tvrtko Ursulin wrote: >> Hi, > Hi, > >> Could you paste a link to the queue of i915 patches pending for a quick >> overview of how much work there is and in what areas? > > Last post to the list: > https://https://lkml.kernel.org/r/.kernel.org/all/20211214140301.520464-1-bigeasy@linutronix.de/ > > or if you look at the DRM section in > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/series?h=v5.17-rc6-rt10-patches#n156 Thanks! > you see: > 0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch > 0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch Two for the display folks. > 0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch What do preempt_disable/enable do on PREEMPT_RT? Thinking if instead the solution could be to always force the !ATOMIC path (for the whole _wait_for_atomic macro) on PREEMPT_RT. > 0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch If the issue is only with certain trace points why disable all? > 0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch Didn't quite fully understand, why is this not fixable? Especially thinking if the option of not blanket disabling all tracepoints in the previous patch. > 0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch Not sure about why cond_resched was put between irq_work_queue and irq_work_sync - would it not be like-for-like change to have the two together? Commit message makes me think _queue already starts the handler on x86 at least. > 0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch I think this is okay. The part after the unlock is serialized by the tasklet already. Slight doubt due the comment: local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */ Makes me want to think about it harder but not now. Another thing to check is if execlists_context_status_change still needs the atomic notifier chain with this change. > 0010-drm-i915-Drop-the-irqs_disabled-check.patch LGTM. > Revert-drm-i915-Depend-on-PREEMPT_RT.patch Okay. And finally for this very patch (the thread I am replying to): Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Regards, Tvrtko > > and you could view them from > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches?h=v5.17-rc6-rt10-patches > >> Also, I assume due absence of ARCH_SUPPORTS_RT being defined by any arch, >> that something more is not yet ready? > > Correct. Looking at what I have queued for the next merge window I have > less than 20 patches (excluding i915 and printk) before ARCH_SUPPORTS_RT > can be enabled for x86-64. > >> Regards, >> >> Tvrtko > > Sebastian
On 2022-03-01 14:27:18 [+0000], Tvrtko Ursulin wrote: > > you see: > > 0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch > > 0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch > > Two for the display folks. > > > 0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch > > What do preempt_disable/enable do on PREEMPT_RT? Thinking if instead the > solution could be to always force the !ATOMIC path (for the whole > _wait_for_atomic macro) on PREEMPT_RT. Could be one way to handle it. But please don't disable preemption and or interrupts for longer period of time as all of it increases the overall latency. Side note: All of these patches is a collection over time. I personally have only a single i7-sandybridge with i915 and here I don't really enter all the possible paths here. People report, I patch and look around and then they are quiet so I assume that it is working. > > 0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch > > If the issue is only with certain trace points why disable all? It is a class and it is easier that way. > > 0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch > > Didn't quite fully understand, why is this not fixable? Especially thinking > if the option of not blanket disabling all tracepoints in the previous > patch. The problem is that you can't acquire that lock from within that trace-point on PREEMPT_RT. On !RT it is possible but it is also problematic because LOCKDEP does not see possible dead locks unless that trace-point is enabled. I've been talking to Steven (after https://lkml.kernel.org/r/20211214115837.6f33a9b2@gandalf.local.home) and he wants to come up with something where you can pass a lock as argument to the tracing-API. That way the lock can be acquired before the trace event is invoked and lockdep will see it even if the trace event is disabled. So there is an idea how to get it to work eventually without disabling it in the long term. Making the register a raw_spinlock_t would solve problem immediately but I am a little worried given the increased latency in a quick test: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/ also, this one single hardware but the upper limit atomic-polls is high. > > 0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch > > Not sure about why cond_resched was put between irq_work_queue and > irq_work_sync - would it not be like-for-like change to have the two > together? maybe it loops for a while and an additional scheduling would be nice. > Commit message makes me think _queue already starts the handler on > x86 at least. Yes, irq_work_queue() triggers the IRQ right away on x86, irq_work_sync() would wait for it to happen in case it did not happen. On architectures which don't provide an IRQ-work interrupt, it is delayed to the HZ tick timer interrupt. So this serves also as an example in case someone want to copy the code ;) > > 0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch > > I think this is okay. The part after the unlock is serialized by the tasklet > already. > > Slight doubt due the comment: > > local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */ > > Makes me want to think about it harder but not now. Clark reported it and confirmed that the warning is gone on RT and everything appears to work ;) PREEMPT_RT wise there is no synchronisation vs irq_work other than an actual lock (in case it is needed). > Another thing to check is if execlists_context_status_change still needs the > atomic notifier chain with this change. > > > 0010-drm-i915-Drop-the-irqs_disabled-check.patch > > LGTM. Do you want me to repost that one? > > Revert-drm-i915-Depend-on-PREEMPT_RT.patch > > Okay. > > And finally for this very patch (the thread I am replying to): > > Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Thanks. > Regards, > > Tvrtko Sebastian
On 01/03/2022 15:13, Sebastian Andrzej Siewior wrote: > On 2022-03-01 14:27:18 [+0000], Tvrtko Ursulin wrote: >>> you see: >>> 0003-drm-i915-Use-preempt_disable-enable_rt-where-recomme.patch >>> 0004-drm-i915-Don-t-disable-interrupts-on-PREEMPT_RT-duri.patch >> >> Two for the display folks. >> >>> 0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch >> >> What do preempt_disable/enable do on PREEMPT_RT? Thinking if instead the >> solution could be to always force the !ATOMIC path (for the whole >> _wait_for_atomic macro) on PREEMPT_RT. > > Could be one way to handle it. But please don't disable preemption and > or interrupts for longer period of time as all of it increases the > overall latency. I am looking for your guidance of what is the correct thing here. Main purpose of this macro on the i915 side is to do short waits on GPU registers changing post write from spin-locked sections. But there were rare cases when very short waits were needed from unlocked sections, shorter than 10us (which is AFAIR what usleep_range documents should be a lower limit). Which is why non-atomic path was added to the macro. That path uses preempt_disable/enable so it can use local_clock(). All this may, or may not be, compatible with PREEMPT_RT to start with? Or question phrased differently, how we should implement the <10us waits from non-atomic sections under PREEMPT_RT? > Side note: All of these patches is a collection over time. I personally > have only a single i7-sandybridge with i915 and here I don't really > enter all the possible paths here. People report, I patch and look > around and then they are quiet so I assume that it is working. > >>> 0006-drm-i915-Disable-tracing-points-on-PREEMPT_RT.patch >> >> If the issue is only with certain trace points why disable all? > > It is a class and it is easier that way. > >>> 0007-drm-i915-skip-DRM_I915_LOW_LEVEL_TRACEPOINTS-with-NO.patch >> >> Didn't quite fully understand, why is this not fixable? Especially thinking >> if the option of not blanket disabling all tracepoints in the previous >> patch. > > The problem is that you can't acquire that lock from within that > trace-point on PREEMPT_RT. On !RT it is possible but it is also > problematic because LOCKDEP does not see possible dead locks unless that > trace-point is enabled. Oh I meant could the include ordering problem be fixed differently? """ [PATCH 07/10] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with NOTRACE The order of the header files is important. If this header file is included after tracepoint.h was included then the NOTRACE here becomes a nop. Currently this happens for two .c files which use the tracepoitns behind DRM_I915_LOW_LEVEL_TRACEPOINTS. """ Like these two .c files - can order of includes just be changed in them? > > I've been talking to Steven (after > https://lkml.kernel.org/r/20211214115837.6f33a9b2@gandalf.local.home) > and he wants to come up with something where you can pass a lock as > argument to the tracing-API. That way the lock can be acquired before > the trace event is invoked and lockdep will see it even if the trace > event is disabled. > So there is an idea how to get it to work eventually without disabling > it in the long term. > > Making the register a raw_spinlock_t would solve problem immediately but > I am a little worried given the increased latency in a quick test: > https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/ > > also, this one single hardware but the upper limit atomic-polls is high. > >>> 0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch >> >> Not sure about why cond_resched was put between irq_work_queue and >> irq_work_sync - would it not be like-for-like change to have the two >> together? > > maybe it loops for a while and an additional scheduling would be nice. > >> Commit message makes me think _queue already starts the handler on >> x86 at least. > > Yes, irq_work_queue() triggers the IRQ right away on x86, > irq_work_sync() would wait for it to happen in case it did not happen. > On architectures which don't provide an IRQ-work interrupt, it is > delayed to the HZ tick timer interrupt. So this serves also as an > example in case someone want to copy the code ;) My question wasn't why is there a need_resched() in there, but why is the patch: + irq_work_queue(&b->irq_work); cond_resched(); + irq_work_sync(&b->irq_work); And not: + irq_work_queue(&b->irq_work); + irq_work_sync(&b->irq_work); cond_resched(); To preserve like for like, if my understanding of the commit message was correct. > >>> 0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch >> >> I think this is okay. The part after the unlock is serialized by the tasklet >> already. >> >> Slight doubt due the comment: >> >> local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */ >> >> Makes me want to think about it harder but not now. > > Clark reported it and confirmed that the warning is gone on RT and > everything appears to work ;) I will need to think about it harder at some point. > PREEMPT_RT wise there is no synchronisation vs irq_work other than an > actual lock (in case it is needed). Okay, marking as todo/later for me. Need to see if enabling breadcrumbs earlier than it used to be after this patch makes any difference. >> Another thing to check is if execlists_context_status_change still needs the >> atomic notifier chain with this change. >> >>> 0010-drm-i915-Drop-the-irqs_disabled-check.patch >> >> LGTM. > > Do you want me to repost that one? I think it's up to you whether you go one by one, or repost the whole series or whatever. >>> Revert-drm-i915-Depend-on-PREEMPT_RT.patch >> >> Okay. >> >> And finally for this very patch (the thread I am replying to): >> >> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Thanks. Np - and I've pushed this one. Regards, Tvrtko
On 2022-03-02 11:42:35 [+0000], Tvrtko Ursulin wrote: > > > > 0005-drm-i915-Don-t-check-for-atomic-context-on-PREEMPT_R.patch > > > > > > What do preempt_disable/enable do on PREEMPT_RT? Thinking if instead the > > > solution could be to always force the !ATOMIC path (for the whole > > > _wait_for_atomic macro) on PREEMPT_RT. > > > > Could be one way to handle it. But please don't disable preemption and > > or interrupts for longer period of time as all of it increases the > > overall latency. > > I am looking for your guidance of what is the correct thing here. > > Main purpose of this macro on the i915 side is to do short waits on GPU > registers changing post write from spin-locked sections. But there were rare > cases when very short waits were needed from unlocked sections, shorter than > 10us (which is AFAIR what usleep_range documents should be a lower limit). > Which is why non-atomic path was added to the macro. That path uses > preempt_disable/enable so it can use local_clock(). > > All this may, or may not be, compatible with PREEMPT_RT to start with? Your assumption about atomic is not correct and that is why I aim to ignore for RT. Or maybe alter so it fits. It is assumed, that in_atomic() is true in an interrupts handler or with an acquired spinlock_t, right? Both condition keep the context preemptible so the atomic check triggers. However, both (the force threaded interrupt handler and the spinlock_t) ensure that the task is stuck on the CPU. So maybe your _WAIT_FOR_ATOMIC_CHECK() could point to cant_migrate(). It looks like you try to ensure that local_clock() is from the same CPU. > Or question phrased differently, how we should implement the <10us waits > from non-atomic sections under PREEMPT_RT? I think if you swap check in _WAIT_FOR_ATOMIC_CHECK() it should be good. After all the remains preemptible during the condition polls so it should work. > > The problem is that you can't acquire that lock from within that > > trace-point on PREEMPT_RT. On !RT it is possible but it is also > > problematic because LOCKDEP does not see possible dead locks unless that > > trace-point is enabled. > > Oh I meant could the include ordering problem be fixed differently? > > """ > [PATCH 07/10] drm/i915: skip DRM_I915_LOW_LEVEL_TRACEPOINTS with > NOTRACE > > The order of the header files is important. If this header file is > included after tracepoint.h was included then the NOTRACE here becomes a > nop. Currently this happens for two .c files which use the tracepoitns > behind DRM_I915_LOW_LEVEL_TRACEPOINTS. > """ > > Like these two .c files - can order of includes just be changed in them? Maybe. Let me check and get back to you. > > I've been talking to Steven (after > > https://lkml.kernel.org/r/20211214115837.6f33a9b2@gandalf.local.home) > > and he wants to come up with something where you can pass a lock as > > argument to the tracing-API. That way the lock can be acquired before > > the trace event is invoked and lockdep will see it even if the trace > > event is disabled. > > So there is an idea how to get it to work eventually without disabling > > it in the long term. > > > > Making the register a raw_spinlock_t would solve problem immediately but > > I am a little worried given the increased latency in a quick test: > > https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/ > > > > also, this one single hardware but the upper limit atomic-polls is high. > > > > > > 0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch > > > > > > Not sure about why cond_resched was put between irq_work_queue and > > > irq_work_sync - would it not be like-for-like change to have the two > > > together? > > > > maybe it loops for a while and an additional scheduling would be nice. > > > > > Commit message makes me think _queue already starts the handler on > > > x86 at least. > > > > Yes, irq_work_queue() triggers the IRQ right away on x86, > > irq_work_sync() would wait for it to happen in case it did not happen. > > On architectures which don't provide an IRQ-work interrupt, it is > > delayed to the HZ tick timer interrupt. So this serves also as an > > example in case someone want to copy the code ;) > > My question wasn't why is there a need_resched() in there, but why is the > patch: > > + irq_work_queue(&b->irq_work); > cond_resched(); > + irq_work_sync(&b->irq_work); > > And not: > > + irq_work_queue(&b->irq_work); > + irq_work_sync(&b->irq_work); > cond_resched(); > > To preserve like for like, if my understanding of the commit message was > correct. No strong need, it can be put as you suggest. Should someone else schedule &b->irq_work from another CPU then you could first attempt to cond_resched() and then wait for &b->irq_work's completion. Assuming that this does not happen (because the irq_work was previously queued and invoked immediately) irq_work_sync) will just return. > > > > 0009-drm-i915-gt-Use-spin_lock_irq-instead-of-local_irq_d.patch > > > > > > I think this is okay. The part after the unlock is serialized by the tasklet > > > already. > > > > > > Slight doubt due the comment: > > > > > > local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */ > > > > > > Makes me want to think about it harder but not now. > > > > Clark reported it and confirmed that the warning is gone on RT and > > everything appears to work ;) > > I will need to think about it harder at some point. > > > PREEMPT_RT wise there is no synchronisation vs irq_work other than an > > actual lock (in case it is needed). > > Okay, marking as todo/later for me. Need to see if enabling breadcrumbs > earlier than it used to be after this patch makes any difference. Okay. > > > Another thing to check is if execlists_context_status_change still needs the > > > atomic notifier chain with this change. > > > > > > > 0010-drm-i915-Drop-the-irqs_disabled-check.patch > > > > > > LGTM. > > > > Do you want me to repost that one? > > I think it's up to you whether you go one by one, or repost the whole series > or whatever. If it is up to me then let me repost that one single patch and I have it out of my queue :) And 0008-drm-i915-gt-Queue-and-wait-for-the-irq_work-item.patch without cond_resched() in the middle. > Regards, > > Tvrtko Sebastian
diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index a4c94dc2e2164..3aa719d5a0f0d 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -3,6 +3,7 @@ config DRM_I915 tristate "Intel 8xx/9xx/G3x/G4x/HD Graphics" depends on DRM depends on X86 && PCI + depends on !PREEMPT_RT select INTEL_GTT select INTERVAL_TREE # we need shmfs for the swappable backing store, and in particular
There are a few sections in the driver which are not compatible with PREEMPT_RT. They trigger warnings and can lead to deadlocks at runtime. Disable the i915 driver on a PREEMPT_RT enabled kernel. This way PREEMPT_RT itself can be enabled without needing to address the i915 issues first. The RT related patches are still in RT queue and will be handled later. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- drivers/gpu/drm/i915/Kconfig | 1 + 1 file changed, 1 insertion(+)