[v2,3/8] drm/i915: Don't check for atomic context on PREEMPT_RT

Message ID	20240613102818.4056866-4-bigeasy@linutronix.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> To: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org Cc: Jani Nikula <jani.nikula@linux.intel.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, Thomas Gleixner <tglx@linutronix.de>, Tvrtko Ursulin <tursulin@ursulin.net>, Sebastian Andrzej Siewior <bigeasy@linutronix.de> Subject: [PATCH v2 3/8] drm/i915: Don't check for atomic context on PREEMPT_RT Date: Thu, 13 Jun 2024 12:20:20 +0200 Message-ID: <20240613102818.4056866-4-bigeasy@linutronix.de> In-Reply-To: <20240613102818.4056866-1-bigeasy@linutronix.de> References: <20240613102818.4056866-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: list Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	drm/i915: PREEMPT_RT related fixups. \| expand [v2,0/8] drm/i915: PREEMPT_RT related fixups. [v2,1/8] drm/i915: Use preempt_disable/enable_rt() where recommended [v2,2/8] drm/i915: Don't disable interrupts on PREEMPT_RT during atomic updates [v2,3/8] drm/i915: Don't check for atomic context on PREEMPT_RT [v2,4/8] drm/i915: Disable tracing points on PREEMPT_RT [v2,5/8] drm/i915/gt: Use spin_lock_irq() instead of local_irq_disable() + spin_lock() [v2,6/8] drm/i915: Drop the irqs_disabled() check [v2,7/8] drm/i915/guc: Consider also RCU depth in busy loop. [v2,8/8] Revert "drm/i915: Depend on !PREEMPT_RT."

Sebastian Andrzej Siewior June 13, 2024, 10:20 a.m. UTC

The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT
because the uncore::lock is a spinlock_t and does not disable
preemption or interrupts.

Changing the uncore:lock to a raw_spinlock_t doubles the worst case
latency on an otherwise idle testbox during testing. Therefore I'm
currently unsure about changing this.

Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/gpu/drm/i915/i915_utils.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Tvrtko Ursulin June 14, 2024, 8:32 a.m. UTC | #1

On 13/06/2024 11:20, Sebastian Andrzej Siewior wrote:
> The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT
> because the uncore::lock is a spinlock_t and does not disable
> preemption or interrupts.
> 
> Changing the uncore:lock to a raw_spinlock_t doubles the worst case
> latency on an otherwise idle testbox during testing. Therefore I'm
> currently unsure about changing this.
> 
> Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>   drivers/gpu/drm/i915/i915_utils.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> index 06ec6ceb61d57..2ca54bc235925 100644
> --- a/drivers/gpu/drm/i915/i915_utils.h
> +++ b/drivers/gpu/drm/i915/i915_utils.h
> @@ -274,7 +274,7 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms)
>   #define wait_for(COND, MS)		_wait_for((COND), (MS) * 1000, 10, 1000)
>   
>   /* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */
> -#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT)
> +#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT) && !defined(CONFIG_PREEMPT_RT)
>   # define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) WARN_ON_ONCE((ATOMIC) && !in_atomic())
>   #else
>   # define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) do { } while (0)

I think this could be okay-ish in principle, but the commit text is not 
entirely accurate because there is no direct coupling between the wait 
helpers and the uncore lock. They can be used from any atomic context.

Okay-ish in principle because there is sufficient testing in Intel's CI 
on non-PREEMPT_RT kernels to catch any conceptual misuses.

But see also the caller in skl_pcode_request. It is a bit harder to hit 
since it is the fallback path. Or gen5_rps_enable which nests under a 
different lock.

Hmm would there be a different helper, or combination of helpers, which 
could replace in_atomic() which would do the right thing on both 
kernels? Something to tell us we are neither under a spin_lock, nor 
preempt_disable, nor interrupts disabled, nor bottom-half. On either 
stock or PREEMPT_RT.

WARN_ON_ONCE((ATOMIC) && !(!preemptible() || in_hardirq() || 
in_serving_softirq())

Would this work?

Regards,

Tvrtko

Sebastian Andrzej Siewior June 14, 2024, 11:05 a.m. UTC | #2

On 2024-06-14 09:32:07 [+0100], Tvrtko Ursulin wrote:
> I think this could be okay-ish in principle, but the commit text is not
> entirely accurate because there is no direct coupling between the wait
> helpers and the uncore lock. They can be used from any atomic context.
> 
> Okay-ish in principle because there is sufficient testing in Intel's CI on
> non-PREEMPT_RT kernels to catch any conceptual misuses.

You just avoid disabling preemption if you expect to be in atomic
context to save a few cycles. It wouldn't hurt to disable it anyway. The
only reason you need it is to remain on the same CPU while reading the
clock because it is not guaranteed otherwise.

Delays > 50ms are detected at build time.

> But see also the caller in skl_pcode_request. It is a bit harder to hit
> since it is the fallback path. Or gen5_rps_enable which nests under a
> different lock.
> 
> Hmm would there be a different helper, or combination of helpers, which
> could replace in_atomic() which would do the right thing on both kernels?
> Something to tell us we are neither under a spin_lock, nor preempt_disable,
> nor interrupts disabled, nor bottom-half. On either stock or PREEMPT_RT.

There is nothing that you can use to deduct that you are under a
spin-lock. preemptible() works only if you have a preemption counter
which is not mandatory. It can affect RCU but not in all configurations.

> WARN_ON_ONCE((ATOMIC) && !(!preemptible() || in_hardirq() ||
> in_serving_softirq())
> 
> Would this work?

Nope. None of this triggers if you acquire a spinlock_t. And I can't
think of something that would always be true.

So the question is why do you need to know if the context is atomic?
The only impact is avoiding disabling preemption. Is it that important
to avoid it?
If so would cant_migrate() work? It requires CONFIG_DEBUG_ATOMIC_SLEEP=y
to do the trick.

> Regards,
> 
> Tvrtko

Sebastian

Tvrtko Ursulin June 14, 2024, 12:19 p.m. UTC | #3

On 14/06/2024 12:05, Sebastian Andrzej Siewior wrote:
> On 2024-06-14 09:32:07 [+0100], Tvrtko Ursulin wrote:
>> I think this could be okay-ish in principle, but the commit text is not
>> entirely accurate because there is no direct coupling between the wait
>> helpers and the uncore lock. They can be used from any atomic context.
>>
>> Okay-ish in principle because there is sufficient testing in Intel's CI on
>> non-PREEMPT_RT kernels to catch any conceptual misuses.
> 
> You just avoid disabling preemption if you expect to be in atomic
> context to save a few cycles. It wouldn't hurt to disable it anyway. The
> only reason you need it is to remain on the same CPU while reading the
> clock because it is not guaranteed otherwise.

Ah no, that is not why. Reason for conditional disabling of preemption 
is to have an implementation for very short delays which does not run 
with preemption permanently disabled. So it is disabled only around time 
tracking.

> Delays > 50ms are detected at build time.

Right, point of that is to ask the contributor if they are sure this is 
what they want. Catching misuse of the short delay wait helper step one..

>> But see also the caller in skl_pcode_request. It is a bit harder to hit
>> since it is the fallback path. Or gen5_rps_enable which nests under a
>> different lock.
>>
>> Hmm would there be a different helper, or combination of helpers, which
>> could replace in_atomic() which would do the right thing on both kernels?
>> Something to tell us we are neither under a spin_lock, nor preempt_disable,
>> nor interrupts disabled, nor bottom-half. On either stock or PREEMPT_RT.
> 
> There is nothing that you can use to deduct that you are under a
> spin-lock. preemptible() works only if you have a preemption counter
> which is not mandatory. It can affect RCU but not in all configurations.
> 
>> WARN_ON_ONCE((ATOMIC) && !(!preemptible() || in_hardirq() ||
>> in_serving_softirq())
>>
>> Would this work?
> 
> Nope. None of this triggers if you acquire a spinlock_t. And I can't
> think of something that would always be true.

Bummer.

> So the question is why do you need to know if the context is atomic?
> The only impact is avoiding disabling preemption. Is it that important
> to avoid it?
> If so would cant_migrate() work? It requires CONFIG_DEBUG_ATOMIC_SLEEP=y
> to do the trick.

... catching misuse of atomic wait helpers step 2 - are you calling it 
from a non-atomic context without the real need. So should use the 
non-atomic helper instead.

When i915 development was very active and with a lot of contributors it 
was beneficial to catch these things which code review would easily miss.

Now that the pace is much, much slower, it is probably not very 
important. So this patch is acceptable for what I am concerned and:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Actually please also add the PREEMPT_RT angle to the comment above 
_WAIT_FOR_ATOMIC_CHECK. Sometimes lines change and git blame makes it 
hard to find the commit text.

Regards,

Tvrtko

> 
>> Regards,
>>
>> Tvrtko
> 
> Sebastian

Sebastian Andrzej Siewior June 17, 2024, 10:07 a.m. UTC | #4

On 2024-06-14 13:19:25 [+0100], Tvrtko Ursulin wrote:
> > So the question is why do you need to know if the context is atomic?
> > The only impact is avoiding disabling preemption. Is it that important
> > to avoid it?
> > If so would cant_migrate() work? It requires CONFIG_DEBUG_ATOMIC_SLEEP=y
> > to do the trick.
> 
> ... catching misuse of atomic wait helpers step 2 - are you calling it from
> a non-atomic context without the real need. So should use the non-atomic
> helper instead.
> 
> When i915 development was very active and with a lot of contributors it was
> beneficial to catch these things which code review would easily miss.
> 
> Now that the pace is much, much slower, it is probably not very important.
> So this patch is acceptable for what I am concerned and:
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> Actually please also add the PREEMPT_RT angle to the comment above
> _WAIT_FOR_ATOMIC_CHECK. Sometimes lines change and git blame makes it hard
> to find the commit text.

Do you want me the repost the series? Are the bots happy enough?
I have the following as far this patch:

------->8--------------

The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT
because the uncore::lock is a spinlock_t and does not disable
preemption or interrupts.

Changing the uncore:lock to a raw_spinlock_t doubles the worst case
latency on an otherwise idle testbox during testing.

Ignore _WAIT_FOR_ATOMIC_CHECK() on PREEMPT_RT.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/gpu/drm/i915/i915_utils.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 06ec6ceb61d57..f0d3c5cdc1b1b 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -273,8 +273,13 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms)
 						   (Wmax))
 #define wait_for(COND, MS)		_wait_for((COND), (MS) * 1000, 10, 1000)
 
-/* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */
-#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT)
+/*
+ * If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false.
+ * On PREEMPT_RT the context isn't becoming atomic because it is used in an
+ * interrupt handler or because a spinlock_t is acquired. This leads warnings
+ * which don't occur otherwise and is therefore disabled.
+ */
+#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT) && !defined(CONFIG_PREEMPT_RT)
 # define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) WARN_ON_ONCE((ATOMIC) && !in_atomic())
 #else
 # define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) do { } while (0)

> Regards,
> 
> Tvrtko

Sebastian

Tvrtko Ursulin June 18, 2024, 9 a.m. UTC | #5

On 17/06/2024 11:07, Sebastian Andrzej Siewior wrote:
> On 2024-06-14 13:19:25 [+0100], Tvrtko Ursulin wrote:
>>> So the question is why do you need to know if the context is atomic?
>>> The only impact is avoiding disabling preemption. Is it that important
>>> to avoid it?
>>> If so would cant_migrate() work? It requires CONFIG_DEBUG_ATOMIC_SLEEP=y
>>> to do the trick.
>>
>> ... catching misuse of atomic wait helpers step 2 - are you calling it from
>> a non-atomic context without the real need. So should use the non-atomic
>> helper instead.
>>
>> When i915 development was very active and with a lot of contributors it was
>> beneficial to catch these things which code review would easily miss.
>>
>> Now that the pace is much, much slower, it is probably not very important.
>> So this patch is acceptable for what I am concerned and:
>>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> Actually please also add the PREEMPT_RT angle to the comment above
>> _WAIT_FOR_ATOMIC_CHECK. Sometimes lines change and git blame makes it hard
>> to find the commit text.
> 
> Do you want me the repost the series? Are the bots happy enough?

I did a re-test but am not 100% certain yet. CI looks frustratingly 
noisy at the moment.

igt@debugfs_test@read_all_entries appears to be a fluke which is not new.

But igt@gem_exec_parallel@engines@basic from the latest run seem new.

So I queued another re-test.

> I have the following as far this patch:
> 
> ------->8--------------
> 
> The !in_atomic() check in _wait_for_atomic() triggers on PREEMPT_RT
> because the uncore::lock is a spinlock_t and does not disable
> preemption or interrupts.
> 
> Changing the uncore:lock to a raw_spinlock_t doubles the worst case
> latency on an otherwise idle testbox during testing.
> 
> Ignore _WAIT_FOR_ATOMIC_CHECK() on PREEMPT_RT.
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Link: https://lore.kernel.org/all/20211006164628.s2mtsdd2jdbfyf7g@linutronix.de/
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>   drivers/gpu/drm/i915/i915_utils.h | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
> index 06ec6ceb61d57..f0d3c5cdc1b1b 100644
> --- a/drivers/gpu/drm/i915/i915_utils.h
> +++ b/drivers/gpu/drm/i915/i915_utils.h
> @@ -273,8 +273,13 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms)
>   						   (Wmax))
>   #define wait_for(COND, MS)		_wait_for((COND), (MS) * 1000, 10, 1000)
>   
> -/* If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false. */
> -#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT)
> +/*
> + * If CONFIG_PREEMPT_COUNT is disabled, in_atomic() always reports false.
> + * On PREEMPT_RT the context isn't becoming atomic because it is used in an
> + * interrupt handler or because a spinlock_t is acquired. This leads warnings
> + * which don't occur otherwise and is therefore disabled.

Ack, thanks!

Regards,

Tvrtko

> + */
> +#if defined(CONFIG_DRM_I915_DEBUG) && defined(CONFIG_PREEMPT_COUNT) && !defined(CONFIG_PREEMPT_RT)
>   # define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) WARN_ON_ONCE((ATOMIC) && !in_atomic())
>   #else
>   # define _WAIT_FOR_ATOMIC_CHECK(ATOMIC) do { } while (0)
> 
>> Regards,
>>
>> Tvrtko
> 
> Sebastian

Sebastian Andrzej Siewior June 18, 2024, 12:54 p.m. UTC | #6

On 2024-06-18 10:00:09 [+0100], Tvrtko Ursulin wrote:
> I did a re-test but am not 100% certain yet. CI looks frustratingly noisy at
> the moment.
> 
> igt@debugfs_test@read_all_entries appears to be a fluke which is not new.
> 
> But igt@gem_exec_parallel@engines@basic from the latest run seem new.
> 
> So I queued another re-test.

Okay. If you want me to repost the whole series or just parts of it,
just say so.

Sebastian

Tvrtko Ursulin June 19, 2024, 10:26 a.m. UTC | #7

On 18/06/2024 13:54, Sebastian Andrzej Siewior wrote:
> On 2024-06-18 10:00:09 [+0100], Tvrtko Ursulin wrote:
>> I did a re-test but am not 100% certain yet. CI looks frustratingly noisy at
>> the moment.
>>
>> igt@debugfs_test@read_all_entries appears to be a fluke which is not new.
>>
>> But igt@gem_exec_parallel@engines@basic from the latest run seem new.
>>
>> So I queued another re-test.
> 
> Okay. If you want me to repost the whole series or just parts of it,
> just say so.

Looks like a green set of results, both BAT and full run.

Patches 1&2 will need someone from the display side to bless though.

Regards,

Tvrtko

[v2,3/8] drm/i915: Don't check for atomic context on PREEMPT_RT

Commit Message

Comments

Patch