diff mbox series

[v3] drm/i915/gt: Use spin_lock_irqsave() in interruptible context

Message ID pusppq5ybyszau2oocboj3mtj5x574gwij323jlclm5zxvimmu@mnfg6odxbpsv (mailing list archive)
State New, archived
Headers show
Series [v3] drm/i915/gt: Use spin_lock_irqsave() in interruptible context | expand

Commit Message

Krzysztof Karas Jan. 16, 2025, 10:40 a.m. UTC
spin_lock/unlock() functions used in interrupt contexts could
result in a deadlock, as seen in GitLab issue #13399:
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13399,
which occurs when interrupt comes in while holding a lock.

Try to remedy the problem by saving irq state before spin lock
acquisition.

v2: add irqs' state save/restore calls to all locks/unlocks in
 signal_irq_work() execution (Maciej)

v3: use with spin_lock_irqsave() in guc_lrc_desc_unpin() instead
 of other lock/unlock calls and add Fixes and Cc tags (Tvrtko);
 change title and commit message

Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss")
Cc: <stable@vger.kernel.org> # v6.9+
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Maciej Patelczyk Jan. 16, 2025, 11:22 a.m. UTC | #1
On 16.01.2025 11:40, Krzysztof Karas wrote:

> spin_lock/unlock() functions used in interrupt contexts could
> result in a deadlock, as seen in GitLab issue #13399:
> https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13399,
> which occurs when interrupt comes in while holding a lock.
>
> Try to remedy the problem by saving irq state before spin lock
> acquisition.
>
> v2: add irqs' state save/restore calls to all locks/unlocks in
>   signal_irq_work() execution (Maciej)
>
> v3: use with spin_lock_irqsave() in guc_lrc_desc_unpin() instead
>   of other lock/unlock calls and add Fixes and Cc tags (Tvrtko);
>   change title and commit message
>
> Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
> Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss")
> Cc: <stable@vger.kernel.org> # v6.9+
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 12f1ba7ca9c1..29d9c81473cc 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -3433,10 +3433,10 @@ static inline int guc_lrc_desc_unpin(struct intel_context *ce)
>   	 */
>   	ret = deregister_context(ce, ce->guc_id.id);
>   	if (ret) {
> -		spin_lock(&ce->guc_state.lock);
> +		spin_lock_irqsave(&ce->guc_state.lock, flags);
>   		set_context_registered(ce);
>   		clr_context_destroyed(ce);
> -		spin_unlock(&ce->guc_state.lock);
> +		spin_unlock_irqrestore(&ce->guc_state.lock, flags);
>   		/*
>   		 * As gt-pm is awake at function entry, intel_wakeref_put_async merely decrements
>   		 * the wakeref immediately but per function spec usage call this after unlock.

Looked at this and wasn't seeing it. There is spin_lock_irqsave() used 
earlier.

Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Krzysztof Karas Jan. 17, 2025, 11:51 a.m. UTC | #2
Hi CI Team,

On 2025-01-16 at 13:18:25 +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915/gt: Use spin_lock_irqsave() in interruptible context
> URL   : https://patchwork.freedesktop.org/series/143600/
> State : failure
>
...
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in Patchwork_143600v1:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@kms_busy@basic@modeset:
>     - bat-apl-1:          [PASS][1] -> [DMESG-WARN][2] +1 other test dmesg-warn
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-apl-1/igt@kms_busy@basic@modeset.html
>    [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-apl-1/igt@kms_busy@basic@modeset.html
> 
>   * igt@kms_cursor_legacy@basic-flip-before-cursor-legacy:
>     - fi-kbl-7567u:       [PASS][3] -> [DMESG-WARN][4]
>    [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/fi-kbl-7567u/igt@kms_cursor_legacy@basic-flip-before-cursor-legacy.html
>    [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/fi-kbl-7567u/igt@kms_cursor_legacy@basic-flip-before-cursor-legacy.html
> 
These seem unrelated to my patch and I've seen similar warnings about
FIFO underrun on other unrelated patches as well. Please re-report.

Krzysztof
>   
> Known issues
> ------------
> 
>   Here are the changes found in Patchwork_143600v1 that come from known issues:
> 
> ### IGT changes ###
> 
> #### Issues hit ####
> 
>   * igt@i915_pm_rpm@module-reload:
>     - bat-rpls-4:         [PASS][5] -> [FAIL][6] ([i915#13401])
>    [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-rpls-4/igt@i915_pm_rpm@module-reload.html
>    [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-rpls-4/igt@i915_pm_rpm@module-reload.html
> 
>   * igt@i915_selftest@live:
>     - bat-arlh-3:         [PASS][7] -> [DMESG-FAIL][8] ([i915#12061] / [i915#12435])
>    [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-arlh-3/igt@i915_selftest@live.html
>    [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-arlh-3/igt@i915_selftest@live.html
> 
>   * igt@i915_selftest@live@workarounds:
>     - bat-arlh-3:         [PASS][9] -> [DMESG-FAIL][10] ([i915#12061])
>    [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-arlh-3/igt@i915_selftest@live@workarounds.html
>    [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-arlh-3/igt@i915_selftest@live@workarounds.html
>     - bat-mtlp-6:         [PASS][11] -> [DMESG-FAIL][12] ([i915#12061]) +1 other test dmesg-fail
>    [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-mtlp-6/igt@i915_selftest@live@workarounds.html
>    [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-mtlp-6/igt@i915_selftest@live@workarounds.html
> 
>   * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence:
>     - bat-dg2-11:         [PASS][13] -> [SKIP][14] ([i915#9197]) +2 other tests skip
>    [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
>    [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
> 
>   
> #### Possible fixes ####
> 
>   * igt@i915_selftest@live@gt_lrc:
>     - bat-twl-1:          [INCOMPLETE][15] ([i915#9413]) -> [PASS][16]
>    [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-twl-1/igt@i915_selftest@live@gt_lrc.html
>    [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-twl-1/igt@i915_selftest@live@gt_lrc.html
> 
>   
> #### Warnings ####
> 
>   * igt@gem_exec_gttfill@basic:
>     - fi-pnv-d510:        [SKIP][17] -> [ABORT][18] ([i915#13169])
>    [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/fi-pnv-d510/igt@gem_exec_gttfill@basic.html
>    [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/fi-pnv-d510/igt@gem_exec_gttfill@basic.html
> 
>   
>   [i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
>   [i915#12435]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12435
>   [i915#13169]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13169
>   [i915#13401]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13401
>   [i915#9197]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9197
>   [i915#9413]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9413
> 
> 
> Build changes
> -------------
> 
>   * Linux: CI_DRM_15966 -> Patchwork_143600v1
> 
>   CI-20190529: 20190529
>   CI_DRM_15966: 2334d1179d652eb4032fcb0d8e6f53cbc6a1011c @ git://anongit.freedesktop.org/gfx-ci/linux
>   IGT_8194: c505b5e9853c3270751f264b0d4055d4d7ae13ce @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>   Patchwork_143600v1: 2334d1179d652eb4032fcb0d8e6f53cbc6a1011c @ git://anongit.freedesktop.org/gfx-ci/linux
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/index.html
Ravali, JupallyX Jan. 20, 2025, 9:25 a.m. UTC | #3
Hi,
https://patchwork.freedesktop.org/series/143600/ - Re-reported.
 
i915.CI.BAT - Re-reported.
 
Thanks,
Ravali.

-----Original Message-----
From: I915-ci-infra <i915-ci-infra-bounces@lists.freedesktop.org> On Behalf Of Krzysztof Karas
Sent: 17 January 2025 17:21
To: intel-gfx@lists.freedesktop.org; I915-ci-infra@lists.freedesktop.org
Subject: Re: ✗ i915.CI.BAT: failure for drm/i915/gt: Use spin_lock_irqsave() in interruptible context

Hi CI Team,

On 2025-01-16 at 13:18:25 +0000, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915/gt: Use spin_lock_irqsave() in interruptible context
> URL   : https://patchwork.freedesktop.org/series/143600/
> State : failure
>
...
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in Patchwork_143600v1:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt@kms_busy@basic@modeset:
>     - bat-apl-1:          [PASS][1] -> [DMESG-WARN][2] +1 other test dmesg-warn
>    [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-apl-1/igt@kms_busy@basic@modeset.html
>    [2]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-apl-1/
> igt@kms_busy@basic@modeset.html
> 
>   * igt@kms_cursor_legacy@basic-flip-before-cursor-legacy:
>     - fi-kbl-7567u:       [PASS][3] -> [DMESG-WARN][4]
>    [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/fi-kbl-7567u/igt@kms_cursor_legacy@basic-flip-before-cursor-legacy.html
>    [4]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/fi-kbl-756
> 7u/igt@kms_cursor_legacy@basic-flip-before-cursor-legacy.html
> 
These seem unrelated to my patch and I've seen similar warnings about FIFO underrun on other unrelated patches as well. Please re-report.

Krzysztof
>   
> Known issues
> ------------
> 
>   Here are the changes found in Patchwork_143600v1 that come from known issues:
> 
> ### IGT changes ###
> 
> #### Issues hit ####
> 
>   * igt@i915_pm_rpm@module-reload:
>     - bat-rpls-4:         [PASS][5] -> [FAIL][6] ([i915#13401])
>    [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-rpls-4/igt@i915_pm_rpm@module-reload.html
>    [6]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-rpls-4
> /igt@i915_pm_rpm@module-reload.html
> 
>   * igt@i915_selftest@live:
>     - bat-arlh-3:         [PASS][7] -> [DMESG-FAIL][8] ([i915#12061] / [i915#12435])
>    [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-arlh-3/igt@i915_selftest@live.html
>    [8]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-arlh-3
> /igt@i915_selftest@live.html
> 
>   * igt@i915_selftest@live@workarounds:
>     - bat-arlh-3:         [PASS][9] -> [DMESG-FAIL][10] ([i915#12061])
>    [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-arlh-3/igt@i915_selftest@live@workarounds.html
>    [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-arlh-3/igt@i915_selftest@live@workarounds.html
>     - bat-mtlp-6:         [PASS][11] -> [DMESG-FAIL][12] ([i915#12061]) +1 other test dmesg-fail
>    [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-mtlp-6/igt@i915_selftest@live@workarounds.html
>    [12]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-mtlp-6
> /igt@i915_selftest@live@workarounds.html
> 
>   * igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence:
>     - bat-dg2-11:         [PASS][13] -> [SKIP][14] ([i915#9197]) +2 other tests skip
>    [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-dg2-11/igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
>    [14]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-dg2-11
> /igt@kms_pipe_crc_basic@nonblocking-crc-frame-sequence.html
> 
>   
> #### Possible fixes ####
> 
>   * igt@i915_selftest@live@gt_lrc:
>     - bat-twl-1:          [INCOMPLETE][15] ([i915#9413]) -> [PASS][16]
>    [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/bat-twl-1/igt@i915_selftest@live@gt_lrc.html
>    [16]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/bat-twl-1/
> igt@i915_selftest@live@gt_lrc.html
> 
>   
> #### Warnings ####
> 
>   * igt@gem_exec_gttfill@basic:
>     - fi-pnv-d510:        [SKIP][17] -> [ABORT][18] ([i915#13169])
>    [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_15966/fi-pnv-d510/igt@gem_exec_gttfill@basic.html
>    [18]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/fi-pnv-d51
> 0/igt@gem_exec_gttfill@basic.html
> 
>   
>   [i915#12061]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12061
>   [i915#12435]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12435
>   [i915#13169]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13169
>   [i915#13401]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13401
>   [i915#9197]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9197
>   [i915#9413]: 
> https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9413
> 
> 
> Build changes
> -------------
> 
>   * Linux: CI_DRM_15966 -> Patchwork_143600v1
> 
>   CI-20190529: 20190529
>   CI_DRM_15966: 2334d1179d652eb4032fcb0d8e6f53cbc6a1011c @ git://anongit.freedesktop.org/gfx-ci/linux
>   IGT_8194: c505b5e9853c3270751f264b0d4055d4d7ae13ce @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>   Patchwork_143600v1: 2334d1179d652eb4032fcb0d8e6f53cbc6a1011c @ 
> git://anongit.freedesktop.org/gfx-ci/linux
> 
> == Logs ==
> 
> For more details see: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_143600v1/index.html
Andi Shyti Feb. 13, 2025, 6:56 p.m. UTC | #4
Hi Krzysztof,

On Thu, Jan 16, 2025 at 10:40:46AM +0000, Krzysztof Karas wrote:
> spin_lock/unlock() functions used in interrupt contexts could
> result in a deadlock, as seen in GitLab issue #13399:
> https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13399,

I moved this link in the tag sectoin as Closes:

> which occurs when interrupt comes in while holding a lock.
> 
> Try to remedy the problem by saving irq state before spin lock
> acquisition.
> 
> v2: add irqs' state save/restore calls to all locks/unlocks in
>  signal_irq_work() execution (Maciej)
> 
> v3: use with spin_lock_irqsave() in guc_lrc_desc_unpin() instead
>  of other lock/unlock calls and add Fixes and Cc tags (Tvrtko);
>  change title and commit message
> 
> Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
> Fixes: 2f2cc53b5fe7 ("drm/i915/guc: Close deregister-context race against CT-loss")

I moved Fixes: above your SoB.

> Cc: <stable@vger.kernel.org> # v6.9+

Anyway, good catch, but please, remember next time to relly add
in Cc the stable kernel mailing list, the guc guys in Cc for GuC
related patches and the author of the patch you are fixing.

Reviewed and merged in drm-intel-gt-next.

Thanks,
Andi
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 12f1ba7ca9c1..29d9c81473cc 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -3433,10 +3433,10 @@  static inline int guc_lrc_desc_unpin(struct intel_context *ce)
 	 */
 	ret = deregister_context(ce, ce->guc_id.id);
 	if (ret) {
-		spin_lock(&ce->guc_state.lock);
+		spin_lock_irqsave(&ce->guc_state.lock, flags);
 		set_context_registered(ce);
 		clr_context_destroyed(ce);
-		spin_unlock(&ce->guc_state.lock);
+		spin_unlock_irqrestore(&ce->guc_state.lock, flags);
 		/*
 		 * As gt-pm is awake at function entry, intel_wakeref_put_async merely decrements
 		 * the wakeref immediately but per function spec usage call this after unlock.