diff mbox series

[v3] drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8

Message ID 20240711163208.1355736-1-nitin.r.gote@intel.com (mailing list archive)
State New, archived
Headers show
Series [v3] drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8 | expand

Commit Message

Nitin Gote July 11, 2024, 4:32 p.m. UTC
We're seeing a GPU HANG issue on a CHV platform, which was caused by
bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for gen8").

Gen8 platform has only timeslice and doesn't support a preemption mechanism
as engines do not have a preemption timer and doesn't send an irq if the
preemption timeout expires. So, add a fix to not consider preemption
during dequeuing for gen8 platforms.

v2: Simplify can_preempt() function (Tvrtko Ursulin)

v3:
 - Inside need_preempt(), condition of can_preempt() is not required
   as simplified can_preempt() is enough. (Chris Wilson)

Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for gen8")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
Suggested-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
CC: <stable@vger.kernel.org> # v5.2+
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

Comments

Cavitt, Jonathan July 11, 2024, 4:28 p.m. UTC | #1
-----Original Message-----
From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Nitin Gote
Sent: Thursday, July 11, 2024 9:32 AM
To: Wilson, Chris P <chris.p.wilson@intel.com>; tursulin@ursulin.net; intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org; Shyti, Andi <andi.shyti@intel.com>; Das, Nirmoy <nirmoy.das@intel.com>; janusz.krzysztofik@linux.intel.com; Gote, Nitin R <nitin.r.gote@intel.com>; Chris Wilson <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
Subject: [PATCH v3] drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8
> 
> We're seeing a GPU HANG issue on a CHV platform, which was caused by
> bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for gen8").
> 
> Gen8 platform has only timeslice and doesn't support a preemption mechanism
> as engines do not have a preemption timer and doesn't send an irq if the
> preemption timeout expires.

That seems to mean the original can_preempt function was inaccurately built,
so fixing it here makes the most sense to me, especially if it's causing problems.

Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
-Jonathan Cavitt

> So, add a fix to not consider preemption
> during dequeuing for gen8 platforms.
> 
> v2: Simplify can_preempt() function (Tvrtko Ursulin)
> 
> v3:
>  - Inside need_preempt(), condition of can_preempt() is not required
>    as simplified can_preempt() is enough. (Chris Wilson)
> 
> Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for gen8")
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
> Suggested-by: Andi Shyti <andi.shyti@intel.com>
> Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> CC: <stable@vger.kernel.org> # v5.2+
> ---
>  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 21829439e686..72090f52fb85 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -3315,11 +3315,7 @@ static void remove_from_engine(struct i915_request *rq)
>  
>  static bool can_preempt(struct intel_engine_cs *engine)
>  {
> -	if (GRAPHICS_VER(engine->i915) > 8)
> -		return true;
> -
> -	/* GPGPU on bdw requires extra w/a; not implemented */
> -	return engine->class != RENDER_CLASS;
> +	return GRAPHICS_VER(engine->i915) > 8;
>  }
>  
>  static void kick_execlists(const struct i915_request *rq, int prio)
> -- 
> 2.25.1
> 
>
Rodrigo Vivi July 11, 2024, 6:09 p.m. UTC | #2
On Thu, Jul 11, 2024 at 04:28:53PM +0000, Cavitt, Jonathan wrote:
> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Nitin Gote
> Sent: Thursday, July 11, 2024 9:32 AM
> To: Wilson, Chris P <chris.p.wilson@intel.com>; tursulin@ursulin.net; intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org; Shyti, Andi <andi.shyti@intel.com>; Das, Nirmoy <nirmoy.das@intel.com>; janusz.krzysztofik@linux.intel.com; Gote, Nitin R <nitin.r.gote@intel.com>; Chris Wilson <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
> Subject: [PATCH v3] drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8
> > 
> > We're seeing a GPU HANG issue on a CHV platform, which was caused by
> > bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for gen8").
> > 
> > Gen8 platform has only timeslice and doesn't support a preemption mechanism
> > as engines do not have a preemption timer and doesn't send an irq if the
> > preemption timeout expires.
> 
> That seems to mean the original can_preempt function was inaccurately built,
> so fixing it here makes the most sense to me, especially if it's causing problems.
> 
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
> -Jonathan Cavitt
> 
> > So, add a fix to not consider preemption
> > during dequeuing for gen8 platforms.
> > 
> > v2: Simplify can_preempt() function (Tvrtko Ursulin)
> > 
> > v3:
> >  - Inside need_preempt(), condition of can_preempt() is not required
> >    as simplified can_preempt() is enough. (Chris Wilson)
> > 
> > Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for gen8")

Something strange in here...

This patch is not using directly or indirectly (I915_ENGINE_HAS_PREEMPTION) the
can_preempt()...

> > Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
> > Suggested-by: Andi Shyti <andi.shyti@intel.com>
> > Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
> > Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> > CC: <stable@vger.kernel.org> # v5.2+
> > ---
> >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 6 +-----
> >  1 file changed, 1 insertion(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index 21829439e686..72090f52fb85 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -3315,11 +3315,7 @@ static void remove_from_engine(struct i915_request *rq)
> >  
> >  static bool can_preempt(struct intel_engine_cs *engine)
> >  {
> > -	if (GRAPHICS_VER(engine->i915) > 8)
> > -		return true;
> > -
> > -	/* GPGPU on bdw requires extra w/a; not implemented */
> > -	return engine->class != RENDER_CLASS;
> > +	return GRAPHICS_VER(engine->i915) > 8;
> >  }
> >  
> >  static void kick_execlists(const struct i915_request *rq, int prio)
> > -- 
> > 2.25.1
> > 
> >
Nitin Gote July 12, 2024, 1:25 p.m. UTC | #3
> -----Original Message-----
> From: Vivi, Rodrigo <rodrigo.vivi@intel.com>
> Sent: Thursday, July 11, 2024 11:39 PM
> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>
> Cc: Gote, Nitin R <nitin.r.gote@intel.com>; Wilson, Chris P
> <chris.p.wilson@intel.com>; tursulin@ursulin.net; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Shyti, Andi
> <andi.shyti@intel.com>; Das, Nirmoy <nirmoy.das@intel.com>;
> janusz.krzysztofik@linux.intel.com; Chris Wilson
> <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
> Subject: Re: [PATCH v3] drm/i915/gt: Do not consider preemption during
> execlists_dequeue for gen8
> 
> On Thu, Jul 11, 2024 at 04:28:53PM +0000, Cavitt, Jonathan wrote:
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> > Nitin Gote
> > Sent: Thursday, July 11, 2024 9:32 AM
> > To: Wilson, Chris P <chris.p.wilson@intel.com>; tursulin@ursulin.net;
> > intel-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org; Shyti, Andi
> > <andi.shyti@intel.com>; Das, Nirmoy <nirmoy.das@intel.com>;
> > janusz.krzysztofik@linux.intel.com; Gote, Nitin R
> > <nitin.r.gote@intel.com>; Chris Wilson
> > <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
> > Subject: [PATCH v3] drm/i915/gt: Do not consider preemption during
> > execlists_dequeue for gen8
> > >
> > > We're seeing a GPU HANG issue on a CHV platform, which was caused by
> > > bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries
> for gen8").
> > >
> > > Gen8 platform has only timeslice and doesn't support a preemption
> > > mechanism as engines do not have a preemption timer and doesn't send
> > > an irq if the preemption timeout expires.
> >
> > That seems to mean the original can_preempt function was inaccurately
> > built, so fixing it here makes the most sense to me, especially if it's causing
> problems.
> >
> > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> -Jonathan
> > Cavitt
> >
> > > So, add a fix to not consider preemption during dequeuing for gen8
> > > platforms.
> > >
> > > v2: Simplify can_preempt() function (Tvrtko Ursulin)
> > >
> > > v3:
> > >  - Inside need_preempt(), condition of can_preempt() is not required
> > >    as simplified can_preempt() is enough. (Chris Wilson)
> > >
> > > Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse preemption
> > > boundaries for gen8")
> 
> Something strange in here...
> 
> This patch is not using directly or indirectly (I915_ENGINE_HAS_PREEMPTION)
> the can_preempt()...
>

Thank you Rodrigo for the review comment. Seems like you are right.
Fixes: bac24f59f454 is misleading as it's not using can_preempt(). 
The bug could be from the commit bac24f59f454 as mentioned in the issue
But this change fixes the original implementation of can_preempt()  in below commit.
Fixes: 751f82b353a6 ("drm/i915/gt: Only disable preemption on gen8 render engines").

I will update the Fixes in the commit description and will send in v4.

> > > Closes:
> > > https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
> > > Suggested-by: Andi Shyti <andi.shyti@intel.com>
> > > Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
> > > Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> > > CC: <stable@vger.kernel.org> # v5.2+
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 6 +-----
> > >  1 file changed, 1 insertion(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > index 21829439e686..72090f52fb85 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > @@ -3315,11 +3315,7 @@ static void remove_from_engine(struct
> > > i915_request *rq)
> > >
> > >  static bool can_preempt(struct intel_engine_cs *engine)  {
> > > -	if (GRAPHICS_VER(engine->i915) > 8)
> > > -		return true;
> > > -
> > > -	/* GPGPU on bdw requires extra w/a; not implemented */
> > > -	return engine->class != RENDER_CLASS;
> > > +	return GRAPHICS_VER(engine->i915) > 8;
> > >  }
> > >
> > >  static void kick_execlists(const struct i915_request *rq, int prio)
> > > --
> > > 2.25.1
> > >
> > >
Andi Shyti July 12, 2024, 1:39 p.m. UTC | #4
Hi Nitin,

> > > > We're seeing a GPU HANG issue on a CHV platform, which was caused by
> > > > bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries
> > for gen8").
> > > >
> > > > Gen8 platform has only timeslice and doesn't support a preemption
> > > > mechanism as engines do not have a preemption timer and doesn't send
> > > > an irq if the preemption timeout expires.
> > >
> > > That seems to mean the original can_preempt function was inaccurately
> > > built, so fixing it here makes the most sense to me, especially if it's causing
> > problems.
> > >
> > > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> -Jonathan
> > > Cavitt
> > >
> > > > So, add a fix to not consider preemption during dequeuing for gen8
> > > > platforms.
> > > >
> > > > v2: Simplify can_preempt() function (Tvrtko Ursulin)
> > > >
> > > > v3:
> > > >  - Inside need_preempt(), condition of can_preempt() is not required
> > > >    as simplified can_preempt() is enough. (Chris Wilson)
> > > >
> > > > Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse preemption
> > > > boundaries for gen8")
> > 
> > Something strange in here...
> > 
> > This patch is not using directly or indirectly (I915_ENGINE_HAS_PREEMPTION)
> > the can_preempt()...
> >
> 
> Thank you Rodrigo for the review comment. Seems like you are right.
> Fixes: bac24f59f454 is misleading as it's not using can_preempt(). 
> The bug could be from the commit bac24f59f454 as mentioned in the issue
> But this change fixes the original implementation of can_preempt()  in below commit.
> Fixes: 751f82b353a6 ("drm/i915/gt: Only disable preemption on gen8 render engines").
> 
> I will update the Fixes in the commit description and will send in v4.

no need to resend it, I will update it before pushing.

Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>

I think the first mention in the commit log is correct, though,
as that's the reason where the issue was generated.

Thanks,
Andi

> > > > Closes:
> > > > https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
> > > > Suggested-by: Andi Shyti <andi.shyti@intel.com>
> > > > Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
> > > > Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> > > > CC: <stable@vger.kernel.org> # v5.2+
> > > > ---
> > > >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 6 +-----
> > > >  1 file changed, 1 insertion(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > index 21829439e686..72090f52fb85 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > > @@ -3315,11 +3315,7 @@ static void remove_from_engine(struct
> > > > i915_request *rq)
> > > >
> > > >  static bool can_preempt(struct intel_engine_cs *engine)  {
> > > > -	if (GRAPHICS_VER(engine->i915) > 8)
> > > > -		return true;
> > > > -
> > > > -	/* GPGPU on bdw requires extra w/a; not implemented */
> > > > -	return engine->class != RENDER_CLASS;
> > > > +	return GRAPHICS_VER(engine->i915) > 8;
> > > >  }
> > > >
> > > >  static void kick_execlists(const struct i915_request *rq, int prio)
> > > > --
> > > > 2.25.1
> > > >
> > > >
Andi Shyti July 15, 2024, 10:14 p.m. UTC | #5
Hi,

On Fri, Jul 12, 2024 at 03:25:23PM +0200, Gote, Nitin R wrote:
> > -----Original Message-----
> > From: Vivi, Rodrigo <rodrigo.vivi@intel.com>
> > Sent: Thursday, July 11, 2024 11:39 PM
> > To: Cavitt, Jonathan <jonathan.cavitt@intel.com>
> > Cc: Gote, Nitin R <nitin.r.gote@intel.com>; Wilson, Chris P
> > <chris.p.wilson@intel.com>; tursulin@ursulin.net; intel-
> > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Shyti, Andi
> > <andi.shyti@intel.com>; Das, Nirmoy <nirmoy.das@intel.com>;
> > janusz.krzysztofik@linux.intel.com; Chris Wilson
> > <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
> > Subject: Re: [PATCH v3] drm/i915/gt: Do not consider preemption during
> > execlists_dequeue for gen8
> > 
> > On Thu, Jul 11, 2024 at 04:28:53PM +0000, Cavitt, Jonathan wrote:
> > > -----Original Message-----
> > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> > > Nitin Gote
> > > Sent: Thursday, July 11, 2024 9:32 AM
> > > To: Wilson, Chris P <chris.p.wilson@intel.com>; tursulin@ursulin.net;
> > > intel-gfx@lists.freedesktop.org
> > > Cc: dri-devel@lists.freedesktop.org; Shyti, Andi
> > > <andi.shyti@intel.com>; Das, Nirmoy <nirmoy.das@intel.com>;
> > > janusz.krzysztofik@linux.intel.com; Gote, Nitin R
> > > <nitin.r.gote@intel.com>; Chris Wilson
> > > <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
> > > Subject: [PATCH v3] drm/i915/gt: Do not consider preemption during
> > > execlists_dequeue for gen8
> > > >
> > > > We're seeing a GPU HANG issue on a CHV platform, which was caused by
> > > > bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries
> > for gen8").
> > > >
> > > > Gen8 platform has only timeslice and doesn't support a preemption
> > > > mechanism as engines do not have a preemption timer and doesn't send
> > > > an irq if the preemption timeout expires.
> > >
> > > That seems to mean the original can_preempt function was inaccurately
> > > built, so fixing it here makes the most sense to me, especially if it's causing
> > problems.
> > >
> > > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> -Jonathan
> > > Cavitt
> > >
> > > > So, add a fix to not consider preemption during dequeuing for gen8
> > > > platforms.
> > > >
> > > > v2: Simplify can_preempt() function (Tvrtko Ursulin)
> > > >
> > > > v3:
> > > >  - Inside need_preempt(), condition of can_preempt() is not required
> > > >    as simplified can_preempt() is enough. (Chris Wilson)
> > > >
> > > > Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse preemption
> > > > boundaries for gen8")
> > 
> > Something strange in here...
> > 
> > This patch is not using directly or indirectly (I915_ENGINE_HAS_PREEMPTION)
> > the can_preempt()...
> >
> 
> Thank you Rodrigo for the review comment. Seems like you are right.
> Fixes: bac24f59f454 is misleading as it's not using can_preempt(). 
> The bug could be from the commit bac24f59f454 as mentioned in the issue
> But this change fixes the original implementation of can_preempt()  in below commit.
> Fixes: 751f82b353a6 ("drm/i915/gt: Only disable preemption on gen8 render engines").
> 
> I will update the Fixes in the commit description and will send in v4.

Can I reword the commit log to something similar:

    drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8

    We're seeing a GPU hang issue on a CHV platform, which was caused by commit
    bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for
    Gen8").

    The Gen8 platform only supports timeslicing and doesn't have a preemption
    mechanism, as its engines do not have a preemption timer.

    Commit 751f82b353a6 ("drm/i915/gt: Only disable preemption on Gen8 render
    engines") addressed this issue only for render engines. This patch extends
    that fix by ensuring that preemption is not considered for all engines on
    Gen8 platforms.

    v4:
     - Use the correct Fixes tag (Rodrigo Vivi)
     - Reworded commit log (Andi Shyti)

    v3:
     - Inside need_preempt(), condition of can_preempt() is not required
       as simplified can_preempt() is enough. (Chris Wilson)

    v2: Simplify can_preempt() function (Tvrtko Ursulin)

Andi
Nitin Gote July 16, 2024, 3:43 a.m. UTC | #6
Hi,

> -----Original Message-----
> From: Andi Shyti <andi.shyti@linux.intel.com>
> Sent: Tuesday, July 16, 2024 3:44 AM
> To: Gote, Nitin R <nitin.r.gote@intel.com>
> Cc: Vivi, Rodrigo <rodrigo.vivi@intel.com>; Cavitt, Jonathan
> <jonathan.cavitt@intel.com>; Wilson, Chris P <chris.p.wilson@intel.com>;
> tursulin@ursulin.net; intel-gfx@lists.freedesktop.org; dri-
> devel@lists.freedesktop.org; Das, Nirmoy <nirmoy.das@intel.com>;
> janusz.krzysztofik@linux.intel.com; Chris Wilson
> <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
> Subject: Re: [PATCH v3] drm/i915/gt: Do not consider preemption during
> execlists_dequeue for gen8
> 
> Hi,
> 
> On Fri, Jul 12, 2024 at 03:25:23PM +0200, Gote, Nitin R wrote:
> > > -----Original Message-----
> > > From: Vivi, Rodrigo <rodrigo.vivi@intel.com>
> > > Sent: Thursday, July 11, 2024 11:39 PM
> > > To: Cavitt, Jonathan <jonathan.cavitt@intel.com>
> > > Cc: Gote, Nitin R <nitin.r.gote@intel.com>; Wilson, Chris P
> > > <chris.p.wilson@intel.com>; tursulin@ursulin.net; intel-
> > > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Shyti,
> > > Andi <andi.shyti@intel.com>; Das, Nirmoy <nirmoy.das@intel.com>;
> > > janusz.krzysztofik@linux.intel.com; Chris Wilson
> > > <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
> > > Subject: Re: [PATCH v3] drm/i915/gt: Do not consider preemption
> > > during execlists_dequeue for gen8
> > >
> > > On Thu, Jul 11, 2024 at 04:28:53PM +0000, Cavitt, Jonathan wrote:
> > > > -----Original Message-----
> > > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On
> > > > Behalf Of Nitin Gote
> > > > Sent: Thursday, July 11, 2024 9:32 AM
> > > > To: Wilson, Chris P <chris.p.wilson@intel.com>;
> > > > tursulin@ursulin.net; intel-gfx@lists.freedesktop.org
> > > > Cc: dri-devel@lists.freedesktop.org; Shyti, Andi
> > > > <andi.shyti@intel.com>; Das, Nirmoy <nirmoy.das@intel.com>;
> > > > janusz.krzysztofik@linux.intel.com; Gote, Nitin R
> > > > <nitin.r.gote@intel.com>; Chris Wilson
> > > > <chris.p.wilson@linux.intel.com>; stable@vger.kernel.org
> > > > Subject: [PATCH v3] drm/i915/gt: Do not consider preemption during
> > > > execlists_dequeue for gen8
> > > > >
> > > > > We're seeing a GPU HANG issue on a CHV platform, which was
> > > > > caused by
> > > > > bac24f59f454 ("drm/i915/execlists: Enable coarse preemption
> > > > > boundaries
> > > for gen8").
> > > > >
> > > > > Gen8 platform has only timeslice and doesn't support a
> > > > > preemption mechanism as engines do not have a preemption timer
> > > > > and doesn't send an irq if the preemption timeout expires.
> > > >
> > > > That seems to mean the original can_preempt function was
> > > > inaccurately built, so fixing it here makes the most sense to me,
> > > > especially if it's causing
> > > problems.
> > > >
> > > > Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> -Jonathan
> > > > Cavitt
> > > >
> > > > > So, add a fix to not consider preemption during dequeuing for
> > > > > gen8 platforms.
> > > > >
> > > > > v2: Simplify can_preempt() function (Tvrtko Ursulin)
> > > > >
> > > > > v3:
> > > > >  - Inside need_preempt(), condition of can_preempt() is not required
> > > > >    as simplified can_preempt() is enough. (Chris Wilson)
> > > > >
> > > > > Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse
> > > > > preemption boundaries for gen8")
> > >
> > > Something strange in here...
> > >
> > > This patch is not using directly or indirectly
> > > (I915_ENGINE_HAS_PREEMPTION) the can_preempt()...
> > >
> >
> > Thank you Rodrigo for the review comment. Seems like you are right.
> > Fixes: bac24f59f454 is misleading as it's not using can_preempt().
> > The bug could be from the commit bac24f59f454 as mentioned in the
> > issue But this change fixes the original implementation of can_preempt()  in
> below commit.
> > Fixes: 751f82b353a6 ("drm/i915/gt: Only disable preemption on gen8
> render engines").
> >
> > I will update the Fixes in the commit description and will send in v4.
> 
> Can I reword the commit log to something similar:
> 
>     drm/i915/gt: Do not consider preemption during execlists_dequeue for
> gen8
> 
>     We're seeing a GPU hang issue on a CHV platform, which was caused by
> commit
>     bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries
> for
>     Gen8").
> 
>     The Gen8 platform only supports timeslicing and doesn't have a
> preemption
>     mechanism, as its engines do not have a preemption timer.
> 
>     Commit 751f82b353a6 ("drm/i915/gt: Only disable preemption on Gen8
> render
>     engines") addressed this issue only for render engines. This patch extends
>     that fix by ensuring that preemption is not considered for all engines on
>     Gen8 platforms.
> 
>     v4:
>      - Use the correct Fixes tag (Rodrigo Vivi)
>      - Reworded commit log (Andi Shyti)
> 
>     v3:
>      - Inside need_preempt(), condition of can_preempt() is not required
>        as simplified can_preempt() is enough. (Chris Wilson)
> 
>     v2: Simplify can_preempt() function (Tvrtko Ursulin)
> 
> Andi

Sure. You can.

Thank you
- Nitin
Andi Shyti July 16, 2024, 9:54 a.m. UTC | #7
Hi Nitin,

On Thu, Jul 11, 2024 at 10:02:08PM +0530, Nitin Gote wrote:
> We're seeing a GPU HANG issue on a CHV platform, which was caused by
> bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for gen8").
> 
> Gen8 platform has only timeslice and doesn't support a preemption mechanism
> as engines do not have a preemption timer and doesn't send an irq if the
> preemption timeout expires. So, add a fix to not consider preemption
> during dequeuing for gen8 platforms.
> 
> v2: Simplify can_preempt() function (Tvrtko Ursulin)
> 
> v3:
>  - Inside need_preempt(), condition of can_preempt() is not required
>    as simplified can_preempt() is enough. (Chris Wilson)
> 
> Fixes: bac24f59f454 ("drm/i915/execlists: Enable coarse preemption boundaries for gen8")
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11396
> Suggested-by: Andi Shyti <andi.shyti@intel.com>
> Signed-off-by: Nitin Gote <nitin.r.gote@intel.com>
> Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
> CC: <stable@vger.kernel.org> # v5.2+

with the commit message fixed and the checkpatch as well, merged
to drm-intel-gt-next.

Thank you,
Andi
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 21829439e686..72090f52fb85 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3315,11 +3315,7 @@  static void remove_from_engine(struct i915_request *rq)
 
 static bool can_preempt(struct intel_engine_cs *engine)
 {
-	if (GRAPHICS_VER(engine->i915) > 8)
-		return true;
-
-	/* GPGPU on bdw requires extra w/a; not implemented */
-	return engine->class != RENDER_CLASS;
+	return GRAPHICS_VER(engine->i915) > 8;
 }
 
 static void kick_execlists(const struct i915_request *rq, int prio)