diff mbox series

[07/27] drm/i915/guc: Don't call switch_to_kernel_context with GuC submission

Message ID 20210820224446.30620-8-matthew.brost@intel.com (mailing list archive)
State New, archived
Headers show
Series Parallel submission aka multi-bb execbuf | expand

Commit Message

Matthew Brost Aug. 20, 2021, 10:44 p.m. UTC
Calling switch_to_kernel_context isn't needed if the engine PM reference
is taken while all contexts are pinned. By not calling
switch_to_kernel_context we save on issuing a request to the engine.

v2:
 (Daniel Vetter)
  - Add FIXME comment about pushing switch_to_kernel_context to backend

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

John Harrison Sept. 9, 2021, 10:51 p.m. UTC | #1
On 8/20/2021 15:44, Matthew Brost wrote:
> Calling switch_to_kernel_context isn't needed if the engine PM reference
> is taken while all contexts are pinned. By not calling
> switch_to_kernel_context we save on issuing a request to the engine.
I thought the intention of the switch_to_kernel was to ensure that the 
GPU is not touching any user context and is basically idle. That is not 
a valid assumption with an external scheduler such as GuC. So why is the 
description above only mentioning PM references? What is the connection 
between the PM ref and the switch_to_kernel?

Also, the comment in the code does not mention anything about PM 
references, it just says 'not necessary with GuC' but no explanation at all.


> v2:
>   (Daniel Vetter)
>    - Add FIXME comment about pushing switch_to_kernel_context to backend
>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 1f07ac4e0672..11fee66daf60 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>   	unsigned long flags;
>   	bool result = true;
>   
> +	/*
> +	 * No need to switch_to_kernel_context if GuC submission
> +	 *
> +	 * FIXME: This execlists specific backend behavior in generic code, this
"This execlists" -> "This is execlist"

"this should be" -> "it should be"

John.

> +	 * should be pushed to the backend.
> +	 */
> +	if (intel_engine_uses_guc(engine))
> +		return true;
> +
>   	/* GPU is pointing to the void, as good as in the kernel context. */
>   	if (intel_gt_is_wedged(engine->gt))
>   		return true;
Matthew Brost Sept. 13, 2021, 4:54 p.m. UTC | #2
On Thu, Sep 09, 2021 at 03:51:27PM -0700, John Harrison wrote:
> On 8/20/2021 15:44, Matthew Brost wrote:
> > Calling switch_to_kernel_context isn't needed if the engine PM reference
> > is taken while all contexts are pinned. By not calling
> > switch_to_kernel_context we save on issuing a request to the engine.
> I thought the intention of the switch_to_kernel was to ensure that the GPU
> is not touching any user context and is basically idle. That is not a valid
> assumption with an external scheduler such as GuC. So why is the description
> above only mentioning PM references? What is the connection between the PM
> ref and the switch_to_kernel?
> 
> Also, the comment in the code does not mention anything about PM references,
> it just says 'not necessary with GuC' but no explanation at all.
> 

Yea, this need to be explained better. How about this?

Calling switch_to_kernel_context isn't needed if the engine PM reference
is take while all user contexts have scheduling enabled. Once scheduling
is disabled on all user contexts the GuC is guaranteed to not touch any
user context state which is effectively the same pointing to a kernel
context.

Matt

> 
> > v2:
> >   (Daniel Vetter)
> >    - Add FIXME comment about pushing switch_to_kernel_context to backend
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++
> >   1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > index 1f07ac4e0672..11fee66daf60 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> >   	unsigned long flags;
> >   	bool result = true;
> > +	/*
> > +	 * No need to switch_to_kernel_context if GuC submission
> > +	 *
> > +	 * FIXME: This execlists specific backend behavior in generic code, this
> "This execlists" -> "This is execlist"
> 
> "this should be" -> "it should be"
> 
> John.
> 
> > +	 * should be pushed to the backend.
> > +	 */
> > +	if (intel_engine_uses_guc(engine))
> > +		return true;
> > +
> >   	/* GPU is pointing to the void, as good as in the kernel context. */
> >   	if (intel_gt_is_wedged(engine->gt))
> >   		return true;
>
Matthew Brost Sept. 13, 2021, 4:55 p.m. UTC | #3
On Thu, Sep 09, 2021 at 03:51:27PM -0700, John Harrison wrote:
> On 8/20/2021 15:44, Matthew Brost wrote:
> > Calling switch_to_kernel_context isn't needed if the engine PM reference
> > is taken while all contexts are pinned. By not calling
> > switch_to_kernel_context we save on issuing a request to the engine.
> I thought the intention of the switch_to_kernel was to ensure that the GPU
> is not touching any user context and is basically idle. That is not a valid
> assumption with an external scheduler such as GuC. So why is the description
> above only mentioning PM references? What is the connection between the PM
> ref and the switch_to_kernel?
> 
> Also, the comment in the code does not mention anything about PM references,
> it just says 'not necessary with GuC' but no explanation at all.
> 
> 
> > v2:
> >   (Daniel Vetter)
> >    - Add FIXME comment about pushing switch_to_kernel_context to backend
> > 
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++
> >   1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > index 1f07ac4e0672..11fee66daf60 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> > @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> >   	unsigned long flags;
> >   	bool result = true;
> > +	/*
> > +	 * No need to switch_to_kernel_context if GuC submission
> > +	 *
> > +	 * FIXME: This execlists specific backend behavior in generic code, this
> "This execlists" -> "This is execlist"
> 
> "this should be" -> "it should be"
> 

Missed this. Will fix in next rev.

Matt

> John.
> 
> > +	 * should be pushed to the backend.
> > +	 */
> > +	if (intel_engine_uses_guc(engine))
> > +		return true;
> > +
> >   	/* GPU is pointing to the void, as good as in the kernel context. */
> >   	if (intel_gt_is_wedged(engine->gt))
> >   		return true;
>
John Harrison Sept. 13, 2021, 10:38 p.m. UTC | #4
On 9/13/2021 09:54, Matthew Brost wrote:
> On Thu, Sep 09, 2021 at 03:51:27PM -0700, John Harrison wrote:
>> On 8/20/2021 15:44, Matthew Brost wrote:
>>> Calling switch_to_kernel_context isn't needed if the engine PM reference
>>> is taken while all contexts are pinned. By not calling
>>> switch_to_kernel_context we save on issuing a request to the engine.
>> I thought the intention of the switch_to_kernel was to ensure that the GPU
>> is not touching any user context and is basically idle. That is not a valid
>> assumption with an external scheduler such as GuC. So why is the description
>> above only mentioning PM references? What is the connection between the PM
>> ref and the switch_to_kernel?
>>
>> Also, the comment in the code does not mention anything about PM references,
>> it just says 'not necessary with GuC' but no explanation at all.
>>
> Yea, this need to be explained better. How about this?
>
> Calling switch_to_kernel_context isn't needed if the engine PM reference
> is take while all user contexts have scheduling enabled. Once scheduling
> is disabled on all user contexts the GuC is guaranteed to not touch any
> user context state which is effectively the same pointing to a kernel
> context.
>
> Matt
I'm still not seeing how the PM reference is involved?

Also, IMHO the focus is wrong in the above text. The fundamental 
requirement is the ensure the hardware is idle. Execlist achieves this 
by switching to a safe context. GuC achieves it by disabling scheduling. 
Indeed, switching to a 'safe' context really has no effect with GuC 
submission. So 'effectively the same as pointing to a kernel context' is 
an incorrect description. I would go with something like:

    "This is execlist specific behaviour intended to ensure the GPU is
    idle by switching to a known 'safe' context. With GuC submission,
    the same idle guarantee is achieved by other means (disabling
    scheduling). Further, switching to a 'safe' context has no effect
    with GuC submission as the scheduler can just switch back again.
    FIXME: Move this backend scheduler specific behaviour into the
    scheduler backend."


John.


>
>>> v2:
>>>    (Daniel Vetter)
>>>     - Add FIXME comment about pushing switch_to_kernel_context to backend
>>>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++
>>>    1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> index 1f07ac4e0672..11fee66daf60 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>> @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>>>    	unsigned long flags;
>>>    	bool result = true;
>>> +	/*
>>> +	 * No need to switch_to_kernel_context if GuC submission
>>> +	 *
>>> +	 * FIXME: This execlists specific backend behavior in generic code, this
>> "This execlists" -> "This is execlist"
>>
>> "this should be" -> "it should be"
>>
>> John.
>>
>>> +	 * should be pushed to the backend.
>>> +	 */
>>> +	if (intel_engine_uses_guc(engine))
>>> +		return true;
>>> +
>>>    	/* GPU is pointing to the void, as good as in the kernel context. */
>>>    	if (intel_gt_is_wedged(engine->gt))
>>>    		return true;
Matthew Brost Sept. 14, 2021, 5:02 a.m. UTC | #5
On Mon, Sep 13, 2021 at 03:38:44PM -0700, John Harrison wrote:
> On 9/13/2021 09:54, Matthew Brost wrote:
> 
>     On Thu, Sep 09, 2021 at 03:51:27PM -0700, John Harrison wrote:
> 
>         On 8/20/2021 15:44, Matthew Brost wrote:
> 
>             Calling switch_to_kernel_context isn't needed if the engine PM reference
>             is taken while all contexts are pinned. By not calling
>             switch_to_kernel_context we save on issuing a request to the engine.
> 
>         I thought the intention of the switch_to_kernel was to ensure that the GPU
>         is not touching any user context and is basically idle. That is not a valid
>         assumption with an external scheduler such as GuC. So why is the description
>         above only mentioning PM references? What is the connection between the PM
>         ref and the switch_to_kernel?
> 
>         Also, the comment in the code does not mention anything about PM references,
>         it just says 'not necessary with GuC' but no explanation at all.
> 
> 
>     Yea, this need to be explained better. How about this?
> 
>     Calling switch_to_kernel_context isn't needed if the engine PM reference
>     is take while all user contexts have scheduling enabled. Once scheduling
>     is disabled on all user contexts the GuC is guaranteed to not touch any
>     user context state which is effectively the same pointing to a kernel
>     context.
> 
>     Matt
> 
> I'm still not seeing how the PM reference is involved?
> 

We shouldn't trap into the GT PM park code while a user context has
scheduling enabled as the GT PM park code may have side affects we don't
to execute if a user context still has scheduling enabled. I guess that
isn't explained very well.

> Also, IMHO the focus is wrong in the above text. The fundamental requirement is
> the ensure the hardware is idle. Execlist achieves this by switching to a safe
> context. GuC achieves it by disabling scheduling. Indeed, switching to a 'safe'
> context really has no effect with GuC submission. So 'effectively the same as
> pointing to a kernel context' is an incorrect description. I would go with
> something like:
> 
>     "This is execlist specific behaviour intended to ensure the GPU is idle by
>     switching to a known 'safe' context. With GuC submission, the same idle
>     guarantee is achieved by other means (disabling scheduling). Further,
>     switching to a 'safe' context has no effect with GuC submission as the
>     scheduler can just switch back again.
>     FIXME: Move this backend scheduler specific behaviour into the scheduler
>     backend."
>

That is worded better. Will pull into the next rev.

Matt
 
> 
> John.
> 
> 
> 
> 
> 
>             v2:
>               (Daniel Vetter)
>                - Add FIXME comment about pushing switch_to_kernel_context to backend
> 
>             Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>             Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>             ---
>               drivers/gpu/drm/i915/gt/intel_engine_pm.c | 9 +++++++++
>               1 file changed, 9 insertions(+)
> 
>             diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>             index 1f07ac4e0672..11fee66daf60 100644
>             --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>             +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>             @@ -162,6 +162,15 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
>                     unsigned long flags;
>                     bool result = true;
>             +       /*
>             +        * No need to switch_to_kernel_context if GuC submission
>             +        *
>             +        * FIXME: This execlists specific backend behavior in generic code, this
> 
>         "This execlists" -> "This is execlist"
> 
>         "this should be" -> "it should be"
> 
>         John.
> 
> 
>             +        * should be pushed to the backend.
>             +        */
>             +       if (intel_engine_uses_guc(engine))
>             +               return true;
>             +
>                     /* GPU is pointing to the void, as good as in the kernel context. */
>                     if (intel_gt_is_wedged(engine->gt))
>                             return true;
> 
> 
> SECURITY NOTE: file ~/.netrc must not be accessible by others
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 1f07ac4e0672..11fee66daf60 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -162,6 +162,15 @@  static bool switch_to_kernel_context(struct intel_engine_cs *engine)
 	unsigned long flags;
 	bool result = true;
 
+	/*
+	 * No need to switch_to_kernel_context if GuC submission
+	 *
+	 * FIXME: This execlists specific backend behavior in generic code, this
+	 * should be pushed to the backend.
+	 */
+	if (intel_engine_uses_guc(engine))
+		return true;
+
 	/* GPU is pointing to the void, as good as in the kernel context. */
 	if (intel_gt_is_wedged(engine->gt))
 		return true;