diff mbox series

[RFC,49/97] drm/i915/guc: Disable engine barriers with GuC during unpin

Message ID 20210506191451.77768-50-matthew.brost@intel.com (mailing list archive)
State New, archived
Headers show
Series Basic GuC submission support in the i915 | expand

Commit Message

Matthew Brost May 6, 2021, 7:14 p.m. UTC
Disable engine barriers for unpinning with GuC. This feature isn't
needed with the GuC as it disables context scheduling before unpinning
which guarantees the HW will not reference the context. Hence it is
not necessary to defer unpinning until a kernel context request
completes on each engine in the context engine mask.

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.c    |  2 +-
 drivers/gpu/drm/i915/gt/intel_context.h    |  1 +
 drivers/gpu/drm/i915/gt/selftest_context.c | 10 ++++++++++
 drivers/gpu/drm/i915/i915_active.c         |  3 +++
 4 files changed, 15 insertions(+), 1 deletion(-)

Comments

Daniel Vetter May 11, 2021, 3:37 p.m. UTC | #1
On Thu, May 06, 2021 at 12:14:03PM -0700, Matthew Brost wrote:
> Disable engine barriers for unpinning with GuC. This feature isn't
> needed with the GuC as it disables context scheduling before unpinning
> which guarantees the HW will not reference the context. Hence it is
> not necessary to defer unpinning until a kernel context request
> completes on each engine in the context engine mask.
> 
> Cc: John Harrison <john.c.harrison@intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Instead of these ifs in the code, can we push this barrier business down
into backends?

Not in this series, but as one of the things to sort out as part of the
conversion to drm/scheduler.
-Daniel

> ---
>  drivers/gpu/drm/i915/gt/intel_context.c    |  2 +-
>  drivers/gpu/drm/i915/gt/intel_context.h    |  1 +
>  drivers/gpu/drm/i915/gt/selftest_context.c | 10 ++++++++++
>  drivers/gpu/drm/i915/i915_active.c         |  3 +++
>  4 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index 1499b8aace2a..7f97753ab164 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -80,7 +80,7 @@ static int intel_context_active_acquire(struct intel_context *ce)
>  
>  	__i915_active_acquire(&ce->active);
>  
> -	if (intel_context_is_barrier(ce))
> +	if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
>  		return 0;
>  
>  	/* Preallocate tracking nodes */
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index 92ecbab8c1cd..9b211ca5ecc7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -16,6 +16,7 @@
>  #include "intel_engine_types.h"
>  #include "intel_ring_types.h"
>  #include "intel_timeline_types.h"
> +#include "uc/intel_guc_submission.h"
>  
>  #define CE_TRACE(ce, fmt, ...) do {					\
>  	const struct intel_context *ce__ = (ce);			\
> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> index 26685b927169..fa7b99a671dd 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> @@ -209,7 +209,13 @@ static int __live_active_context(struct intel_engine_cs *engine)
>  	 * This test makes sure that the context is kept alive until a
>  	 * subsequent idle-barrier (emitted when the engine wakeref hits 0
>  	 * with no more outstanding requests).
> +	 *
> +	 * In GuC submission mode we don't use idle barriers and we instead
> +	 * get a message from the GuC to signal that it is safe to unpin the
> +	 * context from memory.
>  	 */
> +	if (intel_engine_uses_guc(engine))
> +		return 0;
>  
>  	if (intel_engine_pm_is_awake(engine)) {
>  		pr_err("%s is awake before starting %s!\n",
> @@ -357,7 +363,11 @@ static int __live_remote_context(struct intel_engine_cs *engine)
>  	 * on the context image remotely (intel_context_prepare_remote_request),
>  	 * which inserts foreign fences into intel_context.active, does not
>  	 * clobber the idle-barrier.
> +	 *
> +	 * In GuC submission mode we don't use idle barriers.
>  	 */
> +	if (intel_engine_uses_guc(engine))
> +		return 0;
>  
>  	if (intel_engine_pm_is_awake(engine)) {
>  		pr_err("%s is awake before starting %s!\n",
> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> index b1aa1c482c32..9a264898bb91 100644
> --- a/drivers/gpu/drm/i915/i915_active.c
> +++ b/drivers/gpu/drm/i915/i915_active.c
> @@ -968,6 +968,9 @@ void i915_active_acquire_barrier(struct i915_active *ref)
>  
>  	GEM_BUG_ON(i915_active_is_idle(ref));
>  
> +	if (llist_empty(&ref->preallocated_barriers))
> +		return;
> +
>  	/*
>  	 * Transfer the list of preallocated barriers into the
>  	 * i915_active rbtree, but only as proto-nodes. They will be
> -- 
> 2.28.0
>
Matthew Brost May 11, 2021, 4:31 p.m. UTC | #2
On Tue, May 11, 2021 at 05:37:54PM +0200, Daniel Vetter wrote:
> On Thu, May 06, 2021 at 12:14:03PM -0700, Matthew Brost wrote:
> > Disable engine barriers for unpinning with GuC. This feature isn't
> > needed with the GuC as it disables context scheduling before unpinning
> > which guarantees the HW will not reference the context. Hence it is
> > not necessary to defer unpinning until a kernel context request
> > completes on each engine in the context engine mask.
> > 
> > Cc: John Harrison <john.c.harrison@intel.com>
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> 
> Instead of these ifs in the code, can we push this barrier business down
> into backends?
> 

Not a bad idea. This is an example of what I think of implict behavior of the
backend creeping into the higher levels.

> Not in this series, but as one of the things to sort out as part of the
> conversion to drm/scheduler.

Agree. After basic GuC submission gets merged maybe we go through the code and
remove all the implict backend assumptions.

Matt

> -Daniel
> 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_context.c    |  2 +-
> >  drivers/gpu/drm/i915/gt/intel_context.h    |  1 +
> >  drivers/gpu/drm/i915/gt/selftest_context.c | 10 ++++++++++
> >  drivers/gpu/drm/i915/i915_active.c         |  3 +++
> >  4 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 1499b8aace2a..7f97753ab164 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -80,7 +80,7 @@ static int intel_context_active_acquire(struct intel_context *ce)
> >  
> >  	__i915_active_acquire(&ce->active);
> >  
> > -	if (intel_context_is_barrier(ce))
> > +	if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
> >  		return 0;
> >  
> >  	/* Preallocate tracking nodes */
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> > index 92ecbab8c1cd..9b211ca5ecc7 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > @@ -16,6 +16,7 @@
> >  #include "intel_engine_types.h"
> >  #include "intel_ring_types.h"
> >  #include "intel_timeline_types.h"
> > +#include "uc/intel_guc_submission.h"
> >  
> >  #define CE_TRACE(ce, fmt, ...) do {					\
> >  	const struct intel_context *ce__ = (ce);			\
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> > index 26685b927169..fa7b99a671dd 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> > @@ -209,7 +209,13 @@ static int __live_active_context(struct intel_engine_cs *engine)
> >  	 * This test makes sure that the context is kept alive until a
> >  	 * subsequent idle-barrier (emitted when the engine wakeref hits 0
> >  	 * with no more outstanding requests).
> > +	 *
> > +	 * In GuC submission mode we don't use idle barriers and we instead
> > +	 * get a message from the GuC to signal that it is safe to unpin the
> > +	 * context from memory.
> >  	 */
> > +	if (intel_engine_uses_guc(engine))
> > +		return 0;
> >  
> >  	if (intel_engine_pm_is_awake(engine)) {
> >  		pr_err("%s is awake before starting %s!\n",
> > @@ -357,7 +363,11 @@ static int __live_remote_context(struct intel_engine_cs *engine)
> >  	 * on the context image remotely (intel_context_prepare_remote_request),
> >  	 * which inserts foreign fences into intel_context.active, does not
> >  	 * clobber the idle-barrier.
> > +	 *
> > +	 * In GuC submission mode we don't use idle barriers.
> >  	 */
> > +	if (intel_engine_uses_guc(engine))
> > +		return 0;
> >  
> >  	if (intel_engine_pm_is_awake(engine)) {
> >  		pr_err("%s is awake before starting %s!\n",
> > diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> > index b1aa1c482c32..9a264898bb91 100644
> > --- a/drivers/gpu/drm/i915/i915_active.c
> > +++ b/drivers/gpu/drm/i915/i915_active.c
> > @@ -968,6 +968,9 @@ void i915_active_acquire_barrier(struct i915_active *ref)
> >  
> >  	GEM_BUG_ON(i915_active_is_idle(ref));
> >  
> > +	if (llist_empty(&ref->preallocated_barriers))
> > +		return;
> > +
> >  	/*
> >  	 * Transfer the list of preallocated barriers into the
> >  	 * i915_active rbtree, but only as proto-nodes. They will be
> > -- 
> > 2.28.0
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Tvrtko Ursulin May 26, 2021, 10:26 a.m. UTC | #3
On 06/05/2021 20:14, Matthew Brost wrote:
> Disable engine barriers for unpinning with GuC. This feature isn't
> needed with the GuC as it disables context scheduling before unpinning

Just isn't needed or causes a problem somehow?

> which guarantees the HW will not reference the context. Hence it is
> not necessary to defer unpinning until a kernel context request
> completes on each engine in the context engine mask.

Hm context engine mask does not come across as something used in this patch.

Engine PM works fine with this change - i915 does not turn of the 
engine/gt too early? I mean context unpin is on retire and the guc 
disable of context scheduling is sync or async? Even when the kernel 
context request gets emitted on engine pm put there is no race?

> 
> Cc: John Harrison <john.c.harrison@intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_context.c    |  2 +-
>   drivers/gpu/drm/i915/gt/intel_context.h    |  1 +
>   drivers/gpu/drm/i915/gt/selftest_context.c | 10 ++++++++++
>   drivers/gpu/drm/i915/i915_active.c         |  3 +++
>   4 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
> index 1499b8aace2a..7f97753ab164 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -80,7 +80,7 @@ static int intel_context_active_acquire(struct intel_context *ce)
>   
>   	__i915_active_acquire(&ce->active);
>   
> -	if (intel_context_is_barrier(ce))
> +	if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
>   		return 0;
>   
>   	/* Preallocate tracking nodes */
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
> index 92ecbab8c1cd..9b211ca5ecc7 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> @@ -16,6 +16,7 @@
>   #include "intel_engine_types.h"
>   #include "intel_ring_types.h"
>   #include "intel_timeline_types.h"
> +#include "uc/intel_guc_submission.h"
>   
>   #define CE_TRACE(ce, fmt, ...) do {					\
>   	const struct intel_context *ce__ = (ce);			\
> diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
> index 26685b927169..fa7b99a671dd 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_context.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_context.c
> @@ -209,7 +209,13 @@ static int __live_active_context(struct intel_engine_cs *engine)
>   	 * This test makes sure that the context is kept alive until a
>   	 * subsequent idle-barrier (emitted when the engine wakeref hits 0
>   	 * with no more outstanding requests).
> +	 *
> +	 * In GuC submission mode we don't use idle barriers and we instead
> +	 * get a message from the GuC to signal that it is safe to unpin the
> +	 * context from memory.
>   	 */
> +	if (intel_engine_uses_guc(engine))
> +		return 0;
>   
>   	if (intel_engine_pm_is_awake(engine)) {
>   		pr_err("%s is awake before starting %s!\n",
> @@ -357,7 +363,11 @@ static int __live_remote_context(struct intel_engine_cs *engine)
>   	 * on the context image remotely (intel_context_prepare_remote_request),
>   	 * which inserts foreign fences into intel_context.active, does not
>   	 * clobber the idle-barrier.
> +	 *
> +	 * In GuC submission mode we don't use idle barriers.
>   	 */
> +	if (intel_engine_uses_guc(engine))
> +		return 0;
>   
>   	if (intel_engine_pm_is_awake(engine)) {
>   		pr_err("%s is awake before starting %s!\n",
> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
> index b1aa1c482c32..9a264898bb91 100644
> --- a/drivers/gpu/drm/i915/i915_active.c
> +++ b/drivers/gpu/drm/i915/i915_active.c
> @@ -968,6 +968,9 @@ void i915_active_acquire_barrier(struct i915_active *ref)
>   
>   	GEM_BUG_ON(i915_active_is_idle(ref));
>   
> +	if (llist_empty(&ref->preallocated_barriers))
> +		return;

This hunk is not needed since the effectively same check is few lines below.

Regards,

Tvrtko

> +
>   	/*
>   	 * Transfer the list of preallocated barriers into the
>   	 * i915_active rbtree, but only as proto-nodes. They will be
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 1499b8aace2a..7f97753ab164 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -80,7 +80,7 @@  static int intel_context_active_acquire(struct intel_context *ce)
 
 	__i915_active_acquire(&ce->active);
 
-	if (intel_context_is_barrier(ce))
+	if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine))
 		return 0;
 
 	/* Preallocate tracking nodes */
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h
index 92ecbab8c1cd..9b211ca5ecc7 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -16,6 +16,7 @@ 
 #include "intel_engine_types.h"
 #include "intel_ring_types.h"
 #include "intel_timeline_types.h"
+#include "uc/intel_guc_submission.h"
 
 #define CE_TRACE(ce, fmt, ...) do {					\
 	const struct intel_context *ce__ = (ce);			\
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c b/drivers/gpu/drm/i915/gt/selftest_context.c
index 26685b927169..fa7b99a671dd 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -209,7 +209,13 @@  static int __live_active_context(struct intel_engine_cs *engine)
 	 * This test makes sure that the context is kept alive until a
 	 * subsequent idle-barrier (emitted when the engine wakeref hits 0
 	 * with no more outstanding requests).
+	 *
+	 * In GuC submission mode we don't use idle barriers and we instead
+	 * get a message from the GuC to signal that it is safe to unpin the
+	 * context from memory.
 	 */
+	if (intel_engine_uses_guc(engine))
+		return 0;
 
 	if (intel_engine_pm_is_awake(engine)) {
 		pr_err("%s is awake before starting %s!\n",
@@ -357,7 +363,11 @@  static int __live_remote_context(struct intel_engine_cs *engine)
 	 * on the context image remotely (intel_context_prepare_remote_request),
 	 * which inserts foreign fences into intel_context.active, does not
 	 * clobber the idle-barrier.
+	 *
+	 * In GuC submission mode we don't use idle barriers.
 	 */
+	if (intel_engine_uses_guc(engine))
+		return 0;
 
 	if (intel_engine_pm_is_awake(engine)) {
 		pr_err("%s is awake before starting %s!\n",
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index b1aa1c482c32..9a264898bb91 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -968,6 +968,9 @@  void i915_active_acquire_barrier(struct i915_active *ref)
 
 	GEM_BUG_ON(i915_active_is_idle(ref));
 
+	if (llist_empty(&ref->preallocated_barriers))
+		return;
+
 	/*
 	 * Transfer the list of preallocated barriers into the
 	 * i915_active rbtree, but only as proto-nodes. They will be