[RFC,39/97] drm/i915/guc: Increase size of CTB buffers

Message ID	20210506191451.77768-40-matthew.brost@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=3lL8=KB=lists.freedesktop.org=dri-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BF84561001 IronPort-SDR: VE2WcL2acwM0MwJqLdnecrJpzcUjt11pkAPbXrn6hvAD+y5SOmJjsYCYaelZeR9Rz+LFxQwfvq 73B9x4uEsGeQ== IronPort-SDR: jNfQ+lix+OmSqmQXTZFvHTIb+BstXea6NbwCUCQ4VZxgZp0ctz9BsndFReEaD6KR9G/tHNxDVO 9N6vea34es2w== From: Matthew Brost <matthew.brost@intel.com> To: <intel-gfx@lists.freedesktop.org>, <dri-devel@lists.freedesktop.org> Subject: [RFC PATCH 39/97] drm/i915/guc: Increase size of CTB buffers Date: Thu, 6 May 2021 12:13:53 -0700 Message-Id: <20210506191451.77768-40-matthew.brost@intel.com> In-Reply-To: <20210506191451.77768-1-matthew.brost@intel.com> References: <20210506191451.77768-1-matthew.brost@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Cc: matthew.brost@intel.com, tvrtko.ursulin@intel.com, daniele.ceraolospurio@intel.com, jason.ekstrand@intel.com, jon.bloomfield@intel.com, daniel.vetter@intel.com, john.c.harrison@intel.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	Basic GuC submission support in the i915 \| expand [RFC,00/97] Basic GuC submission support in the i915 [RFC,01/97] drm/i915/gt: Move engine setup out of set_default_submission [RFC,02/97] drm/i915/gt: Move submission_method into intel_gt [RFC,03/97] drm/i915/gt: Move CS interrupt handler to the backend [RFC,04/97] drm/i915/guc: skip disabling CTBs before sanitizing the GuC [RFC,05/97] drm/i915/guc: use probe_error log for CT enablement failure [RFC,06/97] drm/i915/guc: enable only the user interrupt when using GuC submission [RFC,07/97] drm/i915/guc: Remove sample_forcewake h2g action [RFC,08/97] drm/i915/guc: Keep strict GuC ABI definitions [RFC,09/97] drm/i915/guc: Stop using fence/status from CTB descriptor [RFC,10/97] drm/i915: Promote ptrdiff() to i915_utils.h [RFC,11/97] drm/i915/guc: Only rely on own CTB size [RFC,12/97] drm/i915/guc: Don't repeat CTB layout calculations [RFC,13/97] drm/i915/guc: Replace CTB array with explicit members [RFC,14/97] drm/i915/guc: Update sizes of CTB buffers [RFC,15/97] drm/i915/guc: Relax CTB response timeout [RFC,16/97] drm/i915/guc: Start protecting access to CTB descriptors [RFC,17/97] drm/i915/guc: Stop using mutex while sending CTB messages [RFC,18/97] drm/i915/guc: Don't receive all G2H messages in irq handler [RFC,19/97] drm/i915/guc: Always copy CT message to new allocation [RFC,20/97] drm/i915/guc: Introduce unified HXG messages [RFC,21/97] drm/i915/guc: Update MMIO based communication [RFC,22/97] drm/i915/guc: Update CTB response status [RFC,23/97] drm/i915/guc: Support per context scheduling policies [RFC,24/97] drm/i915/guc: Add flag for mark broken CTB [RFC,25/97] drm/i915/guc: New definition of the CTB descriptor [RFC,26/97] drm/i915/guc: New definition of the CTB registration action [RFC,27/97] drm/i915/guc: New CTB based communication [RFC,28/97] drm/i915/guc: Kill guc_clients.ct_pool [RFC,29/97] drm/i915/guc: Update firmware to v60.1.2 [RFC,30/97] drm/i915/uc: turn on GuC/HuC auto mode by default [RFC,31/97] drm/i915/guc: Early initialization of GuC send registers [RFC,32/97] drm/i915: Introduce i915_sched_engine object [RFC,33/97] drm/i915: Engine relative MMIO [RFC,34/97] drm/i915/guc: Use guc_class instead of engine_class in fw interface [RFC,35/97] drm/i915/guc: Improve error message for unsolicited CT response [RFC,36/97] drm/i915/guc: Add non blocking CTB send function [RFC,37/97] drm/i915/guc: Add stall timer to non blocking CTB send function [RFC,38/97] drm/i915/guc: Optimize CTB writes and reads [RFC,39/97] drm/i915/guc: Increase size of CTB buffers [RFC,40/97] drm/i915/guc: Module load failure test for CT buffer creation [RFC,41/97] drm/i915/guc: Add new GuC interface defines and structures [RFC,42/97] drm/i915/guc: Remove GuC stage descriptor, add lrc descriptor [RFC,43/97] drm/i915/guc: Add lrc descriptor context lookup array [RFC,44/97] drm/i915/guc: Implement GuC submission tasklet [RFC,45/97] drm/i915/guc: Add bypass tasklet submission path to GuC [RFC,46/97] drm/i915/guc: Implement GuC context operations for new inteface [RFC,47/97] drm/i915/guc: Insert fence on context when deregistering [RFC,48/97] drm/i915/guc: Defer context unpin until scheduling is disabled [RFC,49/97] drm/i915/guc: Disable engine barriers with GuC during unpin [RFC,50/97] drm/i915/guc: Extend deregistration fence to schedule disable [RFC,51/97] drm/i915: Disable preempt busywait when using GuC scheduling [RFC,52/97] drm/i915/guc: Ensure request ordering via completion fences [RFC,53/97] drm/i915/guc: Disable semaphores when using GuC scheduling [RFC,54/97] drm/i915/guc: Ensure G2H response has space in buffer [RFC,55/97] drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC [RFC,56/97] drm/i915/guc: Update GuC debugfs to support new GuC [RFC,57/97] drm/i915/guc: Add several request trace points [RFC,58/97] drm/i915: Add intel_context tracing [RFC,59/97] drm/i915/guc: GuC virtual engines [RFC,60/97] drm/i915: Track 'serial' counts for virtual engines [RFC,61/97] drm/i915: Hold reference to intel_context over life of i915_request [RFC,62/97] drm/i915/guc: Disable bonding extension with GuC submission [RFC,63/97] drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbs [RFC,64/97] drm/i915/guc: Reset implementation for new GuC interface [RFC,65/97] drm/i915: Reset GPU immediately if submission is disabled [RFC,66/97] drm/i915/guc: Add disable interrupts to guc sanitize [RFC,67/97] drm/i915/guc: Suspend/resume implementation for new interface [RFC,68/97] drm/i915/guc: Handle context reset notification [RFC,69/97] drm/i915/guc: Handle engine reset failure notification [RFC,70/97] drm/i915/guc: Enable the timer expired interrupt for GuC [RFC,71/97] drm/i915/guc: Provide mmio list to be saved/restored on engine reset [RFC,72/97] drm/i915/guc: Don't complain about reset races [RFC,73/97] drm/i915/guc: Enable GuC engine reset [RFC,74/97] drm/i915/guc: Capture error state on context reset [RFC,75/97] drm/i915/guc: Fix for error capture after full GPU reset with GuC [RFC,76/97] drm/i915/guc: Hook GuC scheduling policies up [RFC,77/97] drm/i915/guc: Connect reset modparam updates to GuC policy flags [RFC,78/97] drm/i915/guc: Include scheduling policies in the debugfs state dump [RFC,79/97] drm/i915/guc: Don't call ring_is_idle in GuC submission [RFC,80/97] drm/i915/guc: Implement banned contexts for GuC submission [RFC,81/97] drm/i915/guc: Allow flexible number of context ids [RFC,82/97] drm/i915/guc: Connect the number of guc_ids to debugfs [RFC,83/97] drm/i915/guc: Don't return -EAGAIN to user when guc_ids exhausted [RFC,84/97] drm/i915/guc: Don't allow requests not ready to consume all guc_ids [RFC,85/97] drm/i915/guc: Introduce guc_submit_engine object [RFC,86/97] drm/i915/guc: Add golden context to GuC ADS [RFC,87/97] drm/i915/guc: Implement GuC priority management [RFC,88/97] drm/i915/guc: Support request cancellation [RFC,89/97] drm/i915/guc: Check return of __xa_store when registering a context [RFC,90/97] drm/i915/guc: Non-static lrc descriptor registration buffer [RFC,91/97] drm/i915/guc: Take GT PM ref when deregistering context [RFC,92/97] drm/i915: Add GT PM delayed worker [RFC,93/97] drm/i915/guc: Take engine PM when a context is pinned with GuC submission [RFC,94/97] drm/i915/guc: Don't call switch_to_kernel_context with GuC submission [RFC,95/97] drm/i915/guc: Selftest for GuC flow control [RFC,96/97] drm/i915/guc: Update GuC documentation [RFC,97/97] drm/i915/guc: Unblock GuC submission on Gen11+

Matthew Brost May 6, 2021, 7:13 p.m. UTC

With the introduction of non-blocking CTBs more than one CTB can be in
flight at a time. Increasing the size of the CTBs should reduce how
often software hits the case where no space is available in the CTB
buffer.

Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Michal Wajdeczko May 24, 2021, 1:43 p.m. UTC | #1

On 06.05.2021 21:13, Matthew Brost wrote:
> With the introduction of non-blocking CTBs more than one CTB can be in
> flight at a time. Increasing the size of the CTBs should reduce how
> often software hits the case where no space is available in the CTB
> buffer.
> 
> Cc: John Harrison <john.c.harrison@intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index 77dfbc94dcc3..d6895d29ed2d 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -63,11 +63,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
>   *      +--------+-----------------------------------------------+------+
>   *
>   * Size of each `CT Buffer`_ must be multiple of 4K.
> - * As we don't expect too many messages, for now use minimum sizes.
> + * We don't expect too many messages in flight at any time, unless we are
> + * using the GuC submission. In that case each request requires a minimum
> + * 16 bytes which gives us a maximum 256 queue'd requests. Hopefully this

nit: all our CTB calculations are in dwords now, not bytes

> + * enough space to avoid backpressure on the driver. We increase the size
> + * of the receive buffer (relative to the send) to ensure a G2H response
> + * CTB has a landing spot.

hmm, but we are not checking G2H CTB yet
will start doing it around patch 54/97
so maybe this other patch should be introduced earlier ?

>   */
>  #define CTB_DESC_SIZE		ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
>  #define CTB_H2G_BUFFER_SIZE	(SZ_4K)
> -#define CTB_G2H_BUFFER_SIZE	(SZ_4K)
> +#define CTB_G2H_BUFFER_SIZE	(4 * CTB_H2G_BUFFER_SIZE)

in theory, we (host) should be faster than GuC, so G2H CTB shall be
almost always empty, if this is not a case, maybe we should start
monitoring what is happening and report some warnings if G2H is half full ?

>  
>  #define MAX_US_STALL_CTB	1000000
>  
> @@ -753,7 +758,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>  	/* beware of buffer wrap case */
>  	if (unlikely(available < 0))
>  		available += size;
> -	CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
> +	CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
>  	GEM_BUG_ON(available < 0);
>  
>  	header = cmds[head];
>

Matthew Brost May 24, 2021, 6:40 p.m. UTC | #2

On Mon, May 24, 2021 at 03:43:11PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 06.05.2021 21:13, Matthew Brost wrote:
> > With the introduction of non-blocking CTBs more than one CTB can be in
> > flight at a time. Increasing the size of the CTBs should reduce how
> > often software hits the case where no space is available in the CTB
> > buffer.
> > 
> > Cc: John Harrison <john.c.harrison@intel.com>
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index 77dfbc94dcc3..d6895d29ed2d 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -63,11 +63,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
> >   *      +--------+-----------------------------------------------+------+
> >   *
> >   * Size of each `CT Buffer`_ must be multiple of 4K.
> > - * As we don't expect too many messages, for now use minimum sizes.
> > + * We don't expect too many messages in flight at any time, unless we are
> > + * using the GuC submission. In that case each request requires a minimum
> > + * 16 bytes which gives us a maximum 256 queue'd requests. Hopefully this
> 
> nit: all our CTB calculations are in dwords now, not bytes
> 

I can change the wording to DW sizes.

> > + * enough space to avoid backpressure on the driver. We increase the size
> > + * of the receive buffer (relative to the send) to ensure a G2H response
> > + * CTB has a landing spot.
> 
> hmm, but we are not checking G2H CTB yet
> will start doing it around patch 54/97
> so maybe this other patch should be introduced earlier ?
>

Yes, that patch is going to be pulled down to an earlier spot in the
series.
 
> >   */
> >  #define CTB_DESC_SIZE		ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
> >  #define CTB_H2G_BUFFER_SIZE	(SZ_4K)
> > -#define CTB_G2H_BUFFER_SIZE	(SZ_4K)
> > +#define CTB_G2H_BUFFER_SIZE	(4 * CTB_H2G_BUFFER_SIZE)
> 
> in theory, we (host) should be faster than GuC, so G2H CTB shall be
> almost always empty, if this is not a case, maybe we should start
> monitoring what is happening and report some warnings if G2H is half full ?
>

Certainly some IGTs put some more pressure on the G2H channel than the
H2G channel at least I think. This is something we can tune over time
after this lands upstream. IMO a message at this point is overkill.

Matt
 
> >  
> >  #define MAX_US_STALL_CTB	1000000
> >  
> > @@ -753,7 +758,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
> >  	/* beware of buffer wrap case */
> >  	if (unlikely(available < 0))
> >  		available += size;
> > -	CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
> > +	CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
> >  	GEM_BUG_ON(available < 0);
> >  
> >  	header = cmds[head];
> >

Tvrtko Ursulin May 25, 2021, 9:24 a.m. UTC | #3

On 06/05/2021 20:13, Matthew Brost wrote:
> With the introduction of non-blocking CTBs more than one CTB can be in
> flight at a time. Increasing the size of the CTBs should reduce how
> often software hits the case where no space is available in the CTB
> buffer.

I'd move this before the patch which adds the non-blocking send since 
that one claims congestion should be rare with properly sized buffers. 
So it makes sense to have them sized properly back before that one.

Regards,

Tvrtko

> Cc: John Harrison <john.c.harrison@intel.com>
> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
>   1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> index 77dfbc94dcc3..d6895d29ed2d 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> @@ -63,11 +63,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
>    *      +--------+-----------------------------------------------+------+
>    *
>    * Size of each `CT Buffer`_ must be multiple of 4K.
> - * As we don't expect too many messages, for now use minimum sizes.
> + * We don't expect too many messages in flight at any time, unless we are
> + * using the GuC submission. In that case each request requires a minimum
> + * 16 bytes which gives us a maximum 256 queue'd requests. Hopefully this
> + * enough space to avoid backpressure on the driver. We increase the size
> + * of the receive buffer (relative to the send) to ensure a G2H response
> + * CTB has a landing spot.
>    */
>   #define CTB_DESC_SIZE		ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
>   #define CTB_H2G_BUFFER_SIZE	(SZ_4K)
> -#define CTB_G2H_BUFFER_SIZE	(SZ_4K)
> +#define CTB_G2H_BUFFER_SIZE	(4 * CTB_H2G_BUFFER_SIZE)
>   
>   #define MAX_US_STALL_CTB	1000000
>   
> @@ -753,7 +758,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>   	/* beware of buffer wrap case */
>   	if (unlikely(available < 0))
>   		available += size;
> -	CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
> +	CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
>   	GEM_BUG_ON(available < 0);
>   
>   	header = cmds[head];
>

Matthew Brost May 25, 2021, 5:15 p.m. UTC | #4

On Tue, May 25, 2021 at 10:24:09AM +0100, Tvrtko Ursulin wrote:
> 
> On 06/05/2021 20:13, Matthew Brost wrote:
> > With the introduction of non-blocking CTBs more than one CTB can be in
> > flight at a time. Increasing the size of the CTBs should reduce how
> > often software hits the case where no space is available in the CTB
> > buffer.
> 
> I'd move this before the patch which adds the non-blocking send since that
> one claims congestion should be rare with properly sized buffers. So it
> makes sense to have them sized properly back before that one.
>

IMO patch ordering is a bit of bikeshed. All these CTBs changes required
for GuC submission (34-40, 54) will get posted its own series and get
merged together. None of the individual patches break anything or is any
of this code really used until GuC submission is turned on. I can move
this when I post these patches by themselves but I just don't really see
the point either way.

Matt
 
> Regards,
> 
> Tvrtko
> 
> > Cc: John Harrison <john.c.harrison@intel.com>
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
> >   1 file changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index 77dfbc94dcc3..d6895d29ed2d 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -63,11 +63,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
> >    *      +--------+-----------------------------------------------+------+
> >    *
> >    * Size of each `CT Buffer`_ must be multiple of 4K.
> > - * As we don't expect too many messages, for now use minimum sizes.
> > + * We don't expect too many messages in flight at any time, unless we are
> > + * using the GuC submission. In that case each request requires a minimum
> > + * 16 bytes which gives us a maximum 256 queue'd requests. Hopefully this
> > + * enough space to avoid backpressure on the driver. We increase the size
> > + * of the receive buffer (relative to the send) to ensure a G2H response
> > + * CTB has a landing spot.
> >    */
> >   #define CTB_DESC_SIZE		ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
> >   #define CTB_H2G_BUFFER_SIZE	(SZ_4K)
> > -#define CTB_G2H_BUFFER_SIZE	(SZ_4K)
> > +#define CTB_G2H_BUFFER_SIZE	(4 * CTB_H2G_BUFFER_SIZE)
> >   #define MAX_US_STALL_CTB	1000000
> > @@ -753,7 +758,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
> >   	/* beware of buffer wrap case */
> >   	if (unlikely(available < 0))
> >   		available += size;
> > -	CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
> > +	CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
> >   	GEM_BUG_ON(available < 0);
> >   	header = cmds[head];
> >

Tvrtko Ursulin May 26, 2021, 9:30 a.m. UTC | #5

On 25/05/2021 18:15, Matthew Brost wrote:
> On Tue, May 25, 2021 at 10:24:09AM +0100, Tvrtko Ursulin wrote:
>>
>> On 06/05/2021 20:13, Matthew Brost wrote:
>>> With the introduction of non-blocking CTBs more than one CTB can be in
>>> flight at a time. Increasing the size of the CTBs should reduce how
>>> often software hits the case where no space is available in the CTB
>>> buffer.
>>
>> I'd move this before the patch which adds the non-blocking send since that
>> one claims congestion should be rare with properly sized buffers. So it
>> makes sense to have them sized properly back before that one.
>>
> 
> IMO patch ordering is a bit of bikeshed. All these CTBs changes required
> for GuC submission (34-40, 54) will get posted its own series and get
> merged together. None of the individual patches break anything or is any
> of this code really used until GuC submission is turned on. I can move
> this when I post these patches by themselves but I just don't really see
> the point either way.

As a general principle we do try to have work in the order which makes 
sense functionality wise.

That includes trying to avoid adding and then removing, or changing a 
lot, the same code within the series. And also adding functionality 
which is known to not work completely well until later in the series.

With a master switch at the end of series you can sometimes get away 
with it, but if nothing else it at least makes it much easier to read if 
things are flowing in the expected way within (the series).

In this particular example sizing the buffers appropriately before 
starting to use the facility a lot more certainly sounds like a no 
brainer to me, especially since the patch is so trivial to move conflict 
wise.

Regards,

Tvrtko

> Matt
>   
>> Regards,
>>
>> Tvrtko
>>
>>> Cc: John Harrison <john.c.harrison@intel.com>
>>> Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
>>>    1 file changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> index 77dfbc94dcc3..d6895d29ed2d 100644
>>> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>> @@ -63,11 +63,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
>>>     *      +--------+-----------------------------------------------+------+
>>>     *
>>>     * Size of each `CT Buffer`_ must be multiple of 4K.
>>> - * As we don't expect too many messages, for now use minimum sizes.
>>> + * We don't expect too many messages in flight at any time, unless we are
>>> + * using the GuC submission. In that case each request requires a minimum
>>> + * 16 bytes which gives us a maximum 256 queue'd requests. Hopefully this
>>> + * enough space to avoid backpressure on the driver. We increase the size
>>> + * of the receive buffer (relative to the send) to ensure a G2H response
>>> + * CTB has a landing spot.
>>>     */
>>>    #define CTB_DESC_SIZE		ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
>>>    #define CTB_H2G_BUFFER_SIZE	(SZ_4K)
>>> -#define CTB_G2H_BUFFER_SIZE	(SZ_4K)
>>> +#define CTB_G2H_BUFFER_SIZE	(4 * CTB_H2G_BUFFER_SIZE)
>>>    #define MAX_US_STALL_CTB	1000000
>>> @@ -753,7 +758,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
>>>    	/* beware of buffer wrap case */
>>>    	if (unlikely(available < 0))
>>>    		available += size;
>>> -	CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
>>> +	CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
>>>    	GEM_BUG_ON(available < 0);
>>>    	header = cmds[head];
>>>

Matthew Brost May 26, 2021, 6:20 p.m. UTC | #6

On Wed, May 26, 2021 at 10:30:27AM +0100, Tvrtko Ursulin wrote:
> 
> On 25/05/2021 18:15, Matthew Brost wrote:
> > On Tue, May 25, 2021 at 10:24:09AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 06/05/2021 20:13, Matthew Brost wrote:
> > > > With the introduction of non-blocking CTBs more than one CTB can be in
> > > > flight at a time. Increasing the size of the CTBs should reduce how
> > > > often software hits the case where no space is available in the CTB
> > > > buffer.
> > > 
> > > I'd move this before the patch which adds the non-blocking send since that
> > > one claims congestion should be rare with properly sized buffers. So it
> > > makes sense to have them sized properly back before that one.
> > > 
> > 
> > IMO patch ordering is a bit of bikeshed. All these CTBs changes required
> > for GuC submission (34-40, 54) will get posted its own series and get
> > merged together. None of the individual patches break anything or is any
> > of this code really used until GuC submission is turned on. I can move
> > this when I post these patches by themselves but I just don't really see
> > the point either way.
> 
> As a general principle we do try to have work in the order which makes sense
> functionality wise.
> 
> That includes trying to avoid adding and then removing, or changing a lot,
> the same code within the series. And also adding functionality which is
> known to not work completely well until later in the series.
> 
> With a master switch at the end of series you can sometimes get away with
> it, but if nothing else it at least makes it much easier to read if things
> are flowing in the expected way within (the series).
> 
> In this particular example sizing the buffers appropriately before starting
> to use the facility a lot more certainly sounds like a no brainer to me,
> especially since the patch is so trivial to move conflict wise.
> 

Fair enough. I'll reorder these patches when I do a post to merge these
ones.

Matt

> Regards,
> 
> Tvrtko
> 
> > Matt
> > > Regards,
> > > 
> > > Tvrtko
> > > 
> > > > Cc: John Harrison <john.c.harrison@intel.com>
> > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > ---
> > > >    drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ++++++++---
> > > >    1 file changed, 8 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > index 77dfbc94dcc3..d6895d29ed2d 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > > > @@ -63,11 +63,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct)
> > > >     *      +--------+-----------------------------------------------+------+
> > > >     *
> > > >     * Size of each `CT Buffer`_ must be multiple of 4K.
> > > > - * As we don't expect too many messages, for now use minimum sizes.
> > > > + * We don't expect too many messages in flight at any time, unless we are
> > > > + * using the GuC submission. In that case each request requires a minimum
> > > > + * 16 bytes which gives us a maximum 256 queue'd requests. Hopefully this
> > > > + * enough space to avoid backpressure on the driver. We increase the size
> > > > + * of the receive buffer (relative to the send) to ensure a G2H response
> > > > + * CTB has a landing spot.
> > > >     */
> > > >    #define CTB_DESC_SIZE		ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
> > > >    #define CTB_H2G_BUFFER_SIZE	(SZ_4K)
> > > > -#define CTB_G2H_BUFFER_SIZE	(SZ_4K)
> > > > +#define CTB_G2H_BUFFER_SIZE	(4 * CTB_H2G_BUFFER_SIZE)
> > > >    #define MAX_US_STALL_CTB	1000000
> > > > @@ -753,7 +758,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg)
> > > >    	/* beware of buffer wrap case */
> > > >    	if (unlikely(available < 0))
> > > >    		available += size;
> > > > -	CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
> > > > +	CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
> > > >    	GEM_BUG_ON(available < 0);
> > > >    	header = cmds[head];
> > > >

[RFC,39/97] drm/i915/guc: Increase size of CTB buffers

Commit Message

Comments

Patch