diff mbox

[v5,09/35] drm/i915: Force MMIO flips when scheduler enabled

Message ID 1455805644-6450-10-git-send-email-John.C.Harrison@Intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

John Harrison Feb. 18, 2016, 2:26 p.m. UTC
From: John Harrison <John.C.Harrison@Intel.com>

MMIO flips are the preferred mechanism now but more importantly, pipe
based flips cause issues for the scheduler. Specifically, submitting
work to the rings around the side of the scheduler could cause that
work to be lost if the scheduler generates a pre-emption event on that
ring.

For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jesse Barnes Feb. 19, 2016, 7:28 p.m. UTC | #1
On 02/18/2016 06:26 AM, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> MMIO flips are the preferred mechanism now but more importantly, pipe
> based flips cause issues for the scheduler. Specifically, submitting
> work to the rings around the side of the scheduler could cause that
> work to be lost if the scheduler generates a pre-emption event on that
> ring.
> 
> For: VIZ-1587
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 6e12ed7..731d20a 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -46,6 +46,7 @@
>  #include <linux/dma_remapping.h>
>  #include <linux/reservation.h>
>  #include <linux/dma-buf.h>
> +#include "i915_scheduler.h"
>  
>  /* Primary plane formats for gen <= 3 */
>  static const uint32_t i8xx_primary_formats[] = {
> @@ -11330,6 +11331,8 @@ static bool use_mmio_flip(struct intel_engine_cs *ring,
>  		return true;
>  	else if (i915.enable_execlists)
>  		return true;
> +	else if (i915_scheduler_is_enabled(ring->dev))
> +		return true;
>  	else if (obj->base.dma_buf &&
>  		 !reservation_object_test_signaled_rcu(obj->base.dma_buf->resv,
>  						       false))
> 

Why can't we use mmio flips unconditionally?  Maarten or Ville?

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Ville Syrjälä Feb. 19, 2016, 7:53 p.m. UTC | #2
On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:
> On 02/18/2016 06:26 AM, John.C.Harrison@Intel.com wrote:
> > From: John Harrison <John.C.Harrison@Intel.com>
> > 
> > MMIO flips are the preferred mechanism now but more importantly, pipe
> > based flips cause issues for the scheduler. Specifically, submitting
> > work to the rings around the side of the scheduler could cause that
> > work to be lost if the scheduler generates a pre-emption event on that
> > ring.
> > 
> > For: VIZ-1587
> > Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_display.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 6e12ed7..731d20a 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -46,6 +46,7 @@
> >  #include <linux/dma_remapping.h>
> >  #include <linux/reservation.h>
> >  #include <linux/dma-buf.h>
> > +#include "i915_scheduler.h"
> >  
> >  /* Primary plane formats for gen <= 3 */
> >  static const uint32_t i8xx_primary_formats[] = {
> > @@ -11330,6 +11331,8 @@ static bool use_mmio_flip(struct intel_engine_cs *ring,
> >  		return true;
> >  	else if (i915.enable_execlists)
> >  		return true;
> > +	else if (i915_scheduler_is_enabled(ring->dev))
> > +		return true;
> >  	else if (obj->base.dma_buf &&
> >  		 !reservation_object_test_signaled_rcu(obj->base.dma_buf->resv,
> >  						       false))
> > 
> 
> Why can't we use mmio flips unconditionally?  Maarten or Ville?

We do when execlists are used, which is always on gen9+. So I guess I'm
missing the point of this patch. For gen5+ we could also do it trivially.

For older platforms it'd require a bit of work since we'd need to
complete the flips from the vblank interrupt. Well, we actually do
that already even with CS flips on those platforms, but we do look
at the flip pending interrupt to figure out if CS already issued
the flip or not. So that part would need changing.

I also think we should switch to using the vblank interrupt for this
stuff on all platforms, mainly since the flip done interrupt is somewhat
broken on at least BDW (no idea if it got fixed in SKL or later), and
doing things in more than one way certainly won't decrease our bug count.
Jesse Barnes Feb. 19, 2016, 8:01 p.m. UTC | #3
On 02/19/2016 11:53 AM, Ville Syrjälä wrote:
> On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:
>> On 02/18/2016 06:26 AM, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> MMIO flips are the preferred mechanism now but more importantly, pipe
>>> based flips cause issues for the scheduler. Specifically, submitting
>>> work to the rings around the side of the scheduler could cause that
>>> work to be lost if the scheduler generates a pre-emption event on that
>>> ring.
>>>
>>> For: VIZ-1587
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> ---
>>>  drivers/gpu/drm/i915/intel_display.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>>> index 6e12ed7..731d20a 100644
>>> --- a/drivers/gpu/drm/i915/intel_display.c
>>> +++ b/drivers/gpu/drm/i915/intel_display.c
>>> @@ -46,6 +46,7 @@
>>>  #include <linux/dma_remapping.h>
>>>  #include <linux/reservation.h>
>>>  #include <linux/dma-buf.h>
>>> +#include "i915_scheduler.h"
>>>  
>>>  /* Primary plane formats for gen <= 3 */
>>>  static const uint32_t i8xx_primary_formats[] = {
>>> @@ -11330,6 +11331,8 @@ static bool use_mmio_flip(struct intel_engine_cs *ring,
>>>  		return true;
>>>  	else if (i915.enable_execlists)
>>>  		return true;
>>> +	else if (i915_scheduler_is_enabled(ring->dev))
>>> +		return true;
>>>  	else if (obj->base.dma_buf &&
>>>  		 !reservation_object_test_signaled_rcu(obj->base.dma_buf->resv,
>>>  						       false))
>>>
>>
>> Why can't we use mmio flips unconditionally?  Maarten or Ville?
> 
> We do when execlists are used, which is always on gen9+. So I guess I'm
> missing the point of this patch. For gen5+ we could also do it trivially.

didn't check if the scheduler is also enabled for gen8 (I guess it would be nice, that would cover BDW and BSW).

> 
> For older platforms it'd require a bit of work since we'd need to
> complete the flips from the vblank interrupt. Well, we actually do
> that already even with CS flips on those platforms, but we do look
> at the flip pending interrupt to figure out if CS already issued
> the flip or not. So that part would need changing.
> 
> I also think we should switch to using the vblank interrupt for this
> stuff on all platforms, mainly since the flip done interrupt is somewhat
> broken on at least BDW (no idea if it got fixed in SKL or later), and
> doing things in more than one way certainly won't decrease our bug count.

Yeah that's probably the way to go; I haven't checked the behavior on SKL either.
Chris Wilson Feb. 20, 2016, 9:22 a.m. UTC | #4
On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:
> On 02/18/2016 06:26 AM, John.C.Harrison@Intel.com wrote:
> > From: John Harrison <John.C.Harrison@Intel.com>
> > 
> > MMIO flips are the preferred mechanism now

Because introducing variable latency in waking up a big core is a good
idea?

> but more importantly, pipe
> > based flips cause issues for the scheduler. Specifically, submitting
> > work to the rings around the side of the scheduler could cause that
> > work to be lost if the scheduler generates a pre-emption event on that
> > ring.

No. That is just incredibily bad design.

> Why can't we use mmio flips unconditionally?  Maarten or Ville?

Why would we want to? CS flips work just fine in execlists and no reason
was ever given as to why they were not enabled, just laziness.
-Chris
Lankhorst, Maarten Feb. 22, 2016, 9:41 a.m. UTC | #5
Hey,


Jesse Barnes schreef op vr 19-02-2016 om 12:01 [-0800]:
> On 02/19/2016 11:53 AM, Ville Syrjälä wrote:

> > On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:

> >> On 02/18/2016 06:26 AM, John.C.Harrison@Intel.com wrote:

> >>> From: John Harrison <John.C.Harrison@Intel.com>

> >>>

> >>> MMIO flips are the preferred mechanism now but more importantly, pipe

> >>> based flips cause issues for the scheduler. Specifically, submitting

> >>> work to the rings around the side of the scheduler could cause that

> >>> work to be lost if the scheduler generates a pre-emption event on that

> >>> ring.

> >>>

> >>> For: VIZ-1587

> >>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>

> >>> ---

> >>>  drivers/gpu/drm/i915/intel_display.c | 3 +++

> >>>  1 file changed, 3 insertions(+)

> >>>

> >>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c

> >>> index 6e12ed7..731d20a 100644

> >>> --- a/drivers/gpu/drm/i915/intel_display.c

> >>> +++ b/drivers/gpu/drm/i915/intel_display.c

> >>> @@ -46,6 +46,7 @@

> >>>  #include <linux/dma_remapping.h>

> >>>  #include <linux/reservation.h>

> >>>  #include <linux/dma-buf.h>

> >>> +#include "i915_scheduler.h"

> >>>  

> >>>  /* Primary plane formats for gen <= 3 */

> >>>  static const uint32_t i8xx_primary_formats[] = {

> >>> @@ -11330,6 +11331,8 @@ static bool use_mmio_flip(struct intel_engine_cs *ring,

> >>>  		return true;

> >>>  	else if (i915.enable_execlists)

> >>>  		return true;

> >>> +	else if (i915_scheduler_is_enabled(ring->dev))

> >>> +		return true;

> >>>  	else if (obj->base.dma_buf &&

> >>>  		 !reservation_object_test_signaled_rcu(obj->base.dma_buf->resv,

> >>>  						       false))

> >>>

> >>

> >> Why can't we use mmio flips unconditionally?  Maarten or Ville?

> > 

> > We do when execlists are used, which is always on gen9+. So I guess I'm

> > missing the point of this patch. For gen5+ we could also do it trivially.

> 

> didn't check if the scheduler is also enabled for gen8 (I guess it would be nice, that would cover BDW and BSW).


I have a patch that would enable support for mmio flips on all
platforms, but because of Chris Wilson's objections I still didn't force
mmio flips by default.

> > 

> > For older platforms it'd require a bit of work since we'd need to

> > complete the flips from the vblank interrupt. Well, we actually do

> > that already even with CS flips on those platforms, but we do look

> > at the flip pending interrupt to figure out if CS already issued

> > the flip or not. So that part would need changing.

It was easy to fix in a way similar to that.

> > I also think we should switch to using the vblank interrupt for this

> > stuff on all platforms, mainly since the flip done interrupt is somewhat

> > broken on at least BDW (no idea if it got fixed in SKL or later), and

> > doing things in more than one way certainly won't decrease our bug count.

> 

> Yeah that's probably the way to go; I haven't checked the behavior on SKL either.


---------------------------------------------------------------------
Intel International B.V.
Registered in The Netherlands under number 34098535
Statutory seat: Haarlemmermeer
Registered address: Capronilaan 37, 1119NG Schiphol-Rijk

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
John Harrison Feb. 22, 2016, 12:53 p.m. UTC | #6
So is the summary that currently MMIO flips do not work on some platforms?

That could be a problem because the scheduler is intended to be enabled 
for everything and thus will be forcing MMIO flips on everything. So 
should I reverse the logic here - only enable the scheduler if MMIO 
flips are already enabled?

Thanks,
John.

On 22/02/2016 09:41, Lankhorst, Maarten wrote:
> Hey,
>
>
> Jesse Barnes schreef op vr 19-02-2016 om 12:01 [-0800]:
>> On 02/19/2016 11:53 AM, Ville Syrjälä wrote:
>>> On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:
>>>> On 02/18/2016 06:26 AM, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> MMIO flips are the preferred mechanism now but more importantly, pipe
>>>>> based flips cause issues for the scheduler. Specifically, submitting
>>>>> work to the rings around the side of the scheduler could cause that
>>>>> work to be lost if the scheduler generates a pre-emption event on that
>>>>> ring.
>>>>>
>>>>> For: VIZ-1587
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> ---
>>>>>   drivers/gpu/drm/i915/intel_display.c | 3 +++
>>>>>   1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>>>>> index 6e12ed7..731d20a 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_display.c
>>>>> +++ b/drivers/gpu/drm/i915/intel_display.c
>>>>> @@ -46,6 +46,7 @@
>>>>>   #include <linux/dma_remapping.h>
>>>>>   #include <linux/reservation.h>
>>>>>   #include <linux/dma-buf.h>
>>>>> +#include "i915_scheduler.h"
>>>>>   
>>>>>   /* Primary plane formats for gen <= 3 */
>>>>>   static const uint32_t i8xx_primary_formats[] = {
>>>>> @@ -11330,6 +11331,8 @@ static bool use_mmio_flip(struct intel_engine_cs *ring,
>>>>>   		return true;
>>>>>   	else if (i915.enable_execlists)
>>>>>   		return true;
>>>>> +	else if (i915_scheduler_is_enabled(ring->dev))
>>>>> +		return true;
>>>>>   	else if (obj->base.dma_buf &&
>>>>>   		 !reservation_object_test_signaled_rcu(obj->base.dma_buf->resv,
>>>>>   						       false))
>>>>>
>>>> Why can't we use mmio flips unconditionally?  Maarten or Ville?
>>> We do when execlists are used, which is always on gen9+. So I guess I'm
>>> missing the point of this patch. For gen5+ we could also do it trivially.
>> didn't check if the scheduler is also enabled for gen8 (I guess it would be nice, that would cover BDW and BSW).
> I have a patch that would enable support for mmio flips on all
> platforms, but because of Chris Wilson's objections I still didn't force
> mmio flips by default.
>
>>> For older platforms it'd require a bit of work since we'd need to
>>> complete the flips from the vblank interrupt. Well, we actually do
>>> that already even with CS flips on those platforms, but we do look
>>> at the flip pending interrupt to figure out if CS already issued
>>> the flip or not. So that part would need changing.
> It was easy to fix in a way similar to that.
>
>>> I also think we should switch to using the vblank interrupt for this
>>> stuff on all platforms, mainly since the flip done interrupt is somewhat
>>> broken on at least BDW (no idea if it got fixed in SKL or later), and
>>> doing things in more than one way certainly won't decrease our bug count.
>> Yeah that's probably the way to go; I haven't checked the behavior on SKL either.
Jesse Barnes Feb. 22, 2016, 8:42 p.m. UTC | #7
On 02/20/2016 01:22 AM, Chris Wilson wrote:
> On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:
>> On 02/18/2016 06:26 AM, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> MMIO flips are the preferred mechanism now
> 
> Because introducing variable latency in waking up a big core is a good
> idea?

Is the latency variability really that bad?  I'm not too concerned about waking up the core either, I think it's going to be hit up quite a bit in these situations anyway, and vblanks should be driving the rendering as well.

>> but more importantly, pipe
>>> based flips cause issues for the scheduler. Specifically, submitting
>>> work to the rings around the side of the scheduler could cause that
>>> work to be lost if the scheduler generates a pre-emption event on that
>>> ring.
> 
> No. That is just incredibily bad design.
> 
>> Why can't we use mmio flips unconditionally?  Maarten or Ville?
> 
> Why would we want to? CS flips work just fine in execlists and no reason
> was ever given as to why they were not enabled, just laziness.

I'm not saying it can't be done, but I thought with atomic we decided to go with mmio because it makes things a lot simpler (iirc Ville's design did that way back?).

I'm fine with either, but it seems like we should settle on one rather than trying to maintain two different ways of flipping.  We'll have work to do either way.

Jesse
Chris Wilson Feb. 23, 2016, 11:16 a.m. UTC | #8
On Mon, Feb 22, 2016 at 12:42:48PM -0800, Jesse Barnes wrote:
> On 02/20/2016 01:22 AM, Chris Wilson wrote:
> > On Fri, Feb 19, 2016 at 11:28:05AM -0800, Jesse Barnes wrote:
> >> On 02/18/2016 06:26 AM, John.C.Harrison@Intel.com wrote:
> >>> From: John Harrison <John.C.Harrison@Intel.com>
> >>>
> >>> MMIO flips are the preferred mechanism now
> > 
> > Because introducing variable latency in waking up a big core is a good
> > idea?
> 
> Is the latency variability really that bad?  I'm not too concerned about waking up the core either, I think it's going to be hit up quite a bit in these situations anyway, and vblanks should be driving the rendering as well.

The workqueue we use is subject to being preempted by the user, so the
latency can be made arbitrarily large. Also, we don't control what tasks
are ahead of the flip.
 
> >> but more importantly, pipe
> >>> based flips cause issues for the scheduler. Specifically, submitting
> >>> work to the rings around the side of the scheduler could cause that
> >>> work to be lost if the scheduler generates a pre-emption event on that
> >>> ring.
> > 
> > No. That is just incredibily bad design.
> > 
> >> Why can't we use mmio flips unconditionally?  Maarten or Ville?
> > 
> > Why would we want to? CS flips work just fine in execlists and no reason
> > was ever given as to why they were not enabled, just laziness.
> 
> I'm not saying it can't be done, but I thought with atomic we decided to go with mmio because it makes things a lot simpler (iirc Ville's design did that way back?).

My point is that the interface to the scheduler excludes even the
possibility of doing so because it is not cleanly segrated into a tool
for determining the next request to execute but instead subsumes half of
GEM and half of the engine backend becoming another unwanted midlayer
incorporating its own policy. The scheduler doesn't even deal in
requests, which is what prevents us from constructing a request here
with the known dependency and passing it to the scheduler for execution.

As I saw it, CS flips would be one of the best demonstrations for a
scheduler - because to execute the flip within the deadline should
cause a priority boost for all the prerequisites, and then ideally
reduce the number of missed target msc under load. (Though we can
equally promote the last_write_request through another scheduler
interface.)
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 6e12ed7..731d20a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -46,6 +46,7 @@ 
 #include <linux/dma_remapping.h>
 #include <linux/reservation.h>
 #include <linux/dma-buf.h>
+#include "i915_scheduler.h"
 
 /* Primary plane formats for gen <= 3 */
 static const uint32_t i8xx_primary_formats[] = {
@@ -11330,6 +11331,8 @@  static bool use_mmio_flip(struct intel_engine_cs *ring,
 		return true;
 	else if (i915.enable_execlists)
 		return true;
+	else if (i915_scheduler_is_enabled(ring->dev))
+		return true;
 	else if (obj->base.dma_buf &&
 		 !reservation_object_test_signaled_rcu(obj->base.dma_buf->resv,
 						       false))