diff mbox

[v2] drm/i915: Fallback to using unmappable memory for scanout

Message ID 1426764580-26197-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson March 19, 2015, 11:29 a.m. UTC
The existing ABI says that scanouts are pinned into the mappable region
so that legacy clients (e.g. old Xorg or plymouthd) can write directly
into the scanout through a GTT mapping. However if the surface does not
fit into the mappable region, we are better off just trying to fit it
anywhere and hoping for the best. (Any userspace that is cappable of
using ginormous scanouts is also likely not to rely on pure GTT
updates.) In the future, there may even be a kernel mediated method for
the legacy clients.

v2: Skip fence pinning when not mappable.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c      |  7 ++++++-
 drivers/gpu/drm/i915/intel_display.c | 23 +++++++++++++----------
 2 files changed, 19 insertions(+), 11 deletions(-)

Comments

deepak.s@linux.intel.com March 19, 2015, 1:01 p.m. UTC | #1
On Thursday 19 March 2015 04:59 PM, Chris Wilson wrote:
> The existing ABI says that scanouts are pinned into the mappable region
> so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> into the scanout through a GTT mapping. However if the surface does not
> fit into the mappable region, we are better off just trying to fit it
> anywhere and hoping for the best. (Any userspace that is cappable of
> using ginormous scanouts is also likely not to rely on pure GTT
> updates.) In the future, there may even be a kernel mediated method for
> the legacy clients.
>
> v2: Skip fence pinning when not mappable.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
> Cc: Deepak S <deepak.s@linux.intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>   drivers/gpu/drm/i915/i915_gem.c      |  7 ++++++-
>   drivers/gpu/drm/i915/intel_display.c | 23 +++++++++++++----------
>   2 files changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9e498e0bbf22..9a1de848e450 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4034,10 +4034,15 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>   
>   	/* As the user may map the buffer once pinned in the display plane
>   	 * (e.g. libkms for the bootup splash), we have to ensure that we
> -	 * always use map_and_fenceable for all scanout buffers.
> +	 * always use map_and_fenceable for all scanout buffers. However,
> +	 * it may simply be too big to fit into mappable, in which case
> +	 * put it anyway and hope that userspace can cope (but always first
> +	 * try to preserve the existing ABI).
>   	 */
>   	ret = i915_gem_obj_ggtt_pin(obj, alignment, PIN_MAPPABLE);
>   	if (ret)
> +		ret = i915_gem_obj_ggtt_pin(obj, alignment, 0);
> +	if (ret)
>   		goto err_unpin_display;
>   
>   	i915_gem_object_flush_cpu_write_domain(obj);
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index d621ebecd33e..628aace63b43 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2308,16 +2308,18 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
>   	if (ret)
>   		goto err_interruptible;
>   
> -	/* Install a fence for tiled scan-out. Pre-i965 always needs a
> -	 * fence, whereas 965+ only requires a fence if using
> -	 * framebuffer compression.  For simplicity, we always install
> -	 * a fence as the cost is not that onerous.
> -	 */
> -	ret = i915_gem_object_get_fence(obj);
> -	if (ret)
> -		goto err_unpin;
> +	if (obj->map_and_fenceable) {
> +		/* Install a fence for tiled scan-out. Pre-i965 always needs a
> +		 * fence, whereas 965+ only requires a fence if using
> +		 * framebuffer compression.  For simplicity, we always, when
> +		 * possible, install a fence as the cost is not that onerous.
> +		 */
> +		ret = i915_gem_object_get_fence(obj);
> +		if (ret)
> +			goto err_unpin;
>   
> -	i915_gem_object_pin_fence(obj);
> +		i915_gem_object_pin_fence(obj);
> +	}
>   
>   	dev_priv->mm.interruptible = true;
>   	intel_runtime_pm_put(dev_priv);
> @@ -2335,7 +2337,8 @@ static void intel_unpin_fb_obj(struct drm_i915_gem_object *obj)
>   {
>   	WARN_ON(!mutex_is_locked(&obj->base.dev->struct_mutex));
>   
> -	i915_gem_object_unpin_fence(obj);
> +	if (obj->map_and_fenceable)
> +		i915_gem_object_unpin_fence(obj);
>   	i915_gem_object_unpin_from_display_plane(obj);
>   }
>   

should we skip put_fence in overlay_do_put_image ?
Chris Wilson March 19, 2015, 1:10 p.m. UTC | #2
On Thu, Mar 19, 2015 at 06:31:04PM +0530, Deepak S wrote:
> should we skip put_fence in overlay_do_put_image ?

Ah interesting point you raise there. That is buggy code fullstop.
We should not be call put_fence if pin_to_display_plane pins the fence.
Techinically the overlay could use a fence, the restriction is more to
do with an artificial limitation on the overlay API, and so the actual
call to i915_gem_object_put_fence() can be removed without any ill
side-effects. Something for the next time someone considers gen2-4 code.
-Chris
deepak.s@linux.intel.com March 19, 2015, 1:13 p.m. UTC | #3
On Thursday 19 March 2015 06:40 PM, Chris Wilson wrote:
> On Thu, Mar 19, 2015 at 06:31:04PM +0530, Deepak S wrote:
>> should we skip put_fence in overlay_do_put_image ?
> Ah interesting point you raise there. That is buggy code fullstop.
> We should not be call put_fence if pin_to_display_plane pins the fence.
> Techinically the overlay could use a fence, the restriction is more to
> do with an artificial limitation on the overlay API, and so the actual
> call to i915_gem_object_put_fence() can be removed without any ill
> side-effects. Something for the next time someone considers gen2-4 code.
> -Chris
>
:) Ok got it
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Daniel Vetter March 19, 2015, 4:34 p.m. UTC | #4
On Thu, Mar 19, 2015 at 01:10:13PM +0000, Chris Wilson wrote:
> On Thu, Mar 19, 2015 at 06:31:04PM +0530, Deepak S wrote:
> > should we skip put_fence in overlay_do_put_image ?
> 
> Ah interesting point you raise there. That is buggy code fullstop.
> We should not be call put_fence if pin_to_display_plane pins the fence.
> Techinically the overlay could use a fence, the restriction is more to
> do with an artificial limitation on the overlay API, and so the actual
> call to i915_gem_object_put_fence() can be removed without any ill
> side-effects. Something for the next time someone considers gen2-4 code.

The put_fence in there is iirc to synchronize with the lazy tiling, just
in case. Or at least that's the story I remember.
-Daniel
Daniel Vetter March 19, 2015, 4:35 p.m. UTC | #5
On Thu, Mar 19, 2015 at 11:29:40AM +0000, Chris Wilson wrote:
> The existing ABI says that scanouts are pinned into the mappable region
> so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> into the scanout through a GTT mapping. However if the surface does not
> fit into the mappable region, we are better off just trying to fit it
> anywhere and hoping for the best. (Any userspace that is cappable of
> using ginormous scanouts is also likely not to rely on pure GTT
> updates.) In the future, there may even be a kernel mediated method for
> the legacy clients.
> 
> v2: Skip fence pinning when not mappable.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
> Cc: Deepak S <deepak.s@linux.intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_gem.c      |  7 ++++++-
>  drivers/gpu/drm/i915/intel_display.c | 23 +++++++++++++----------
>  2 files changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9e498e0bbf22..9a1de848e450 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4034,10 +4034,15 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  
>  	/* As the user may map the buffer once pinned in the display plane
>  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> -	 * always use map_and_fenceable for all scanout buffers.
> +	 * always use map_and_fenceable for all scanout buffers. However,
> +	 * it may simply be too big to fit into mappable, in which case
> +	 * put it anyway and hope that userspace can cope (but always first
> +	 * try to preserve the existing ABI).
>  	 */
>  	ret = i915_gem_obj_ggtt_pin(obj, alignment, PIN_MAPPABLE);
>  	if (ret)
> +		ret = i915_gem_obj_ggtt_pin(obj, alignment, 0);
> +	if (ret)
>  		goto err_unpin_display;
>  
>  	i915_gem_object_flush_cpu_write_domain(obj);
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index d621ebecd33e..628aace63b43 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2308,16 +2308,18 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
>  	if (ret)
>  		goto err_interruptible;
>  
> -	/* Install a fence for tiled scan-out. Pre-i965 always needs a
> -	 * fence, whereas 965+ only requires a fence if using
> -	 * framebuffer compression.  For simplicity, we always install
> -	 * a fence as the cost is not that onerous.
> -	 */
> -	ret = i915_gem_object_get_fence(obj);
> -	if (ret)
> -		goto err_unpin;
> +	if (obj->map_and_fenceable) {
> +		/* Install a fence for tiled scan-out. Pre-i965 always needs a
> +		 * fence, whereas 965+ only requires a fence if using
> +		 * framebuffer compression.  For simplicity, we always, when
> +		 * possible, install a fence as the cost is not that onerous.
> +		 */
> +		ret = i915_gem_object_get_fence(obj);
> +		if (ret)
> +			goto err_unpin;

FBC still assumes that a fence is there (and with Paulo's recent rework
that's made even more explicit). I think we need a change in the fbc
frontbuffer tracking integration to not filter out GTT invalidates if the
buffer isn't mappable. Paulo?
-Daniel
Lespiau, Damien March 19, 2015, 4:39 p.m. UTC | #6
On Thu, Mar 19, 2015 at 11:29:40AM +0000, Chris Wilson wrote:
> The existing ABI says that scanouts are pinned into the mappable region
> so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> into the scanout through a GTT mapping. However if the surface does not
> fit into the mappable region, we are better off just trying to fit it
> anywhere and hoping for the best. (Any userspace that is cappable of
> using ginormous scanouts is also likely not to rely on pure GTT
> updates.) In the future, there may even be a kernel mediated method for
> the legacy clients.
> 
> v2: Skip fence pinning when not mappable.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
> Cc: Deepak S <deepak.s@linux.intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---

Hum, isn't that still overlooking that we can't put the framebuffers at
the end of the GGTT?
Chris Wilson March 19, 2015, 4:50 p.m. UTC | #7
On Thu, Mar 19, 2015 at 05:35:17PM +0100, Daniel Vetter wrote:
> On Thu, Mar 19, 2015 at 11:29:40AM +0000, Chris Wilson wrote:
> > The existing ABI says that scanouts are pinned into the mappable region
> > so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> > into the scanout through a GTT mapping. However if the surface does not
> > fit into the mappable region, we are better off just trying to fit it
> > anywhere and hoping for the best. (Any userspace that is cappable of
> > using ginormous scanouts is also likely not to rely on pure GTT
> > updates.) In the future, there may even be a kernel mediated method for
> > the legacy clients.
> > 
> > v2: Skip fence pinning when not mappable.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
> > Cc: Deepak S <deepak.s@linux.intel.com>
> > Cc: Damien Lespiau <damien.lespiau@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c      |  7 ++++++-
> >  drivers/gpu/drm/i915/intel_display.c | 23 +++++++++++++----------
> >  2 files changed, 19 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 9e498e0bbf22..9a1de848e450 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4034,10 +4034,15 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> >  
> >  	/* As the user may map the buffer once pinned in the display plane
> >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > -	 * always use map_and_fenceable for all scanout buffers.
> > +	 * always use map_and_fenceable for all scanout buffers. However,
> > +	 * it may simply be too big to fit into mappable, in which case
> > +	 * put it anyway and hope that userspace can cope (but always first
> > +	 * try to preserve the existing ABI).
> >  	 */
> >  	ret = i915_gem_obj_ggtt_pin(obj, alignment, PIN_MAPPABLE);
> >  	if (ret)
> > +		ret = i915_gem_obj_ggtt_pin(obj, alignment, 0);
> > +	if (ret)
> >  		goto err_unpin_display;
> >  
> >  	i915_gem_object_flush_cpu_write_domain(obj);
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index d621ebecd33e..628aace63b43 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -2308,16 +2308,18 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
> >  	if (ret)
> >  		goto err_interruptible;
> >  
> > -	/* Install a fence for tiled scan-out. Pre-i965 always needs a
> > -	 * fence, whereas 965+ only requires a fence if using
> > -	 * framebuffer compression.  For simplicity, we always install
> > -	 * a fence as the cost is not that onerous.
> > -	 */
> > -	ret = i915_gem_object_get_fence(obj);
> > -	if (ret)
> > -		goto err_unpin;
> > +	if (obj->map_and_fenceable) {
> > +		/* Install a fence for tiled scan-out. Pre-i965 always needs a
> > +		 * fence, whereas 965+ only requires a fence if using
> > +		 * framebuffer compression.  For simplicity, we always, when
> > +		 * possible, install a fence as the cost is not that onerous.
> > +		 */
> > +		ret = i915_gem_object_get_fence(obj);
> > +		if (ret)
> > +			goto err_unpin;
> 
> FBC still assumes that a fence is there (and with Paulo's recent rework
> that's made even more explicit). I think we need a change in the fbc
> frontbuffer tracking integration to not filter out GTT invalidates if the
> buffer isn't mappable. Paulo?

I checked that we have such a check in the fbc enable code. I think if
we have a framebuffer that won't fit in the GTT, we are reasonably sure
it won't be FBC compatible. On the other hand, if we have 4
framebuffers...

if (obj->tiling_mode != I915_TILING_X ||
    obj->fence_reg == I915_FENCE_REG_NONE) {
	if (set_no_fbc_reason(dev_priv, FBC_NOT_TILED))
		DRM_DEBUG_KMS("framebuffer not tiled or fenced, disabling compression\n");

I think it is preferrable that the system continues to run in a degraded
mode in such circumstances than fail entirely.
-Chris
Chris Wilson March 19, 2015, 4:51 p.m. UTC | #8
On Thu, Mar 19, 2015 at 05:34:09PM +0100, Daniel Vetter wrote:
> On Thu, Mar 19, 2015 at 01:10:13PM +0000, Chris Wilson wrote:
> > On Thu, Mar 19, 2015 at 06:31:04PM +0530, Deepak S wrote:
> > > should we skip put_fence in overlay_do_put_image ?
> > 
> > Ah interesting point you raise there. That is buggy code fullstop.
> > We should not be call put_fence if pin_to_display_plane pins the fence.
> > Techinically the overlay could use a fence, the restriction is more to
> > do with an artificial limitation on the overlay API, and so the actual
> > call to i915_gem_object_put_fence() can be removed without any ill
> > side-effects. Something for the next time someone considers gen2-4 code.
> 
> The put_fence in there is iirc to synchronize with the lazy tiling, just
> in case. Or at least that's the story I remember.

Right, that's before we then did the pin_fence. We should have caught
the conflict then. Fortunately, it should just come out in the wash now.
-Chris
Chris Wilson March 19, 2015, 4:54 p.m. UTC | #9
On Thu, Mar 19, 2015 at 04:39:06PM +0000, Damien Lespiau wrote:
> On Thu, Mar 19, 2015 at 11:29:40AM +0000, Chris Wilson wrote:
> > The existing ABI says that scanouts are pinned into the mappable region
> > so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> > into the scanout through a GTT mapping. However if the surface does not
> > fit into the mappable region, we are better off just trying to fit it
> > anywhere and hoping for the best. (Any userspace that is cappable of
> > using ginormous scanouts is also likely not to rely on pure GTT
> > updates.) In the future, there may even be a kernel mediated method for
> > the legacy clients.
> > 
> > v2: Skip fence pinning when not mappable.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
> > Cc: Deepak S <deepak.s@linux.intel.com>
> > Cc: Damien Lespiau <damien.lespiau@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> 
> Hum, isn't that still overlooking that we can't put the framebuffers at
> the end of the GGTT?

Yes. We don't have the interface yet. When we do we can apply the vtd
requirements as well..
-Chris
Shuang He March 19, 2015, 9:57 p.m. UTC | #10
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6006
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                 -1              272/272              271/272
ILK                                  301/301              301/301
SNB                                  303/303              303/303
IVB                 -1              342/342              341/342
BYT                                  287/287              287/287
HSW                                  362/362              362/362
BDW                                  308/308              308/308
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*PNV  igt_gem_userptr_blits_minor-unsync-interruptible      PASS(3)      DMESG_WARN(1)PASS(1)
*IVB  igt_gem_storedw_batches_loop_normal      PASS(2)      DMESG_WARN(1)PASS(1)
Note: You need to pay more attention to line start with '*'
Daniel Vetter March 20, 2015, 10:29 a.m. UTC | #11
On Thu, Mar 19, 2015 at 04:50:22PM +0000, Chris Wilson wrote:
> On Thu, Mar 19, 2015 at 05:35:17PM +0100, Daniel Vetter wrote:
> > On Thu, Mar 19, 2015 at 11:29:40AM +0000, Chris Wilson wrote:
> > > The existing ABI says that scanouts are pinned into the mappable region
> > > so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> > > into the scanout through a GTT mapping. However if the surface does not
> > > fit into the mappable region, we are better off just trying to fit it
> > > anywhere and hoping for the best. (Any userspace that is cappable of
> > > using ginormous scanouts is also likely not to rely on pure GTT
> > > updates.) In the future, there may even be a kernel mediated method for
> > > the legacy clients.
> > > 
> > > v2: Skip fence pinning when not mappable.
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
> > > Cc: Deepak S <deepak.s@linux.intel.com>
> > > Cc: Damien Lespiau <damien.lespiau@intel.com>
> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c      |  7 ++++++-
> > >  drivers/gpu/drm/i915/intel_display.c | 23 +++++++++++++----------
> > >  2 files changed, 19 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 9e498e0bbf22..9a1de848e450 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -4034,10 +4034,15 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > >  
> > >  	/* As the user may map the buffer once pinned in the display plane
> > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > -	 * always use map_and_fenceable for all scanout buffers.
> > > +	 * always use map_and_fenceable for all scanout buffers. However,
> > > +	 * it may simply be too big to fit into mappable, in which case
> > > +	 * put it anyway and hope that userspace can cope (but always first
> > > +	 * try to preserve the existing ABI).
> > >  	 */
> > >  	ret = i915_gem_obj_ggtt_pin(obj, alignment, PIN_MAPPABLE);
> > >  	if (ret)
> > > +		ret = i915_gem_obj_ggtt_pin(obj, alignment, 0);
> > > +	if (ret)
> > >  		goto err_unpin_display;
> > >  
> > >  	i915_gem_object_flush_cpu_write_domain(obj);
> > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > > index d621ebecd33e..628aace63b43 100644
> > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > @@ -2308,16 +2308,18 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
> > >  	if (ret)
> > >  		goto err_interruptible;
> > >  
> > > -	/* Install a fence for tiled scan-out. Pre-i965 always needs a
> > > -	 * fence, whereas 965+ only requires a fence if using
> > > -	 * framebuffer compression.  For simplicity, we always install
> > > -	 * a fence as the cost is not that onerous.
> > > -	 */
> > > -	ret = i915_gem_object_get_fence(obj);
> > > -	if (ret)
> > > -		goto err_unpin;
> > > +	if (obj->map_and_fenceable) {
> > > +		/* Install a fence for tiled scan-out. Pre-i965 always needs a
> > > +		 * fence, whereas 965+ only requires a fence if using
> > > +		 * framebuffer compression.  For simplicity, we always, when
> > > +		 * possible, install a fence as the cost is not that onerous.
> > > +		 */
> > > +		ret = i915_gem_object_get_fence(obj);
> > > +		if (ret)
> > > +			goto err_unpin;
> > 
> > FBC still assumes that a fence is there (and with Paulo's recent rework
> > that's made even more explicit). I think we need a change in the fbc
> > frontbuffer tracking integration to not filter out GTT invalidates if the
> > buffer isn't mappable. Paulo?
> 
> I checked that we have such a check in the fbc enable code. I think if
> we have a framebuffer that won't fit in the GTT, we are reasonably sure
> it won't be FBC compatible. On the other hand, if we have 4
> framebuffers...
> 
> if (obj->tiling_mode != I915_TILING_X ||
>     obj->fence_reg == I915_FENCE_REG_NONE) {
> 	if (set_no_fbc_reason(dev_priv, FBC_NOT_TILED))
> 		DRM_DEBUG_KMS("framebuffer not tiled or fenced, disabling compression\n");
> 
> I think it is preferrable that the system continues to run in a degraded
> mode in such circumstances than fail entirely.

Oh right I've forgotten that fbc hw only works with X tiled and that we
use the fence_reg as a proxy. Adding a comment would be useful though.
-Daniel
Chris Wilson March 20, 2015, 10:49 a.m. UTC | #12
On Fri, Mar 20, 2015 at 11:29:25AM +0100, Daniel Vetter wrote:
> On Thu, Mar 19, 2015 at 04:50:22PM +0000, Chris Wilson wrote:
> > On Thu, Mar 19, 2015 at 05:35:17PM +0100, Daniel Vetter wrote:
> > > On Thu, Mar 19, 2015 at 11:29:40AM +0000, Chris Wilson wrote:
> > > > +	if (obj->map_and_fenceable) {
> > > > +		/* Install a fence for tiled scan-out. Pre-i965 always needs a
> > > > +		 * fence, whereas 965+ only requires a fence if using
> > > > +		 * framebuffer compression.  For simplicity, we always, when
> > > > +		 * possible, install a fence as the cost is not that onerous.

> Oh right I've forgotten that fbc hw only works with X tiled and that we
> use the fence_reg as a proxy. Adding a comment would be useful though.

* If we fail to fence the tiled scanout, then either the modeset will
* reject the change (which is highly unlikely as the affected systems,
* all but one, do not have unmappable space) or we will not be able to
* enable full powersaving techniques (also likely not to apply due to
* various limits FBC and the like impose on the size of the buffer,
* which presumably we violated anyway with this unmappable buffer).
* Anyway, it is presumably better to stumble onwards with something and
* try to run the system in a "less than optimal" mode that matches the
* user configuration.

> > > > +		 */
> > > > +		ret = i915_gem_object_get_fence(obj);
> > > > +		if (ret)
> > > > +			goto err_unpin;
Daniel Vetter March 20, 2015, 2:36 p.m. UTC | #13
On Fri, Mar 20, 2015 at 10:49:19AM +0000, Chris Wilson wrote:
> On Fri, Mar 20, 2015 at 11:29:25AM +0100, Daniel Vetter wrote:
> > On Thu, Mar 19, 2015 at 04:50:22PM +0000, Chris Wilson wrote:
> > > On Thu, Mar 19, 2015 at 05:35:17PM +0100, Daniel Vetter wrote:
> > > > On Thu, Mar 19, 2015 at 11:29:40AM +0000, Chris Wilson wrote:
> > > > > +	if (obj->map_and_fenceable) {
> > > > > +		/* Install a fence for tiled scan-out. Pre-i965 always needs a
> > > > > +		 * fence, whereas 965+ only requires a fence if using
> > > > > +		 * framebuffer compression.  For simplicity, we always, when
> > > > > +		 * possible, install a fence as the cost is not that onerous.
> 
> > Oh right I've forgotten that fbc hw only works with X tiled and that we
> > use the fence_reg as a proxy. Adding a comment would be useful though.
> 
> * If we fail to fence the tiled scanout, then either the modeset will
> * reject the change (which is highly unlikely as the affected systems,
> * all but one, do not have unmappable space) or we will not be able to
> * enable full powersaving techniques (also likely not to apply due to
> * various limits FBC and the like impose on the size of the buffer,
> * which presumably we violated anyway with this unmappable buffer).
> * Anyway, it is presumably better to stumble onwards with something and
> * try to run the system in a "less than optimal" mode that matches the
> * user configuration.

I actually thought of a comment in the obj->fence_reg check in the fbc
code that we now can have frontbuffers lacking fences. And that the check
isn't just a proxy check for x-tiled anymore.

Just to avoid someone replacing the obj->fence_reg check with a
obj->tiling_mode == X_TILING check somewhen in the future.

Not fencing here is imo clear, since iirc on gen3 fences in the unmappable
part are not allowed.
-Daniel
Ville Syrjala March 20, 2015, 2:52 p.m. UTC | #14
On Fri, Mar 20, 2015 at 11:29:25AM +0100, Daniel Vetter wrote:
> On Thu, Mar 19, 2015 at 04:50:22PM +0000, Chris Wilson wrote:
> > On Thu, Mar 19, 2015 at 05:35:17PM +0100, Daniel Vetter wrote:
> > > On Thu, Mar 19, 2015 at 11:29:40AM +0000, Chris Wilson wrote:
> > > > The existing ABI says that scanouts are pinned into the mappable region
> > > > so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> > > > into the scanout through a GTT mapping. However if the surface does not
> > > > fit into the mappable region, we are better off just trying to fit it
> > > > anywhere and hoping for the best. (Any userspace that is cappable of
> > > > using ginormous scanouts is also likely not to rely on pure GTT
> > > > updates.) In the future, there may even be a kernel mediated method for
> > > > the legacy clients.
> > > > 
> > > > v2: Skip fence pinning when not mappable.
> > > > 
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
> > > > Cc: Deepak S <deepak.s@linux.intel.com>
> > > > Cc: Damien Lespiau <damien.lespiau@intel.com>
> > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_gem.c      |  7 ++++++-
> > > >  drivers/gpu/drm/i915/intel_display.c | 23 +++++++++++++----------
> > > >  2 files changed, 19 insertions(+), 11 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 9e498e0bbf22..9a1de848e450 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -4034,10 +4034,15 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> > > >  
> > > >  	/* As the user may map the buffer once pinned in the display plane
> > > >  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> > > > -	 * always use map_and_fenceable for all scanout buffers.
> > > > +	 * always use map_and_fenceable for all scanout buffers. However,
> > > > +	 * it may simply be too big to fit into mappable, in which case
> > > > +	 * put it anyway and hope that userspace can cope (but always first
> > > > +	 * try to preserve the existing ABI).
> > > >  	 */
> > > >  	ret = i915_gem_obj_ggtt_pin(obj, alignment, PIN_MAPPABLE);
> > > >  	if (ret)
> > > > +		ret = i915_gem_obj_ggtt_pin(obj, alignment, 0);
> > > > +	if (ret)
> > > >  		goto err_unpin_display;
> > > >  
> > > >  	i915_gem_object_flush_cpu_write_domain(obj);
> > > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > > > index d621ebecd33e..628aace63b43 100644
> > > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > > @@ -2308,16 +2308,18 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
> > > >  	if (ret)
> > > >  		goto err_interruptible;
> > > >  
> > > > -	/* Install a fence for tiled scan-out. Pre-i965 always needs a
> > > > -	 * fence, whereas 965+ only requires a fence if using
> > > > -	 * framebuffer compression.  For simplicity, we always install
> > > > -	 * a fence as the cost is not that onerous.
> > > > -	 */
> > > > -	ret = i915_gem_object_get_fence(obj);
> > > > -	if (ret)
> > > > -		goto err_unpin;
> > > > +	if (obj->map_and_fenceable) {
> > > > +		/* Install a fence for tiled scan-out. Pre-i965 always needs a
> > > > +		 * fence, whereas 965+ only requires a fence if using
> > > > +		 * framebuffer compression.  For simplicity, we always, when
> > > > +		 * possible, install a fence as the cost is not that onerous.
> > > > +		 */
> > > > +		ret = i915_gem_object_get_fence(obj);
> > > > +		if (ret)
> > > > +			goto err_unpin;
> > > 
> > > FBC still assumes that a fence is there (and with Paulo's recent rework
> > > that's made even more explicit). I think we need a change in the fbc
> > > frontbuffer tracking integration to not filter out GTT invalidates if the
> > > buffer isn't mappable. Paulo?
> > 
> > I checked that we have such a check in the fbc enable code. I think if
> > we have a framebuffer that won't fit in the GTT, we are reasonably sure
> > it won't be FBC compatible. On the other hand, if we have 4
> > framebuffers...
> > 
> > if (obj->tiling_mode != I915_TILING_X ||
> >     obj->fence_reg == I915_FENCE_REG_NONE) {
> > 	if (set_no_fbc_reason(dev_priv, FBC_NOT_TILED))
> > 		DRM_DEBUG_KMS("framebuffer not tiled or fenced, disabling compression\n");
> > 
> > I think it is preferrable that the system continues to run in a degraded
> > mode in such circumstances than fail entirely.
> 
> Oh right I've forgotten that fbc hw only works with X tiled and that we
> use the fence_reg as a proxy. Adding a comment would be useful though.

SKL supports FBC with Y tiled as well.
Jani Nikula March 25, 2015, 12:21 p.m. UTC | #15
On Thu, 19 Mar 2015, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> The existing ABI says that scanouts are pinned into the mappable region
> so that legacy clients (e.g. old Xorg or plymouthd) can write directly
> into the scanout through a GTT mapping. However if the surface does not
> fit into the mappable region, we are better off just trying to fit it
> anywhere and hoping for the best. (Any userspace that is cappable of
> using ginormous scanouts is also likely not to rely on pure GTT
> updates.) In the future, there may even be a kernel mediated method for
> the legacy clients.
>
> v2: Skip fence pinning when not mappable.

Chris, this patch conflicts with current nightly in a way that I'm not
comfortable resolving.

BR,
Jani.


>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
> Cc: Deepak S <deepak.s@linux.intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_gem.c      |  7 ++++++-
>  drivers/gpu/drm/i915/intel_display.c | 23 +++++++++++++----------
>  2 files changed, 19 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9e498e0bbf22..9a1de848e450 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4034,10 +4034,15 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
>  
>  	/* As the user may map the buffer once pinned in the display plane
>  	 * (e.g. libkms for the bootup splash), we have to ensure that we
> -	 * always use map_and_fenceable for all scanout buffers.
> +	 * always use map_and_fenceable for all scanout buffers. However,
> +	 * it may simply be too big to fit into mappable, in which case
> +	 * put it anyway and hope that userspace can cope (but always first
> +	 * try to preserve the existing ABI).
>  	 */
>  	ret = i915_gem_obj_ggtt_pin(obj, alignment, PIN_MAPPABLE);
>  	if (ret)
> +		ret = i915_gem_obj_ggtt_pin(obj, alignment, 0);
> +	if (ret)
>  		goto err_unpin_display;
>  
>  	i915_gem_object_flush_cpu_write_domain(obj);
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index d621ebecd33e..628aace63b43 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2308,16 +2308,18 @@ intel_pin_and_fence_fb_obj(struct drm_plane *plane,
>  	if (ret)
>  		goto err_interruptible;
>  
> -	/* Install a fence for tiled scan-out. Pre-i965 always needs a
> -	 * fence, whereas 965+ only requires a fence if using
> -	 * framebuffer compression.  For simplicity, we always install
> -	 * a fence as the cost is not that onerous.
> -	 */
> -	ret = i915_gem_object_get_fence(obj);
> -	if (ret)
> -		goto err_unpin;
> +	if (obj->map_and_fenceable) {
> +		/* Install a fence for tiled scan-out. Pre-i965 always needs a
> +		 * fence, whereas 965+ only requires a fence if using
> +		 * framebuffer compression.  For simplicity, we always, when
> +		 * possible, install a fence as the cost is not that onerous.
> +		 */
> +		ret = i915_gem_object_get_fence(obj);
> +		if (ret)
> +			goto err_unpin;
>  
> -	i915_gem_object_pin_fence(obj);
> +		i915_gem_object_pin_fence(obj);
> +	}
>  
>  	dev_priv->mm.interruptible = true;
>  	intel_runtime_pm_put(dev_priv);
> @@ -2335,7 +2337,8 @@ static void intel_unpin_fb_obj(struct drm_i915_gem_object *obj)
>  {
>  	WARN_ON(!mutex_is_locked(&obj->base.dev->struct_mutex));
>  
> -	i915_gem_object_unpin_fence(obj);
> +	if (obj->map_and_fenceable)
> +		i915_gem_object_unpin_fence(obj);
>  	i915_gem_object_unpin_from_display_plane(obj);
>  }
>  
> -- 
> 2.1.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9e498e0bbf22..9a1de848e450 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4034,10 +4034,15 @@  i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 
 	/* As the user may map the buffer once pinned in the display plane
 	 * (e.g. libkms for the bootup splash), we have to ensure that we
-	 * always use map_and_fenceable for all scanout buffers.
+	 * always use map_and_fenceable for all scanout buffers. However,
+	 * it may simply be too big to fit into mappable, in which case
+	 * put it anyway and hope that userspace can cope (but always first
+	 * try to preserve the existing ABI).
 	 */
 	ret = i915_gem_obj_ggtt_pin(obj, alignment, PIN_MAPPABLE);
 	if (ret)
+		ret = i915_gem_obj_ggtt_pin(obj, alignment, 0);
+	if (ret)
 		goto err_unpin_display;
 
 	i915_gem_object_flush_cpu_write_domain(obj);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index d621ebecd33e..628aace63b43 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2308,16 +2308,18 @@  intel_pin_and_fence_fb_obj(struct drm_plane *plane,
 	if (ret)
 		goto err_interruptible;
 
-	/* Install a fence for tiled scan-out. Pre-i965 always needs a
-	 * fence, whereas 965+ only requires a fence if using
-	 * framebuffer compression.  For simplicity, we always install
-	 * a fence as the cost is not that onerous.
-	 */
-	ret = i915_gem_object_get_fence(obj);
-	if (ret)
-		goto err_unpin;
+	if (obj->map_and_fenceable) {
+		/* Install a fence for tiled scan-out. Pre-i965 always needs a
+		 * fence, whereas 965+ only requires a fence if using
+		 * framebuffer compression.  For simplicity, we always, when
+		 * possible, install a fence as the cost is not that onerous.
+		 */
+		ret = i915_gem_object_get_fence(obj);
+		if (ret)
+			goto err_unpin;
 
-	i915_gem_object_pin_fence(obj);
+		i915_gem_object_pin_fence(obj);
+	}
 
 	dev_priv->mm.interruptible = true;
 	intel_runtime_pm_put(dev_priv);
@@ -2335,7 +2337,8 @@  static void intel_unpin_fb_obj(struct drm_i915_gem_object *obj)
 {
 	WARN_ON(!mutex_is_locked(&obj->base.dev->struct_mutex));
 
-	i915_gem_object_unpin_fence(obj);
+	if (obj->map_and_fenceable)
+		i915_gem_object_unpin_fence(obj);
 	i915_gem_object_unpin_from_display_plane(obj);
 }