diff mbox series

[v2,1/5] drm/rockchip: fix fb references in async update

Message ID 20190312022204.2775-2-helen.koike@collabora.com (mailing list archive)
State New, archived
Headers show
Series drm: Fix fb changes for async updates | expand

Commit Message

Helen Koike March 12, 2019, 2:21 a.m. UTC
In the case of async update, modifications are done in place, i.e. in the
current plane state, so the new_state is prepared and the new_state is
cleanup up (instead of the old_state, diferrently on what happen in a
normal sync update).
To cleanup the old_fb properly, it needs to be placed in the new_state
in the end of async_update, so cleanup call will unreference the old_fb
correctly.

Also, the previous code had a:

	plane_state = plane->funcs->atomic_duplicate_state(plane);
	...
	swap(plane_state, plane->state);

	if (plane->state->fb && plane->state->fb != new_state->fb) {
	...
	}

Which was wrong, as the fb were just assigned to be equal, so this if
statement nevers evaluates to true.

Another details is that the function drm_crtc_vblank_get() can only be
called when vop->is_enabled is true, otherwise it has no effect and
trows a WARN_ON().

Calling drm_atomic_set_fb_for_plane() (which get a referent of the new
fb and pus the old fb) is not required, as it is taken care by
drm_mode_cursor_universal() when calling
drm_atomic_helper_update_plane().

Signed-off-by: Helen Koike <helen.koike@collabora.com>

---
Hello,

I tested on the rockchip ficus v1.1 using igt plane_cursor_legacy and
kms_cursor_legacy and I didn't see any regressions.

Changes in v2: None

 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 42 ++++++++++++---------
 1 file changed, 24 insertions(+), 18 deletions(-)

Comments

Boris Brezillon March 12, 2019, 6:34 a.m. UTC | #1
On Mon, 11 Mar 2019 23:21:59 -0300
Helen Koike <helen.koike@collabora.com> wrote:

> In the case of async update, modifications are done in place, i.e. in the
> current plane state, so the new_state is prepared and the new_state is
> cleanup up (instead of the old_state, diferrently on what happen in a

  ^ cleaned up				^ differently (but maybe
"unlike what happens" is more appropriate here).

> normal sync update).
> To cleanup the old_fb properly, it needs to be placed in the new_state
> in the end of async_update, so cleanup call will unreference the old_fb
> correctly.
> 
> Also, the previous code had a:
> 
> 	plane_state = plane->funcs->atomic_duplicate_state(plane);
> 	...
> 	swap(plane_state, plane->state);
> 
> 	if (plane->state->fb && plane->state->fb != new_state->fb) {
> 	...
> 	}
> 
> Which was wrong, as the fb were just assigned to be equal, so this if
> statement nevers evaluates to true.
> 
> Another details is that the function drm_crtc_vblank_get() can only be
> called when vop->is_enabled is true, otherwise it has no effect and
> trows a WARN_ON().
> 
> Calling drm_atomic_set_fb_for_plane() (which get a referent of the new
> fb and pus the old fb) is not required, as it is taken care by
> drm_mode_cursor_universal() when calling
> drm_atomic_helper_update_plane().
> 
> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> 
> ---
> Hello,
> 
> I tested on the rockchip ficus v1.1 using igt plane_cursor_legacy and
> kms_cursor_legacy and I didn't see any regressions.
> 
> Changes in v2: None
> 
>  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 42 ++++++++++++---------
>  1 file changed, 24 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> index c7d4c6073ea5..a1ee8c156a7b 100644
> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>  					  struct drm_plane_state *new_state)
>  {
>  	struct vop *vop = to_vop(plane->state->crtc);
> -	struct drm_plane_state *plane_state;
> +	struct drm_framebuffer *old_fb = plane->state->fb;
>  
> -	plane_state = plane->funcs->atomic_duplicate_state(plane);
> -	plane_state->crtc_x = new_state->crtc_x;
> -	plane_state->crtc_y = new_state->crtc_y;
> -	plane_state->crtc_h = new_state->crtc_h;
> -	plane_state->crtc_w = new_state->crtc_w;
> -	plane_state->src_x = new_state->src_x;
> -	plane_state->src_y = new_state->src_y;
> -	plane_state->src_h = new_state->src_h;
> -	plane_state->src_w = new_state->src_w;
> -
> -	if (plane_state->fb != new_state->fb)
> -		drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
> -
> -	swap(plane_state, plane->state);
> -
> -	if (plane->state->fb && plane->state->fb != new_state->fb) {
> +	/*
> +	 * A scanout can still be occurring, so we can't drop the reference to
> +	 * the old framebuffer. To solve this we get a reference to old_fb and
> +	 * set a worker to release it later.

Hm, doesn't look like an async update to me if we have to wait for the
next VBLANK to happen to get the new content on the screen. Maybe we
should reject async updates when old_fb != new_fb in the rk
->async_check() hook. 

> +	 */
> +	if (vop->is_enabled &&
> +	    plane->state->fb && plane->state->fb != new_state->fb) {
>  		drm_framebuffer_get(plane->state->fb);
>  		WARN_ON(drm_crtc_vblank_get(plane->state->crtc) != 0);
>  		drm_flip_work_queue(&vop->fb_unref_work, plane->state->fb);
>  		set_bit(VOP_PENDING_FB_UNREF, &vop->pending);
>  	}

In any case, I think this should be called after
vop_plane_atomic_update() to prevent the situation where the VBLANK
event happens between this point and the following
vop_plane_atomic_update() call.

>  
> +	plane->state->crtc_x = new_state->crtc_x;
> +	plane->state->crtc_y = new_state->crtc_y;
> +	plane->state->crtc_h = new_state->crtc_h;
> +	plane->state->crtc_w = new_state->crtc_w;
> +	plane->state->src_x = new_state->src_x;
> +	plane->state->src_y = new_state->src_y;
> +	plane->state->src_h = new_state->src_h;
> +	plane->state->src_w = new_state->src_w;
> +	plane->state->fb = new_state->fb;

Any reason not to use swap() here and reference plane->state->fb
instead of new_state->fb after this point?

> +
>  	if (vop->is_enabled) {
>  		rockchip_drm_psr_inhibit_get_state(new_state->state);
>  		vop_plane_atomic_update(plane, plane->state);
> @@ -945,7 +946,12 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>  		rockchip_drm_psr_inhibit_put_state(new_state->state);
>  	}
>  
> -	plane->funcs->atomic_destroy_state(plane, plane_state);
> +	/*
> +	 * In async update we perform inplace modifications and release the
> +	 * new_state. The following is required so we release the reference of
> +	 * the old framebuffer.
> +	 */
> +	new_state->fb = old_fb;
>  }
>  
>  static const struct drm_plane_helper_funcs plane_helper_funcs = {
Daniel Vetter March 12, 2019, 11:04 a.m. UTC | #2
On Tue, Mar 12, 2019 at 07:34:38AM +0100, Boris Brezillon wrote:
> On Mon, 11 Mar 2019 23:21:59 -0300
> Helen Koike <helen.koike@collabora.com> wrote:
> 
> > In the case of async update, modifications are done in place, i.e. in the
> > current plane state, so the new_state is prepared and the new_state is
> > cleanup up (instead of the old_state, diferrently on what happen in a
> 
>   ^ cleaned up				^ differently (but maybe
> "unlike what happens" is more appropriate here).
> 
> > normal sync update).
> > To cleanup the old_fb properly, it needs to be placed in the new_state
> > in the end of async_update, so cleanup call will unreference the old_fb
> > correctly.
> > 
> > Also, the previous code had a:
> > 
> > 	plane_state = plane->funcs->atomic_duplicate_state(plane);
> > 	...
> > 	swap(plane_state, plane->state);
> > 
> > 	if (plane->state->fb && plane->state->fb != new_state->fb) {
> > 	...
> > 	}
> > 
> > Which was wrong, as the fb were just assigned to be equal, so this if
> > statement nevers evaluates to true.
> > 
> > Another details is that the function drm_crtc_vblank_get() can only be
> > called when vop->is_enabled is true, otherwise it has no effect and
> > trows a WARN_ON().
> > 
> > Calling drm_atomic_set_fb_for_plane() (which get a referent of the new
> > fb and pus the old fb) is not required, as it is taken care by
> > drm_mode_cursor_universal() when calling
> > drm_atomic_helper_update_plane().
> > 
> > Signed-off-by: Helen Koike <helen.koike@collabora.com>
> > 
> > ---
> > Hello,
> > 
> > I tested on the rockchip ficus v1.1 using igt plane_cursor_legacy and
> > kms_cursor_legacy and I didn't see any regressions.
> > 
> > Changes in v2: None
> > 
> >  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 42 ++++++++++++---------
> >  1 file changed, 24 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> > index c7d4c6073ea5..a1ee8c156a7b 100644
> > --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> > @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
> >  					  struct drm_plane_state *new_state)
> >  {
> >  	struct vop *vop = to_vop(plane->state->crtc);
> > -	struct drm_plane_state *plane_state;
> > +	struct drm_framebuffer *old_fb = plane->state->fb;
> >  
> > -	plane_state = plane->funcs->atomic_duplicate_state(plane);
> > -	plane_state->crtc_x = new_state->crtc_x;
> > -	plane_state->crtc_y = new_state->crtc_y;
> > -	plane_state->crtc_h = new_state->crtc_h;
> > -	plane_state->crtc_w = new_state->crtc_w;
> > -	plane_state->src_x = new_state->src_x;
> > -	plane_state->src_y = new_state->src_y;
> > -	plane_state->src_h = new_state->src_h;
> > -	plane_state->src_w = new_state->src_w;
> > -
> > -	if (plane_state->fb != new_state->fb)
> > -		drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
> > -
> > -	swap(plane_state, plane->state);
> > -
> > -	if (plane->state->fb && plane->state->fb != new_state->fb) {
> > +	/*
> > +	 * A scanout can still be occurring, so we can't drop the reference to
> > +	 * the old framebuffer. To solve this we get a reference to old_fb and
> > +	 * set a worker to release it later.
> 
> Hm, doesn't look like an async update to me if we have to wait for the
> next VBLANK to happen to get the new content on the screen. Maybe we
> should reject async updates when old_fb != new_fb in the rk
> ->async_check() hook. 

Scanning out garbage because the old buffer is unpinned too quickly is one
of the long-term "features" of async updates. At least for features.

It's another one of these things we need to fix. Which might become easier
if we switch to usual state switching, since then we can punt the
cleanup_plane phase to a worker.

Note that depending upon the gpu this might not just result in garbage but
hangs, usually when there's an iommu and the chip dies if it accesses an
unmapped page.

Probably something to fix later on in async framework.
-Daniel
> 
> > +	 */
> > +	if (vop->is_enabled &&
> > +	    plane->state->fb && plane->state->fb != new_state->fb) {
> >  		drm_framebuffer_get(plane->state->fb);
> >  		WARN_ON(drm_crtc_vblank_get(plane->state->crtc) != 0);
> >  		drm_flip_work_queue(&vop->fb_unref_work, plane->state->fb);
> >  		set_bit(VOP_PENDING_FB_UNREF, &vop->pending);
> >  	}
> 
> In any case, I think this should be called after
> vop_plane_atomic_update() to prevent the situation where the VBLANK
> event happens between this point and the following
> vop_plane_atomic_update() call.
> 
> >  
> > +	plane->state->crtc_x = new_state->crtc_x;
> > +	plane->state->crtc_y = new_state->crtc_y;
> > +	plane->state->crtc_h = new_state->crtc_h;
> > +	plane->state->crtc_w = new_state->crtc_w;
> > +	plane->state->src_x = new_state->src_x;
> > +	plane->state->src_y = new_state->src_y;
> > +	plane->state->src_h = new_state->src_h;
> > +	plane->state->src_w = new_state->src_w;
> > +	plane->state->fb = new_state->fb;
> 
> Any reason not to use swap() here and reference plane->state->fb
> instead of new_state->fb after this point?
> 
> > +
> >  	if (vop->is_enabled) {
> >  		rockchip_drm_psr_inhibit_get_state(new_state->state);
> >  		vop_plane_atomic_update(plane, plane->state);
> > @@ -945,7 +946,12 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
> >  		rockchip_drm_psr_inhibit_put_state(new_state->state);
> >  	}
> >  
> > -	plane->funcs->atomic_destroy_state(plane, plane_state);
> > +	/*
> > +	 * In async update we perform inplace modifications and release the
> > +	 * new_state. The following is required so we release the reference of
> > +	 * the old framebuffer.
> > +	 */
> > +	new_state->fb = old_fb;
> >  }
> >  
> >  static const struct drm_plane_helper_funcs plane_helper_funcs = {
>
Helen Koike March 12, 2019, 3:34 p.m. UTC | #3
On 3/12/19 3:34 AM, Boris Brezillon wrote:
> On Mon, 11 Mar 2019 23:21:59 -0300
> Helen Koike <helen.koike@collabora.com> wrote:
> 
>> In the case of async update, modifications are done in place, i.e. in the
>> current plane state, so the new_state is prepared and the new_state is
>> cleanup up (instead of the old_state, diferrently on what happen in a
> 
>   ^ cleaned up				^ differently (but maybe
> "unlike what happens" is more appropriate here).
> 
>> normal sync update).
>> To cleanup the old_fb properly, it needs to be placed in the new_state
>> in the end of async_update, so cleanup call will unreference the old_fb
>> correctly.
>>
>> Also, the previous code had a:
>>
>> 	plane_state = plane->funcs->atomic_duplicate_state(plane);
>> 	...
>> 	swap(plane_state, plane->state);
>>
>> 	if (plane->state->fb && plane->state->fb != new_state->fb) {
>> 	...
>> 	}
>>
>> Which was wrong, as the fb were just assigned to be equal, so this if
>> statement nevers evaluates to true.
>>
>> Another details is that the function drm_crtc_vblank_get() can only be
>> called when vop->is_enabled is true, otherwise it has no effect and
>> trows a WARN_ON().
>>
>> Calling drm_atomic_set_fb_for_plane() (which get a referent of the new
>> fb and pus the old fb) is not required, as it is taken care by
>> drm_mode_cursor_universal() when calling
>> drm_atomic_helper_update_plane().
>>
>> Signed-off-by: Helen Koike <helen.koike@collabora.com>
>>
>> ---
>> Hello,
>>
>> I tested on the rockchip ficus v1.1 using igt plane_cursor_legacy and
>> kms_cursor_legacy and I didn't see any regressions.
>>
>> Changes in v2: None
>>
>>  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 42 ++++++++++++---------
>>  1 file changed, 24 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>> index c7d4c6073ea5..a1ee8c156a7b 100644
>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>  					  struct drm_plane_state *new_state)
>>  {
>>  	struct vop *vop = to_vop(plane->state->crtc);
>> -	struct drm_plane_state *plane_state;
>> +	struct drm_framebuffer *old_fb = plane->state->fb;
>>  
>> -	plane_state = plane->funcs->atomic_duplicate_state(plane);
>> -	plane_state->crtc_x = new_state->crtc_x;
>> -	plane_state->crtc_y = new_state->crtc_y;
>> -	plane_state->crtc_h = new_state->crtc_h;
>> -	plane_state->crtc_w = new_state->crtc_w;
>> -	plane_state->src_x = new_state->src_x;
>> -	plane_state->src_y = new_state->src_y;
>> -	plane_state->src_h = new_state->src_h;
>> -	plane_state->src_w = new_state->src_w;
>> -
>> -	if (plane_state->fb != new_state->fb)
>> -		drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
>> -
>> -	swap(plane_state, plane->state);
>> -
>> -	if (plane->state->fb && plane->state->fb != new_state->fb) {
>> +	/*
>> +	 * A scanout can still be occurring, so we can't drop the reference to
>> +	 * the old framebuffer. To solve this we get a reference to old_fb and
>> +	 * set a worker to release it later.
> 
> Hm, doesn't look like an async update to me if we have to wait for the
> next VBLANK to happen to get the new content on the screen. Maybe we
> should reject async updates when old_fb != new_fb in the rk
> ->async_check() hook.

Unless I am misunderstanding this, we don't wait here, we just grab a
reference to the fb in case it is being still used by the hw, so it
doesn't get released prematurely.

> 
>> +	 */
>> +	if (vop->is_enabled &&
>> +	    plane->state->fb && plane->state->fb != new_state->fb) {
>>  		drm_framebuffer_get(plane->state->fb);
>>  		WARN_ON(drm_crtc_vblank_get(plane->state->crtc) != 0);
>>  		drm_flip_work_queue(&vop->fb_unref_work, plane->state->fb);
>>  		set_bit(VOP_PENDING_FB_UNREF, &vop->pending);
>>  	}
> 
> In any case, I think this should be called after
> vop_plane_atomic_update() to prevent the situation where the VBLANK
> event happens between this point and the following
> vop_plane_atomic_update() call.

ack, I'll update it in the next version.

> 
>>  
>> +	plane->state->crtc_x = new_state->crtc_x;
>> +	plane->state->crtc_y = new_state->crtc_y;
>> +	plane->state->crtc_h = new_state->crtc_h;
>> +	plane->state->crtc_w = new_state->crtc_w;
>> +	plane->state->src_x = new_state->src_x;
>> +	plane->state->src_y = new_state->src_y;
>> +	plane->state->src_h = new_state->src_h;
>> +	plane->state->src_w = new_state->src_w;
>> +	plane->state->fb = new_state->fb;
> 
> Any reason not to use swap() here and reference plane->state->fb
> instead of new_state->fb after this point?

I had the impression I had to do this in one of my tests, but re-testing
now and re-looking at the code this doesn't seem to be necessary. I'll
use a swap() in the next version.

Thanks for your feedback.
Helen

> 
>> +
>>  	if (vop->is_enabled) {
>>  		rockchip_drm_psr_inhibit_get_state(new_state->state);
>>  		vop_plane_atomic_update(plane, plane->state);
>> @@ -945,7 +946,12 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>  		rockchip_drm_psr_inhibit_put_state(new_state->state);
>>  	}
>>  
>> -	plane->funcs->atomic_destroy_state(plane, plane_state);
>> +	/*
>> +	 * In async update we perform inplace modifications and release the
>> +	 * new_state. The following is required so we release the reference of
>> +	 * the old framebuffer.
>> +	 */
>> +	new_state->fb = old_fb;
>>  }
>>  
>>  static const struct drm_plane_helper_funcs plane_helper_funcs = {
>
Boris Brezillon March 12, 2019, 3:52 p.m. UTC | #4
On Tue, 12 Mar 2019 12:34:45 -0300
Helen Koike <helen.koike@collabora.com> wrote:

> On 3/12/19 3:34 AM, Boris Brezillon wrote:
> > On Mon, 11 Mar 2019 23:21:59 -0300
> > Helen Koike <helen.koike@collabora.com> wrote:
> >   
> >> In the case of async update, modifications are done in place, i.e. in the
> >> current plane state, so the new_state is prepared and the new_state is
> >> cleanup up (instead of the old_state, diferrently on what happen in a  
> > 
> >   ^ cleaned up				^ differently (but maybe
> > "unlike what happens" is more appropriate here).
> >   
> >> normal sync update).
> >> To cleanup the old_fb properly, it needs to be placed in the new_state
> >> in the end of async_update, so cleanup call will unreference the old_fb
> >> correctly.
> >>
> >> Also, the previous code had a:
> >>
> >> 	plane_state = plane->funcs->atomic_duplicate_state(plane);
> >> 	...
> >> 	swap(plane_state, plane->state);
> >>
> >> 	if (plane->state->fb && plane->state->fb != new_state->fb) {
> >> 	...
> >> 	}
> >>
> >> Which was wrong, as the fb were just assigned to be equal, so this if
> >> statement nevers evaluates to true.
> >>
> >> Another details is that the function drm_crtc_vblank_get() can only be
> >> called when vop->is_enabled is true, otherwise it has no effect and
> >> trows a WARN_ON().
> >>
> >> Calling drm_atomic_set_fb_for_plane() (which get a referent of the new
> >> fb and pus the old fb) is not required, as it is taken care by
> >> drm_mode_cursor_universal() when calling
> >> drm_atomic_helper_update_plane().
> >>
> >> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> >>
> >> ---
> >> Hello,
> >>
> >> I tested on the rockchip ficus v1.1 using igt plane_cursor_legacy and
> >> kms_cursor_legacy and I didn't see any regressions.
> >>
> >> Changes in v2: None
> >>
> >>  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 42 ++++++++++++---------
> >>  1 file changed, 24 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> >> index c7d4c6073ea5..a1ee8c156a7b 100644
> >> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> >> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> >> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
> >>  					  struct drm_plane_state *new_state)
> >>  {
> >>  	struct vop *vop = to_vop(plane->state->crtc);
> >> -	struct drm_plane_state *plane_state;
> >> +	struct drm_framebuffer *old_fb = plane->state->fb;
> >>  
> >> -	plane_state = plane->funcs->atomic_duplicate_state(plane);
> >> -	plane_state->crtc_x = new_state->crtc_x;
> >> -	plane_state->crtc_y = new_state->crtc_y;
> >> -	plane_state->crtc_h = new_state->crtc_h;
> >> -	plane_state->crtc_w = new_state->crtc_w;
> >> -	plane_state->src_x = new_state->src_x;
> >> -	plane_state->src_y = new_state->src_y;
> >> -	plane_state->src_h = new_state->src_h;
> >> -	plane_state->src_w = new_state->src_w;
> >> -
> >> -	if (plane_state->fb != new_state->fb)
> >> -		drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
> >> -
> >> -	swap(plane_state, plane->state);
> >> -
> >> -	if (plane->state->fb && plane->state->fb != new_state->fb) {
> >> +	/*
> >> +	 * A scanout can still be occurring, so we can't drop the reference to
> >> +	 * the old framebuffer. To solve this we get a reference to old_fb and
> >> +	 * set a worker to release it later.  
> > 
> > Hm, doesn't look like an async update to me if we have to wait for the
> > next VBLANK to happen to get the new content on the screen. Maybe we
> > should reject async updates when old_fb != new_fb in the rk  
> > ->async_check() hook.  
> 
> Unless I am misunderstanding this, we don't wait here, we just grab a
> reference to the fb in case it is being still used by the hw, so it
> doesn't get released prematurely.

I was just reacting to the comment that says the new FB should stay
around until the next VBLANK event happens. If the FB must stay around
that probably means the HW is still using, which made me wonder if this
HW actually supports async update (where async means "update now and
don't care about about tearing"). Or maybe it takes some time to switch
to the new FB and waiting for the next VBLANK to release the old FB was
an easy solution to not wait for the flip to actually happen in
->async_update() (which is kind of a combination of async+non-blocking).

Anyway, let's keep it like that.
Tomasz Figa March 13, 2019, 3:42 a.m. UTC | #5
On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
<boris.brezillon@collabora.com> wrote:
>
> On Tue, 12 Mar 2019 12:34:45 -0300
> Helen Koike <helen.koike@collabora.com> wrote:
>
> > On 3/12/19 3:34 AM, Boris Brezillon wrote:
> > > On Mon, 11 Mar 2019 23:21:59 -0300
> > > Helen Koike <helen.koike@collabora.com> wrote:
> > >
> > >> In the case of async update, modifications are done in place, i.e. in the
> > >> current plane state, so the new_state is prepared and the new_state is
> > >> cleanup up (instead of the old_state, diferrently on what happen in a
> > >
> > >   ^ cleaned up                              ^ differently (but maybe
> > > "unlike what happens" is more appropriate here).
> > >
> > >> normal sync update).
> > >> To cleanup the old_fb properly, it needs to be placed in the new_state
> > >> in the end of async_update, so cleanup call will unreference the old_fb
> > >> correctly.
> > >>
> > >> Also, the previous code had a:
> > >>
> > >>    plane_state = plane->funcs->atomic_duplicate_state(plane);
> > >>    ...
> > >>    swap(plane_state, plane->state);
> > >>
> > >>    if (plane->state->fb && plane->state->fb != new_state->fb) {
> > >>    ...
> > >>    }
> > >>
> > >> Which was wrong, as the fb were just assigned to be equal, so this if
> > >> statement nevers evaluates to true.
> > >>
> > >> Another details is that the function drm_crtc_vblank_get() can only be
> > >> called when vop->is_enabled is true, otherwise it has no effect and
> > >> trows a WARN_ON().
> > >>
> > >> Calling drm_atomic_set_fb_for_plane() (which get a referent of the new
> > >> fb and pus the old fb) is not required, as it is taken care by
> > >> drm_mode_cursor_universal() when calling
> > >> drm_atomic_helper_update_plane().
> > >>
> > >> Signed-off-by: Helen Koike <helen.koike@collabora.com>
> > >>
> > >> ---
> > >> Hello,
> > >>
> > >> I tested on the rockchip ficus v1.1 using igt plane_cursor_legacy and
> > >> kms_cursor_legacy and I didn't see any regressions.
> > >>
> > >> Changes in v2: None
> > >>
> > >>  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 42 ++++++++++++---------
> > >>  1 file changed, 24 insertions(+), 18 deletions(-)
> > >>
> > >> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> > >> index c7d4c6073ea5..a1ee8c156a7b 100644
> > >> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> > >> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> > >> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
> > >>                                      struct drm_plane_state *new_state)
> > >>  {
> > >>    struct vop *vop = to_vop(plane->state->crtc);
> > >> -  struct drm_plane_state *plane_state;
> > >> +  struct drm_framebuffer *old_fb = plane->state->fb;
> > >>
> > >> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
> > >> -  plane_state->crtc_x = new_state->crtc_x;
> > >> -  plane_state->crtc_y = new_state->crtc_y;
> > >> -  plane_state->crtc_h = new_state->crtc_h;
> > >> -  plane_state->crtc_w = new_state->crtc_w;
> > >> -  plane_state->src_x = new_state->src_x;
> > >> -  plane_state->src_y = new_state->src_y;
> > >> -  plane_state->src_h = new_state->src_h;
> > >> -  plane_state->src_w = new_state->src_w;
> > >> -
> > >> -  if (plane_state->fb != new_state->fb)
> > >> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
> > >> -
> > >> -  swap(plane_state, plane->state);
> > >> -
> > >> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
> > >> +  /*
> > >> +   * A scanout can still be occurring, so we can't drop the reference to
> > >> +   * the old framebuffer. To solve this we get a reference to old_fb and
> > >> +   * set a worker to release it later.
> > >
> > > Hm, doesn't look like an async update to me if we have to wait for the
> > > next VBLANK to happen to get the new content on the screen. Maybe we
> > > should reject async updates when old_fb != new_fb in the rk
> > > ->async_check() hook.
> >
> > Unless I am misunderstanding this, we don't wait here, we just grab a
> > reference to the fb in case it is being still used by the hw, so it
> > doesn't get released prematurely.
>
> I was just reacting to the comment that says the new FB should stay
> around until the next VBLANK event happens. If the FB must stay around
> that probably means the HW is still using, which made me wonder if this
> HW actually supports async update (where async means "update now and
> don't care about about tearing"). Or maybe it takes some time to switch
> to the new FB and waiting for the next VBLANK to release the old FB was
> an easy solution to not wait for the flip to actually happen in
> ->async_update() (which is kind of a combination of async+non-blocking).

The hardware switches framebuffers on vblank, so whatever framebuffer
is currently being scanned out from needs to stay there until the
hardware switches to the new one in shadow registers. If that doesn't
happen, you get IOMMU faults and the display controller stops working
since we don't have any fault handling currently, just printing a
message.

Best regards,
Tomasz
Michel Dänzer March 13, 2019, 9:58 a.m. UTC | #6
On 2019-03-13 4:42 a.m., Tomasz Figa wrote:
> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
> <boris.brezillon@collabora.com> wrote:
>>
>> On Tue, 12 Mar 2019 12:34:45 -0300
>> Helen Koike <helen.koike@collabora.com> wrote:
>>
>>> On 3/12/19 3:34 AM, Boris Brezillon wrote:
>>>> On Mon, 11 Mar 2019 23:21:59 -0300
>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>
>>>>> In the case of async update, modifications are done in place, i.e. in the
>>>>> current plane state, so the new_state is prepared and the new_state is
>>>>> cleanup up (instead of the old_state, diferrently on what happen in a
>>>>
>>>>   ^ cleaned up                              ^ differently (but maybe
>>>> "unlike what happens" is more appropriate here).
>>>>
>>>>> normal sync update).
>>>>> To cleanup the old_fb properly, it needs to be placed in the new_state
>>>>> in the end of async_update, so cleanup call will unreference the old_fb
>>>>> correctly.
>>>>>
>>>>> Also, the previous code had a:
>>>>>
>>>>>    plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>>    ...
>>>>>    swap(plane_state, plane->state);
>>>>>
>>>>>    if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>>    ...
>>>>>    }
>>>>>
>>>>> Which was wrong, as the fb were just assigned to be equal, so this if
>>>>> statement nevers evaluates to true.
>>>>>
>>>>> Another details is that the function drm_crtc_vblank_get() can only be
>>>>> called when vop->is_enabled is true, otherwise it has no effect and
>>>>> trows a WARN_ON().
>>>>>
>>>>> Calling drm_atomic_set_fb_for_plane() (which get a referent of the new
>>>>> fb and pus the old fb) is not required, as it is taken care by
>>>>> drm_mode_cursor_universal() when calling
>>>>> drm_atomic_helper_update_plane().
>>>>>
>>>>> Signed-off-by: Helen Koike <helen.koike@collabora.com>
>>>>>
>>>>> ---
>>>>> Hello,
>>>>>
>>>>> I tested on the rockchip ficus v1.1 using igt plane_cursor_legacy and
>>>>> kms_cursor_legacy and I didn't see any regressions.
>>>>>
>>>>> Changes in v2: None
>>>>>
>>>>>  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 42 ++++++++++++---------
>>>>>  1 file changed, 24 insertions(+), 18 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>> index c7d4c6073ea5..a1ee8c156a7b 100644
>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>>>>                                      struct drm_plane_state *new_state)
>>>>>  {
>>>>>    struct vop *vop = to_vop(plane->state->crtc);
>>>>> -  struct drm_plane_state *plane_state;
>>>>> +  struct drm_framebuffer *old_fb = plane->state->fb;
>>>>>
>>>>> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>> -  plane_state->crtc_x = new_state->crtc_x;
>>>>> -  plane_state->crtc_y = new_state->crtc_y;
>>>>> -  plane_state->crtc_h = new_state->crtc_h;
>>>>> -  plane_state->crtc_w = new_state->crtc_w;
>>>>> -  plane_state->src_x = new_state->src_x;
>>>>> -  plane_state->src_y = new_state->src_y;
>>>>> -  plane_state->src_h = new_state->src_h;
>>>>> -  plane_state->src_w = new_state->src_w;
>>>>> -
>>>>> -  if (plane_state->fb != new_state->fb)
>>>>> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
>>>>> -
>>>>> -  swap(plane_state, plane->state);
>>>>> -
>>>>> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>> +  /*
>>>>> +   * A scanout can still be occurring, so we can't drop the reference to
>>>>> +   * the old framebuffer. To solve this we get a reference to old_fb and
>>>>> +   * set a worker to release it later.
>>>>
>>>> Hm, doesn't look like an async update to me if we have to wait for the
>>>> next VBLANK to happen to get the new content on the screen. Maybe we
>>>> should reject async updates when old_fb != new_fb in the rk
>>>> ->async_check() hook.
>>>
>>> Unless I am misunderstanding this, we don't wait here, we just grab a
>>> reference to the fb in case it is being still used by the hw, so it
>>> doesn't get released prematurely.
>>
>> I was just reacting to the comment that says the new FB should stay
>> around until the next VBLANK event happens. If the FB must stay around
>> that probably means the HW is still using, which made me wonder if this
>> HW actually supports async update (where async means "update now and
>> don't care about about tearing"). Or maybe it takes some time to switch
>> to the new FB and waiting for the next VBLANK to release the old FB was
>> an easy solution to not wait for the flip to actually happen in
>> ->async_update() (which is kind of a combination of async+non-blocking).
> 
> The hardware switches framebuffers on vblank, so whatever framebuffer
> is currently being scanned out from needs to stay there until the
> hardware switches to the new one in shadow registers. If that doesn't
> happen, you get IOMMU faults and the display controller stops working
> since we don't have any fault handling currently, just printing a
> message.

Sounds like your hardware doesn't actually support async flips. It's
probably better for the driver not to pretend otherwise.
Helen Koike March 13, 2019, 6:08 p.m. UTC | #7
On 3/13/19 6:58 AM, Michel Dänzer wrote:
> On 2019-03-13 4:42 a.m., Tomasz Figa wrote:
>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
>> <boris.brezillon@collabora.com> wrote:
>>>
>>> On Tue, 12 Mar 2019 12:34:45 -0300
>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>
>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote:
>>>>> On Mon, 11 Mar 2019 23:21:59 -0300
>>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>>
>>>>>> In the case of async update, modifications are done in place, i.e. in the
>>>>>> current plane state, so the new_state is prepared and the new_state is
>>>>>> cleanup up (instead of the old_state, diferrently on what happen in a
>>>>>
>>>>>   ^ cleaned up                              ^ differently (but maybe
>>>>> "unlike what happens" is more appropriate here).
>>>>>
>>>>>> normal sync update).
>>>>>> To cleanup the old_fb properly, it needs to be placed in the new_state
>>>>>> in the end of async_update, so cleanup call will unreference the old_fb
>>>>>> correctly.
>>>>>>
>>>>>> Also, the previous code had a:
>>>>>>
>>>>>>    plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>>>    ...
>>>>>>    swap(plane_state, plane->state);
>>>>>>
>>>>>>    if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>>>    ...
>>>>>>    }
>>>>>>
>>>>>> Which was wrong, as the fb were just assigned to be equal, so this if
>>>>>> statement nevers evaluates to true.
>>>>>>
>>>>>> Another details is that the function drm_crtc_vblank_get() can only be
>>>>>> called when vop->is_enabled is true, otherwise it has no effect and
>>>>>> trows a WARN_ON().
>>>>>>
>>>>>> Calling drm_atomic_set_fb_for_plane() (which get a referent of the new
>>>>>> fb and pus the old fb) is not required, as it is taken care by
>>>>>> drm_mode_cursor_universal() when calling
>>>>>> drm_atomic_helper_update_plane().
>>>>>>
>>>>>> Signed-off-by: Helen Koike <helen.koike@collabora.com>
>>>>>>
>>>>>> ---
>>>>>> Hello,
>>>>>>
>>>>>> I tested on the rockchip ficus v1.1 using igt plane_cursor_legacy and
>>>>>> kms_cursor_legacy and I didn't see any regressions.
>>>>>>
>>>>>> Changes in v2: None
>>>>>>
>>>>>>  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 42 ++++++++++++---------
>>>>>>  1 file changed, 24 insertions(+), 18 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>> index c7d4c6073ea5..a1ee8c156a7b 100644
>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>>>>>                                      struct drm_plane_state *new_state)
>>>>>>  {
>>>>>>    struct vop *vop = to_vop(plane->state->crtc);
>>>>>> -  struct drm_plane_state *plane_state;
>>>>>> +  struct drm_framebuffer *old_fb = plane->state->fb;
>>>>>>
>>>>>> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>>> -  plane_state->crtc_x = new_state->crtc_x;
>>>>>> -  plane_state->crtc_y = new_state->crtc_y;
>>>>>> -  plane_state->crtc_h = new_state->crtc_h;
>>>>>> -  plane_state->crtc_w = new_state->crtc_w;
>>>>>> -  plane_state->src_x = new_state->src_x;
>>>>>> -  plane_state->src_y = new_state->src_y;
>>>>>> -  plane_state->src_h = new_state->src_h;
>>>>>> -  plane_state->src_w = new_state->src_w;
>>>>>> -
>>>>>> -  if (plane_state->fb != new_state->fb)
>>>>>> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
>>>>>> -
>>>>>> -  swap(plane_state, plane->state);
>>>>>> -
>>>>>> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>>> +  /*
>>>>>> +   * A scanout can still be occurring, so we can't drop the reference to
>>>>>> +   * the old framebuffer. To solve this we get a reference to old_fb and
>>>>>> +   * set a worker to release it later.
>>>>>
>>>>> Hm, doesn't look like an async update to me if we have to wait for the
>>>>> next VBLANK to happen to get the new content on the screen. Maybe we
>>>>> should reject async updates when old_fb != new_fb in the rk
>>>>> ->async_check() hook.
>>>>
>>>> Unless I am misunderstanding this, we don't wait here, we just grab a
>>>> reference to the fb in case it is being still used by the hw, so it
>>>> doesn't get released prematurely.
>>>
>>> I was just reacting to the comment that says the new FB should stay
>>> around until the next VBLANK event happens. If the FB must stay around
>>> that probably means the HW is still using, which made me wonder if this
>>> HW actually supports async update (where async means "update now and
>>> don't care about about tearing"). Or maybe it takes some time to switch
>>> to the new FB and waiting for the next VBLANK to release the old FB was
>>> an easy solution to not wait for the flip to actually happen in
>>> ->async_update() (which is kind of a combination of async+non-blocking).
>>
>> The hardware switches framebuffers on vblank, so whatever framebuffer
>> is currently being scanned out from needs to stay there until the
>> hardware switches to the new one in shadow registers. If that doesn't
>> happen, you get IOMMU faults and the display controller stops working
>> since we don't have any fault handling currently, just printing a
>> message.
> 
> Sounds like your hardware doesn't actually support async flips. It's
> probably better for the driver not to pretend otherwise.
> 
> 

I think wee need to clarify the meaning of the async_update callback
(and we should clarify it in the docs).

The way I understand what the async_update callback should do is: don't
block (i.e. don't wait for the next vblank), and update the hw state at
some point with the latest state from the last call to async_update.

Which means that: any driver can implement the async_update callback,
independently if it supports changing its state right away or not.
If hw supports, async_update can change the hw state right away, if not,
then changes will be applied in the next vblank (it can even amend the
pending commit if there is one).
With this, we can remove all the legacy cursor code to use the
async_update callback, since async_update can be called 100 times before
the next vblank, and the latest state will be set to the hw without
waiting 100 vblanks.

Please, let me know if this is your understanding as well. If not, then
we need to remodel things.

Thanks,
Helen
Michel Dänzer March 14, 2019, 9:15 a.m. UTC | #8
On 2019-03-13 7:08 p.m., Helen Koike wrote:
> On 3/13/19 6:58 AM, Michel Dänzer wrote:
>> On 2019-03-13 4:42 a.m., Tomasz Figa wrote:
>>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
>>> <boris.brezillon@collabora.com> wrote:
>>>> On Tue, 12 Mar 2019 12:34:45 -0300
>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote:
>>>>>> On Mon, 11 Mar 2019 23:21:59 -0300
>>>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>>>
>>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>>>>>>                                      struct drm_plane_state *new_state)
>>>>>>>  {
>>>>>>>    struct vop *vop = to_vop(plane->state->crtc);
>>>>>>> -  struct drm_plane_state *plane_state;
>>>>>>> +  struct drm_framebuffer *old_fb = plane->state->fb;
>>>>>>>
>>>>>>> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>>>> -  plane_state->crtc_x = new_state->crtc_x;
>>>>>>> -  plane_state->crtc_y = new_state->crtc_y;
>>>>>>> -  plane_state->crtc_h = new_state->crtc_h;
>>>>>>> -  plane_state->crtc_w = new_state->crtc_w;
>>>>>>> -  plane_state->src_x = new_state->src_x;
>>>>>>> -  plane_state->src_y = new_state->src_y;
>>>>>>> -  plane_state->src_h = new_state->src_h;
>>>>>>> -  plane_state->src_w = new_state->src_w;
>>>>>>> -
>>>>>>> -  if (plane_state->fb != new_state->fb)
>>>>>>> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
>>>>>>> -
>>>>>>> -  swap(plane_state, plane->state);
>>>>>>> -
>>>>>>> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>>>> +  /*
>>>>>>> +   * A scanout can still be occurring, so we can't drop the reference to
>>>>>>> +   * the old framebuffer. To solve this we get a reference to old_fb and
>>>>>>> +   * set a worker to release it later.
>>>>>>
>>>>>> Hm, doesn't look like an async update to me if we have to wait for the
>>>>>> next VBLANK to happen to get the new content on the screen. Maybe we
>>>>>> should reject async updates when old_fb != new_fb in the rk
>>>>>> ->async_check() hook.
>>>>>
>>>>> Unless I am misunderstanding this, we don't wait here, we just grab a
>>>>> reference to the fb in case it is being still used by the hw, so it
>>>>> doesn't get released prematurely.
>>>>
>>>> I was just reacting to the comment that says the new FB should stay
>>>> around until the next VBLANK event happens. If the FB must stay around
>>>> that probably means the HW is still using, which made me wonder if this
>>>> HW actually supports async update (where async means "update now and
>>>> don't care about about tearing"). Or maybe it takes some time to switch
>>>> to the new FB and waiting for the next VBLANK to release the old FB was
>>>> an easy solution to not wait for the flip to actually happen in
>>>> ->async_update() (which is kind of a combination of async+non-blocking).
>>>
>>> The hardware switches framebuffers on vblank, so whatever framebuffer
>>> is currently being scanned out from needs to stay there until the
>>> hardware switches to the new one in shadow registers. If that doesn't
>>> happen, you get IOMMU faults and the display controller stops working
>>> since we don't have any fault handling currently, just printing a
>>> message.
>>
>> Sounds like your hardware doesn't actually support async flips. It's
>> probably better for the driver not to pretend otherwise.
> 
> I think wee need to clarify the meaning of the async_update callback
> (and we should clarify it in the docs).
> 
> The way I understand what the async_update callback should do is: don't
> block (i.e. don't wait for the next vblank),

Note that those are two separate things. "Async flips" are about "don't
wait for vblank", not about "don't block".


> and update the hw state at some point with the latest state from the
> last call to async_update.
> 
> Which means that: any driver can implement the async_update callback,
> independently if it supports changing its state right away or not.
> If hw supports, async_update can change the hw state right away, if not,
> then changes will be applied in the next vblank (it can even amend the
> pending commit if there is one).
> With this, we can remove all the legacy cursor code to use the
> async_update callback, since async_update can be called 100 times before
> the next vblank, and the latest state will be set to the hw without
> waiting 100 vblanks.
> 
> Please, let me know if this is your understanding as well. If not, then
> we need to remodel things.

While this may make sense for cursor updates, I don't think it does for
async flips. If the flip only actually takes effect during the next
vblank, it doesn't really fit the definition and userspace expectation
of an async flip. It's better to clearly communicate to userspace that
the hardware cannot do async flips, than to pretend it can and fake
them. Userspace has to deal with this anyway, since async flips weren't
always supported in general.
Helen Koike March 14, 2019, 5:51 p.m. UTC | #9
On 3/14/19 6:15 AM, Michel Dänzer wrote:
> On 2019-03-13 7:08 p.m., Helen Koike wrote:
>> On 3/13/19 6:58 AM, Michel Dänzer wrote:
>>> On 2019-03-13 4:42 a.m., Tomasz Figa wrote:
>>>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
>>>> <boris.brezillon@collabora.com> wrote:
>>>>> On Tue, 12 Mar 2019 12:34:45 -0300
>>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote:
>>>>>>> On Mon, 11 Mar 2019 23:21:59 -0300
>>>>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>>>>
>>>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>>>>>>>                                      struct drm_plane_state *new_state)
>>>>>>>>  {
>>>>>>>>    struct vop *vop = to_vop(plane->state->crtc);
>>>>>>>> -  struct drm_plane_state *plane_state;
>>>>>>>> +  struct drm_framebuffer *old_fb = plane->state->fb;
>>>>>>>>
>>>>>>>> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>>>>> -  plane_state->crtc_x = new_state->crtc_x;
>>>>>>>> -  plane_state->crtc_y = new_state->crtc_y;
>>>>>>>> -  plane_state->crtc_h = new_state->crtc_h;
>>>>>>>> -  plane_state->crtc_w = new_state->crtc_w;
>>>>>>>> -  plane_state->src_x = new_state->src_x;
>>>>>>>> -  plane_state->src_y = new_state->src_y;
>>>>>>>> -  plane_state->src_h = new_state->src_h;
>>>>>>>> -  plane_state->src_w = new_state->src_w;
>>>>>>>> -
>>>>>>>> -  if (plane_state->fb != new_state->fb)
>>>>>>>> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
>>>>>>>> -
>>>>>>>> -  swap(plane_state, plane->state);
>>>>>>>> -
>>>>>>>> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>>>>> +  /*
>>>>>>>> +   * A scanout can still be occurring, so we can't drop the reference to
>>>>>>>> +   * the old framebuffer. To solve this we get a reference to old_fb and
>>>>>>>> +   * set a worker to release it later.
>>>>>>>
>>>>>>> Hm, doesn't look like an async update to me if we have to wait for the
>>>>>>> next VBLANK to happen to get the new content on the screen. Maybe we
>>>>>>> should reject async updates when old_fb != new_fb in the rk
>>>>>>> ->async_check() hook.
>>>>>>
>>>>>> Unless I am misunderstanding this, we don't wait here, we just grab a
>>>>>> reference to the fb in case it is being still used by the hw, so it
>>>>>> doesn't get released prematurely.
>>>>>
>>>>> I was just reacting to the comment that says the new FB should stay
>>>>> around until the next VBLANK event happens. If the FB must stay around
>>>>> that probably means the HW is still using, which made me wonder if this
>>>>> HW actually supports async update (where async means "update now and
>>>>> don't care about about tearing"). Or maybe it takes some time to switch
>>>>> to the new FB and waiting for the next VBLANK to release the old FB was
>>>>> an easy solution to not wait for the flip to actually happen in
>>>>> ->async_update() (which is kind of a combination of async+non-blocking).
>>>>
>>>> The hardware switches framebuffers on vblank, so whatever framebuffer
>>>> is currently being scanned out from needs to stay there until the
>>>> hardware switches to the new one in shadow registers. If that doesn't
>>>> happen, you get IOMMU faults and the display controller stops working
>>>> since we don't have any fault handling currently, just printing a
>>>> message.
>>>
>>> Sounds like your hardware doesn't actually support async flips. It's
>>> probably better for the driver not to pretend otherwise.
>>
>> I think wee need to clarify the meaning of the async_update callback
>> (and we should clarify it in the docs).
>>
>> The way I understand what the async_update callback should do is: don't
>> block (i.e. don't wait for the next vblank),
> 
> Note that those are two separate things. "Async flips" are about "don't
> wait for vblank", not about "don't block".
> 
> 
>> and update the hw state at some point with the latest state from the
>> last call to async_update.
>>
>> Which means that: any driver can implement the async_update callback,
>> independently if it supports changing its state right away or not.
>> If hw supports, async_update can change the hw state right away, if not,
>> then changes will be applied in the next vblank (it can even amend the
>> pending commit if there is one).
>> With this, we can remove all the legacy cursor code to use the
>> async_update callback, since async_update can be called 100 times before
>> the next vblank, and the latest state will be set to the hw without
>> waiting 100 vblanks.
>>
>> Please, let me know if this is your understanding as well. If not, then
>> we need to remodel things.
> 
> While this may make sense for cursor updates, I don't think it does for
> async flips. If the flip only actually takes effect during the next
> vblank, it doesn't really fit the definition and userspace expectation
> of an async flip. It's better to clearly communicate to userspace that
> the hardware cannot do async flips, than to pretend it can and fake
> them. Userspace has to deal with this anyway, since async flips weren't
> always supported in general.
> 
> 

What do you think if we separate two concepts here:

- amend mode: works like cursor updates, i.e, update the hw state at
some point with the latest state from the last call to async_update. No
special hardware support is required.

- async update: update hw state immediately. This depends if the hw
supports it or not.

Every async update is an amend, but the opposite is not necessarily true.

What do you think if we rename the current async_update to amend_update,
and we add a parameter "force_async" to it? (or maybe
force_immediate_update?)
Then amend_check with force_async=1 would fail if the hardware doesn't
support it (we could also add flags in the capabilities to inform
userspace the expected behaviour of things and if the hw supports
force_sync).

Like this, we can implement the cursors using the amend_update (which is
now called async_update), and async_flips with amend_update with
force_async=1.

If this sounds a reasonable proposal I can try to work on a prof of
concept. What do you think? Let me know if you have any other ideas.

Thanks,
Helen
Michel Dänzer March 15, 2019, 10:11 a.m. UTC | #10
On 2019-03-14 6:51 p.m., Helen Koike wrote:
> On 3/14/19 6:15 AM, Michel Dänzer wrote:
>> On 2019-03-13 7:08 p.m., Helen Koike wrote:
>>> On 3/13/19 6:58 AM, Michel Dänzer wrote:
>>>> On 2019-03-13 4:42 a.m., Tomasz Figa wrote:
>>>>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
>>>>> <boris.brezillon@collabora.com> wrote:
>>>>>> On Tue, 12 Mar 2019 12:34:45 -0300
>>>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote:
>>>>>>>> On Mon, 11 Mar 2019 23:21:59 -0300
>>>>>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>>>>>
>>>>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>>>>>>>>                                      struct drm_plane_state *new_state)
>>>>>>>>>  {
>>>>>>>>>    struct vop *vop = to_vop(plane->state->crtc);
>>>>>>>>> -  struct drm_plane_state *plane_state;
>>>>>>>>> +  struct drm_framebuffer *old_fb = plane->state->fb;
>>>>>>>>>
>>>>>>>>> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>>>>>> -  plane_state->crtc_x = new_state->crtc_x;
>>>>>>>>> -  plane_state->crtc_y = new_state->crtc_y;
>>>>>>>>> -  plane_state->crtc_h = new_state->crtc_h;
>>>>>>>>> -  plane_state->crtc_w = new_state->crtc_w;
>>>>>>>>> -  plane_state->src_x = new_state->src_x;
>>>>>>>>> -  plane_state->src_y = new_state->src_y;
>>>>>>>>> -  plane_state->src_h = new_state->src_h;
>>>>>>>>> -  plane_state->src_w = new_state->src_w;
>>>>>>>>> -
>>>>>>>>> -  if (plane_state->fb != new_state->fb)
>>>>>>>>> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
>>>>>>>>> -
>>>>>>>>> -  swap(plane_state, plane->state);
>>>>>>>>> -
>>>>>>>>> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>>>>>> +  /*
>>>>>>>>> +   * A scanout can still be occurring, so we can't drop the reference to
>>>>>>>>> +   * the old framebuffer. To solve this we get a reference to old_fb and
>>>>>>>>> +   * set a worker to release it later.
>>>>>>>>
>>>>>>>> Hm, doesn't look like an async update to me if we have to wait for the
>>>>>>>> next VBLANK to happen to get the new content on the screen. Maybe we
>>>>>>>> should reject async updates when old_fb != new_fb in the rk
>>>>>>>> ->async_check() hook.
>>>>>>>
>>>>>>> Unless I am misunderstanding this, we don't wait here, we just grab a
>>>>>>> reference to the fb in case it is being still used by the hw, so it
>>>>>>> doesn't get released prematurely.
>>>>>>
>>>>>> I was just reacting to the comment that says the new FB should stay
>>>>>> around until the next VBLANK event happens. If the FB must stay around
>>>>>> that probably means the HW is still using, which made me wonder if this
>>>>>> HW actually supports async update (where async means "update now and
>>>>>> don't care about about tearing"). Or maybe it takes some time to switch
>>>>>> to the new FB and waiting for the next VBLANK to release the old FB was
>>>>>> an easy solution to not wait for the flip to actually happen in
>>>>>> ->async_update() (which is kind of a combination of async+non-blocking).
>>>>>
>>>>> The hardware switches framebuffers on vblank, so whatever framebuffer
>>>>> is currently being scanned out from needs to stay there until the
>>>>> hardware switches to the new one in shadow registers. If that doesn't
>>>>> happen, you get IOMMU faults and the display controller stops working
>>>>> since we don't have any fault handling currently, just printing a
>>>>> message.
>>>>
>>>> Sounds like your hardware doesn't actually support async flips. It's
>>>> probably better for the driver not to pretend otherwise.
>>>
>>> I think wee need to clarify the meaning of the async_update callback
>>> (and we should clarify it in the docs).
>>>
>>> The way I understand what the async_update callback should do is: don't
>>> block (i.e. don't wait for the next vblank),
>>
>> Note that those are two separate things. "Async flips" are about "don't
>> wait for vblank", not about "don't block".
>>
>>
>>> and update the hw state at some point with the latest state from the
>>> last call to async_update.
>>>
>>> Which means that: any driver can implement the async_update callback,
>>> independently if it supports changing its state right away or not.
>>> If hw supports, async_update can change the hw state right away, if not,
>>> then changes will be applied in the next vblank (it can even amend the
>>> pending commit if there is one).
>>> With this, we can remove all the legacy cursor code to use the
>>> async_update callback, since async_update can be called 100 times before
>>> the next vblank, and the latest state will be set to the hw without
>>> waiting 100 vblanks.
>>>
>>> Please, let me know if this is your understanding as well. If not, then
>>> we need to remodel things.
>>
>> While this may make sense for cursor updates, I don't think it does for
>> async flips. If the flip only actually takes effect during the next
>> vblank, it doesn't really fit the definition and userspace expectation
>> of an async flip. It's better to clearly communicate to userspace that
>> the hardware cannot do async flips, than to pretend it can and fake
>> them. Userspace has to deal with this anyway, since async flips weren't
>> always supported in general.
> 
> What do you think if we separate two concepts here:
> 
> - amend mode: works like cursor updates, i.e, update the hw state at
> some point with the latest state from the last call to async_update. No
> special hardware support is required.
> 
> - async update: update hw state immediately. This depends if the hw
> supports it or not.
> 
> Every async update is an amend, but the opposite is not necessarily true.
> 
> What do you think if we rename the current async_update to amend_update,
> and we add a parameter "force_async" to it? (or maybe
> force_immediate_update?)
> Then amend_check with force_async=1 would fail if the hardware doesn't
> support it (we could also add flags in the capabilities to inform
> userspace the expected behaviour of things and if the hw supports
> force_sync).
> 
> Like this, we can implement the cursors using the amend_update (which is
> now called async_update), and async_flips with amend_update with
> force_async=1.

Might force_async make sense for cursor updates as well? I thought some
hardware supported HW cursor updates outside of vblank, but I'm not sure.

Without force_async, are cursor updates always applied to the hardware
on the next vblank, even if the pending commit is delayed further (e.g.
because a fence it depends on doesn't signal before vblank)? If cursor
updates can be delayed beyond the next vblank, that can result in bad
user experience.
Boris Brezillon March 15, 2019, 10:25 a.m. UTC | #11
On Fri, 15 Mar 2019 11:11:36 +0100
Michel Dänzer <michel@daenzer.net> wrote:

> On 2019-03-14 6:51 p.m., Helen Koike wrote:
> > On 3/14/19 6:15 AM, Michel Dänzer wrote:  
> >> On 2019-03-13 7:08 p.m., Helen Koike wrote:  
> >>> On 3/13/19 6:58 AM, Michel Dänzer wrote:  
> >>>> On 2019-03-13 4:42 a.m., Tomasz Figa wrote:  
> >>>>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
> >>>>> <boris.brezillon@collabora.com> wrote:  
> >>>>>> On Tue, 12 Mar 2019 12:34:45 -0300
> >>>>>> Helen Koike <helen.koike@collabora.com> wrote:  
> >>>>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote:  
> >>>>>>>> On Mon, 11 Mar 2019 23:21:59 -0300
> >>>>>>>> Helen Koike <helen.koike@collabora.com> wrote:
> >>>>>>>>  
> >>>>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> >>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
> >>>>>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
> >>>>>>>>>                                      struct drm_plane_state *new_state)
> >>>>>>>>>  {
> >>>>>>>>>    struct vop *vop = to_vop(plane->state->crtc);
> >>>>>>>>> -  struct drm_plane_state *plane_state;
> >>>>>>>>> +  struct drm_framebuffer *old_fb = plane->state->fb;
> >>>>>>>>>
> >>>>>>>>> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
> >>>>>>>>> -  plane_state->crtc_x = new_state->crtc_x;
> >>>>>>>>> -  plane_state->crtc_y = new_state->crtc_y;
> >>>>>>>>> -  plane_state->crtc_h = new_state->crtc_h;
> >>>>>>>>> -  plane_state->crtc_w = new_state->crtc_w;
> >>>>>>>>> -  plane_state->src_x = new_state->src_x;
> >>>>>>>>> -  plane_state->src_y = new_state->src_y;
> >>>>>>>>> -  plane_state->src_h = new_state->src_h;
> >>>>>>>>> -  plane_state->src_w = new_state->src_w;
> >>>>>>>>> -
> >>>>>>>>> -  if (plane_state->fb != new_state->fb)
> >>>>>>>>> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
> >>>>>>>>> -
> >>>>>>>>> -  swap(plane_state, plane->state);
> >>>>>>>>> -
> >>>>>>>>> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
> >>>>>>>>> +  /*
> >>>>>>>>> +   * A scanout can still be occurring, so we can't drop the reference to
> >>>>>>>>> +   * the old framebuffer. To solve this we get a reference to old_fb and
> >>>>>>>>> +   * set a worker to release it later.  
> >>>>>>>>
> >>>>>>>> Hm, doesn't look like an async update to me if we have to wait for the
> >>>>>>>> next VBLANK to happen to get the new content on the screen. Maybe we
> >>>>>>>> should reject async updates when old_fb != new_fb in the rk  
> >>>>>>>> ->async_check() hook.  
> >>>>>>>
> >>>>>>> Unless I am misunderstanding this, we don't wait here, we just grab a
> >>>>>>> reference to the fb in case it is being still used by the hw, so it
> >>>>>>> doesn't get released prematurely.  
> >>>>>>
> >>>>>> I was just reacting to the comment that says the new FB should stay
> >>>>>> around until the next VBLANK event happens. If the FB must stay around
> >>>>>> that probably means the HW is still using, which made me wonder if this
> >>>>>> HW actually supports async update (where async means "update now and
> >>>>>> don't care about about tearing"). Or maybe it takes some time to switch
> >>>>>> to the new FB and waiting for the next VBLANK to release the old FB was
> >>>>>> an easy solution to not wait for the flip to actually happen in  
> >>>>>> ->async_update() (which is kind of a combination of async+non-blocking).  
> >>>>>
> >>>>> The hardware switches framebuffers on vblank, so whatever framebuffer
> >>>>> is currently being scanned out from needs to stay there until the
> >>>>> hardware switches to the new one in shadow registers. If that doesn't
> >>>>> happen, you get IOMMU faults and the display controller stops working
> >>>>> since we don't have any fault handling currently, just printing a
> >>>>> message.  
> >>>>
> >>>> Sounds like your hardware doesn't actually support async flips. It's
> >>>> probably better for the driver not to pretend otherwise.  
> >>>
> >>> I think wee need to clarify the meaning of the async_update callback
> >>> (and we should clarify it in the docs).
> >>>
> >>> The way I understand what the async_update callback should do is: don't
> >>> block (i.e. don't wait for the next vblank),  
> >>
> >> Note that those are two separate things. "Async flips" are about "don't
> >> wait for vblank", not about "don't block".
> >>
> >>  
> >>> and update the hw state at some point with the latest state from the
> >>> last call to async_update.
> >>>
> >>> Which means that: any driver can implement the async_update callback,
> >>> independently if it supports changing its state right away or not.
> >>> If hw supports, async_update can change the hw state right away, if not,
> >>> then changes will be applied in the next vblank (it can even amend the
> >>> pending commit if there is one).
> >>> With this, we can remove all the legacy cursor code to use the
> >>> async_update callback, since async_update can be called 100 times before
> >>> the next vblank, and the latest state will be set to the hw without
> >>> waiting 100 vblanks.
> >>>
> >>> Please, let me know if this is your understanding as well. If not, then
> >>> we need to remodel things.  
> >>
> >> While this may make sense for cursor updates, I don't think it does for
> >> async flips. If the flip only actually takes effect during the next
> >> vblank, it doesn't really fit the definition and userspace expectation
> >> of an async flip. It's better to clearly communicate to userspace that
> >> the hardware cannot do async flips, than to pretend it can and fake
> >> them. Userspace has to deal with this anyway, since async flips weren't
> >> always supported in general.  
> > 
> > What do you think if we separate two concepts here:
> > 
> > - amend mode: works like cursor updates, i.e, update the hw state at
> > some point with the latest state from the last call to async_update. No
> > special hardware support is required.
> > 
> > - async update: update hw state immediately. This depends if the hw
> > supports it or not.
> > 
> > Every async update is an amend, but the opposite is not necessarily true.
> > 
> > What do you think if we rename the current async_update to amend_update,
> > and we add a parameter "force_async" to it? (or maybe
> > force_immediate_update?)
> > Then amend_check with force_async=1 would fail if the hardware doesn't
> > support it (we could also add flags in the capabilities to inform
> > userspace the expected behaviour of things and if the hw supports
> > force_sync).
> > 
> > Like this, we can implement the cursors using the amend_update (which is
> > now called async_update), and async_flips with amend_update with
> > force_async=1.  
> 
> Might force_async make sense for cursor updates as well? I thought some
> hardware supported HW cursor updates outside of vblank, but I'm not sure.
> 
> Without force_async, are cursor updates always applied to the hardware
> on the next vblank, even if the pending commit is delayed further (e.g.
> because a fence it depends on doesn't signal before vblank)? If cursor
> updates can be delayed beyond the next vblank, that can result in bad
> user experience.

You mean you have

1. sync/regular update pending (waiting on a fence)
2. async update on top of #1

?

In that case I'd expect async_update to either fail with -EBUSY or
fallback to a sync update, but #2 should never go before #1 because the
plane state in #2 has been constructed from the expected state after #1
has been applied.

Note that right now this situation cannot happen because we fallback to
a sync update when ->hw_done of the previous commit is not signaled.
Michel Dänzer March 15, 2019, 11:29 a.m. UTC | #12
On 2019-03-15 11:25 a.m., Boris Brezillon wrote:
> On Fri, 15 Mar 2019 11:11:36 +0100
> Michel Dänzer <michel@daenzer.net> wrote:
> 
>> On 2019-03-14 6:51 p.m., Helen Koike wrote:
>>> On 3/14/19 6:15 AM, Michel Dänzer wrote:  
>>>> On 2019-03-13 7:08 p.m., Helen Koike wrote:  
>>>>> On 3/13/19 6:58 AM, Michel Dänzer wrote:  
>>>>>> On 2019-03-13 4:42 a.m., Tomasz Figa wrote:  
>>>>>>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
>>>>>>> <boris.brezillon@collabora.com> wrote:  
>>>>>>>> On Tue, 12 Mar 2019 12:34:45 -0300
>>>>>>>> Helen Koike <helen.koike@collabora.com> wrote:  
>>>>>>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote:  
>>>>>>>>>> On Mon, 11 Mar 2019 23:21:59 -0300
>>>>>>>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>>>>>>>  
>>>>>>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>>>>>>>>>>                                      struct drm_plane_state *new_state)
>>>>>>>>>>>  {
>>>>>>>>>>>    struct vop *vop = to_vop(plane->state->crtc);
>>>>>>>>>>> -  struct drm_plane_state *plane_state;
>>>>>>>>>>> +  struct drm_framebuffer *old_fb = plane->state->fb;
>>>>>>>>>>>
>>>>>>>>>>> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>>>>>>>> -  plane_state->crtc_x = new_state->crtc_x;
>>>>>>>>>>> -  plane_state->crtc_y = new_state->crtc_y;
>>>>>>>>>>> -  plane_state->crtc_h = new_state->crtc_h;
>>>>>>>>>>> -  plane_state->crtc_w = new_state->crtc_w;
>>>>>>>>>>> -  plane_state->src_x = new_state->src_x;
>>>>>>>>>>> -  plane_state->src_y = new_state->src_y;
>>>>>>>>>>> -  plane_state->src_h = new_state->src_h;
>>>>>>>>>>> -  plane_state->src_w = new_state->src_w;
>>>>>>>>>>> -
>>>>>>>>>>> -  if (plane_state->fb != new_state->fb)
>>>>>>>>>>> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
>>>>>>>>>>> -
>>>>>>>>>>> -  swap(plane_state, plane->state);
>>>>>>>>>>> -
>>>>>>>>>>> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>>>>>>>> +  /*
>>>>>>>>>>> +   * A scanout can still be occurring, so we can't drop the reference to
>>>>>>>>>>> +   * the old framebuffer. To solve this we get a reference to old_fb and
>>>>>>>>>>> +   * set a worker to release it later.  
>>>>>>>>>>
>>>>>>>>>> Hm, doesn't look like an async update to me if we have to wait for the
>>>>>>>>>> next VBLANK to happen to get the new content on the screen. Maybe we
>>>>>>>>>> should reject async updates when old_fb != new_fb in the rk  
>>>>>>>>>> ->async_check() hook.  
>>>>>>>>>
>>>>>>>>> Unless I am misunderstanding this, we don't wait here, we just grab a
>>>>>>>>> reference to the fb in case it is being still used by the hw, so it
>>>>>>>>> doesn't get released prematurely.  
>>>>>>>>
>>>>>>>> I was just reacting to the comment that says the new FB should stay
>>>>>>>> around until the next VBLANK event happens. If the FB must stay around
>>>>>>>> that probably means the HW is still using, which made me wonder if this
>>>>>>>> HW actually supports async update (where async means "update now and
>>>>>>>> don't care about about tearing"). Or maybe it takes some time to switch
>>>>>>>> to the new FB and waiting for the next VBLANK to release the old FB was
>>>>>>>> an easy solution to not wait for the flip to actually happen in  
>>>>>>>> ->async_update() (which is kind of a combination of async+non-blocking).  
>>>>>>>
>>>>>>> The hardware switches framebuffers on vblank, so whatever framebuffer
>>>>>>> is currently being scanned out from needs to stay there until the
>>>>>>> hardware switches to the new one in shadow registers. If that doesn't
>>>>>>> happen, you get IOMMU faults and the display controller stops working
>>>>>>> since we don't have any fault handling currently, just printing a
>>>>>>> message.  
>>>>>>
>>>>>> Sounds like your hardware doesn't actually support async flips. It's
>>>>>> probably better for the driver not to pretend otherwise.  
>>>>>
>>>>> I think wee need to clarify the meaning of the async_update callback
>>>>> (and we should clarify it in the docs).
>>>>>
>>>>> The way I understand what the async_update callback should do is: don't
>>>>> block (i.e. don't wait for the next vblank),  
>>>>
>>>> Note that those are two separate things. "Async flips" are about "don't
>>>> wait for vblank", not about "don't block".
>>>>
>>>>  
>>>>> and update the hw state at some point with the latest state from the
>>>>> last call to async_update.
>>>>>
>>>>> Which means that: any driver can implement the async_update callback,
>>>>> independently if it supports changing its state right away or not.
>>>>> If hw supports, async_update can change the hw state right away, if not,
>>>>> then changes will be applied in the next vblank (it can even amend the
>>>>> pending commit if there is one).
>>>>> With this, we can remove all the legacy cursor code to use the
>>>>> async_update callback, since async_update can be called 100 times before
>>>>> the next vblank, and the latest state will be set to the hw without
>>>>> waiting 100 vblanks.
>>>>>
>>>>> Please, let me know if this is your understanding as well. If not, then
>>>>> we need to remodel things.  
>>>>
>>>> While this may make sense for cursor updates, I don't think it does for
>>>> async flips. If the flip only actually takes effect during the next
>>>> vblank, it doesn't really fit the definition and userspace expectation
>>>> of an async flip. It's better to clearly communicate to userspace that
>>>> the hardware cannot do async flips, than to pretend it can and fake
>>>> them. Userspace has to deal with this anyway, since async flips weren't
>>>> always supported in general.  
>>>
>>> What do you think if we separate two concepts here:
>>>
>>> - amend mode: works like cursor updates, i.e, update the hw state at
>>> some point with the latest state from the last call to async_update. No
>>> special hardware support is required.
>>>
>>> - async update: update hw state immediately. This depends if the hw
>>> supports it or not.
>>>
>>> Every async update is an amend, but the opposite is not necessarily true.
>>>
>>> What do you think if we rename the current async_update to amend_update,
>>> and we add a parameter "force_async" to it? (or maybe
>>> force_immediate_update?)
>>> Then amend_check with force_async=1 would fail if the hardware doesn't
>>> support it (we could also add flags in the capabilities to inform
>>> userspace the expected behaviour of things and if the hw supports
>>> force_sync).
>>>
>>> Like this, we can implement the cursors using the amend_update (which is
>>> now called async_update), and async_flips with amend_update with
>>> force_async=1.  
>>
>> Might force_async make sense for cursor updates as well? I thought some
>> hardware supported HW cursor updates outside of vblank, but I'm not sure.
>>
>> Without force_async, are cursor updates always applied to the hardware
>> on the next vblank, even if the pending commit is delayed further (e.g.
>> because a fence it depends on doesn't signal before vblank)? If cursor
>> updates can be delayed beyond the next vblank, that can result in bad
>> user experience.
> 
> You mean you have
> 
> 1. sync/regular update pending (waiting on a fence)
> 2. async update on top of #1
> 
> ?

Yeah.
Helen Koike March 15, 2019, 4:54 p.m. UTC | #13
On 3/15/19 8:29 AM, Michel Dänzer wrote:
> On 2019-03-15 11:25 a.m., Boris Brezillon wrote:
>> On Fri, 15 Mar 2019 11:11:36 +0100
>> Michel Dänzer <michel@daenzer.net> wrote:
>>
>>> On 2019-03-14 6:51 p.m., Helen Koike wrote:
>>>> On 3/14/19 6:15 AM, Michel Dänzer wrote:  
>>>>> On 2019-03-13 7:08 p.m., Helen Koike wrote:  
>>>>>> On 3/13/19 6:58 AM, Michel Dänzer wrote:  
>>>>>>> On 2019-03-13 4:42 a.m., Tomasz Figa wrote:  
>>>>>>>> On Wed, Mar 13, 2019 at 12:52 AM Boris Brezillon
>>>>>>>> <boris.brezillon@collabora.com> wrote:  
>>>>>>>>> On Tue, 12 Mar 2019 12:34:45 -0300
>>>>>>>>> Helen Koike <helen.koike@collabora.com> wrote:  
>>>>>>>>>> On 3/12/19 3:34 AM, Boris Brezillon wrote:  
>>>>>>>>>>> On Mon, 11 Mar 2019 23:21:59 -0300
>>>>>>>>>>> Helen Koike <helen.koike@collabora.com> wrote:
>>>>>>>>>>>  
>>>>>>>>>>>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
>>>>>>>>>>>> @@ -912,30 +912,31 @@ static void vop_plane_atomic_async_update(struct drm_plane *plane,
>>>>>>>>>>>>                                      struct drm_plane_state *new_state)
>>>>>>>>>>>>  {
>>>>>>>>>>>>    struct vop *vop = to_vop(plane->state->crtc);
>>>>>>>>>>>> -  struct drm_plane_state *plane_state;
>>>>>>>>>>>> +  struct drm_framebuffer *old_fb = plane->state->fb;
>>>>>>>>>>>>
>>>>>>>>>>>> -  plane_state = plane->funcs->atomic_duplicate_state(plane);
>>>>>>>>>>>> -  plane_state->crtc_x = new_state->crtc_x;
>>>>>>>>>>>> -  plane_state->crtc_y = new_state->crtc_y;
>>>>>>>>>>>> -  plane_state->crtc_h = new_state->crtc_h;
>>>>>>>>>>>> -  plane_state->crtc_w = new_state->crtc_w;
>>>>>>>>>>>> -  plane_state->src_x = new_state->src_x;
>>>>>>>>>>>> -  plane_state->src_y = new_state->src_y;
>>>>>>>>>>>> -  plane_state->src_h = new_state->src_h;
>>>>>>>>>>>> -  plane_state->src_w = new_state->src_w;
>>>>>>>>>>>> -
>>>>>>>>>>>> -  if (plane_state->fb != new_state->fb)
>>>>>>>>>>>> -          drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
>>>>>>>>>>>> -
>>>>>>>>>>>> -  swap(plane_state, plane->state);
>>>>>>>>>>>> -
>>>>>>>>>>>> -  if (plane->state->fb && plane->state->fb != new_state->fb) {
>>>>>>>>>>>> +  /*
>>>>>>>>>>>> +   * A scanout can still be occurring, so we can't drop the reference to
>>>>>>>>>>>> +   * the old framebuffer. To solve this we get a reference to old_fb and
>>>>>>>>>>>> +   * set a worker to release it later.  
>>>>>>>>>>>
>>>>>>>>>>> Hm, doesn't look like an async update to me if we have to wait for the
>>>>>>>>>>> next VBLANK to happen to get the new content on the screen. Maybe we
>>>>>>>>>>> should reject async updates when old_fb != new_fb in the rk  
>>>>>>>>>>> ->async_check() hook.  
>>>>>>>>>>
>>>>>>>>>> Unless I am misunderstanding this, we don't wait here, we just grab a
>>>>>>>>>> reference to the fb in case it is being still used by the hw, so it
>>>>>>>>>> doesn't get released prematurely.  
>>>>>>>>>
>>>>>>>>> I was just reacting to the comment that says the new FB should stay
>>>>>>>>> around until the next VBLANK event happens. If the FB must stay around
>>>>>>>>> that probably means the HW is still using, which made me wonder if this
>>>>>>>>> HW actually supports async update (where async means "update now and
>>>>>>>>> don't care about about tearing"). Or maybe it takes some time to switch
>>>>>>>>> to the new FB and waiting for the next VBLANK to release the old FB was
>>>>>>>>> an easy solution to not wait for the flip to actually happen in  
>>>>>>>>> ->async_update() (which is kind of a combination of async+non-blocking).  
>>>>>>>>
>>>>>>>> The hardware switches framebuffers on vblank, so whatever framebuffer
>>>>>>>> is currently being scanned out from needs to stay there until the
>>>>>>>> hardware switches to the new one in shadow registers. If that doesn't
>>>>>>>> happen, you get IOMMU faults and the display controller stops working
>>>>>>>> since we don't have any fault handling currently, just printing a
>>>>>>>> message.  
>>>>>>>
>>>>>>> Sounds like your hardware doesn't actually support async flips. It's
>>>>>>> probably better for the driver not to pretend otherwise.  
>>>>>>
>>>>>> I think wee need to clarify the meaning of the async_update callback
>>>>>> (and we should clarify it in the docs).
>>>>>>
>>>>>> The way I understand what the async_update callback should do is: don't
>>>>>> block (i.e. don't wait for the next vblank),  
>>>>>
>>>>> Note that those are two separate things. "Async flips" are about "don't
>>>>> wait for vblank", not about "don't block".
>>>>>
>>>>>  
>>>>>> and update the hw state at some point with the latest state from the
>>>>>> last call to async_update.
>>>>>>
>>>>>> Which means that: any driver can implement the async_update callback,
>>>>>> independently if it supports changing its state right away or not.
>>>>>> If hw supports, async_update can change the hw state right away, if not,
>>>>>> then changes will be applied in the next vblank (it can even amend the
>>>>>> pending commit if there is one).
>>>>>> With this, we can remove all the legacy cursor code to use the
>>>>>> async_update callback, since async_update can be called 100 times before
>>>>>> the next vblank, and the latest state will be set to the hw without
>>>>>> waiting 100 vblanks.
>>>>>>
>>>>>> Please, let me know if this is your understanding as well. If not, then
>>>>>> we need to remodel things.  
>>>>>
>>>>> While this may make sense for cursor updates, I don't think it does for
>>>>> async flips. If the flip only actually takes effect during the next
>>>>> vblank, it doesn't really fit the definition and userspace expectation
>>>>> of an async flip. It's better to clearly communicate to userspace that
>>>>> the hardware cannot do async flips, than to pretend it can and fake
>>>>> them. Userspace has to deal with this anyway, since async flips weren't
>>>>> always supported in general.  
>>>>
>>>> What do you think if we separate two concepts here:
>>>>
>>>> - amend mode: works like cursor updates, i.e, update the hw state at
>>>> some point with the latest state from the last call to async_update. No
>>>> special hardware support is required.
>>>>
>>>> - async update: update hw state immediately. This depends if the hw
>>>> supports it or not.
>>>>
>>>> Every async update is an amend, but the opposite is not necessarily true.
>>>>
>>>> What do you think if we rename the current async_update to amend_update,
>>>> and we add a parameter "force_async" to it? (or maybe
>>>> force_immediate_update?)
>>>> Then amend_check with force_async=1 would fail if the hardware doesn't
>>>> support it (we could also add flags in the capabilities to inform
>>>> userspace the expected behaviour of things and if the hw supports
>>>> force_sync).
>>>>
>>>> Like this, we can implement the cursors using the amend_update (which is
>>>> now called async_update), and async_flips with amend_update with
>>>> force_async=1.  
>>>
>>> Might force_async make sense for cursor updates as well? I thought some
>>> hardware supported HW cursor updates outside of vblank, but I'm not sure.

What I had in mind was actually:
amend_update() -> could do a real async or not depending on the hw
force_async=1 -> it means amend_update will fail if the hw doesn't
support it.

>>>
>>> Without force_async, are cursor updates always applied to the hardware
>>> on the next vblank, even if the pending commit is delayed further (e.g.
>>> because a fence it depends on doesn't signal before vblank)? If cursor
>>> updates can be delayed beyond the next vblank, that can result in bad
>>> user experience.
>>
>> You mean you have
>>
>> 1. sync/regular update pending (waiting on a fence)
>> 2. async update on top of #1
>>
>> ?
> 
> Yeah.
> 
> 

Actually I was thinking in another solution (without this force_async flag).

Instead of having this force_async, we can have two capabilities:

CAP_ASYNC: means the hw supports real async
CAP_AMEND: means that the driver supports amend the in-flight update so
that the new one will take its place in the queue (i.e. the current
legacy cursor behavior).

If (!CAP_AMEND && !CAP_ASYNC)
	* use a sync update or update the FB content in place without flipping
buffers.
	* legacy cursor update will fallback to sync update.
	* async flip is not supported.

If (CAP_AMEND && !CAP_ASYNC)
	* legacy cursor update will amend in-flight pending updates (like how
rockchip does now) or it will fallback to a sync update if not possible.
	* async flip is not supported.

If (!CAP_AMEND && CAP_ASYNC)
	* not sure yet what this would mean.

If (CAP_AMEND && CAP_ASYNC)
	* legacy cursor update will perform real async update.
	* async flip is supported.


What do you think?

Regards
Helen
diff mbox series

Patch

diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index c7d4c6073ea5..a1ee8c156a7b 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -912,30 +912,31 @@  static void vop_plane_atomic_async_update(struct drm_plane *plane,
 					  struct drm_plane_state *new_state)
 {
 	struct vop *vop = to_vop(plane->state->crtc);
-	struct drm_plane_state *plane_state;
+	struct drm_framebuffer *old_fb = plane->state->fb;
 
-	plane_state = plane->funcs->atomic_duplicate_state(plane);
-	plane_state->crtc_x = new_state->crtc_x;
-	plane_state->crtc_y = new_state->crtc_y;
-	plane_state->crtc_h = new_state->crtc_h;
-	plane_state->crtc_w = new_state->crtc_w;
-	plane_state->src_x = new_state->src_x;
-	plane_state->src_y = new_state->src_y;
-	plane_state->src_h = new_state->src_h;
-	plane_state->src_w = new_state->src_w;
-
-	if (plane_state->fb != new_state->fb)
-		drm_atomic_set_fb_for_plane(plane_state, new_state->fb);
-
-	swap(plane_state, plane->state);
-
-	if (plane->state->fb && plane->state->fb != new_state->fb) {
+	/*
+	 * A scanout can still be occurring, so we can't drop the reference to
+	 * the old framebuffer. To solve this we get a reference to old_fb and
+	 * set a worker to release it later.
+	 */
+	if (vop->is_enabled &&
+	    plane->state->fb && plane->state->fb != new_state->fb) {
 		drm_framebuffer_get(plane->state->fb);
 		WARN_ON(drm_crtc_vblank_get(plane->state->crtc) != 0);
 		drm_flip_work_queue(&vop->fb_unref_work, plane->state->fb);
 		set_bit(VOP_PENDING_FB_UNREF, &vop->pending);
 	}
 
+	plane->state->crtc_x = new_state->crtc_x;
+	plane->state->crtc_y = new_state->crtc_y;
+	plane->state->crtc_h = new_state->crtc_h;
+	plane->state->crtc_w = new_state->crtc_w;
+	plane->state->src_x = new_state->src_x;
+	plane->state->src_y = new_state->src_y;
+	plane->state->src_h = new_state->src_h;
+	plane->state->src_w = new_state->src_w;
+	plane->state->fb = new_state->fb;
+
 	if (vop->is_enabled) {
 		rockchip_drm_psr_inhibit_get_state(new_state->state);
 		vop_plane_atomic_update(plane, plane->state);
@@ -945,7 +946,12 @@  static void vop_plane_atomic_async_update(struct drm_plane *plane,
 		rockchip_drm_psr_inhibit_put_state(new_state->state);
 	}
 
-	plane->funcs->atomic_destroy_state(plane, plane_state);
+	/*
+	 * In async update we perform inplace modifications and release the
+	 * new_state. The following is required so we release the reference of
+	 * the old framebuffer.
+	 */
+	new_state->fb = old_fb;
 }
 
 static const struct drm_plane_helper_funcs plane_helper_funcs = {