Message ID | 1452689614-6570-1-git-send-email-john@metanate.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > As commented in drm_atomic_helper_wait_for_vblanks(), userspace relies > on cursor ioctls being unsynced. Converting the rockchip driver to > atomic has significantly impacted cursor performance by making every > cursor update wait for vblank. > > By skipping the vblank sync when the framebuffer has not changed (as is > done in drm_atomic_helper_wait_for_vblanks()) we can avoid this for the > common case of moving the cursor and only need to delay the cursor ioctl > when the cursor icon changes. > > I originally inserted a check on legacy_cursor_update as well, but that > caused a storm of iommu page faults. I didn't investigate the cause of > those since this change gives enough of a performance improvement for my > use case. > > This is RFC because of that and because the framebuffer_changed() > function is copied from drm_atomic_helper.c as a quick way to test the > result. > > Signed-off-by: John Keeping <john@metanate.com> > --- > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 +++++++++++++++++++++++++-- > 1 file changed, 25 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > index f784488..8fd9821 100644 > --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > @@ -177,8 +177,28 @@ static void rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > crtc_funcs->wait_for_update(crtc); > } > > +static bool framebuffer_changed(struct drm_device *dev, > + struct drm_atomic_state *old_state, > + struct drm_crtc *crtc) > +{ > + struct drm_plane *plane; > + struct drm_plane_state *old_plane_state; > + int i; > + > + for_each_plane_in_state(old_state, plane, old_plane_state, i) { > + if (plane->state->crtc != crtc && > + old_plane_state->crtc != crtc) > + continue; > + > + if (plane->state->fb != old_plane_state->fb) > + return true; > + } > + > + return false; > +} Please don't hand-roll logic that affects semantics like this. Instead please use drm_atomic_helper_wait_for_vblanks(), which should do this correctly for you. If that's not the case then we need to improve the generic helper, or figure out what's different with rockhip. Thanks, Daniel > + > static void > -rockchip_atomic_wait_for_complete(struct drm_atomic_state *old_state) > +rockchip_atomic_wait_for_complete(struct drm_device *dev, struct drm_atomic_state *old_state) > { > struct drm_crtc_state *old_crtc_state; > struct drm_crtc *crtc; > @@ -194,6 +214,9 @@ rockchip_atomic_wait_for_complete(struct drm_atomic_state *old_state) > if (!crtc->state->active) > continue; > > + if (!framebuffer_changed(dev, old_state, crtc)) > + continue; > + > ret = drm_crtc_vblank_get(crtc); > if (ret != 0) > continue; > @@ -241,7 +264,7 @@ rockchip_atomic_commit_complete(struct rockchip_atomic_commit *commit) > > drm_atomic_helper_commit_planes(dev, state, true); > > - rockchip_atomic_wait_for_complete(state); > + rockchip_atomic_wait_for_complete(dev, state); > > drm_atomic_helper_cleanup_planes(dev, state); > > -- > 2.7.0.rc3.140.g520a093 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: > On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > > As commented in drm_atomic_helper_wait_for_vblanks(), userspace > > relies on cursor ioctls being unsynced. Converting the rockchip > > driver to atomic has significantly impacted cursor performance by > > making every cursor update wait for vblank. > > > > By skipping the vblank sync when the framebuffer has not changed > > (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid > > this for the common case of moving the cursor and only need to > > delay the cursor ioctl when the cursor icon changes. > > > > I originally inserted a check on legacy_cursor_update as well, but > > that caused a storm of iommu page faults. I didn't investigate the > > cause of those since this change gives enough of a performance > > improvement for my use case. > > > > This is RFC because of that and because the framebuffer_changed() > > function is copied from drm_atomic_helper.c as a quick way to test > > the result. > > > > Signed-off-by: John Keeping <john@metanate.com> > > --- > > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 > > +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 > > deletions(-) > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 > > 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > @@ -177,8 +177,28 @@ static void > > rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > > crtc_funcs->wait_for_update(crtc); } > > > > +static bool framebuffer_changed(struct drm_device *dev, > > + struct drm_atomic_state *old_state, > > + struct drm_crtc *crtc) > > +{ > > + struct drm_plane *plane; > > + struct drm_plane_state *old_plane_state; > > + int i; > > + > > + for_each_plane_in_state(old_state, plane, old_plane_state, > > i) { > > + if (plane->state->crtc != crtc && > > + old_plane_state->crtc != crtc) > > + continue; > > + > > + if (plane->state->fb != old_plane_state->fb) > > + return true; > > + } > > + > > + return false; > > +} > > Please don't hand-roll logic that affects semantics like this. Instead > please use drm_atomic_helper_wait_for_vblanks(), which should do this > correctly for you. > > If that's not the case then we need to improve the generic helper, or > figure out what's different with rockhip. According to commit 63ebb9f (drm/rockchip: Convert to support atomic API) it's because rockchip doesn't have a hardware vblank counter. I'm not entirely clear on why this prevents the use of drm_atomic_helper_wait_for_vblanks(). > > + > > static void > > -rockchip_atomic_wait_for_complete(struct drm_atomic_state > > *old_state) +rockchip_atomic_wait_for_complete(struct drm_device > > *dev, struct drm_atomic_state *old_state) { > > struct drm_crtc_state *old_crtc_state; > > struct drm_crtc *crtc; > > @@ -194,6 +214,9 @@ rockchip_atomic_wait_for_complete(struct > > drm_atomic_state *old_state) if (!crtc->state->active) > > continue; > > > > + if (!framebuffer_changed(dev, old_state, crtc)) > > + continue; > > + > > ret = drm_crtc_vblank_get(crtc); > > if (ret != 0) > > continue; > > @@ -241,7 +264,7 @@ rockchip_atomic_commit_complete(struct > > rockchip_atomic_commit *commit) > > drm_atomic_helper_commit_planes(dev, state, true); > > > > - rockchip_atomic_wait_for_complete(state); > > + rockchip_atomic_wait_for_complete(dev, state); > > > > drm_atomic_helper_cleanup_planes(dev, state); > > > > -- > > 2.7.0.rc3.140.g520a093 > > > > _______________________________________________ > > dri-devel mailing list > > dri-devel@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/dri-devel >
On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: > On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: > > > On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > > > As commented in drm_atomic_helper_wait_for_vblanks(), userspace > > > relies on cursor ioctls being unsynced. Converting the rockchip > > > driver to atomic has significantly impacted cursor performance by > > > making every cursor update wait for vblank. > > > > > > By skipping the vblank sync when the framebuffer has not changed > > > (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid > > > this for the common case of moving the cursor and only need to > > > delay the cursor ioctl when the cursor icon changes. > > > > > > I originally inserted a check on legacy_cursor_update as well, but > > > that caused a storm of iommu page faults. I didn't investigate the > > > cause of those since this change gives enough of a performance > > > improvement for my use case. > > > > > > This is RFC because of that and because the framebuffer_changed() > > > function is copied from drm_atomic_helper.c as a quick way to test > > > the result. > > > > > > Signed-off-by: John Keeping <john@metanate.com> > > > --- > > > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 > > > +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 > > > deletions(-) > > > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 > > > 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > @@ -177,8 +177,28 @@ static void > > > rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > > > crtc_funcs->wait_for_update(crtc); } > > > > > > +static bool framebuffer_changed(struct drm_device *dev, > > > + struct drm_atomic_state *old_state, > > > + struct drm_crtc *crtc) > > > +{ > > > + struct drm_plane *plane; > > > + struct drm_plane_state *old_plane_state; > > > + int i; > > > + > > > + for_each_plane_in_state(old_state, plane, old_plane_state, > > > i) { > > > + if (plane->state->crtc != crtc && > > > + old_plane_state->crtc != crtc) > > > + continue; > > > + > > > + if (plane->state->fb != old_plane_state->fb) > > > + return true; > > > + } > > > + > > > + return false; > > > +} > > > > Please don't hand-roll logic that affects semantics like this. Instead > > please use drm_atomic_helper_wait_for_vblanks(), which should do this > > correctly for you. > > > > If that's not the case then we need to improve the generic helper, or > > figure out what's different with rockhip. > > According to commit 63ebb9f (drm/rockchip: Convert to support atomic > API) it's because rockchip doesn't have a hardware vblank counter. > > I'm not entirely clear on why this prevents the use of > drm_atomic_helper_wait_for_vblanks(). Hm, that commit isn't terribly helpful. If that's really needed then imo I think we should extract a "drm_atomic_helper_plane_needs_vblank_wait()" helper that's used by both. But since rockchip does vblank_get/put calls I'd hope vblanks actually work correctly. And then the helper should work too. -Daniel
On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: > On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: > > On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: > > > > > On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > > > > As commented in drm_atomic_helper_wait_for_vblanks(), userspace > > > > relies on cursor ioctls being unsynced. Converting the rockchip > > > > driver to atomic has significantly impacted cursor performance by > > > > making every cursor update wait for vblank. > > > > > > > > By skipping the vblank sync when the framebuffer has not changed > > > > (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid > > > > this for the common case of moving the cursor and only need to > > > > delay the cursor ioctl when the cursor icon changes. > > > > > > > > I originally inserted a check on legacy_cursor_update as well, but > > > > that caused a storm of iommu page faults. I didn't investigate the > > > > cause of those since this change gives enough of a performance > > > > improvement for my use case. > > > > > > > > This is RFC because of that and because the framebuffer_changed() > > > > function is copied from drm_atomic_helper.c as a quick way to test > > > > the result. > > > > > > > > Signed-off-by: John Keeping <john@metanate.com> > > > > --- > > > > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 > > > > +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 > > > > deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 > > > > 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > @@ -177,8 +177,28 @@ static void > > > > rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > > > > crtc_funcs->wait_for_update(crtc); } > > > > > > > > +static bool framebuffer_changed(struct drm_device *dev, > > > > + struct drm_atomic_state *old_state, > > > > + struct drm_crtc *crtc) > > > > +{ > > > > + struct drm_plane *plane; > > > > + struct drm_plane_state *old_plane_state; > > > > + int i; > > > > + > > > > + for_each_plane_in_state(old_state, plane, old_plane_state, > > > > i) { > > > > + if (plane->state->crtc != crtc && > > > > + old_plane_state->crtc != crtc) > > > > + continue; > > > > + > > > > + if (plane->state->fb != old_plane_state->fb) > > > > + return true; > > > > + } > > > > + > > > > + return false; > > > > +} > > > > > > Please don't hand-roll logic that affects semantics like this. Instead > > > please use drm_atomic_helper_wait_for_vblanks(), which should do this > > > correctly for you. > > > > > > If that's not the case then we need to improve the generic helper, or > > > figure out what's different with rockhip. > > > > According to commit 63ebb9f (drm/rockchip: Convert to support atomic > > API) it's because rockchip doesn't have a hardware vblank counter. > > > > I'm not entirely clear on why this prevents the use of > > drm_atomic_helper_wait_for_vblanks(). > > Hm, that commit isn't terribly helpful. If that's really needed then imo I > think we should extract a "drm_atomic_helper_plane_needs_vblank_wait()" > helper that's used by both. But since rockchip does vblank_get/put calls > I'd hope vblanks actually work correctly. And then the helper should work > too. I tried switching the call to rockchip_crtc_wait_for_update() to drm_atomic_helper_wait_for_vblanks() and it works fine until I switch the buffer associated with a cursor, at which point I get iommu page faults, presumably because the GEM buffer is unreferenced too early. AFAICT the buffer will be released via drm_atomic_state_free() unconditionally, but I suspect I'm missing something since that would mean every driver would hit a similar problem.
On Wed, Jan 13, 2016 at 03:55:29PM +0000, John Keeping wrote: > On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: > > > On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: > > > On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: > > > > > > > On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > > > > > As commented in drm_atomic_helper_wait_for_vblanks(), userspace > > > > > relies on cursor ioctls being unsynced. Converting the rockchip > > > > > driver to atomic has significantly impacted cursor performance by > > > > > making every cursor update wait for vblank. > > > > > > > > > > By skipping the vblank sync when the framebuffer has not changed > > > > > (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid > > > > > this for the common case of moving the cursor and only need to > > > > > delay the cursor ioctl when the cursor icon changes. > > > > > > > > > > I originally inserted a check on legacy_cursor_update as well, but > > > > > that caused a storm of iommu page faults. I didn't investigate the > > > > > cause of those since this change gives enough of a performance > > > > > improvement for my use case. > > > > > > > > > > This is RFC because of that and because the framebuffer_changed() > > > > > function is copied from drm_atomic_helper.c as a quick way to test > > > > > the result. > > > > > > > > > > Signed-off-by: John Keeping <john@metanate.com> > > > > > --- > > > > > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 > > > > > +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 > > > > > deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 > > > > > 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > @@ -177,8 +177,28 @@ static void > > > > > rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > > > > > crtc_funcs->wait_for_update(crtc); } > > > > > > > > > > +static bool framebuffer_changed(struct drm_device *dev, > > > > > + struct drm_atomic_state *old_state, > > > > > + struct drm_crtc *crtc) > > > > > +{ > > > > > + struct drm_plane *plane; > > > > > + struct drm_plane_state *old_plane_state; > > > > > + int i; > > > > > + > > > > > + for_each_plane_in_state(old_state, plane, old_plane_state, > > > > > i) { > > > > > + if (plane->state->crtc != crtc && > > > > > + old_plane_state->crtc != crtc) > > > > > + continue; > > > > > + > > > > > + if (plane->state->fb != old_plane_state->fb) > > > > > + return true; > > > > > + } > > > > > + > > > > > + return false; > > > > > +} > > > > > > > > Please don't hand-roll logic that affects semantics like this. Instead > > > > please use drm_atomic_helper_wait_for_vblanks(), which should do this > > > > correctly for you. > > > > > > > > If that's not the case then we need to improve the generic helper, or > > > > figure out what's different with rockhip. > > > > > > According to commit 63ebb9f (drm/rockchip: Convert to support atomic > > > API) it's because rockchip doesn't have a hardware vblank counter. > > > > > > I'm not entirely clear on why this prevents the use of > > > drm_atomic_helper_wait_for_vblanks(). > > > > Hm, that commit isn't terribly helpful. If that's really needed then imo I > > think we should extract a "drm_atomic_helper_plane_needs_vblank_wait()" > > helper that's used by both. But since rockchip does vblank_get/put calls > > I'd hope vblanks actually work correctly. And then the helper should work > > too. > > I tried switching the call to rockchip_crtc_wait_for_update() to > drm_atomic_helper_wait_for_vblanks() and it works fine until I switch > the buffer associated with a cursor, at which point I get iommu page > faults, presumably because the GEM buffer is unreferenced too early. > > AFAICT the buffer will be released via drm_atomic_state_free() > unconditionally, but I suspect I'm missing something since that would > mean every driver would hit a similar problem. Yeah, with the helper we always skip, which means when the cursor bo changes you indeed unmap too early. So can't even share the overall condition, but we could definitely share the little framebuffer_changed helper. Plus rockchip_crtc_wait_for_update should have a big comment explaining why we have different rules than core helpers! Cheers, Daniel
On Wed, 13 Jan 2016 17:21:56 +0100, Daniel Vetter wrote: > On Wed, Jan 13, 2016 at 03:55:29PM +0000, John Keeping wrote: > > On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: > > > > > On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: > > > > On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: > > > > > > > > > On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > > > > > > As commented in drm_atomic_helper_wait_for_vblanks(), userspace > > > > > > relies on cursor ioctls being unsynced. Converting the rockchip > > > > > > driver to atomic has significantly impacted cursor performance by > > > > > > making every cursor update wait for vblank. > > > > > > > > > > > > By skipping the vblank sync when the framebuffer has not changed > > > > > > (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid > > > > > > this for the common case of moving the cursor and only need to > > > > > > delay the cursor ioctl when the cursor icon changes. > > > > > > > > > > > > I originally inserted a check on legacy_cursor_update as well, but > > > > > > that caused a storm of iommu page faults. I didn't investigate the > > > > > > cause of those since this change gives enough of a performance > > > > > > improvement for my use case. > > > > > > > > > > > > This is RFC because of that and because the framebuffer_changed() > > > > > > function is copied from drm_atomic_helper.c as a quick way to test > > > > > > the result. > > > > > > > > > > > > Signed-off-by: John Keeping <john@metanate.com> > > > > > > --- > > > > > > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 > > > > > > +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 > > > > > > deletions(-) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 > > > > > > 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > @@ -177,8 +177,28 @@ static void > > > > > > rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > > > > > > crtc_funcs->wait_for_update(crtc); } > > > > > > > > > > > > +static bool framebuffer_changed(struct drm_device *dev, > > > > > > + struct drm_atomic_state *old_state, > > > > > > + struct drm_crtc *crtc) > > > > > > +{ > > > > > > + struct drm_plane *plane; > > > > > > + struct drm_plane_state *old_plane_state; > > > > > > + int i; > > > > > > + > > > > > > + for_each_plane_in_state(old_state, plane, old_plane_state, > > > > > > i) { > > > > > > + if (plane->state->crtc != crtc && > > > > > > + old_plane_state->crtc != crtc) > > > > > > + continue; > > > > > > + > > > > > > + if (plane->state->fb != old_plane_state->fb) > > > > > > + return true; > > > > > > + } > > > > > > + > > > > > > + return false; > > > > > > +} > > > > > > > > > > Please don't hand-roll logic that affects semantics like this. Instead > > > > > please use drm_atomic_helper_wait_for_vblanks(), which should do this > > > > > correctly for you. > > > > > > > > > > If that's not the case then we need to improve the generic helper, or > > > > > figure out what's different with rockhip. > > > > > > > > According to commit 63ebb9f (drm/rockchip: Convert to support atomic > > > > API) it's because rockchip doesn't have a hardware vblank counter. > > > > > > > > I'm not entirely clear on why this prevents the use of > > > > drm_atomic_helper_wait_for_vblanks(). > > > > > > Hm, that commit isn't terribly helpful. If that's really needed then imo I > > > think we should extract a "drm_atomic_helper_plane_needs_vblank_wait()" > > > helper that's used by both. But since rockchip does vblank_get/put calls > > > I'd hope vblanks actually work correctly. And then the helper should work > > > too. > > > > I tried switching the call to rockchip_crtc_wait_for_update() to > > drm_atomic_helper_wait_for_vblanks() and it works fine until I switch > > the buffer associated with a cursor, at which point I get iommu page > > faults, presumably because the GEM buffer is unreferenced too early. > > > > AFAICT the buffer will be released via drm_atomic_state_free() > > unconditionally, but I suspect I'm missing something since that would > > mean every driver would hit a similar problem. > > Yeah, with the helper we always skip, which means when the cursor bo > changes you indeed unmap too early. So can't even share the overall > condition, but we could definitely share the little framebuffer_changed > helper. That leaves me with the question: why do other atomic drivers work? If drm_atomic_helper_wait_for_vblanks() skipping vblanks results in the cursor bo being unmapped too early for rockchip, why is it not unmapped too early for all of the other drivers using that helper?
On Wed, Jan 13, 2016 at 04:40:38PM +0000, John Keeping wrote: > On Wed, 13 Jan 2016 17:21:56 +0100, Daniel Vetter wrote: > > > On Wed, Jan 13, 2016 at 03:55:29PM +0000, John Keeping wrote: > > > On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: > > > > > > > On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: > > > > > On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: > > > > > > > > > > > On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > > > > > > > As commented in drm_atomic_helper_wait_for_vblanks(), userspace > > > > > > > relies on cursor ioctls being unsynced. Converting the rockchip > > > > > > > driver to atomic has significantly impacted cursor performance by > > > > > > > making every cursor update wait for vblank. > > > > > > > > > > > > > > By skipping the vblank sync when the framebuffer has not changed > > > > > > > (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid > > > > > > > this for the common case of moving the cursor and only need to > > > > > > > delay the cursor ioctl when the cursor icon changes. > > > > > > > > > > > > > > I originally inserted a check on legacy_cursor_update as well, but > > > > > > > that caused a storm of iommu page faults. I didn't investigate the > > > > > > > cause of those since this change gives enough of a performance > > > > > > > improvement for my use case. > > > > > > > > > > > > > > This is RFC because of that and because the framebuffer_changed() > > > > > > > function is copied from drm_atomic_helper.c as a quick way to test > > > > > > > the result. > > > > > > > > > > > > > > Signed-off-by: John Keeping <john@metanate.com> > > > > > > > --- > > > > > > > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 > > > > > > > +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 > > > > > > > deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > > b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 > > > > > > > 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > > @@ -177,8 +177,28 @@ static void > > > > > > > rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > > > > > > > crtc_funcs->wait_for_update(crtc); } > > > > > > > > > > > > > > +static bool framebuffer_changed(struct drm_device *dev, > > > > > > > + struct drm_atomic_state *old_state, > > > > > > > + struct drm_crtc *crtc) > > > > > > > +{ > > > > > > > + struct drm_plane *plane; > > > > > > > + struct drm_plane_state *old_plane_state; > > > > > > > + int i; > > > > > > > + > > > > > > > + for_each_plane_in_state(old_state, plane, old_plane_state, > > > > > > > i) { > > > > > > > + if (plane->state->crtc != crtc && > > > > > > > + old_plane_state->crtc != crtc) > > > > > > > + continue; > > > > > > > + > > > > > > > + if (plane->state->fb != old_plane_state->fb) > > > > > > > + return true; > > > > > > > + } > > > > > > > + > > > > > > > + return false; > > > > > > > +} > > > > > > > > > > > > Please don't hand-roll logic that affects semantics like this. Instead > > > > > > please use drm_atomic_helper_wait_for_vblanks(), which should do this > > > > > > correctly for you. > > > > > > > > > > > > If that's not the case then we need to improve the generic helper, or > > > > > > figure out what's different with rockhip. > > > > > > > > > > According to commit 63ebb9f (drm/rockchip: Convert to support atomic > > > > > API) it's because rockchip doesn't have a hardware vblank counter. > > > > > > > > > > I'm not entirely clear on why this prevents the use of > > > > > drm_atomic_helper_wait_for_vblanks(). > > > > > > > > Hm, that commit isn't terribly helpful. If that's really needed then imo I > > > > think we should extract a "drm_atomic_helper_plane_needs_vblank_wait()" > > > > helper that's used by both. But since rockchip does vblank_get/put calls > > > > I'd hope vblanks actually work correctly. And then the helper should work > > > > too. > > > > > > I tried switching the call to rockchip_crtc_wait_for_update() to > > > drm_atomic_helper_wait_for_vblanks() and it works fine until I switch > > > the buffer associated with a cursor, at which point I get iommu page > > > faults, presumably because the GEM buffer is unreferenced too early. > > > > > > AFAICT the buffer will be released via drm_atomic_state_free() > > > unconditionally, but I suspect I'm missing something since that would > > > mean every driver would hit a similar problem. > > > > Yeah, with the helper we always skip, which means when the cursor bo > > changes you indeed unmap too early. So can't even share the overall > > condition, but we could definitely share the little framebuffer_changed > > helper. > > That leaves me with the question: why do other atomic drivers work? > > If drm_atomic_helper_wait_for_vblanks() skipping vblanks results in the > cursor bo being unmapped too early for rockchip, why is it not unmapped > too early for all of the other drivers using that helper? It's unmapped too early for everyone, it's just that normally that doesn't result in a fireworks show. What we maybe could/should do is do the unmapping asynchronously, but that runs into the overall "current atomic helpers don't do async yet" problem. Might be a good point to start fixing this up though. -Daniel
On Wed, 13 Jan 2016 18:19:17 +0100, Daniel Vetter wrote: > On Wed, Jan 13, 2016 at 04:40:38PM +0000, John Keeping wrote: > > On Wed, 13 Jan 2016 17:21:56 +0100, Daniel Vetter wrote: > > > > > On Wed, Jan 13, 2016 at 03:55:29PM +0000, John Keeping wrote: > > > > On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: > > > > > > > > > On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: > > > > > > On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: > > > > > > > > > > > > > On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > > > > > > > > As commented in drm_atomic_helper_wait_for_vblanks(), userspace > > > > > > > > relies on cursor ioctls being unsynced. Converting the rockchip > > > > > > > > driver to atomic has significantly impacted cursor performance by > > > > > > > > making every cursor update wait for vblank. > > > > > > > > > > > > > > > > By skipping the vblank sync when the framebuffer has not changed > > > > > > > > (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid > > > > > > > > this for the common case of moving the cursor and only need to > > > > > > > > delay the cursor ioctl when the cursor icon changes. > > > > > > > > > > > > > > > > I originally inserted a check on legacy_cursor_update as well, but > > > > > > > > that caused a storm of iommu page faults. I didn't investigate the > > > > > > > > cause of those since this change gives enough of a performance > > > > > > > > improvement for my use case. > > > > > > > > > > > > > > > > This is RFC because of that and because the framebuffer_changed() > > > > > > > > function is copied from drm_atomic_helper.c as a quick way to test > > > > > > > > the result. > > > > > > > > > > > > > > > > Signed-off-by: John Keeping <john@metanate.com> > > > > > > > > --- > > > > > > > > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 > > > > > > > > +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 > > > > > > > > deletions(-) > > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > > > b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 > > > > > > > > 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > > > > > > > > @@ -177,8 +177,28 @@ static void > > > > > > > > rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > > > > > > > > crtc_funcs->wait_for_update(crtc); } > > > > > > > > > > > > > > > > +static bool framebuffer_changed(struct drm_device *dev, > > > > > > > > + struct drm_atomic_state *old_state, > > > > > > > > + struct drm_crtc *crtc) > > > > > > > > +{ > > > > > > > > + struct drm_plane *plane; > > > > > > > > + struct drm_plane_state *old_plane_state; > > > > > > > > + int i; > > > > > > > > + > > > > > > > > + for_each_plane_in_state(old_state, plane, old_plane_state, > > > > > > > > i) { > > > > > > > > + if (plane->state->crtc != crtc && > > > > > > > > + old_plane_state->crtc != crtc) > > > > > > > > + continue; > > > > > > > > + > > > > > > > > + if (plane->state->fb != old_plane_state->fb) > > > > > > > > + return true; > > > > > > > > + } > > > > > > > > + > > > > > > > > + return false; > > > > > > > > +} > > > > > > > > > > > > > > Please don't hand-roll logic that affects semantics like this. Instead > > > > > > > please use drm_atomic_helper_wait_for_vblanks(), which should do this > > > > > > > correctly for you. > > > > > > > > > > > > > > If that's not the case then we need to improve the generic helper, or > > > > > > > figure out what's different with rockhip. > > > > > > > > > > > > According to commit 63ebb9f (drm/rockchip: Convert to support atomic > > > > > > API) it's because rockchip doesn't have a hardware vblank counter. > > > > > > > > > > > > I'm not entirely clear on why this prevents the use of > > > > > > drm_atomic_helper_wait_for_vblanks(). > > > > > > > > > > Hm, that commit isn't terribly helpful. If that's really needed then imo I > > > > > think we should extract a "drm_atomic_helper_plane_needs_vblank_wait()" > > > > > helper that's used by both. But since rockchip does vblank_get/put calls > > > > > I'd hope vblanks actually work correctly. And then the helper should work > > > > > too. > > > > > > > > I tried switching the call to rockchip_crtc_wait_for_update() to > > > > drm_atomic_helper_wait_for_vblanks() and it works fine until I switch > > > > the buffer associated with a cursor, at which point I get iommu page > > > > faults, presumably because the GEM buffer is unreferenced too early. > > > > > > > > AFAICT the buffer will be released via drm_atomic_state_free() > > > > unconditionally, but I suspect I'm missing something since that would > > > > mean every driver would hit a similar problem. > > > > > > Yeah, with the helper we always skip, which means when the cursor bo > > > changes you indeed unmap too early. So can't even share the overall > > > condition, but we could definitely share the little framebuffer_changed > > > helper. > > > > That leaves me with the question: why do other atomic drivers work? > > > > If drm_atomic_helper_wait_for_vblanks() skipping vblanks results in the > > cursor bo being unmapped too early for rockchip, why is it not unmapped > > too early for all of the other drivers using that helper? > > It's unmapped too early for everyone, it's just that normally that doesn't > result in a fireworks show. What we maybe could/should do is do the > unmapping asynchronously, but that runs into the overall "current atomic > helpers don't do async yet" problem. Might be a good point to start fixing > this up though. OK, thanks, I think I'm beginning to understand how this all fits together. It looks like there are two options for me to get reasonable cursor performance on rockchip in the short term: 1) Export the current framebuffer_changed() function as drm_atomic_helper_framebuffer_changed() and use it in rockchip_crtc_wait_for_update(). 2) Add a mechanism to suppress the legacy_cursor_update check in drm_atomic_helper_wait_for_vblanks() and switch the rockchip driver over to it. In both of these cases we're only restoring the unsynced cursor ioctls behaviour when the cursor is moved but it will still be expensive when the cursor bo changes. That gives sufficient performance in my testing.
On 2016?01?14? 01:39, John Keeping wrote: > On Wed, 13 Jan 2016 18:19:17 +0100, Daniel Vetter wrote: > >> On Wed, Jan 13, 2016 at 04:40:38PM +0000, John Keeping wrote: >>> On Wed, 13 Jan 2016 17:21:56 +0100, Daniel Vetter wrote: >>> >>>> On Wed, Jan 13, 2016 at 03:55:29PM +0000, John Keeping wrote: >>>>> On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: >>>>> >>>>>> On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: >>>>>>> On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: >>>>>>> >>>>>>>> On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: >>>>>>>>> As commented in drm_atomic_helper_wait_for_vblanks(), userspace >>>>>>>>> relies on cursor ioctls being unsynced. Converting the rockchip >>>>>>>>> driver to atomic has significantly impacted cursor performance by >>>>>>>>> making every cursor update wait for vblank. >>>>>>>>> >>>>>>>>> By skipping the vblank sync when the framebuffer has not changed >>>>>>>>> (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid >>>>>>>>> this for the common case of moving the cursor and only need to >>>>>>>>> delay the cursor ioctl when the cursor icon changes. >>>>>>>>> >>>>>>>>> I originally inserted a check on legacy_cursor_update as well, but >>>>>>>>> that caused a storm of iommu page faults. I didn't investigate the >>>>>>>>> cause of those since this change gives enough of a performance >>>>>>>>> improvement for my use case. >>>>>>>>> >>>>>>>>> This is RFC because of that and because the framebuffer_changed() >>>>>>>>> function is copied from drm_atomic_helper.c as a quick way to test >>>>>>>>> the result. >>>>>>>>> >>>>>>>>> Signed-off-by: John Keeping <john@metanate.com> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 >>>>>>>>> +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 >>>>>>>>> deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>> b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 >>>>>>>>> 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>> @@ -177,8 +177,28 @@ static void >>>>>>>>> rockchip_crtc_wait_for_update(struct drm_crtc *crtc) >>>>>>>>> crtc_funcs->wait_for_update(crtc); } >>>>>>>>> >>>>>>>>> +static bool framebuffer_changed(struct drm_device *dev, >>>>>>>>> + struct drm_atomic_state *old_state, >>>>>>>>> + struct drm_crtc *crtc) >>>>>>>>> +{ >>>>>>>>> + struct drm_plane *plane; >>>>>>>>> + struct drm_plane_state *old_plane_state; >>>>>>>>> + int i; >>>>>>>>> + >>>>>>>>> + for_each_plane_in_state(old_state, plane, old_plane_state, >>>>>>>>> i) { >>>>>>>>> + if (plane->state->crtc != crtc && >>>>>>>>> + old_plane_state->crtc != crtc) >>>>>>>>> + continue; >>>>>>>>> + >>>>>>>>> + if (plane->state->fb != old_plane_state->fb) >>>>>>>>> + return true; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + return false; >>>>>>>>> +} >>>>>>>> Please don't hand-roll logic that affects semantics like this. Instead >>>>>>>> please use drm_atomic_helper_wait_for_vblanks(), which should do this >>>>>>>> correctly for you. >>>>>>>> >>>>>>>> If that's not the case then we need to improve the generic helper, or >>>>>>>> figure out what's different with rockhip. >>>>>>> According to commit 63ebb9f (drm/rockchip: Convert to support atomic >>>>>>> API) it's because rockchip doesn't have a hardware vblank counter. >>>>>>> >>>>>>> I'm not entirely clear on why this prevents the use of >>>>>>> drm_atomic_helper_wait_for_vblanks(). >>>>>> Hm, that commit isn't terribly helpful. If that's really needed then imo I >>>>>> think we should extract a "drm_atomic_helper_plane_needs_vblank_wait()" >>>>>> helper that's used by both. But since rockchip does vblank_get/put calls >>>>>> I'd hope vblanks actually work correctly. And then the helper should work >>>>>> too. >>>>> I tried switching the call to rockchip_crtc_wait_for_update() to >>>>> drm_atomic_helper_wait_for_vblanks() and it works fine until I switch >>>>> the buffer associated with a cursor, at which point I get iommu page >>>>> faults, presumably because the GEM buffer is unreferenced too early. >>>>> >>>>> AFAICT the buffer will be released via drm_atomic_state_free() >>>>> unconditionally, but I suspect I'm missing something since that would >>>>> mean every driver would hit a similar problem. >>>> Yeah, with the helper we always skip, which means when the cursor bo >>>> changes you indeed unmap too early. So can't even share the overall >>>> condition, but we could definitely share the little framebuffer_changed >>>> helper. >>> That leaves me with the question: why do other atomic drivers work? >>> >>> If drm_atomic_helper_wait_for_vblanks() skipping vblanks results in the >>> cursor bo being unmapped too early for rockchip, why is it not unmapped >>> too early for all of the other drivers using that helper? >> It's unmapped too early for everyone, it's just that normally that doesn't >> result in a fireworks show. What we maybe could/should do is do the >> unmapping asynchronously, but that runs into the overall "current atomic >> helpers don't do async yet" problem. Might be a good point to start fixing >> this up though. > OK, thanks, I think I'm beginning to understand how this all fits > together. > > It looks like there are two options for me to get reasonable cursor > performance on rockchip in the short term: > > 1) Export the current framebuffer_changed() function as > drm_atomic_helper_framebuffer_changed() and use it in > rockchip_crtc_wait_for_update(). > > 2) Add a mechanism to suppress the legacy_cursor_update check in > drm_atomic_helper_wait_for_vblanks() and switch the rockchip driver > over to it. > > In both of these cases we're only restoring the unsynced cursor ioctls > behaviour when the cursor is moved but it will still be expensive when > the cursor bo changes. That gives sufficient performance in my testing. > > > Thanks for point that. because rockchip not support hardware vblank counter, use drm_atomic_helper_wait_for_vblanks have under issues: | <-- HW vsync irq and reg take effect plane_commit ---> | get_vblank and wait -> | | <-- handle_vblank, vblank->count + 1 cleanup_fb ---> | iommu crash ---> | | <-- HW vsync irq and reg take effect there is no hardware vblank counter on rockchip vop, we can't ensure the consistency of reg take effect and vblank->count, if plane commit hit into the period of reg take effect and vblank->count, cleanup_fb happen before old_fb swap out from vop, then iommu crash. That is why I special the wait_for_vblanks, we need check the reg really take effect before clean up old fb. at vop_win_pending_is_complete function, check win enable and win address, to ensure that. Not only rockchip drm do that thing: exynos also check address before cleanup fb if (start == start_s) exynos_drm_crtc_finish_update(ctx->crtc, plane); Thanks. -- ?ark Yao
On Thu, Jan 14, 2016 at 2:16 AM, Mark yao <mark.yao@rock-chips.com> wrote: > On 2016?01?14? 01:39, John Keeping wrote: >> >> On Wed, 13 Jan 2016 18:19:17 +0100, Daniel Vetter wrote: >> >>> On Wed, Jan 13, 2016 at 04:40:38PM +0000, John Keeping wrote: >>>> >>>> On Wed, 13 Jan 2016 17:21:56 +0100, Daniel Vetter wrote: >>>> >>>>> >>>>> On Wed, Jan 13, 2016 at 03:55:29PM +0000, John Keeping wrote: >>>>>> >>>>>> On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: >>>>>> >>>>>>> >>>>>>> On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: >>>>>>>> >>>>>>>> On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: >>>>>>>>>> >>>>>>>>>> As commented in drm_atomic_helper_wait_for_vblanks(), userspace >>>>>>>>>> relies on cursor ioctls being unsynced. Converting the rockchip >>>>>>>>>> driver to atomic has significantly impacted cursor performance by >>>>>>>>>> making every cursor update wait for vblank. >>>>>>>>>> >>>>>>>>>> By skipping the vblank sync when the framebuffer has not changed >>>>>>>>>> (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid >>>>>>>>>> this for the common case of moving the cursor and only need to >>>>>>>>>> delay the cursor ioctl when the cursor icon changes. >>>>>>>>>> >>>>>>>>>> I originally inserted a check on legacy_cursor_update as well, but >>>>>>>>>> that caused a storm of iommu page faults. I didn't investigate >>>>>>>>>> the >>>>>>>>>> cause of those since this change gives enough of a performance >>>>>>>>>> improvement for my use case. >>>>>>>>>> >>>>>>>>>> This is RFC because of that and because the framebuffer_changed() >>>>>>>>>> function is copied from drm_atomic_helper.c as a quick way to test >>>>>>>>>> the result. >>>>>>>>>> >>>>>>>>>> Signed-off-by: John Keeping <john@metanate.com> >>>>>>>>>> --- >>>>>>>>>> drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 >>>>>>>>>> +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 >>>>>>>>>> deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>>> b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index >>>>>>>>>> f784488..8fd9821 >>>>>>>>>> 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>>> @@ -177,8 +177,28 @@ static void >>>>>>>>>> rockchip_crtc_wait_for_update(struct drm_crtc *crtc) >>>>>>>>>> crtc_funcs->wait_for_update(crtc); } >>>>>>>>>> +static bool framebuffer_changed(struct drm_device *dev, >>>>>>>>>> + struct drm_atomic_state >>>>>>>>>> *old_state, >>>>>>>>>> + struct drm_crtc *crtc) >>>>>>>>>> +{ >>>>>>>>>> + struct drm_plane *plane; >>>>>>>>>> + struct drm_plane_state *old_plane_state; >>>>>>>>>> + int i; >>>>>>>>>> + >>>>>>>>>> + for_each_plane_in_state(old_state, plane, old_plane_state, >>>>>>>>>> i) { >>>>>>>>>> + if (plane->state->crtc != crtc && >>>>>>>>>> + old_plane_state->crtc != crtc) >>>>>>>>>> + continue; >>>>>>>>>> + >>>>>>>>>> + if (plane->state->fb != old_plane_state->fb) >>>>>>>>>> + return true; >>>>>>>>>> + } >>>>>>>>>> + >>>>>>>>>> + return false; >>>>>>>>>> +} >>>>>>>>> >>>>>>>>> Please don't hand-roll logic that affects semantics like this. >>>>>>>>> Instead >>>>>>>>> please use drm_atomic_helper_wait_for_vblanks(), which should do >>>>>>>>> this >>>>>>>>> correctly for you. >>>>>>>>> >>>>>>>>> If that's not the case then we need to improve the generic helper, >>>>>>>>> or >>>>>>>>> figure out what's different with rockhip. >>>>>>>> >>>>>>>> According to commit 63ebb9f (drm/rockchip: Convert to support atomic >>>>>>>> API) it's because rockchip doesn't have a hardware vblank counter. >>>>>>>> >>>>>>>> I'm not entirely clear on why this prevents the use of >>>>>>>> drm_atomic_helper_wait_for_vblanks(). >>>>>>> >>>>>>> Hm, that commit isn't terribly helpful. If that's really needed then >>>>>>> imo I >>>>>>> think we should extract a >>>>>>> "drm_atomic_helper_plane_needs_vblank_wait()" >>>>>>> helper that's used by both. But since rockchip does vblank_get/put >>>>>>> calls >>>>>>> I'd hope vblanks actually work correctly. And then the helper should >>>>>>> work >>>>>>> too. >>>>>> >>>>>> I tried switching the call to rockchip_crtc_wait_for_update() to >>>>>> drm_atomic_helper_wait_for_vblanks() and it works fine until I switch >>>>>> the buffer associated with a cursor, at which point I get iommu page >>>>>> faults, presumably because the GEM buffer is unreferenced too early. >>>>>> >>>>>> AFAICT the buffer will be released via drm_atomic_state_free() >>>>>> unconditionally, but I suspect I'm missing something since that would >>>>>> mean every driver would hit a similar problem. >>>>> >>>>> Yeah, with the helper we always skip, which means when the cursor bo >>>>> changes you indeed unmap too early. So can't even share the overall >>>>> condition, but we could definitely share the little framebuffer_changed >>>>> helper. >>>> >>>> That leaves me with the question: why do other atomic drivers work? >>>> >>>> If drm_atomic_helper_wait_for_vblanks() skipping vblanks results in the >>>> cursor bo being unmapped too early for rockchip, why is it not unmapped >>>> too early for all of the other drivers using that helper? >>> >>> It's unmapped too early for everyone, it's just that normally that >>> doesn't >>> result in a fireworks show. What we maybe could/should do is do the >>> unmapping asynchronously, but that runs into the overall "current atomic >>> helpers don't do async yet" problem. Might be a good point to start >>> fixing >>> this up though. >> >> OK, thanks, I think I'm beginning to understand how this all fits >> together. >> >> It looks like there are two options for me to get reasonable cursor >> performance on rockchip in the short term: >> >> 1) Export the current framebuffer_changed() function as >> drm_atomic_helper_framebuffer_changed() and use it in >> rockchip_crtc_wait_for_update(). >> >> 2) Add a mechanism to suppress the legacy_cursor_update check in >> drm_atomic_helper_wait_for_vblanks() and switch the rockchip driver >> over to it. >> >> In both of these cases we're only restoring the unsynced cursor ioctls >> behaviour when the cursor is moved but it will still be expensive when >> the cursor bo changes. That gives sufficient performance in my testing. >> >> >> > > Thanks for point that. > > because rockchip not support hardware vblank counter, use > drm_atomic_helper_wait_for_vblanks have under issues: > > | <-- HW vsync irq and reg take > effect > plane_commit ---> | > get_vblank and wait -> | > | <-- handle_vblank, > vblank->count + 1 > cleanup_fb ---> | > iommu crash ---> | > | <-- HW vsync irq and reg take > effect > there is no hardware vblank counter on rockchip vop, we can't ensure the > consistency of reg take effect and vblank->count, > if plane commit hit into the period of reg take effect and vblank->count, > cleanup_fb happen before old_fb swap out from vop, > then iommu crash. > > That is why I special the wait_for_vblanks, we need check the reg really > take effect before clean up old fb. > at vop_win_pending_is_complete function, check win enable and win address, > to ensure that. > > Not only rockchip drm do that thing: > > exynos also check address before cleanup fb > if (start == start_s) > exynos_drm_crtc_finish_update(ctx->crtc, plane); > > Thanks. Do you have a scanline counter or something similar at least? Any other indication about how far along the chip is with scanning out? We use that in i915 to avoid races with the interrupt handler and detect this w/a scenario. I think if you have a scanline counter then it should magically work, since the vblank code will realize that you're already past the last vblank interrupt and /should/ have incremented already. Or something like that. Otherwise if this is common we might want to figure out how to solve this in a generic way. It's one of these problems that will make generic async support almost impossible. -Daniel
On 2016?01?14? 16:32, Daniel Vetter wrote: > On Thu, Jan 14, 2016 at 2:16 AM, Mark yao <mark.yao@rock-chips.com> wrote: >> On 2016?01?14? 01:39, John Keeping wrote: >>> On Wed, 13 Jan 2016 18:19:17 +0100, Daniel Vetter wrote: >>> >>>> On Wed, Jan 13, 2016 at 04:40:38PM +0000, John Keeping wrote: >>>>> On Wed, 13 Jan 2016 17:21:56 +0100, Daniel Vetter wrote: >>>>> >>>>>> On Wed, Jan 13, 2016 at 03:55:29PM +0000, John Keeping wrote: >>>>>>> On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: >>>>>>> >>>>>>>> On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: >>>>>>>>> On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: >>>>>>>>> >>>>>>>>>> On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: >>>>>>>>>>> As commented in drm_atomic_helper_wait_for_vblanks(), userspace >>>>>>>>>>> relies on cursor ioctls being unsynced. Converting the rockchip >>>>>>>>>>> driver to atomic has significantly impacted cursor performance by >>>>>>>>>>> making every cursor update wait for vblank. >>>>>>>>>>> >>>>>>>>>>> By skipping the vblank sync when the framebuffer has not changed >>>>>>>>>>> (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid >>>>>>>>>>> this for the common case of moving the cursor and only need to >>>>>>>>>>> delay the cursor ioctl when the cursor icon changes. >>>>>>>>>>> >>>>>>>>>>> I originally inserted a check on legacy_cursor_update as well, but >>>>>>>>>>> that caused a storm of iommu page faults. I didn't investigate >>>>>>>>>>> the >>>>>>>>>>> cause of those since this change gives enough of a performance >>>>>>>>>>> improvement for my use case. >>>>>>>>>>> >>>>>>>>>>> This is RFC because of that and because the framebuffer_changed() >>>>>>>>>>> function is copied from drm_atomic_helper.c as a quick way to test >>>>>>>>>>> the result. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: John Keeping <john@metanate.com> >>>>>>>>>>> --- >>>>>>>>>>> drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 >>>>>>>>>>> +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 >>>>>>>>>>> deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>>>> b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index >>>>>>>>>>> f784488..8fd9821 >>>>>>>>>>> 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>>>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c >>>>>>>>>>> @@ -177,8 +177,28 @@ static void >>>>>>>>>>> rockchip_crtc_wait_for_update(struct drm_crtc *crtc) >>>>>>>>>>> crtc_funcs->wait_for_update(crtc); } >>>>>>>>>>> +static bool framebuffer_changed(struct drm_device *dev, >>>>>>>>>>> + struct drm_atomic_state >>>>>>>>>>> *old_state, >>>>>>>>>>> + struct drm_crtc *crtc) >>>>>>>>>>> +{ >>>>>>>>>>> + struct drm_plane *plane; >>>>>>>>>>> + struct drm_plane_state *old_plane_state; >>>>>>>>>>> + int i; >>>>>>>>>>> + >>>>>>>>>>> + for_each_plane_in_state(old_state, plane, old_plane_state, >>>>>>>>>>> i) { >>>>>>>>>>> + if (plane->state->crtc != crtc && >>>>>>>>>>> + old_plane_state->crtc != crtc) >>>>>>>>>>> + continue; >>>>>>>>>>> + >>>>>>>>>>> + if (plane->state->fb != old_plane_state->fb) >>>>>>>>>>> + return true; >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + return false; >>>>>>>>>>> +} >>>>>>>>>> Please don't hand-roll logic that affects semantics like this. >>>>>>>>>> Instead >>>>>>>>>> please use drm_atomic_helper_wait_for_vblanks(), which should do >>>>>>>>>> this >>>>>>>>>> correctly for you. >>>>>>>>>> >>>>>>>>>> If that's not the case then we need to improve the generic helper, >>>>>>>>>> or >>>>>>>>>> figure out what's different with rockhip. >>>>>>>>> According to commit 63ebb9f (drm/rockchip: Convert to support atomic >>>>>>>>> API) it's because rockchip doesn't have a hardware vblank counter. >>>>>>>>> >>>>>>>>> I'm not entirely clear on why this prevents the use of >>>>>>>>> drm_atomic_helper_wait_for_vblanks(). >>>>>>>> Hm, that commit isn't terribly helpful. If that's really needed then >>>>>>>> imo I >>>>>>>> think we should extract a >>>>>>>> "drm_atomic_helper_plane_needs_vblank_wait()" >>>>>>>> helper that's used by both. But since rockchip does vblank_get/put >>>>>>>> calls >>>>>>>> I'd hope vblanks actually work correctly. And then the helper should >>>>>>>> work >>>>>>>> too. >>>>>>> I tried switching the call to rockchip_crtc_wait_for_update() to >>>>>>> drm_atomic_helper_wait_for_vblanks() and it works fine until I switch >>>>>>> the buffer associated with a cursor, at which point I get iommu page >>>>>>> faults, presumably because the GEM buffer is unreferenced too early. >>>>>>> >>>>>>> AFAICT the buffer will be released via drm_atomic_state_free() >>>>>>> unconditionally, but I suspect I'm missing something since that would >>>>>>> mean every driver would hit a similar problem. >>>>>> Yeah, with the helper we always skip, which means when the cursor bo >>>>>> changes you indeed unmap too early. So can't even share the overall >>>>>> condition, but we could definitely share the little framebuffer_changed >>>>>> helper. >>>>> That leaves me with the question: why do other atomic drivers work? >>>>> >>>>> If drm_atomic_helper_wait_for_vblanks() skipping vblanks results in the >>>>> cursor bo being unmapped too early for rockchip, why is it not unmapped >>>>> too early for all of the other drivers using that helper? >>>> It's unmapped too early for everyone, it's just that normally that >>>> doesn't >>>> result in a fireworks show. What we maybe could/should do is do the >>>> unmapping asynchronously, but that runs into the overall "current atomic >>>> helpers don't do async yet" problem. Might be a good point to start >>>> fixing >>>> this up though. >>> OK, thanks, I think I'm beginning to understand how this all fits >>> together. >>> >>> It looks like there are two options for me to get reasonable cursor >>> performance on rockchip in the short term: >>> >>> 1) Export the current framebuffer_changed() function as >>> drm_atomic_helper_framebuffer_changed() and use it in >>> rockchip_crtc_wait_for_update(). >>> >>> 2) Add a mechanism to suppress the legacy_cursor_update check in >>> drm_atomic_helper_wait_for_vblanks() and switch the rockchip driver >>> over to it. >>> >>> In both of these cases we're only restoring the unsynced cursor ioctls >>> behaviour when the cursor is moved but it will still be expensive when >>> the cursor bo changes. That gives sufficient performance in my testing. >>> >>> >>> >> Thanks for point that. >> >> because rockchip not support hardware vblank counter, use >> drm_atomic_helper_wait_for_vblanks have under issues: >> >> | <-- HW vsync irq and reg take >> effect >> plane_commit ---> | >> get_vblank and wait -> | >> | <-- handle_vblank, >> vblank->count + 1 >> cleanup_fb ---> | >> iommu crash ---> | >> | <-- HW vsync irq and reg take >> effect >> there is no hardware vblank counter on rockchip vop, we can't ensure the >> consistency of reg take effect and vblank->count, >> if plane commit hit into the period of reg take effect and vblank->count, >> cleanup_fb happen before old_fb swap out from vop, >> then iommu crash. >> >> That is why I special the wait_for_vblanks, we need check the reg really >> take effect before clean up old fb. >> at vop_win_pending_is_complete function, check win enable and win address, >> to ensure that. >> >> Not only rockchip drm do that thing: >> >> exynos also check address before cleanup fb >> if (start == start_s) >> exynos_drm_crtc_finish_update(ctx->crtc, plane); >> >> Thanks. > Do you have a scanline counter or something similar at least? Any > other indication about how far along the chip is with scanning out? We > use that in i915 to avoid races with the interrupt handler and detect > this w/a scenario. > > I think if you have a scanline counter then it should magically work, > since the vblank code will realize that you're already past the last > vblank interrupt and /should/ have incremented already. Or something > like that. > > Otherwise if this is common we might want to figure out how to solve > this in a generic way. It's one of these problems that will make > generic async support almost impossible. > -Daniel No, both rk3288 or rk3036 not support hardware vblank counter and scanline counter. At android side, we use same way, check address and enable bit to ensure register take effect. On future chips, scanline counter and hardware counter would be support, but not now.
On Thu, Jan 14, 2016 at 04:46:37PM +0800, Mark yao wrote: > On 2016?01?14? 16:32, Daniel Vetter wrote: > >On Thu, Jan 14, 2016 at 2:16 AM, Mark yao <mark.yao@rock-chips.com> wrote: > >>On 2016?01?14? 01:39, John Keeping wrote: > >>>On Wed, 13 Jan 2016 18:19:17 +0100, Daniel Vetter wrote: > >>> > >>>>On Wed, Jan 13, 2016 at 04:40:38PM +0000, John Keeping wrote: > >>>>>On Wed, 13 Jan 2016 17:21:56 +0100, Daniel Vetter wrote: > >>>>> > >>>>>>On Wed, Jan 13, 2016 at 03:55:29PM +0000, John Keeping wrote: > >>>>>>>On Wed, 13 Jan 2016 16:40:05 +0100, Daniel Vetter wrote: > >>>>>>> > >>>>>>>>On Wed, Jan 13, 2016 at 02:34:25PM +0000, John Keeping wrote: > >>>>>>>>>On Wed, 13 Jan 2016 15:23:20 +0100, Daniel Vetter wrote: > >>>>>>>>> > >>>>>>>>>>On Wed, Jan 13, 2016 at 12:53:34PM +0000, John Keeping wrote: > >>>>>>>>>>>As commented in drm_atomic_helper_wait_for_vblanks(), userspace > >>>>>>>>>>>relies on cursor ioctls being unsynced. Converting the rockchip > >>>>>>>>>>>driver to atomic has significantly impacted cursor performance by > >>>>>>>>>>>making every cursor update wait for vblank. > >>>>>>>>>>> > >>>>>>>>>>>By skipping the vblank sync when the framebuffer has not changed > >>>>>>>>>>>(as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid > >>>>>>>>>>>this for the common case of moving the cursor and only need to > >>>>>>>>>>>delay the cursor ioctl when the cursor icon changes. > >>>>>>>>>>> > >>>>>>>>>>>I originally inserted a check on legacy_cursor_update as well, but > >>>>>>>>>>>that caused a storm of iommu page faults. I didn't investigate > >>>>>>>>>>>the > >>>>>>>>>>>cause of those since this change gives enough of a performance > >>>>>>>>>>>improvement for my use case. > >>>>>>>>>>> > >>>>>>>>>>>This is RFC because of that and because the framebuffer_changed() > >>>>>>>>>>>function is copied from drm_atomic_helper.c as a quick way to test > >>>>>>>>>>>the result. > >>>>>>>>>>> > >>>>>>>>>>>Signed-off-by: John Keeping <john@metanate.com> > >>>>>>>>>>>--- > >>>>>>>>>>> drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 > >>>>>>>>>>>+++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 > >>>>>>>>>>>deletions(-) > >>>>>>>>>>> > >>>>>>>>>>>diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > >>>>>>>>>>>b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index > >>>>>>>>>>>f784488..8fd9821 > >>>>>>>>>>>100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > >>>>>>>>>>>+++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c > >>>>>>>>>>>@@ -177,8 +177,28 @@ static void > >>>>>>>>>>>rockchip_crtc_wait_for_update(struct drm_crtc *crtc) > >>>>>>>>>>>crtc_funcs->wait_for_update(crtc); } > >>>>>>>>>>> +static bool framebuffer_changed(struct drm_device *dev, > >>>>>>>>>>>+ struct drm_atomic_state > >>>>>>>>>>>*old_state, > >>>>>>>>>>>+ struct drm_crtc *crtc) > >>>>>>>>>>>+{ > >>>>>>>>>>>+ struct drm_plane *plane; > >>>>>>>>>>>+ struct drm_plane_state *old_plane_state; > >>>>>>>>>>>+ int i; > >>>>>>>>>>>+ > >>>>>>>>>>>+ for_each_plane_in_state(old_state, plane, old_plane_state, > >>>>>>>>>>>i) { > >>>>>>>>>>>+ if (plane->state->crtc != crtc && > >>>>>>>>>>>+ old_plane_state->crtc != crtc) > >>>>>>>>>>>+ continue; > >>>>>>>>>>>+ > >>>>>>>>>>>+ if (plane->state->fb != old_plane_state->fb) > >>>>>>>>>>>+ return true; > >>>>>>>>>>>+ } > >>>>>>>>>>>+ > >>>>>>>>>>>+ return false; > >>>>>>>>>>>+} > >>>>>>>>>>Please don't hand-roll logic that affects semantics like this. > >>>>>>>>>>Instead > >>>>>>>>>>please use drm_atomic_helper_wait_for_vblanks(), which should do > >>>>>>>>>>this > >>>>>>>>>>correctly for you. > >>>>>>>>>> > >>>>>>>>>>If that's not the case then we need to improve the generic helper, > >>>>>>>>>>or > >>>>>>>>>>figure out what's different with rockhip. > >>>>>>>>>According to commit 63ebb9f (drm/rockchip: Convert to support atomic > >>>>>>>>>API) it's because rockchip doesn't have a hardware vblank counter. > >>>>>>>>> > >>>>>>>>>I'm not entirely clear on why this prevents the use of > >>>>>>>>>drm_atomic_helper_wait_for_vblanks(). > >>>>>>>>Hm, that commit isn't terribly helpful. If that's really needed then > >>>>>>>>imo I > >>>>>>>>think we should extract a > >>>>>>>>"drm_atomic_helper_plane_needs_vblank_wait()" > >>>>>>>>helper that's used by both. But since rockchip does vblank_get/put > >>>>>>>>calls > >>>>>>>>I'd hope vblanks actually work correctly. And then the helper should > >>>>>>>>work > >>>>>>>>too. > >>>>>>>I tried switching the call to rockchip_crtc_wait_for_update() to > >>>>>>>drm_atomic_helper_wait_for_vblanks() and it works fine until I switch > >>>>>>>the buffer associated with a cursor, at which point I get iommu page > >>>>>>>faults, presumably because the GEM buffer is unreferenced too early. > >>>>>>> > >>>>>>>AFAICT the buffer will be released via drm_atomic_state_free() > >>>>>>>unconditionally, but I suspect I'm missing something since that would > >>>>>>>mean every driver would hit a similar problem. > >>>>>>Yeah, with the helper we always skip, which means when the cursor bo > >>>>>>changes you indeed unmap too early. So can't even share the overall > >>>>>>condition, but we could definitely share the little framebuffer_changed > >>>>>>helper. > >>>>>That leaves me with the question: why do other atomic drivers work? > >>>>> > >>>>>If drm_atomic_helper_wait_for_vblanks() skipping vblanks results in the > >>>>>cursor bo being unmapped too early for rockchip, why is it not unmapped > >>>>>too early for all of the other drivers using that helper? > >>>>It's unmapped too early for everyone, it's just that normally that > >>>>doesn't > >>>>result in a fireworks show. What we maybe could/should do is do the > >>>>unmapping asynchronously, but that runs into the overall "current atomic > >>>>helpers don't do async yet" problem. Might be a good point to start > >>>>fixing > >>>>this up though. > >>>OK, thanks, I think I'm beginning to understand how this all fits > >>>together. > >>> > >>>It looks like there are two options for me to get reasonable cursor > >>>performance on rockchip in the short term: > >>> > >>>1) Export the current framebuffer_changed() function as > >>> drm_atomic_helper_framebuffer_changed() and use it in > >>> rockchip_crtc_wait_for_update(). > >>> > >>>2) Add a mechanism to suppress the legacy_cursor_update check in > >>> drm_atomic_helper_wait_for_vblanks() and switch the rockchip driver > >>> over to it. > >>> > >>>In both of these cases we're only restoring the unsynced cursor ioctls > >>>behaviour when the cursor is moved but it will still be expensive when > >>>the cursor bo changes. That gives sufficient performance in my testing. > >>> > >>> > >>> > >>Thanks for point that. > >> > >>because rockchip not support hardware vblank counter, use > >>drm_atomic_helper_wait_for_vblanks have under issues: > >> > >> | <-- HW vsync irq and reg take > >>effect > >> plane_commit ---> | > >> get_vblank and wait -> | > >> | <-- handle_vblank, > >>vblank->count + 1 > >> cleanup_fb ---> | > >> iommu crash ---> | > >> | <-- HW vsync irq and reg take > >>effect > >>there is no hardware vblank counter on rockchip vop, we can't ensure the > >>consistency of reg take effect and vblank->count, > >>if plane commit hit into the period of reg take effect and vblank->count, > >>cleanup_fb happen before old_fb swap out from vop, > >>then iommu crash. > >> > >>That is why I special the wait_for_vblanks, we need check the reg really > >>take effect before clean up old fb. > >>at vop_win_pending_is_complete function, check win enable and win address, > >>to ensure that. > >> > >>Not only rockchip drm do that thing: > >> > >>exynos also check address before cleanup fb > >> if (start == start_s) > >> exynos_drm_crtc_finish_update(ctx->crtc, plane); > >> > >>Thanks. > >Do you have a scanline counter or something similar at least? Any > >other indication about how far along the chip is with scanning out? We > >use that in i915 to avoid races with the interrupt handler and detect > >this w/a scenario. > > > >I think if you have a scanline counter then it should magically work, > >since the vblank code will realize that you're already past the last > >vblank interrupt and /should/ have incremented already. Or something > >like that. > > > >Otherwise if this is common we might want to figure out how to solve > >this in a generic way. It's one of these problems that will make > >generic async support almost impossible. > >-Daniel > > No, both rk3288 or rk3036 not support hardware vblank counter and scanline > counter. > > At android side, we use same way, check address and enable bit to ensure > register take effect. > > On future chips, scanline counter and hardware counter would be support, > but not now. Ugh. Oh well, there's not really anything we can do in core nore helpers to make this easier for drivers. This really only can be fixed sensibly at the hardware level. So yeah I think exposing framebuffer_changed as a helper is the way to go here. Thanks, Daniel
On Thu, 14 Jan 2016 15:20:47 +0100, Daniel Vetter wrote: > Ugh. Oh well, there's not really anything we can do in core nore helpers > to make this easier for drivers. This really only can be fixed sensibly at > the hardware level. > > So yeah I think exposing framebuffer_changed as a helper is the way to go > here. OK, here's a series to do that. I also added a comment to rockchip_atomic_wait_for_complete() explaining why we can't use drm_atomic_helper_wait_for_vblanks(). John Keeping (3): drm/atomic-helper: Export framebuffer_changed() drm/rockchip: don't wait for vblank if fb hasn't changed drm/rockchip: explain why we can't wait_for_vblanks drivers/gpu/drm/drm_atomic_helper.c | 24 ++++++++++++++++++++---- drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 14 ++++++++++++-- include/drm/drm_atomic_helper.h | 4 ++++ 3 files changed, 36 insertions(+), 6 deletions(-)
Am Mittwoch, 13. Januar 2016, 12:53:34 schrieb John Keeping: > As commented in drm_atomic_helper_wait_for_vblanks(), userspace relies > on cursor ioctls being unsynced. Converting the rockchip driver to > atomic has significantly impacted cursor performance by making every > cursor update wait for vblank. > > By skipping the vblank sync when the framebuffer has not changed (as is > done in drm_atomic_helper_wait_for_vblanks()) we can avoid this for the > common case of moving the cursor and only need to delay the cursor ioctl > when the cursor icon changes. > > I originally inserted a check on legacy_cursor_update as well, but that > caused a storm of iommu page faults. I didn't investigate the cause of > those since this change gives enough of a performance improvement for my > use case. > > This is RFC because of that and because the framebuffer_changed() > function is copied from drm_atomic_helper.c as a quick way to test the > result. > > Signed-off-by: John Keeping <john@metanate.com> I've seen the effects now as well after making the atomic parts work on in my devtree - i.e. sluggish cursor movements. This patch fixes that issue, so at least: Tested-by: Heiko Stuebner <heiko@sntech.de> Right now I still see flickering on animated cursors though (like ones used by KDE), that wasn't present before. Heiko
The first two patches are unchanged since v1 but the comment in the third has been expanded following Thierry's comments. John Keeping (3): drm/atomic-helper: Export framebuffer_changed() drm/rockchip: don't wait for vblank if fb hasn't changed drm/rockchip: explain why we can't wait_for_vblanks drivers/gpu/drm/drm_atomic_helper.c | 24 ++++++++++++++++++++---- drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 23 +++++++++++++++++++++-- include/drm/drm_atomic_helper.h | 4 ++++ 3 files changed, 45 insertions(+), 6 deletions(-)
On 2016?01?19? 18:46, John Keeping wrote: > The first two patches are unchanged since v1 but the comment in the > third has been expanded following Thierry's comments. > > John Keeping (3): > drm/atomic-helper: Export framebuffer_changed() > drm/rockchip: don't wait for vblank if fb hasn't changed > drm/rockchip: explain why we can't wait_for_vblanks > > drivers/gpu/drm/drm_atomic_helper.c | 24 ++++++++++++++++++++---- > drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 23 +++++++++++++++++++++-- > include/drm/drm_atomic_helper.h | 4 ++++ > 3 files changed, 45 insertions(+), 6 deletions(-) > Hi John Thanks for your fix, applied these three patches into my drm-next, :-)
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c index f784488..8fd9821 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c @@ -177,8 +177,28 @@ static void rockchip_crtc_wait_for_update(struct drm_crtc *crtc) crtc_funcs->wait_for_update(crtc); } +static bool framebuffer_changed(struct drm_device *dev, + struct drm_atomic_state *old_state, + struct drm_crtc *crtc) +{ + struct drm_plane *plane; + struct drm_plane_state *old_plane_state; + int i; + + for_each_plane_in_state(old_state, plane, old_plane_state, i) { + if (plane->state->crtc != crtc && + old_plane_state->crtc != crtc) + continue; + + if (plane->state->fb != old_plane_state->fb) + return true; + } + + return false; +} + static void -rockchip_atomic_wait_for_complete(struct drm_atomic_state *old_state) +rockchip_atomic_wait_for_complete(struct drm_device *dev, struct drm_atomic_state *old_state) { struct drm_crtc_state *old_crtc_state; struct drm_crtc *crtc; @@ -194,6 +214,9 @@ rockchip_atomic_wait_for_complete(struct drm_atomic_state *old_state) if (!crtc->state->active) continue; + if (!framebuffer_changed(dev, old_state, crtc)) + continue; + ret = drm_crtc_vblank_get(crtc); if (ret != 0) continue; @@ -241,7 +264,7 @@ rockchip_atomic_commit_complete(struct rockchip_atomic_commit *commit) drm_atomic_helper_commit_planes(dev, state, true); - rockchip_atomic_wait_for_complete(state); + rockchip_atomic_wait_for_complete(dev, state); drm_atomic_helper_cleanup_planes(dev, state);
As commented in drm_atomic_helper_wait_for_vblanks(), userspace relies on cursor ioctls being unsynced. Converting the rockchip driver to atomic has significantly impacted cursor performance by making every cursor update wait for vblank. By skipping the vblank sync when the framebuffer has not changed (as is done in drm_atomic_helper_wait_for_vblanks()) we can avoid this for the common case of moving the cursor and only need to delay the cursor ioctl when the cursor icon changes. I originally inserted a check on legacy_cursor_update as well, but that caused a storm of iommu page faults. I didn't investigate the cause of those since this change gives enough of a performance improvement for my use case. This is RFC because of that and because the framebuffer_changed() function is copied from drm_atomic_helper.c as a quick way to test the result. Signed-off-by: John Keeping <john@metanate.com> --- drivers/gpu/drm/rockchip/rockchip_drm_fb.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-)