Message ID | 20190107174146.2947-1-nicholas.kazlauskas@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] drm: Block fb changes for async plane updates | expand |
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: fef9df8b5945 drm/atomic: initial support for asynchronous plane update. The bot has tested the following trees: v4.20.0, v4.19.13, v4.14.91. v4.20.0: Build OK! v4.19.13: Build OK! v4.14.91: Build failed! Errors: drivers/gpu/drm/drm_atomic_helper.c:1421:6: error: ‘old_plane_state’ undeclared (first use in this function); did you mean ‘drm_plane_state’? drivers/gpu/drm/drm_atomic_helper.c:1421:29: error: ‘new_plane_state’ undeclared (first use in this function); did you mean ‘drm_plane_state’? How should we proceed with this patch? -- Thanks, Sasha
On 2/13/19 5:18 PM, Nicholas Kazlauskas wrote: > The prepare_fb call always happens on new_plane_state. > > The drm_atomic_helper_cleanup_planes checks to see if > plane state pointer has changed when deciding to call cleanup_fb on > either the new_plane_state or the old_plane_state. > > For a non-async atomic commit the state pointer is swapped, so this > helper calls prepare_fb on the new_plane_state and cleanup_fb on the > old_plane_state. This makes sense, since we want to prepare the > framebuffer we are going to use and cleanup the the framebuffer we are > no longer using. > > For the async atomic update helpers this differs. The async atomic > update helpers perform in-place updates on the existing state. They call > drm_atomic_helper_cleanup_planes but the state pointer is not swapped. > This means that prepare_fb is called on the new_plane_state and > cleanup_fb is called on the new_plane_state (not the old). > > In the case where old_plane_state->fb == new_plane_state->fb then > there should be no behavioral difference between an async update > and a non-async commit. But there are issues that arise when > old_plane_state->fb != new_plane_state->fb. > > The first is that the new_plane_state->fb is immediately cleaned up > after it has been prepared, so we're using a fb that we shouldn't > be. > > The second occurs during a sequence of async atomic updates and > non-async regular atomic commits. Suppose there are two framebuffers > being interleaved in a double-buffering scenario, fb1 and fb2: > > - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 > - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2 > - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 > > We call cleanup_fb on fb2 twice in this example scenario, and any > further use will result in use-after-free. > > The simple fix to this problem is to block framebuffer changes > in the drm_atomic_helper_async_check function for now. > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Harry Wentland <harry.wentland@amd.com> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> > Cc: <stable@vger.kernel.org> # v4.14+ > Fixes: fef9df8b5945 ("drm/atomic: initial support for asynchronous plane update") > Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > --- > drivers/gpu/drm/drm_atomic_helper.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c > index 54e2ae614dcc..f4290f6b0c38 100644 > --- a/drivers/gpu/drm/drm_atomic_helper.c > +++ b/drivers/gpu/drm/drm_atomic_helper.c > @@ -1602,6 +1602,15 @@ int drm_atomic_helper_async_check(struct drm_device *dev, > old_plane_state->crtc != new_plane_state->crtc) > return -EINVAL; > > + /* > + * FIXME: Since prepare_fb and cleanup_fb are always called on > + * the new_plane_state for async updates we need to block framebuffer > + * changes. This prevents use of a fb that's been cleaned up and > + * double cleanups from occuring. > + */ > + if (old_plane_state->fb != new_plane_state->fb) > + return -EINVAL; > + > funcs = plane->helper_private; > if (!funcs->atomic_async_update) > return -EINVAL; > Hi, Thank you for working on this. I have two points I would like to raise: 1) cursor update is too slow: I am working on the v8 of https://lore.kernel.org/patchwork/patch/949264/ "drm/i915: update cursors asynchronously through atomic". But with this patch applied, the following tests fails: cursor-vs-flip-atomic-transitions-varying-size cursor-vs-flip-toggle cursor-vs-flip-varying-size with errors of type: "CRITICAL: completed 97 cursor updated in a period of 30 flips, we expect to complete approximately 15360 updates, with the threshold set at 7680" I guess it's because what should be async updates are being turned into non-async updates (which is slower). Well, I guess we need to work on a proper solution for the FIXME above, this is just an observation/note. 2) I have another possibly related issue: From the tests I am making, unless I did some mistake or miss interpretation (which is possible since I am new to drm), it seems that there is a problem if an async update re-uses the same fb from a previous non-async commit, for example: - Non-async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb2, - Async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 # double prepare fb1 The cursor through atomic patch was "solving" this by bypassing the prepare call with a: @@ -13495,6 +13503,9 @@ intel_prepare_plane_fb(struct drm_plane *plane, struct drm_i915_gem_object *old_obj = intel_fb_obj(plane->state->fb); int ret; + if (new_state->state->async_update) + return 0; + if (old_obj) { struct drm_crtc_state *crtc_state = drm_atomic_get_new_crtc_state(new_state->state, Which looks really wrong to me, but without it, igt tests (plane_cursor_legacy, kms_cursor_legacy) fail, I get lots of I915_VMA_PIN_OVERFLOW, which suggest double pinning the frame buffers. I did this really ugly test to bypass the prepare call just if the buffer is already pinned: https://gitlab.collabora.com/koike/linux/commit/2d01d4af30347ff43fbac254d8ec305cd6fe4aeb With this hack igt tests pass. As I am new to DRM, I would appreciate if someone could confirm that this is a real bug or please clarify if I am miss interpreting anything. If this is a real bug, then if someone could give me pointers in how to properly fix this it would be great. Thanks a lot Helen
On 2/13/19 7:21 PM, Helen Koike wrote: > > > On 2/13/19 5:18 PM, Nicholas Kazlauskas wrote: >> The prepare_fb call always happens on new_plane_state. >> >> The drm_atomic_helper_cleanup_planes checks to see if >> plane state pointer has changed when deciding to call cleanup_fb on >> either the new_plane_state or the old_plane_state. >> >> For a non-async atomic commit the state pointer is swapped, so this >> helper calls prepare_fb on the new_plane_state and cleanup_fb on the >> old_plane_state. This makes sense, since we want to prepare the >> framebuffer we are going to use and cleanup the the framebuffer we are >> no longer using. >> >> For the async atomic update helpers this differs. The async atomic >> update helpers perform in-place updates on the existing state. They call >> drm_atomic_helper_cleanup_planes but the state pointer is not swapped. >> This means that prepare_fb is called on the new_plane_state and >> cleanup_fb is called on the new_plane_state (not the old). >> >> In the case where old_plane_state->fb == new_plane_state->fb then >> there should be no behavioral difference between an async update >> and a non-async commit. But there are issues that arise when >> old_plane_state->fb != new_plane_state->fb. >> >> The first is that the new_plane_state->fb is immediately cleaned up >> after it has been prepared, so we're using a fb that we shouldn't >> be. >> >> The second occurs during a sequence of async atomic updates and >> non-async regular atomic commits. Suppose there are two framebuffers >> being interleaved in a double-buffering scenario, fb1 and fb2: >> >> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 >> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2 >> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 >> >> We call cleanup_fb on fb2 twice in this example scenario, and any >> further use will result in use-after-free. >> >> The simple fix to this problem is to block framebuffer changes >> in the drm_atomic_helper_async_check function for now. >> >> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >> Cc: Harry Wentland <harry.wentland@amd.com> >> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >> Cc: <stable@vger.kernel.org> # v4.14+ >> Fixes: fef9df8b5945 ("drm/atomic: initial support for asynchronous plane update") >> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> >> --- >> drivers/gpu/drm/drm_atomic_helper.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c >> index 54e2ae614dcc..f4290f6b0c38 100644 >> --- a/drivers/gpu/drm/drm_atomic_helper.c >> +++ b/drivers/gpu/drm/drm_atomic_helper.c >> @@ -1602,6 +1602,15 @@ int drm_atomic_helper_async_check(struct drm_device *dev, >> old_plane_state->crtc != new_plane_state->crtc) >> return -EINVAL; >> >> + /* >> + * FIXME: Since prepare_fb and cleanup_fb are always called on >> + * the new_plane_state for async updates we need to block framebuffer >> + * changes. This prevents use of a fb that's been cleaned up and >> + * double cleanups from occuring. >> + */ >> + if (old_plane_state->fb != new_plane_state->fb) >> + return -EINVAL; >> + >> funcs = plane->helper_private; >> if (!funcs->atomic_async_update) >> return -EINVAL; >> > > Hi, > > Thank you for working on this. > > I have two points I would like to raise: > > 1) cursor update is too slow: > > I am working on the v8 of > https://lore.kernel.org/patchwork/patch/949264/ "drm/i915: update > cursors asynchronously through atomic". > > But with this patch applied, the following tests fails: > > cursor-vs-flip-atomic-transitions-varying-size > cursor-vs-flip-toggle > cursor-vs-flip-varying-size > > with errors of type: > > "CRITICAL: completed 97 cursor updated in a period of 30 flips, we > expect to complete approximately 15360 updates, with the threshold set > at 7680" > > I guess it's because what should be async updates are being turned into > non-async updates (which is slower). > > Well, I guess we need to work on a proper solution for the FIXME above, > this is just an observation/note. > > 2) I have another possibly related issue: > > From the tests I am making, unless I did some mistake or miss > interpretation (which is possible since I am new to drm), it seems that > there is a problem if an async update re-uses the same fb from a > previous non-async commit, for example: > > - Non-async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb2, > - Async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 # > double prepare fb1 Sorry, this is solved by this patch, I meant this: - Non-async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb2, - Async commit, oldfb = fb1, newfb = fb1, prepare fb1, cleanup fb1 #double prepare fb1 But I was told that async commit should block if there is a pending commit, so this scenario shouldn't happen, but there is still something off (please see below). > > The cursor through atomic patch was "solving" this by bypassing the > prepare call with a: > > @@ -13495,6 +13503,9 @@ intel_prepare_plane_fb(struct drm_plane *plane, > struct drm_i915_gem_object *old_obj = > intel_fb_obj(plane->state->fb); > int ret; > > + if (new_state->state->async_update) > + return 0; > + > if (old_obj) { > struct drm_crtc_state *crtc_state = > drm_atomic_get_new_crtc_state(new_state->state, > > > Which looks really wrong to me, but without it, igt tests > (plane_cursor_legacy, kms_cursor_legacy) fail, I get lots of > I915_VMA_PIN_OVERFLOW, which suggest double pinning the frame buffers. Please, ignore this error above, when I ran the test this patch (Block fb changes) wasn't applied when I got this error about pin overflow, but I still have issues: With this patch (Block fb changes) applied I get several dmesg failures on WARN_ON(to_intel_plane_state(state)->vma) in the intel_plane_destroy_state() function. https://people.collabora.com/~koike/results-5.0.0-rc1+-v8-tmp/html/problems.html But I don't get this errors with the hack below (that bypass intel_prepare_plane_fb if the buffer is already pinned). I'm still not sure why, I would really appreciate if any one has any pointers. Thanks Helen > > I did this really ugly test to bypass the prepare call just if the > buffer is already pinned: > https://gitlab.collabora.com/koike/linux/commit/2d01d4af30347ff43fbac254d8ec305cd6fe4aeb > > With this hack igt tests pass. > > As I am new to DRM, I would appreciate if someone could confirm that > this is a real bug or please clarify if I am miss interpreting anything. > > If this is a real bug, then if someone could give me pointers in how to > properly fix this it would be great. > > Thanks a lot > Helen >
On Wed, Feb 13, 2019 at 09:32:43PM -0200, Helen Koike wrote: > > > On 2/13/19 7:21 PM, Helen Koike wrote: > > > > > > On 2/13/19 5:18 PM, Nicholas Kazlauskas wrote: > >> The prepare_fb call always happens on new_plane_state. > >> > >> The drm_atomic_helper_cleanup_planes checks to see if > >> plane state pointer has changed when deciding to call cleanup_fb on > >> either the new_plane_state or the old_plane_state. > >> > >> For a non-async atomic commit the state pointer is swapped, so this > >> helper calls prepare_fb on the new_plane_state and cleanup_fb on the > >> old_plane_state. This makes sense, since we want to prepare the > >> framebuffer we are going to use and cleanup the the framebuffer we are > >> no longer using. > >> > >> For the async atomic update helpers this differs. The async atomic > >> update helpers perform in-place updates on the existing state. They call > >> drm_atomic_helper_cleanup_planes but the state pointer is not swapped. > >> This means that prepare_fb is called on the new_plane_state and > >> cleanup_fb is called on the new_plane_state (not the old). > >> > >> In the case where old_plane_state->fb == new_plane_state->fb then > >> there should be no behavioral difference between an async update > >> and a non-async commit. But there are issues that arise when > >> old_plane_state->fb != new_plane_state->fb. > >> > >> The first is that the new_plane_state->fb is immediately cleaned up > >> after it has been prepared, so we're using a fb that we shouldn't > >> be. > >> > >> The second occurs during a sequence of async atomic updates and > >> non-async regular atomic commits. Suppose there are two framebuffers > >> being interleaved in a double-buffering scenario, fb1 and fb2: > >> > >> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 > >> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2 > >> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 > >> > >> We call cleanup_fb on fb2 twice in this example scenario, and any > >> further use will result in use-after-free. > >> > >> The simple fix to this problem is to block framebuffer changes > >> in the drm_atomic_helper_async_check function for now. > >> > >> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > >> Cc: Harry Wentland <harry.wentland@amd.com> > >> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> > >> Cc: <stable@vger.kernel.org> # v4.14+ > >> Fixes: fef9df8b5945 ("drm/atomic: initial support for asynchronous plane update") > >> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> > >> --- > >> drivers/gpu/drm/drm_atomic_helper.c | 9 +++++++++ > >> 1 file changed, 9 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c > >> index 54e2ae614dcc..f4290f6b0c38 100644 > >> --- a/drivers/gpu/drm/drm_atomic_helper.c > >> +++ b/drivers/gpu/drm/drm_atomic_helper.c > >> @@ -1602,6 +1602,15 @@ int drm_atomic_helper_async_check(struct drm_device *dev, > >> old_plane_state->crtc != new_plane_state->crtc) > >> return -EINVAL; > >> > >> + /* > >> + * FIXME: Since prepare_fb and cleanup_fb are always called on > >> + * the new_plane_state for async updates we need to block framebuffer > >> + * changes. This prevents use of a fb that's been cleaned up and > >> + * double cleanups from occuring. > >> + */ > >> + if (old_plane_state->fb != new_plane_state->fb) > >> + return -EINVAL; > >> + > >> funcs = plane->helper_private; > >> if (!funcs->atomic_async_update) > >> return -EINVAL; > >> > > > > Hi, > > > > Thank you for working on this. > > > > I have two points I would like to raise: > > > > 1) cursor update is too slow: > > > > I am working on the v8 of > > https://lore.kernel.org/patchwork/patch/949264/ "drm/i915: update > > cursors asynchronously through atomic". > > > > But with this patch applied, the following tests fails: > > > > cursor-vs-flip-atomic-transitions-varying-size > > cursor-vs-flip-toggle > > cursor-vs-flip-varying-size > > > > with errors of type: > > > > "CRITICAL: completed 97 cursor updated in a period of 30 flips, we > > expect to complete approximately 15360 updates, with the threshold set > > at 7680" > > > > I guess it's because what should be async updates are being turned into > > non-async updates (which is slower). > > > > Well, I guess we need to work on a proper solution for the FIXME above, > > this is just an observation/note. > > > > 2) I have another possibly related issue: > > > > From the tests I am making, unless I did some mistake or miss > > interpretation (which is possible since I am new to drm), it seems that > > there is a problem if an async update re-uses the same fb from a > > previous non-async commit, for example: > > > > - Non-async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb2, > > - Async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 # > > double prepare fb1 > > Sorry, this is solved by this patch, I meant this: > > - Non-async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb2, > - Async commit, oldfb = fb1, newfb = fb1, prepare fb1, cleanup fb1 > #double prepare fb1 > > But I was told that async commit should block if there is a pending > commit, so this scenario shouldn't happen, but there is still something > off (please see below). > > > > > The cursor through atomic patch was "solving" this by bypassing the > > prepare call with a: > > > > @@ -13495,6 +13503,9 @@ intel_prepare_plane_fb(struct drm_plane *plane, > > struct drm_i915_gem_object *old_obj = > > intel_fb_obj(plane->state->fb); > > int ret; > > > > + if (new_state->state->async_update) > > + return 0; > > + > > if (old_obj) { > > struct drm_crtc_state *crtc_state = > > drm_atomic_get_new_crtc_state(new_state->state, > > > > > > Which looks really wrong to me, but without it, igt tests > > (plane_cursor_legacy, kms_cursor_legacy) fail, I get lots of > > I915_VMA_PIN_OVERFLOW, which suggest double pinning the frame buffers. > > Please, ignore this error above, when I ran the test this patch (Block > fb changes) wasn't applied when I got this error about pin overflow, but > I still have issues: > > With this patch (Block fb changes) applied I get several dmesg failures > on WARN_ON(to_intel_plane_state(state)->vma) in the > intel_plane_destroy_state() function. > > https://people.collabora.com/~koike/results-5.0.0-rc1+-v8-tmp/html/problems.html > > But I don't get this errors with the hack below (that bypass > intel_prepare_plane_fb if the buffer is already pinned). > > I'm still not sure why, I would really appreciate if any one has any > pointers. i915 largely uses its own modeset framework, so there's a good chance for differences in how we pin/unpin buffers, compared to common atomic helpers. That's about my best guess. -Daniel > > Thanks > Helen > > > > > I did this really ugly test to bypass the prepare call just if the > > buffer is already pinned: > > https://gitlab.collabora.com/koike/linux/commit/2d01d4af30347ff43fbac254d8ec305cd6fe4aeb > > > > With this hack igt tests pass. > > > > As I am new to DRM, I would appreciate if someone could confirm that > > this is a real bug or please clarify if I am miss interpreting anything. > > > > If this is a real bug, then if someone could give me pointers in how to > > properly fix this it would be great. > > > > Thanks a lot > > Helen > >
On 2/14/19 6:21 AM, Daniel Vetter wrote: > On Wed, Feb 13, 2019 at 09:32:43PM -0200, Helen Koike wrote: >> >> >> On 2/13/19 7:21 PM, Helen Koike wrote: >>> >>> >>> On 2/13/19 5:18 PM, Nicholas Kazlauskas wrote: >>>> The prepare_fb call always happens on new_plane_state. >>>> >>>> The drm_atomic_helper_cleanup_planes checks to see if >>>> plane state pointer has changed when deciding to call cleanup_fb on >>>> either the new_plane_state or the old_plane_state. >>>> >>>> For a non-async atomic commit the state pointer is swapped, so this >>>> helper calls prepare_fb on the new_plane_state and cleanup_fb on the >>>> old_plane_state. This makes sense, since we want to prepare the >>>> framebuffer we are going to use and cleanup the the framebuffer we are >>>> no longer using. >>>> >>>> For the async atomic update helpers this differs. The async atomic >>>> update helpers perform in-place updates on the existing state. They call >>>> drm_atomic_helper_cleanup_planes but the state pointer is not swapped. >>>> This means that prepare_fb is called on the new_plane_state and >>>> cleanup_fb is called on the new_plane_state (not the old). >>>> >>>> In the case where old_plane_state->fb == new_plane_state->fb then >>>> there should be no behavioral difference between an async update >>>> and a non-async commit. But there are issues that arise when >>>> old_plane_state->fb != new_plane_state->fb. >>>> >>>> The first is that the new_plane_state->fb is immediately cleaned up >>>> after it has been prepared, so we're using a fb that we shouldn't >>>> be. >>>> >>>> The second occurs during a sequence of async atomic updates and >>>> non-async regular atomic commits. Suppose there are two framebuffers >>>> being interleaved in a double-buffering scenario, fb1 and fb2: >>>> >>>> - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 >>>> - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2 >>>> - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 >>>> >>>> We call cleanup_fb on fb2 twice in this example scenario, and any >>>> further use will result in use-after-free. >>>> >>>> The simple fix to this problem is to block framebuffer changes >>>> in the drm_atomic_helper_async_check function for now. >>>> >>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >>>> Cc: Harry Wentland <harry.wentland@amd.com> >>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> >>>> Cc: <stable@vger.kernel.org> # v4.14+ >>>> Fixes: fef9df8b5945 ("drm/atomic: initial support for asynchronous plane update") >>>> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> >>>> --- >>>> drivers/gpu/drm/drm_atomic_helper.c | 9 +++++++++ >>>> 1 file changed, 9 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c >>>> index 54e2ae614dcc..f4290f6b0c38 100644 >>>> --- a/drivers/gpu/drm/drm_atomic_helper.c >>>> +++ b/drivers/gpu/drm/drm_atomic_helper.c >>>> @@ -1602,6 +1602,15 @@ int drm_atomic_helper_async_check(struct drm_device *dev, >>>> old_plane_state->crtc != new_plane_state->crtc) >>>> return -EINVAL; >>>> >>>> + /* >>>> + * FIXME: Since prepare_fb and cleanup_fb are always called on >>>> + * the new_plane_state for async updates we need to block framebuffer >>>> + * changes. This prevents use of a fb that's been cleaned up and >>>> + * double cleanups from occuring. >>>> + */ >>>> + if (old_plane_state->fb != new_plane_state->fb) >>>> + return -EINVAL; >>>> + >>>> funcs = plane->helper_private; >>>> if (!funcs->atomic_async_update) >>>> return -EINVAL; >>>> >>> >>> Hi, >>> >>> Thank you for working on this. >>> >>> I have two points I would like to raise: >>> >>> 1) cursor update is too slow: >>> >>> I am working on the v8 of >>> https://lore.kernel.org/patchwork/patch/949264/ "drm/i915: update >>> cursors asynchronously through atomic". >>> >>> But with this patch applied, the following tests fails: >>> >>> cursor-vs-flip-atomic-transitions-varying-size >>> cursor-vs-flip-toggle >>> cursor-vs-flip-varying-size >>> >>> with errors of type: >>> >>> "CRITICAL: completed 97 cursor updated in a period of 30 flips, we >>> expect to complete approximately 15360 updates, with the threshold set >>> at 7680" >>> >>> I guess it's because what should be async updates are being turned into >>> non-async updates (which is slower). >>> >>> Well, I guess we need to work on a proper solution for the FIXME above, >>> this is just an observation/note. >>> >>> 2) I have another possibly related issue: >>> >>> From the tests I am making, unless I did some mistake or miss >>> interpretation (which is possible since I am new to drm), it seems that >>> there is a problem if an async update re-uses the same fb from a >>> previous non-async commit, for example: >>> >>> - Non-async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb2, >>> - Async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 # >>> double prepare fb1 >> >> Sorry, this is solved by this patch, I meant this: >> >> - Non-async commit, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb2, >> - Async commit, oldfb = fb1, newfb = fb1, prepare fb1, cleanup fb1 >> #double prepare fb1 >> >> But I was told that async commit should block if there is a pending >> commit, so this scenario shouldn't happen, but there is still something >> off (please see below). >> >>> >>> The cursor through atomic patch was "solving" this by bypassing the >>> prepare call with a: >>> >>> @@ -13495,6 +13503,9 @@ intel_prepare_plane_fb(struct drm_plane *plane, >>> struct drm_i915_gem_object *old_obj = >>> intel_fb_obj(plane->state->fb); >>> int ret; >>> >>> + if (new_state->state->async_update) >>> + return 0; >>> + >>> if (old_obj) { >>> struct drm_crtc_state *crtc_state = >>> drm_atomic_get_new_crtc_state(new_state->state, >>> >>> >>> Which looks really wrong to me, but without it, igt tests >>> (plane_cursor_legacy, kms_cursor_legacy) fail, I get lots of >>> I915_VMA_PIN_OVERFLOW, which suggest double pinning the frame buffers. >> >> Please, ignore this error above, when I ran the test this patch (Block >> fb changes) wasn't applied when I got this error about pin overflow, but >> I still have issues: >> >> With this patch (Block fb changes) applied I get several dmesg failures >> on WARN_ON(to_intel_plane_state(state)->vma) in the >> intel_plane_destroy_state() function. >> >> https://people.collabora.com/~koike/results-5.0.0-rc1+-v8-tmp/html/problems.html >> >> But I don't get this errors with the hack below (that bypass >> intel_prepare_plane_fb if the buffer is already pinned). >> >> I'm still not sure why, I would really appreciate if any one has any >> pointers. > > i915 largely uses its own modeset framework, so there's a good chance for > differences in how we pin/unpin buffers, compared to common atomic > helpers. That's about my best guess. > -Daniel Indeed, there was a missing drm_atomic_helper_cleanup_planes() call in intel_atomic_commit(), now it is working fine. Thanks and sorry for the noise on this. Helen > >> >> Thanks >> Helen >> >>> >>> I did this really ugly test to bypass the prepare call just if the >>> buffer is already pinned: >>> https://gitlab.collabora.com/koike/linux/commit/2d01d4af30347ff43fbac254d8ec305cd6fe4aeb >>> >>> With this hack igt tests pass. >>> >>> As I am new to DRM, I would appreciate if someone could confirm that >>> this is a real bug or please clarify if I am miss interpreting anything. >>> >>> If this is a real bug, then if someone could give me pointers in how to >>> properly fix this it would be great. >>> >>> Thanks a lot >>> Helen >>> >
On 2019-01-07 12:41 p.m., Nicholas Kazlauskas wrote: > The prepare_fb call always happens on new_plane_state. > > The drm_atomic_helper_cleanup_planes checks to see if > plane state pointer has changed when deciding to call cleanup_fb on > either the new_plane_state or the old_plane_state. > > For a non-async atomic commit the state pointer is swapped, so this > helper calls prepare_fb on the new_plane_state and cleanup_fb on the > old_plane_state. This makes sense, since we want to prepare the > framebuffer we are going to use and cleanup the the framebuffer we are > no longer using. > > For the async atomic update helpers this differs. The async atomic > update helpers perform in-place updates on the existing state. They call > drm_atomic_helper_cleanup_planes but the state pointer is not swapped. > This means that prepare_fb is called on the new_plane_state and > cleanup_fb is called on the new_plane_state (not the old). > > In the case where old_plane_state->fb == new_plane_state->fb then > there should be no behavioral difference between an async update > and a non-async commit. But there are issues that arise when > old_plane_state->fb != new_plane_state->fb. > > The first is that the new_plane_state->fb is immediately cleaned up > after it has been prepared, so we're using a fb that we shouldn't > be. > > The second occurs during a sequence of async atomic updates and > non-async regular atomic commits. Suppose there are two framebuffers > being interleaved in a double-buffering scenario, fb1 and fb2: > > - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 > - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2 > - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 > > We call cleanup_fb on fb2 twice in this example scenario, and any > further use will result in use-after-free. > > The simple fix to this problem is to block framebuffer changes > in the drm_atomic_helper_async_check function for now. > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Harry Wentland <harry.wentland@amd.com> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> > Cc: <stable@vger.kernel.org> # v4.14+ > Fixes: fef9df8b5945 ("drm/atomic: initial support for asynchronous plane update") > Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Patch is merged to drm-misc-fixes with Daniel's RB from IRC and Andrey's and my AB. Harry > --- > drivers/gpu/drm/drm_atomic_helper.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c > index 54e2ae614dcc..f4290f6b0c38 100644 > --- a/drivers/gpu/drm/drm_atomic_helper.c > +++ b/drivers/gpu/drm/drm_atomic_helper.c > @@ -1602,6 +1602,15 @@ int drm_atomic_helper_async_check(struct drm_device *dev, > old_plane_state->crtc != new_plane_state->crtc) > return -EINVAL; > > + /* > + * FIXME: Since prepare_fb and cleanup_fb are always called on > + * the new_plane_state for async updates we need to block framebuffer > + * changes. This prevents use of a fb that's been cleaned up and > + * double cleanups from occuring. > + */ > + if (old_plane_state->fb != new_plane_state->fb) > + return -EINVAL; > + > funcs = plane->helper_private; > if (!funcs->atomic_async_update) > return -EINVAL; >
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 54e2ae614dcc..f4290f6b0c38 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1602,6 +1602,15 @@ int drm_atomic_helper_async_check(struct drm_device *dev, old_plane_state->crtc != new_plane_state->crtc) return -EINVAL; + /* + * FIXME: Since prepare_fb and cleanup_fb are always called on + * the new_plane_state for async updates we need to block framebuffer + * changes. This prevents use of a fb that's been cleaned up and + * double cleanups from occuring. + */ + if (old_plane_state->fb != new_plane_state->fb) + return -EINVAL; + funcs = plane->helper_private; if (!funcs->atomic_async_update) return -EINVAL;
The prepare_fb call always happens on new_plane_state. The drm_atomic_helper_cleanup_planes checks to see if plane state pointer has changed when deciding to call cleanup_fb on either the new_plane_state or the old_plane_state. For a non-async atomic commit the state pointer is swapped, so this helper calls prepare_fb on the new_plane_state and cleanup_fb on the old_plane_state. This makes sense, since we want to prepare the framebuffer we are going to use and cleanup the the framebuffer we are no longer using. For the async atomic update helpers this differs. The async atomic update helpers perform in-place updates on the existing state. They call drm_atomic_helper_cleanup_planes but the state pointer is not swapped. This means that prepare_fb is called on the new_plane_state and cleanup_fb is called on the new_plane_state (not the old). In the case where old_plane_state->fb == new_plane_state->fb then there should be no behavioral difference between an async update and a non-async commit. But there are issues that arise when old_plane_state->fb != new_plane_state->fb. The first is that the new_plane_state->fb is immediately cleaned up after it has been prepared, so we're using a fb that we shouldn't be. The second occurs during a sequence of async atomic updates and non-async regular atomic commits. Suppose there are two framebuffers being interleaved in a double-buffering scenario, fb1 and fb2: - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2 - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 We call cleanup_fb on fb2 twice in this example scenario, and any further use will result in use-after-free. The simple fix to this problem is to block framebuffer changes in the drm_atomic_helper_async_check function for now. Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: <stable@vger.kernel.org> # v4.14+ Fixes: fef9df8b5945 ("drm/atomic: initial support for asynchronous plane update") Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> --- drivers/gpu/drm/drm_atomic_helper.c | 9 +++++++++ 1 file changed, 9 insertions(+)