diff mbox series

drm: document and enforce rules around "spurious" EBUSY from atomic_commit

Message ID 20200922181834.2913552-1-daniel.vetter@ffwll.ch (mailing list archive)
State New, archived
Headers show
Series drm: document and enforce rules around "spurious" EBUSY from atomic_commit | expand

Commit Message

Daniel Vetter Sept. 22, 2020, 6:18 p.m. UTC
When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
pull in arbitrary other resources, including CRTCs (e.g. when
reconfiguring global resources).

But in nonblocking mode userspace has then no idea this happened,
which can lead to spurious EBUSY calls, both:
- when that other CRTC is currently busy doing a page_flip the
  ALLOW_MODESET commit can fail with an EBUSY
- on the other CRTC a normal atomic flip can fail with EBUSY because
  of the additional commit inserted by the kernel without userspace's
  knowledge

For blocking commits this isn't a problem, because everyone else will
just block until all the CRTC are reconfigured. Only thing userspace
can notice is the dropped frames without any reason for why frames got
dropped.

Consensus is that we need new uapi to handle this properly, but no one
has any idea what exactly the new uapi should look like. Since this
has been shipping for years already compositors need to deal no matter
what, so as a first step just try to enforce this across drivers
better with some checks.

v2: Add comments and a WARN_ON to enforce this only when allowed - we
don't want to silently convert page flips into blocking plane updates
just because the driver is buggy.

v3: Fix inverted WARN_ON (Pekka).

v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
rules for drivers.

References: https://lists.freedesktop.org/archives/dri-devel/2018-July/182281.html
Bugzilla: https://gitlab.freedesktop.org/wayland/weston/issues/24#note_9568
Cc: Daniel Stone <daniel@fooishbar.org>
Cc: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Cc: Simon Ser <contact@emersion.fr>
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

Daniel Stone Sept. 22, 2020, 7:22 p.m. UTC | #1
Hi,

On Tue, 22 Sep 2020 at 19:18, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> +       for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> +               affected_crtc |= drm_crtc_mask(crtc);
> +
> +       /*
> +        * For commits that allow modesets drivers can add other CRTCs to the
> +        * atomic commit, e.g. when they need to reallocate global resources.
> +        * This can cause spurious EBUSY, which robs compositors of a very
> +        * effective sanity check for their drawing loop. Therefor only allow
> +        * this for modeset commits.
> +        *
> +        * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
> +        * so compositors know what's going on.
> +        */
> +       if (affected_crtc != requested_crtc) {
> +               /* adding other CRTC is only allowed for modeset commits */
> +               WARN_ON(!state->allow_modeset);
> +       }
> +

Can we please add a DRM_DEBUG() here, regardless of the WARN_ON(), to
let people know what's happening?

With that, R-b me.

Cheers,
Daniel
Pekka Paalanen Sept. 23, 2020, 8:17 a.m. UTC | #2
On Tue, 22 Sep 2020 20:18:34 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> pull in arbitrary other resources, including CRTCs (e.g. when
> reconfiguring global resources).
> 
> But in nonblocking mode userspace has then no idea this happened,
> which can lead to spurious EBUSY calls, both:
> - when that other CRTC is currently busy doing a page_flip the
>   ALLOW_MODESET commit can fail with an EBUSY
> - on the other CRTC a normal atomic flip can fail with EBUSY because
>   of the additional commit inserted by the kernel without userspace's
>   knowledge
> 
> For blocking commits this isn't a problem, because everyone else will
> just block until all the CRTC are reconfigured. Only thing userspace
> can notice is the dropped frames without any reason for why frames got
> dropped.
> 
> Consensus is that we need new uapi to handle this properly, but no one
> has any idea what exactly the new uapi should look like. Since this
> has been shipping for years already compositors need to deal no matter
> what, so as a first step just try to enforce this across drivers
> better with some checks.
> 
> v2: Add comments and a WARN_ON to enforce this only when allowed - we
> don't want to silently convert page flips into blocking plane updates
> just because the driver is buggy.
> 
> v3: Fix inverted WARN_ON (Pekka).
> 
> v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
> rules for drivers.

Dropped all addresses, because gmail refused to send this email
otherwise.

> ---
>  drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index 58527f151984..ef106e7153a6 100644
> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -281,6 +281,10 @@ EXPORT_SYMBOL(__drm_atomic_state_free);
>   * needed. It will also grab the relevant CRTC lock to make sure that the state
>   * is consistent.
>   *
> + * WARNING: Drivers may only add new CRTC states to a @state if
> + * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
> + * not created by userspace through an IOCTL call.
> + *
>   * Returns:
>   *
>   * Either the allocated state or the error code encoded into the pointer. When
> @@ -1262,10 +1266,15 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
>  	struct drm_crtc_state *new_crtc_state;
>  	struct drm_connector *conn;
>  	struct drm_connector_state *conn_state;
> +	unsigned requested_crtc = 0;
> +	unsigned affected_crtc = 0;
>  	int i, ret = 0;
>  
>  	DRM_DEBUG_ATOMIC("checking %p\n", state);
>  
> +	for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> +		requested_crtc |= drm_crtc_mask(crtc);
> +
>  	for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
>  		ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
>  		if (ret) {
> @@ -1313,6 +1322,24 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
>  		}
>  	}
>  
> +	for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> +		affected_crtc |= drm_crtc_mask(crtc);
> +
> +	/*
> +	 * For commits that allow modesets drivers can add other CRTCs to the
> +	 * atomic commit, e.g. when they need to reallocate global resources.
> +	 * This can cause spurious EBUSY, which robs compositors of a very
> +	 * effective sanity check for their drawing loop. Therefor only allow
> +	 * this for modeset commits.
> +	 *
> +	 * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
> +	 * so compositors know what's going on.

Hi,

I think telling userspace the affected_crtc mask would only solve half
of the problem: it would allow userspace to avoid attempting flips on
the other affected CRTCs until this modeset is done, but it doesn't
stop this non-blocking modeset from EBUSY'ing because other affected
CRTCs are busy flipping.

If the aim is to indicate userspace bugs with EBUSY, then EBUSY because
of other CRTCs needs to be differentiable from EBUSY due to a mistake
on this CRTC. Maybe the CRTC mask should instead be "conflicting/busy
CRTCs", not simply "affected CRTCS"?

Userspace might also be designed to always avoid modesets while any
CRTC is busy flipping. In that case any EBUSY would be an indication of
a (userspace) bug and a "busy CRTCs" mask could help pinpoint the issue.

If userspace does a TEST_ONLY commit with a modeset on one CRTC and the
driver pulls in another CRTC that is currently busy, will the test
commit return with EBUSY?

If yes, and *if* userspace is single-threaded wrt. to KMS updates,
that might offer a way to work around it in userspace. But if userspace
is flipping other CRTCs from other threads, TEST_ONLY commit does not
help because another thread may cut in and make a CRTC busy.


Thanks,
pq

> +	 */
> +	if (affected_crtc != requested_crtc) {
> +		/* adding other CRTC is only allowed for modeset commits */
> +		WARN_ON(!state->allow_modeset);
> +	}
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_atomic_check_only);
Daniel Vetter Sept. 23, 2020, 9:16 a.m. UTC | #3
On Wed, Sep 23, 2020 at 10:17 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Tue, 22 Sep 2020 20:18:34 +0200
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> > When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> > pull in arbitrary other resources, including CRTCs (e.g. when
> > reconfiguring global resources).
> >
> > But in nonblocking mode userspace has then no idea this happened,
> > which can lead to spurious EBUSY calls, both:
> > - when that other CRTC is currently busy doing a page_flip the
> >   ALLOW_MODESET commit can fail with an EBUSY
> > - on the other CRTC a normal atomic flip can fail with EBUSY because
> >   of the additional commit inserted by the kernel without userspace's
> >   knowledge
> >
> > For blocking commits this isn't a problem, because everyone else will
> > just block until all the CRTC are reconfigured. Only thing userspace
> > can notice is the dropped frames without any reason for why frames got
> > dropped.
> >
> > Consensus is that we need new uapi to handle this properly, but no one
> > has any idea what exactly the new uapi should look like. Since this
> > has been shipping for years already compositors need to deal no matter
> > what, so as a first step just try to enforce this across drivers
> > better with some checks.
> >
> > v2: Add comments and a WARN_ON to enforce this only when allowed - we
> > don't want to silently convert page flips into blocking plane updates
> > just because the driver is buggy.
> >
> > v3: Fix inverted WARN_ON (Pekka).
> >
> > v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
> > rules for drivers.
>
> Dropped all addresses, because gmail refused to send this email
> otherwise.
>
> > ---
> >  drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > index 58527f151984..ef106e7153a6 100644
> > --- a/drivers/gpu/drm/drm_atomic.c
> > +++ b/drivers/gpu/drm/drm_atomic.c
> > @@ -281,6 +281,10 @@ EXPORT_SYMBOL(__drm_atomic_state_free);
> >   * needed. It will also grab the relevant CRTC lock to make sure that the state
> >   * is consistent.
> >   *
> > + * WARNING: Drivers may only add new CRTC states to a @state if
> > + * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
> > + * not created by userspace through an IOCTL call.
> > + *
> >   * Returns:
> >   *
> >   * Either the allocated state or the error code encoded into the pointer. When
> > @@ -1262,10 +1266,15 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> >       struct drm_crtc_state *new_crtc_state;
> >       struct drm_connector *conn;
> >       struct drm_connector_state *conn_state;
> > +     unsigned requested_crtc = 0;
> > +     unsigned affected_crtc = 0;
> >       int i, ret = 0;
> >
> >       DRM_DEBUG_ATOMIC("checking %p\n", state);
> >
> > +     for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> > +             requested_crtc |= drm_crtc_mask(crtc);
> > +
> >       for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
> >               ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
> >               if (ret) {
> > @@ -1313,6 +1322,24 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> >               }
> >       }
> >
> > +     for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> > +             affected_crtc |= drm_crtc_mask(crtc);
> > +
> > +     /*
> > +      * For commits that allow modesets drivers can add other CRTCs to the
> > +      * atomic commit, e.g. when they need to reallocate global resources.
> > +      * This can cause spurious EBUSY, which robs compositors of a very
> > +      * effective sanity check for their drawing loop. Therefor only allow
> > +      * this for modeset commits.
> > +      *
> > +      * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
> > +      * so compositors know what's going on.
>
> Hi,
>
> I think telling userspace the affected_crtc mask would only solve half
> of the problem: it would allow userspace to avoid attempting flips on
> the other affected CRTCs until this modeset is done, but it doesn't
> stop this non-blocking modeset from EBUSY'ing because other affected
> CRTCs are busy flipping.
>
> If the aim is to indicate userspace bugs with EBUSY, then EBUSY because
> of other CRTCs needs to be differentiable from EBUSY due to a mistake
> on this CRTC. Maybe the CRTC mask should instead be "conflicting/busy
> CRTCs", not simply "affected CRTCS"?
>
> Userspace might also be designed to always avoid modesets while any
> CRTC is busy flipping. In that case any EBUSY would be an indication of
> a (userspace) bug and a "busy CRTCs" mask could help pinpoint the issue.
>
> If userspace does a TEST_ONLY commit with a modeset on one CRTC and the
> driver pulls in another CRTC that is currently busy, will the test
> commit return with EBUSY?
>
> If yes, and *if* userspace is single-threaded wrt. to KMS updates,
> that might offer a way to work around it in userspace. But if userspace
> is flipping other CRTCs from other threads, TEST_ONLY commit does not
> help because another thread may cut in and make a CRTC busy.

You'd also get this from TEST_ONLY commit, so userspace can:
- wait for any affected crtc to complete out their pending updates,
which avoids the EBUSY
- do a blocking modeset (which avoids the EBUSY since it stalls either way)
- insert a note into it's per-crtc draw loop/tracking that there will
be another event inserted by the kernel that needs to be waited on

As long as this doesn't race with any other atomic commit or
processing the event queue, this should be race free. And allow you to
handle spurious EBUSY both here because of other pending flips, and in
the next atomic flips because of this commit here.

Also I don't think the kernel needs to tell you which of these crtc
are busy, since userspace knows that (by tracking outstanding events).
Plus that information is racy anyway, you'll just have to synchronize
with your drm event processing.
-Daniel

>
>
> Thanks,
> pq
>
> > +      */
> > +     if (affected_crtc != requested_crtc) {
> > +             /* adding other CRTC is only allowed for modeset commits */
> > +             WARN_ON(!state->allow_modeset);
> > +     }
> > +
> >       return 0;
> >  }
> >  EXPORT_SYMBOL(drm_atomic_check_only);
>
Daniel Vetter Sept. 23, 2020, 9:21 a.m. UTC | #4
On Wed, Sep 23, 2020 at 11:16 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Wed, Sep 23, 2020 at 10:17 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > On Tue, 22 Sep 2020 20:18:34 +0200
> > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > > When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> > > pull in arbitrary other resources, including CRTCs (e.g. when
> > > reconfiguring global resources).
> > >
> > > But in nonblocking mode userspace has then no idea this happened,
> > > which can lead to spurious EBUSY calls, both:
> > > - when that other CRTC is currently busy doing a page_flip the
> > >   ALLOW_MODESET commit can fail with an EBUSY
> > > - on the other CRTC a normal atomic flip can fail with EBUSY because
> > >   of the additional commit inserted by the kernel without userspace's
> > >   knowledge
> > >
> > > For blocking commits this isn't a problem, because everyone else will
> > > just block until all the CRTC are reconfigured. Only thing userspace
> > > can notice is the dropped frames without any reason for why frames got
> > > dropped.
> > >
> > > Consensus is that we need new uapi to handle this properly, but no one
> > > has any idea what exactly the new uapi should look like. Since this
> > > has been shipping for years already compositors need to deal no matter
> > > what, so as a first step just try to enforce this across drivers
> > > better with some checks.
> > >
> > > v2: Add comments and a WARN_ON to enforce this only when allowed - we
> > > don't want to silently convert page flips into blocking plane updates
> > > just because the driver is buggy.
> > >
> > > v3: Fix inverted WARN_ON (Pekka).
> > >
> > > v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
> > > rules for drivers.
> >
> > Dropped all addresses, because gmail refused to send this email
> > otherwise.
> >
> > > ---
> > >  drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
> > >  1 file changed, 27 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > > index 58527f151984..ef106e7153a6 100644
> > > --- a/drivers/gpu/drm/drm_atomic.c
> > > +++ b/drivers/gpu/drm/drm_atomic.c
> > > @@ -281,6 +281,10 @@ EXPORT_SYMBOL(__drm_atomic_state_free);
> > >   * needed. It will also grab the relevant CRTC lock to make sure that the state
> > >   * is consistent.
> > >   *
> > > + * WARNING: Drivers may only add new CRTC states to a @state if
> > > + * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
> > > + * not created by userspace through an IOCTL call.
> > > + *
> > >   * Returns:
> > >   *
> > >   * Either the allocated state or the error code encoded into the pointer. When
> > > @@ -1262,10 +1266,15 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> > >       struct drm_crtc_state *new_crtc_state;
> > >       struct drm_connector *conn;
> > >       struct drm_connector_state *conn_state;
> > > +     unsigned requested_crtc = 0;
> > > +     unsigned affected_crtc = 0;
> > >       int i, ret = 0;
> > >
> > >       DRM_DEBUG_ATOMIC("checking %p\n", state);
> > >
> > > +     for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> > > +             requested_crtc |= drm_crtc_mask(crtc);
> > > +
> > >       for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
> > >               ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
> > >               if (ret) {
> > > @@ -1313,6 +1322,24 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> > >               }
> > >       }
> > >
> > > +     for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> > > +             affected_crtc |= drm_crtc_mask(crtc);
> > > +
> > > +     /*
> > > +      * For commits that allow modesets drivers can add other CRTCs to the
> > > +      * atomic commit, e.g. when they need to reallocate global resources.
> > > +      * This can cause spurious EBUSY, which robs compositors of a very
> > > +      * effective sanity check for their drawing loop. Therefor only allow
> > > +      * this for modeset commits.
> > > +      *
> > > +      * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
> > > +      * so compositors know what's going on.
> >
> > Hi,
> >
> > I think telling userspace the affected_crtc mask would only solve half
> > of the problem: it would allow userspace to avoid attempting flips on
> > the other affected CRTCs until this modeset is done, but it doesn't
> > stop this non-blocking modeset from EBUSY'ing because other affected
> > CRTCs are busy flipping.
> >
> > If the aim is to indicate userspace bugs with EBUSY, then EBUSY because
> > of other CRTCs needs to be differentiable from EBUSY due to a mistake
> > on this CRTC. Maybe the CRTC mask should instead be "conflicting/busy
> > CRTCs", not simply "affected CRTCS"?
> >
> > Userspace might also be designed to always avoid modesets while any
> > CRTC is busy flipping. In that case any EBUSY would be an indication of
> > a (userspace) bug and a "busy CRTCs" mask could help pinpoint the issue.
> >
> > If userspace does a TEST_ONLY commit with a modeset on one CRTC and the
> > driver pulls in another CRTC that is currently busy, will the test
> > commit return with EBUSY?
> >
> > If yes, and *if* userspace is single-threaded wrt. to KMS updates,
> > that might offer a way to work around it in userspace. But if userspace
> > is flipping other CRTCs from other threads, TEST_ONLY commit does not
> > help because another thread may cut in and make a CRTC busy.

Forgot to explain this TEST_ONLY never returns EBUSY. That only
happens when you commit and failed to wait for the previous frame to
complete. Otherwise compositors would be somewhat screwed with being
blocked instead being able to prep the next frame.

Also note that you'd get this information for any atomic ioctl, so
also for direct commits. The _test_only suffix of this function here
is just internals, the code flow in the ioctl is to always call this
function first, and then optionally call the right commit function.
-Daniel

> You'd also get this from TEST_ONLY commit, so userspace can:
> - wait for any affected crtc to complete out their pending updates,
> which avoids the EBUSY
> - do a blocking modeset (which avoids the EBUSY since it stalls either way)
> - insert a note into it's per-crtc draw loop/tracking that there will
> be another event inserted by the kernel that needs to be waited on
>
> As long as this doesn't race with any other atomic commit or
> processing the event queue, this should be race free. And allow you to
> handle spurious EBUSY both here because of other pending flips, and in
> the next atomic flips because of this commit here.
>
> Also I don't think the kernel needs to tell you which of these crtc
> are busy, since userspace knows that (by tracking outstanding events).
> Plus that information is racy anyway, you'll just have to synchronize
> with your drm event processing.
> -Daniel
>
> >
> >
> > Thanks,
> > pq
> >
> > > +      */
> > > +     if (affected_crtc != requested_crtc) {
> > > +             /* adding other CRTC is only allowed for modeset commits */
> > > +             WARN_ON(!state->allow_modeset);
> > > +     }
> > > +
> > >       return 0;
> > >  }
> > >  EXPORT_SYMBOL(drm_atomic_check_only);
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Daniel Vetter Sept. 23, 2020, 9:26 a.m. UTC | #5
I'm really not awake yet ...

On Wed, Sep 23, 2020 at 10:17 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Tue, 22 Sep 2020 20:18:34 +0200
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> > When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> > pull in arbitrary other resources, including CRTCs (e.g. when
> > reconfiguring global resources).
> >
> > But in nonblocking mode userspace has then no idea this happened,
> > which can lead to spurious EBUSY calls, both:
> > - when that other CRTC is currently busy doing a page_flip the
> >   ALLOW_MODESET commit can fail with an EBUSY
> > - on the other CRTC a normal atomic flip can fail with EBUSY because
> >   of the additional commit inserted by the kernel without userspace's
> >   knowledge
> >
> > For blocking commits this isn't a problem, because everyone else will
> > just block until all the CRTC are reconfigured. Only thing userspace
> > can notice is the dropped frames without any reason for why frames got
> > dropped.
> >
> > Consensus is that we need new uapi to handle this properly, but no one
> > has any idea what exactly the new uapi should look like. Since this
> > has been shipping for years already compositors need to deal no matter
> > what, so as a first step just try to enforce this across drivers
> > better with some checks.
> >
> > v2: Add comments and a WARN_ON to enforce this only when allowed - we
> > don't want to silently convert page flips into blocking plane updates
> > just because the driver is buggy.
> >
> > v3: Fix inverted WARN_ON (Pekka).
> >
> > v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
> > rules for drivers.
>
> Dropped all addresses, because gmail refused to send this email
> otherwise.
>
> > ---
> >  drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > index 58527f151984..ef106e7153a6 100644
> > --- a/drivers/gpu/drm/drm_atomic.c
> > +++ b/drivers/gpu/drm/drm_atomic.c
> > @@ -281,6 +281,10 @@ EXPORT_SYMBOL(__drm_atomic_state_free);
> >   * needed. It will also grab the relevant CRTC lock to make sure that the state
> >   * is consistent.
> >   *
> > + * WARNING: Drivers may only add new CRTC states to a @state if
> > + * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
> > + * not created by userspace through an IOCTL call.
> > + *
> >   * Returns:
> >   *
> >   * Either the allocated state or the error code encoded into the pointer. When
> > @@ -1262,10 +1266,15 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> >       struct drm_crtc_state *new_crtc_state;
> >       struct drm_connector *conn;
> >       struct drm_connector_state *conn_state;
> > +     unsigned requested_crtc = 0;
> > +     unsigned affected_crtc = 0;
> >       int i, ret = 0;
> >
> >       DRM_DEBUG_ATOMIC("checking %p\n", state);
> >
> > +     for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> > +             requested_crtc |= drm_crtc_mask(crtc);
> > +
> >       for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
> >               ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
> >               if (ret) {
> > @@ -1313,6 +1322,24 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> >               }
> >       }
> >
> > +     for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> > +             affected_crtc |= drm_crtc_mask(crtc);
> > +
> > +     /*
> > +      * For commits that allow modesets drivers can add other CRTCs to the
> > +      * atomic commit, e.g. when they need to reallocate global resources.
> > +      * This can cause spurious EBUSY, which robs compositors of a very
> > +      * effective sanity check for their drawing loop. Therefor only allow
> > +      * this for modeset commits.
> > +      *
> > +      * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
> > +      * so compositors know what's going on.
>
> Hi,
>
> I think telling userspace the affected_crtc mask would only solve half
> of the problem: it would allow userspace to avoid attempting flips on
> the other affected CRTCs until this modeset is done, but it doesn't
> stop this non-blocking modeset from EBUSY'ing because other affected
> CRTCs are busy flipping.
>
> If the aim is to indicate userspace bugs with EBUSY, then EBUSY because
> of other CRTCs needs to be differentiable from EBUSY due to a mistake
> on this CRTC. Maybe the CRTC mask should instead be "conflicting/busy
> CRTCs", not simply "affected CRTCS"?
>
> Userspace might also be designed to always avoid modesets while any
> CRTC is busy flipping. In that case any EBUSY would be an indication of
> a (userspace) bug and a "busy CRTCs" mask could help pinpoint the issue.
>
> If userspace does a TEST_ONLY commit with a modeset on one CRTC and the
> driver pulls in another CRTC that is currently busy, will the test
> commit return with EBUSY?
>
> If yes, and *if* userspace is single-threaded wrt. to KMS updates,
> that might offer a way to work around it in userspace. But if userspace
> is flipping other CRTCs from other threads, TEST_ONLY commit does not
> help because another thread may cut in and make a CRTC busy.

This is not a legit programming model for atomic. An atomic commit is
always relative to the current state. If that state changes, then you
need to re-run your TEST_ONLY commit. So multiple threads changing
state in parallel isn't really a good idea anyway. Minimally we'd need
some kind of TEST_ONLY pile-up, so you can validate a change assuming
another commit has already happened. That's even harder than deep
queues on the commit side, we'd probably need full rollback of
commits.
-Daniel

>
>
> Thanks,
> pq
>
> > +      */
> > +     if (affected_crtc != requested_crtc) {
> > +             /* adding other CRTC is only allowed for modeset commits */
> > +             WARN_ON(!state->allow_modeset);
> > +     }
> > +
> >       return 0;
> >  }
> >  EXPORT_SYMBOL(drm_atomic_check_only);
>
Pekka Paalanen Sept. 23, 2020, 9:55 a.m. UTC | #6
On Wed, 23 Sep 2020 11:26:39 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> I'm really not awake yet ...
> 
> On Wed, Sep 23, 2020 at 10:17 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > On Tue, 22 Sep 2020 20:18:34 +0200
> > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >  
> > > When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> > > pull in arbitrary other resources, including CRTCs (e.g. when
> > > reconfiguring global resources).

> > If yes, and *if* userspace is single-threaded wrt. to KMS updates,
> > that might offer a way to work around it in userspace. But if userspace
> > is flipping other CRTCs from other threads, TEST_ONLY commit does not
> > help because another thread may cut in and make a CRTC busy.  
> 
> This is not a legit programming model for atomic. An atomic commit is
> always relative to the current state. If that state changes, then you
> need to re-run your TEST_ONLY commit. So multiple threads changing
> state in parallel isn't really a good idea anyway. Minimally we'd need
> some kind of TEST_ONLY pile-up, so you can validate a change assuming
> another commit has already happened. That's even harder than deep
> queues on the commit side, we'd probably need full rollback of
> commits.

Yes, very good point. I forgot that for a moment.


Thanks,
pq
Ville Syrjälä Sept. 23, 2020, 10:31 a.m. UTC | #7
On Tue, Sep 22, 2020 at 08:18:34PM +0200, Daniel Vetter wrote:
> When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> pull in arbitrary other resources, including CRTCs (e.g. when
> reconfiguring global resources).
> 
> But in nonblocking mode userspace has then no idea this happened,
> which can lead to spurious EBUSY calls, both:
> - when that other CRTC is currently busy doing a page_flip the
>   ALLOW_MODESET commit can fail with an EBUSY
> - on the other CRTC a normal atomic flip can fail with EBUSY because
>   of the additional commit inserted by the kernel without userspace's
>   knowledge
> 
> For blocking commits this isn't a problem, because everyone else will
> just block until all the CRTC are reconfigured. Only thing userspace
> can notice is the dropped frames without any reason for why frames got
> dropped.
> 
> Consensus is that we need new uapi to handle this properly, but no one
> has any idea what exactly the new uapi should look like. Since this
> has been shipping for years already compositors need to deal no matter
> what, so as a first step just try to enforce this across drivers
> better with some checks.
> 
> v2: Add comments and a WARN_ON to enforce this only when allowed - we
> don't want to silently convert page flips into blocking plane updates
> just because the driver is buggy.
> 
> v3: Fix inverted WARN_ON (Pekka).
> 
> v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
> rules for drivers.
> 
> References: https://lists.freedesktop.org/archives/dri-devel/2018-July/182281.html
> Bugzilla: https://gitlab.freedesktop.org/wayland/weston/issues/24#note_9568
> Cc: Daniel Stone <daniel@fooishbar.org>
> Cc: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: stable@vger.kernel.org
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index 58527f151984..ef106e7153a6 100644
> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -281,6 +281,10 @@ EXPORT_SYMBOL(__drm_atomic_state_free);
>   * needed. It will also grab the relevant CRTC lock to make sure that the state
>   * is consistent.
>   *
> + * WARNING: Drivers may only add new CRTC states to a @state if
> + * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
> + * not created by userspace through an IOCTL call.
> + *
>   * Returns:
>   *
>   * Either the allocated state or the error code encoded into the pointer. When
> @@ -1262,10 +1266,15 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
>  	struct drm_crtc_state *new_crtc_state;
>  	struct drm_connector *conn;
>  	struct drm_connector_state *conn_state;
> +	unsigned requested_crtc = 0;
> +	unsigned affected_crtc = 0;
>  	int i, ret = 0;
>  
>  	DRM_DEBUG_ATOMIC("checking %p\n", state);
>  
> +	for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> +		requested_crtc |= drm_crtc_mask(crtc);
> +
>  	for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
>  		ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
>  		if (ret) {
> @@ -1313,6 +1322,24 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
>  		}
>  	}
>  
> +	for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)

Inconsistent old vs. new.

> +		affected_crtc |= drm_crtc_mask(crtc);
> +
> +	/*
> +	 * For commits that allow modesets drivers can add other CRTCs to the
> +	 * atomic commit, e.g. when they need to reallocate global resources.
> +	 * This can cause spurious EBUSY, which robs compositors of a very
> +	 * effective sanity check for their drawing loop. Therefor only allow
> +	 * this for modeset commits.
> +	 *
> +	 * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
> +	 * so compositors know what's going on.
> +	 */
> +	if (affected_crtc != requested_crtc) {
> +		/* adding other CRTC is only allowed for modeset commits */
> +		WARN_ON(!state->allow_modeset);
> +	}

I think this means pretty much all non-pageflip commits will
have to have allow_modeset==true on i915 or else we just can't
guarantee that we can anything (due to sagv and/or cdclk mainly).

Also a bit baffled that CI didn't hit this. I think it should be
totally possible to hit this now. To avoid that I guess we'd just
need to make intel_atomic_serialize_global_state() fail if it
has to add any new crtcs when allow_modeset==false. Hopefully
there aren't many other places that add crtcs to the state
without forcing a modeset on them.

> +
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_atomic_check_only);
> -- 
> 2.28.0
Daniel Vetter Sept. 23, 2020, 10:37 a.m. UTC | #8
On Wed, Sep 23, 2020 at 12:31 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Tue, Sep 22, 2020 at 08:18:34PM +0200, Daniel Vetter wrote:
> > When doing an atomic modeset with ALLOW_MODESET drivers are allowed to
> > pull in arbitrary other resources, including CRTCs (e.g. when
> > reconfiguring global resources).
> >
> > But in nonblocking mode userspace has then no idea this happened,
> > which can lead to spurious EBUSY calls, both:
> > - when that other CRTC is currently busy doing a page_flip the
> >   ALLOW_MODESET commit can fail with an EBUSY
> > - on the other CRTC a normal atomic flip can fail with EBUSY because
> >   of the additional commit inserted by the kernel without userspace's
> >   knowledge
> >
> > For blocking commits this isn't a problem, because everyone else will
> > just block until all the CRTC are reconfigured. Only thing userspace
> > can notice is the dropped frames without any reason for why frames got
> > dropped.
> >
> > Consensus is that we need new uapi to handle this properly, but no one
> > has any idea what exactly the new uapi should look like. Since this
> > has been shipping for years already compositors need to deal no matter
> > what, so as a first step just try to enforce this across drivers
> > better with some checks.
> >
> > v2: Add comments and a WARN_ON to enforce this only when allowed - we
> > don't want to silently convert page flips into blocking plane updates
> > just because the driver is buggy.
> >
> > v3: Fix inverted WARN_ON (Pekka).
> >
> > v4: Drop the uapi changes, only add a WARN_ON for now to enforce some
> > rules for drivers.
> >
> > References: https://lists.freedesktop.org/archives/dri-devel/2018-July/182281.html
> > Bugzilla: https://gitlab.freedesktop.org/wayland/weston/issues/24#note_9568
> > Cc: Daniel Stone <daniel@fooishbar.org>
> > Cc: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
> > Cc: Simon Ser <contact@emersion.fr>
> > Cc: stable@vger.kernel.org
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> >  drivers/gpu/drm/drm_atomic.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> > index 58527f151984..ef106e7153a6 100644
> > --- a/drivers/gpu/drm/drm_atomic.c
> > +++ b/drivers/gpu/drm/drm_atomic.c
> > @@ -281,6 +281,10 @@ EXPORT_SYMBOL(__drm_atomic_state_free);
> >   * needed. It will also grab the relevant CRTC lock to make sure that the state
> >   * is consistent.
> >   *
> > + * WARNING: Drivers may only add new CRTC states to a @state if
> > + * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
> > + * not created by userspace through an IOCTL call.
> > + *
> >   * Returns:
> >   *
> >   * Either the allocated state or the error code encoded into the pointer. When
> > @@ -1262,10 +1266,15 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> >       struct drm_crtc_state *new_crtc_state;
> >       struct drm_connector *conn;
> >       struct drm_connector_state *conn_state;
> > +     unsigned requested_crtc = 0;
> > +     unsigned affected_crtc = 0;
> >       int i, ret = 0;
> >
> >       DRM_DEBUG_ATOMIC("checking %p\n", state);
> >
> > +     for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
> > +             requested_crtc |= drm_crtc_mask(crtc);
> > +
> >       for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
> >               ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
> >               if (ret) {
> > @@ -1313,6 +1322,24 @@ int drm_atomic_check_only(struct drm_atomic_state *state)
> >               }
> >       }
> >
> > +     for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
>
> Inconsistent old vs. new.

Will fix, but also doesn't matter since I don't care about the state,
just that it's in there.

> > +             affected_crtc |= drm_crtc_mask(crtc);
> > +
> > +     /*
> > +      * For commits that allow modesets drivers can add other CRTCs to the
> > +      * atomic commit, e.g. when they need to reallocate global resources.
> > +      * This can cause spurious EBUSY, which robs compositors of a very
> > +      * effective sanity check for their drawing loop. Therefor only allow
> > +      * this for modeset commits.
> > +      *
> > +      * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
> > +      * so compositors know what's going on.
> > +      */
> > +     if (affected_crtc != requested_crtc) {
> > +             /* adding other CRTC is only allowed for modeset commits */
> > +             WARN_ON(!state->allow_modeset);
> > +     }
>
> I think this means pretty much all non-pageflip commits will
> have to have allow_modeset==true on i915 or else we just can't
> guarantee that we can anything (due to sagv and/or cdclk mainly).

I guess not enough machines with multiple outputs in the shards.

> Also a bit baffled that CI didn't hit this. I think it should be
> totally possible to hit this now. To avoid that I guess we'd just
> need to make intel_atomic_serialize_global_state() fail if it
> has to add any new crtcs when allow_modeset==false. Hopefully
> there aren't many other places that add crtcs to the state
> without forcing a modeset on them.

Oh we don't do that? That feels like a pretty bad bug ... Wacking
random other crtc without allow_modeset is pretty nasty.
-Daniel

>
> > +
> >       return 0;
> >  }
> >  EXPORT_SYMBOL(drm_atomic_check_only);
> > --
> > 2.28.0
>
> --
> Ville Syrjälä
> Intel
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index 58527f151984..ef106e7153a6 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -281,6 +281,10 @@  EXPORT_SYMBOL(__drm_atomic_state_free);
  * needed. It will also grab the relevant CRTC lock to make sure that the state
  * is consistent.
  *
+ * WARNING: Drivers may only add new CRTC states to a @state if
+ * drm_atomic_state.allow_modeset is set, or if it's a driver-internal commit
+ * not created by userspace through an IOCTL call.
+ *
  * Returns:
  *
  * Either the allocated state or the error code encoded into the pointer. When
@@ -1262,10 +1266,15 @@  int drm_atomic_check_only(struct drm_atomic_state *state)
 	struct drm_crtc_state *new_crtc_state;
 	struct drm_connector *conn;
 	struct drm_connector_state *conn_state;
+	unsigned requested_crtc = 0;
+	unsigned affected_crtc = 0;
 	int i, ret = 0;
 
 	DRM_DEBUG_ATOMIC("checking %p\n", state);
 
+	for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
+		requested_crtc |= drm_crtc_mask(crtc);
+
 	for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
 		ret = drm_atomic_plane_check(old_plane_state, new_plane_state);
 		if (ret) {
@@ -1313,6 +1322,24 @@  int drm_atomic_check_only(struct drm_atomic_state *state)
 		}
 	}
 
+	for_each_new_crtc_in_state(state, crtc, old_crtc_state, i)
+		affected_crtc |= drm_crtc_mask(crtc);
+
+	/*
+	 * For commits that allow modesets drivers can add other CRTCs to the
+	 * atomic commit, e.g. when they need to reallocate global resources.
+	 * This can cause spurious EBUSY, which robs compositors of a very
+	 * effective sanity check for their drawing loop. Therefor only allow
+	 * this for modeset commits.
+	 *
+	 * FIXME: Should add affected_crtc mask to the ATOMIC IOCTL as an output
+	 * so compositors know what's going on.
+	 */
+	if (affected_crtc != requested_crtc) {
+		/* adding other CRTC is only allowed for modeset commits */
+		WARN_ON(!state->allow_modeset);
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL(drm_atomic_check_only);