drm/i915: Put all non-blocking modesets onto an ordered wq

Message ID	20171113133622.8593-1-ville.syrjala@linux.intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Ville Syrjala <ville.syrjala@linux.intel.com> To: intel-gfx@lists.freedesktop.org Date: Mon, 13 Nov 2017 15:36:22 +0200 Message-Id: <20171113133622.8593-1-ville.syrjala@linux.intel.com> MIME-Version: 1.0 Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Subject: [Intel-gfx] [PATCH] drm/i915: Put all non-blocking modesets onto an ordered wq Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Ville Syrjälä Nov. 13, 2017, 1:36 p.m. UTC

From: Ville Syrjälä <ville.syrjala@linux.intel.com>

We have plenty of global registers and whatnot programmed without
any further locking by the modeset code. Currently non-bocking
modesets are allowed to execute in parallel which could corrupt
said registers.

To avoid the problem let's run all non-blocking modesets on an
ordered workqueue. We still put page flips etc. to system_unbound_wq
allowing page flips on one pipe to execute in parallel with page flips
or a modeset on a another pipe (assuming no known state is shared
between them, at which point they would have been added to the same
atomic commit and serialized that way).

Blocking modesets are already serialized with each other by
connection_mutex, and thus are safe. To serialize them with
non-blocking modesets we just flush the workqueue before executing
blocking modesets.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |  3 +++
 drivers/gpu/drm/i915/intel_display.c | 14 +++++++++++---
 2 files changed, 14 insertions(+), 3 deletions(-)

Maarten Lankhorst Nov. 13, 2017, 2:36 p.m. UTC | #1

Op 13-11-17 om 14:36 schreef Ville Syrjala:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> We have plenty of global registers and whatnot programmed without
> any further locking by the modeset code. Currently non-bocking
> modesets are allowed to execute in parallel which could corrupt
> said registers.
>
> To avoid the problem let's run all non-blocking modesets on an
> ordered workqueue. We still put page flips etc. to system_unbound_wq
> allowing page flips on one pipe to execute in parallel with page flips
> or a modeset on a another pipe (assuming no known state is shared
> between them, at which point they would have been added to the same
> atomic commit and serialized that way).
>
> Blocking modesets are already serialized with each other by
> connection_mutex, and thus are safe. To serialize them with
> non-blocking modesets we just flush the workqueue before executing
> blocking modesets.
>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
This patch won't really fix it, you could still have a blocking modeset in parallel to a nonblocking one. What would really be needed to fix those instances here?

Ville Syrjälä Nov. 13, 2017, 2:44 p.m. UTC | #2

On Mon, Nov 13, 2017 at 03:36:58PM +0100, Maarten Lankhorst wrote:
> Op 13-11-17 om 14:36 schreef Ville Syrjala:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > We have plenty of global registers and whatnot programmed without
> > any further locking by the modeset code. Currently non-bocking
> > modesets are allowed to execute in parallel which could corrupt
> > said registers.
> >
> > To avoid the problem let's run all non-blocking modesets on an
> > ordered workqueue. We still put page flips etc. to system_unbound_wq
> > allowing page flips on one pipe to execute in parallel with page flips
> > or a modeset on a another pipe (assuming no known state is shared
> > between them, at which point they would have been added to the same
> > atomic commit and serialized that way).
> >
> > Blocking modesets are already serialized with each other by
> > connection_mutex, and thus are safe. To serialize them with
> > non-blocking modesets we just flush the workqueue before executing
> > blocking modesets.
> >
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> This patch won't really fix it, you could still have a blocking modeset in parallel to a nonblocking one.

Hence the flush_workqueue().

> What would really be needed to fix those instances here?

Someone needs to review the entire modeset code and come up with a fix
for all the cases where we are frobbing some global state.

Ville Syrjälä Nov. 21, 2017, 2:42 p.m. UTC | #3

On Mon, Nov 13, 2017 at 04:44:08PM +0200, Ville Syrjälä wrote:
> On Mon, Nov 13, 2017 at 03:36:58PM +0100, Maarten Lankhorst wrote:
> > Op 13-11-17 om 14:36 schreef Ville Syrjala:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >
> > > We have plenty of global registers and whatnot programmed without
> > > any further locking by the modeset code. Currently non-bocking
> > > modesets are allowed to execute in parallel which could corrupt
> > > said registers.
> > >
> > > To avoid the problem let's run all non-blocking modesets on an
> > > ordered workqueue. We still put page flips etc. to system_unbound_wq
> > > allowing page flips on one pipe to execute in parallel with page flips
> > > or a modeset on a another pipe (assuming no known state is shared
> > > between them, at which point they would have been added to the same
> > > atomic commit and serialized that way).
> > >
> > > Blocking modesets are already serialized with each other by
> > > connection_mutex, and thus are safe. To serialize them with
> > > non-blocking modesets we just flush the workqueue before executing
> > > blocking modesets.
> > >
> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
> > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > This patch won't really fix it, you could still have a blocking modeset in parallel to a nonblocking one.
> 
> Hence the flush_workqueue().
> 
> > What would really be needed to fix those instances here?
> 
> Someone needs to review the entire modeset code and come up with a fix
> for all the cases where we are frobbing some global state.

So what do we want to do here?

1) someone volunteering to find all the broken code and fix it?
2) go with my ordered wq?
3) someone want to come up with some kind of modeset mutex?
4) revert nonblocking modesets?

Chris Wilson Nov. 22, 2017, 11:10 a.m. UTC | #4

Quoting Ville Syrjala (2017-11-13 13:36:22)
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> We have plenty of global registers and whatnot programmed without
> any further locking by the modeset code. Currently non-bocking
> modesets are allowed to execute in parallel which could corrupt
> said registers.
> 
> To avoid the problem let's run all non-blocking modesets on an
> ordered workqueue. We still put page flips etc. to system_unbound_wq
> allowing page flips on one pipe to execute in parallel with page flips
> or a modeset on a another pipe (assuming no known state is shared
> between them, at which point they would have been added to the same
> atomic commit and serialized that way).
> 
> Blocking modesets are already serialized with each other by
> connection_mutex, and thus are safe. To serialize them with
> non-blocking modesets we just flush the workqueue before executing
> blocking modesets.

I did something very similar in my patches for tracking concurrent
modesets using plane granularity. I used a fence to impose a barrier for
each level of registers (plane, crtc, global) and tracking the
dependencies/ordering via those barrier fences.

In effect, it gives the same barrier as switching between an ordered-wq,
but I feel using fences should be easier to extend down to finer
granularity. (What once was an easy patch is no more :-(
-Chris

Daniel Vetter Nov. 22, 2017, 12:25 p.m. UTC | #5

On Mon, Nov 13, 2017 at 3:36 PM, Maarten Lankhorst
<maarten.lankhorst@linux.intel.com> wrote:
> Op 13-11-17 om 14:36 schreef Ville Syrjala:
>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>
>> We have plenty of global registers and whatnot programmed without
>> any further locking by the modeset code. Currently non-bocking
>> modesets are allowed to execute in parallel which could corrupt
>> said registers.
>>
>> To avoid the problem let's run all non-blocking modesets on an
>> ordered workqueue. We still put page flips etc. to system_unbound_wq
>> allowing page flips on one pipe to execute in parallel with page flips
>> or a modeset on a another pipe (assuming no known state is shared
>> between them, at which point they would have been added to the same
>> atomic commit and serialized that way).
>>
>> Blocking modesets are already serialized with each other by
>> connection_mutex, and thus are safe. To serialize them with
>> non-blocking modesets we just flush the workqueue before executing
>> blocking modesets.
>>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> This patch won't really fix it, you could still have a blocking modeset in parallel to a nonblocking one. What would really be needed to fix those instances here?

The idea was that anything touching anything global would grap all
crtc state, and then we'd track those using the drm_crtc_commit stuff.
Putting everything onto one queue also doesn't work because they're
meant to somewhat overlap (plane cleanup is in the same work, but
should/can overlap with the next update).

Imo the right fix is to make sure we do add all the crtc states
everywhere we touch something global. And if that doesn't scale, then
modeset objects to track those bits.
-Daniel

Daniel Vetter Nov. 22, 2017, 12:35 p.m. UTC | #6

On Wed, Nov 22, 2017 at 1:25 PM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> On Mon, Nov 13, 2017 at 3:36 PM, Maarten Lankhorst
> <maarten.lankhorst@linux.intel.com> wrote:
>> Op 13-11-17 om 14:36 schreef Ville Syrjala:
>>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>>
>>> We have plenty of global registers and whatnot programmed without
>>> any further locking by the modeset code. Currently non-bocking
>>> modesets are allowed to execute in parallel which could corrupt
>>> said registers.
>>>
>>> To avoid the problem let's run all non-blocking modesets on an
>>> ordered workqueue. We still put page flips etc. to system_unbound_wq
>>> allowing page flips on one pipe to execute in parallel with page flips
>>> or a modeset on a another pipe (assuming no known state is shared
>>> between them, at which point they would have been added to the same
>>> atomic commit and serialized that way).
>>>
>>> Blocking modesets are already serialized with each other by
>>> connection_mutex, and thus are safe. To serialize them with
>>> non-blocking modesets we just flush the workqueue before executing
>>> blocking modesets.
>>>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
>>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> This patch won't really fix it, you could still have a blocking modeset in parallel to a nonblocking one. What would really be needed to fix those instances here?
>
> The idea was that anything touching anything global would grap all
> crtc state, and then we'd track those using the drm_crtc_commit stuff.
> Putting everything onto one queue also doesn't work because they're
> meant to somewhat overlap (plane cleanup is in the same work, but
> should/can overlap with the next update).
>
> Imo the right fix is to make sure we do add all the crtc states
> everywhere we touch something global. And if that doesn't scale, then
> modeset objects to track those bits.

Backtracking a lot here: What kind of global registers do you mean
here? I have a bit the feeling that in a bunch of cases the problem
would then also be that it's not really atomic when it matters how we
interleave the updates ...
-Daniel

Ville Syrjälä Nov. 22, 2017, 12:56 p.m. UTC | #7

On Wed, Nov 22, 2017 at 01:35:39PM +0100, Daniel Vetter wrote:
> On Wed, Nov 22, 2017 at 1:25 PM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > On Mon, Nov 13, 2017 at 3:36 PM, Maarten Lankhorst
> > <maarten.lankhorst@linux.intel.com> wrote:
> >> Op 13-11-17 om 14:36 schreef Ville Syrjala:
> >>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >>>
> >>> We have plenty of global registers and whatnot programmed without
> >>> any further locking by the modeset code. Currently non-bocking
> >>> modesets are allowed to execute in parallel which could corrupt
> >>> said registers.
> >>>
> >>> To avoid the problem let's run all non-blocking modesets on an
> >>> ordered workqueue. We still put page flips etc. to system_unbound_wq
> >>> allowing page flips on one pipe to execute in parallel with page flips
> >>> or a modeset on a another pipe (assuming no known state is shared
> >>> between them, at which point they would have been added to the same
> >>> atomic commit and serialized that way).
> >>>
> >>> Blocking modesets are already serialized with each other by
> >>> connection_mutex, and thus are safe. To serialize them with
> >>> non-blocking modesets we just flush the workqueue before executing
> >>> blocking modesets.
> >>>
> >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>> Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
> >>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >> This patch won't really fix it, you could still have a blocking modeset in parallel to a nonblocking one. What would really be needed to fix those instances here?
> >
> > The idea was that anything touching anything global would grap all
> > crtc state, and then we'd track those using the drm_crtc_commit stuff.
> > Putting everything onto one queue also doesn't work because they're
> > meant to somewhat overlap (plane cleanup is in the same work, but
> > should/can overlap with the next update).
> >
> > Imo the right fix is to make sure we do add all the crtc states
> > everywhere we touch something global. And if that doesn't scale, then
> > modeset objects to track those bits.
> 
> Backtracking a lot here: What kind of global registers do you mean
> here? I have a bit the feeling that in a bunch of cases the problem
> would then also be that it's not really atomic when it matters how we
> interleave the updates ...

There are at least various global clock related registers we're frobbing
in DDI and PCH paths. And those are just the ones I've come across by
accident. The problem is that no one has even tried to go throgh the
code looking for these sorts of problems.

Daniel Vetter Nov. 23, 2017, 8:11 a.m. UTC | #8

On Wed, Nov 22, 2017 at 02:56:10PM +0200, Ville Syrjälä wrote:
> On Wed, Nov 22, 2017 at 01:35:39PM +0100, Daniel Vetter wrote:
> > On Wed, Nov 22, 2017 at 1:25 PM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > On Mon, Nov 13, 2017 at 3:36 PM, Maarten Lankhorst
> > > <maarten.lankhorst@linux.intel.com> wrote:
> > >> Op 13-11-17 om 14:36 schreef Ville Syrjala:
> > >>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >>>
> > >>> We have plenty of global registers and whatnot programmed without
> > >>> any further locking by the modeset code. Currently non-bocking
> > >>> modesets are allowed to execute in parallel which could corrupt
> > >>> said registers.
> > >>>
> > >>> To avoid the problem let's run all non-blocking modesets on an
> > >>> ordered workqueue. We still put page flips etc. to system_unbound_wq
> > >>> allowing page flips on one pipe to execute in parallel with page flips
> > >>> or a modeset on a another pipe (assuming no known state is shared
> > >>> between them, at which point they would have been added to the same
> > >>> atomic commit and serialized that way).
> > >>>
> > >>> Blocking modesets are already serialized with each other by
> > >>> connection_mutex, and thus are safe. To serialize them with
> > >>> non-blocking modesets we just flush the workqueue before executing
> > >>> blocking modesets.
> > >>>
> > >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > >>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > >>> Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
> > >>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >> This patch won't really fix it, you could still have a blocking modeset in parallel to a nonblocking one. What would really be needed to fix those instances here?
> > >
> > > The idea was that anything touching anything global would grap all
> > > crtc state, and then we'd track those using the drm_crtc_commit stuff.
> > > Putting everything onto one queue also doesn't work because they're
> > > meant to somewhat overlap (plane cleanup is in the same work, but
> > > should/can overlap with the next update).
> > >
> > > Imo the right fix is to make sure we do add all the crtc states
> > > everywhere we touch something global. And if that doesn't scale, then
> > > modeset objects to track those bits.
> > 
> > Backtracking a lot here: What kind of global registers do you mean
> > here? I have a bit the feeling that in a bunch of cases the problem
> > would then also be that it's not really atomic when it matters how we
> > interleave the updates ...
> 
> There are at least various global clock related registers we're frobbing
> in DDI and PCH paths. And those are just the ones I've come across by
> accident. The problem is that no one has even tried to go throgh the
> code looking for these sorts of problems.

Hm ...

The thing I'm worried about is that we do have some global shared state
not yet extracted properly. But then even for the shared state we do track
we don't sync correctly on updates to it. I guess long term we need to
beef up the private_state stuff to automagically take care of that.

Either way needs some duct-tape interim, and I think the ordered queue is
probably the least invasive, and least likely to be a roadblock for
whatever we pick in the future.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Maarten Lankhorst Nov. 23, 2017, 9:33 a.m. UTC | #9

Op 23-11-17 om 09:11 schreef Daniel Vetter:
> On Wed, Nov 22, 2017 at 02:56:10PM +0200, Ville Syrjälä wrote:
>> On Wed, Nov 22, 2017 at 01:35:39PM +0100, Daniel Vetter wrote:
>>> On Wed, Nov 22, 2017 at 1:25 PM, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>> On Mon, Nov 13, 2017 at 3:36 PM, Maarten Lankhorst
>>>> <maarten.lankhorst@linux.intel.com> wrote:
>>>>> Op 13-11-17 om 14:36 schreef Ville Syrjala:
>>>>>> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>>>>>
>>>>>> We have plenty of global registers and whatnot programmed without
>>>>>> any further locking by the modeset code. Currently non-bocking
>>>>>> modesets are allowed to execute in parallel which could corrupt
>>>>>> said registers.
>>>>>>
>>>>>> To avoid the problem let's run all non-blocking modesets on an
>>>>>> ordered workqueue. We still put page flips etc. to system_unbound_wq
>>>>>> allowing page flips on one pipe to execute in parallel with page flips
>>>>>> or a modeset on a another pipe (assuming no known state is shared
>>>>>> between them, at which point they would have been added to the same
>>>>>> atomic commit and serialized that way).
>>>>>>
>>>>>> Blocking modesets are already serialized with each other by
>>>>>> connection_mutex, and thus are safe. To serialize them with
>>>>>> non-blocking modesets we just flush the workqueue before executing
>>>>>> blocking modesets.
>>>>>>
>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>>> Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
>>>>>> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>>>> This patch won't really fix it, you could still have a blocking modeset in parallel to a nonblocking one. What would really be needed to fix those instances here?
>>>> The idea was that anything touching anything global would grap all
>>>> crtc state, and then we'd track those using the drm_crtc_commit stuff.
>>>> Putting everything onto one queue also doesn't work because they're
>>>> meant to somewhat overlap (plane cleanup is in the same work, but
>>>> should/can overlap with the next update).
>>>>
>>>> Imo the right fix is to make sure we do add all the crtc states
>>>> everywhere we touch something global. And if that doesn't scale, then
>>>> modeset objects to track those bits.
>>> Backtracking a lot here: What kind of global registers do you mean
>>> here? I have a bit the feeling that in a bunch of cases the problem
>>> would then also be that it's not really atomic when it matters how we
>>> interleave the updates ...
>> There are at least various global clock related registers we're frobbing
>> in DDI and PCH paths. And those are just the ones I've come across by
>> accident. The problem is that no one has even tried to go throgh the
>> code looking for these sorts of problems.
> Hm ...
>
> The thing I'm worried about is that we do have some global shared state
> not yet extracted properly. But then even for the shared state we do track
> we don't sync correctly on updates to it. I guess long term we need to
> beef up the private_state stuff to automagically take care of that.
>
> Either way needs some duct-tape interim, and I think the ordered queue is
> probably the least invasive, and least likely to be a roadblock for
> whatever we pick in the future.
>
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

I'm also worried about the opposite, we worry deeply about tracked shared
resources currently because they may be used concurrently, for new features
we should assume it's used in parallel and we might not care as much with
this workaround.

That said, on newer platforms watermark allocations always lock out all 
concurrent CRTC updates anyway, so this will mostly affect older platforms..

Meh,

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Put all non-blocking modesets onto an ordered wq

Commit Message

Comments

Patch