Message ID | 20180530103314.30565-2-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 30/05/18 03:33, Chris Wilson wrote: > After hitting the SIGINT from execbuf, wait until the next timer signal > before trying again. This aligns the start of the ioctl to the timer, > hopefully maximising the amount of time we have for processing before > the next signal -- trying to prevent the case where we are scheduled out > in the middle of processing and so hit the timer signal too early. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=106695 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Not sure I understand what is the sequence of events, is the problem we get a signal in the middle of a 'good' execbuf and exit the while loop prematurely? If so maybe we can also think of making the timer 'VIRTUAL' so that it would decrement only when the process is executing. Thanks, Antonio > --- > lib/i915/gem_ring.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/lib/i915/gem_ring.c b/lib/i915/gem_ring.c > index 7d64165eb..0c061000c 100644 > --- a/lib/i915/gem_ring.c > +++ b/lib/i915/gem_ring.c > @@ -96,6 +96,8 @@ __gem_measure_ring_inflight(int fd, unsigned int engine, enum measure_ring_flags > if (last == count) > break; > > + /* sleep until the next timer interrupt (woken on signal) */ > + pause(); Does it cause any (sensible) slowdown? Thanks, Antonio > last = count; > } while (1); > >
Quoting Antonio Argenziano (2018-05-30 18:30:36) > > > On 30/05/18 03:33, Chris Wilson wrote: > > After hitting the SIGINT from execbuf, wait until the next timer signal > > before trying again. This aligns the start of the ioctl to the timer, > > hopefully maximising the amount of time we have for processing before > > the next signal -- trying to prevent the case where we are scheduled out > > in the middle of processing and so hit the timer signal too early. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=106695 > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Not sure I understand what is the sequence of events, is the problem we > get a signal in the middle of a 'good' execbuf and exit the while loop > prematurely? If so maybe we can also think of making the timer 'VIRTUAL' > so that it would decrement only when the process is executing. If it's VIRTUAL it'll never fire when we wait for space (as being asleep no user/sys time is consumed). The only way I can explain 106695 would be with some very strange scheduler behaviour, but even then it requires us to hit a path where we actually check for a pending signal -- which should only happen when we run out of ring space for this setup. Not even the device being wedged (which it wasn't) would cause the ring to drain. Possibly going over 10s and the cork being unplugged? Very stange. > > --- > > lib/i915/gem_ring.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/lib/i915/gem_ring.c b/lib/i915/gem_ring.c > > index 7d64165eb..0c061000c 100644 > > --- a/lib/i915/gem_ring.c > > +++ b/lib/i915/gem_ring.c > > @@ -96,6 +96,8 @@ __gem_measure_ring_inflight(int fd, unsigned int engine, enum measure_ring_flags > > if (last == count) > > break; > > > > + /* sleep until the next timer interrupt (woken on signal) */ > > + pause(); > > Does it cause any (sensible) slowdown? Adds at most one timer interval, 10us. Ok, at a push 2 timer intervals if it takes longer than first to setup the sleep. -Chris
On 30/05/18 12:52, Chris Wilson wrote: > Quoting Antonio Argenziano (2018-05-30 18:30:36) >> >> >> On 30/05/18 03:33, Chris Wilson wrote: >>> After hitting the SIGINT from execbuf, wait until the next timer signal >>> before trying again. This aligns the start of the ioctl to the timer, >>> hopefully maximising the amount of time we have for processing before >>> the next signal -- trying to prevent the case where we are scheduled out >>> in the middle of processing and so hit the timer signal too early. >>> >>> References: https://bugs.freedesktop.org/show_bug.cgi?id=106695 >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> >> Not sure I understand what is the sequence of events, is the problem we >> get a signal in the middle of a 'good' execbuf and exit the while loop >> prematurely? If so maybe we can also think of making the timer 'VIRTUAL' >> so that it would decrement only when the process is executing. > > If it's VIRTUAL it'll never fire when we wait for space (as being asleep > no user/sys time is consumed). > > The only way I can explain 106695 would be with some very strange > scheduler behaviour, but even then it requires us to hit a path where we > actually check for a pending signal -- which should only happen when we > run out of ring space for this setup. Not even the device being wedged > (which it wasn't) would cause the ring to drain. Possibly going over 10s > and the cork being unplugged? Very stange. Just a bit concerned that we might be covering up some weird corner case where we are sleeping where we shouldn't. But the patch does what advertised and seems sensible so: Acked-by: Antonio Argenziano <antonio.argenziano@intel.com> > >>> --- >>> lib/i915/gem_ring.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/lib/i915/gem_ring.c b/lib/i915/gem_ring.c >>> index 7d64165eb..0c061000c 100644 >>> --- a/lib/i915/gem_ring.c >>> +++ b/lib/i915/gem_ring.c >>> @@ -96,6 +96,8 @@ __gem_measure_ring_inflight(int fd, unsigned int engine, enum measure_ring_flags >>> if (last == count) >>> break; >>> >>> + /* sleep until the next timer interrupt (woken on signal) */ >>> + pause(); >> >> Does it cause any (sensible) slowdown? > > Adds at most one timer interval, 10us. Ok, at a push 2 timer intervals > if it takes longer than first to setup the sleep. > -Chris >
Quoting Antonio Argenziano (2018-05-31 15:42:03) > > > On 30/05/18 12:52, Chris Wilson wrote: > > Quoting Antonio Argenziano (2018-05-30 18:30:36) > >> > >> > >> On 30/05/18 03:33, Chris Wilson wrote: > >>> After hitting the SIGINT from execbuf, wait until the next timer signal > >>> before trying again. This aligns the start of the ioctl to the timer, > >>> hopefully maximising the amount of time we have for processing before > >>> the next signal -- trying to prevent the case where we are scheduled out > >>> in the middle of processing and so hit the timer signal too early. > >>> > >>> References: https://bugs.freedesktop.org/show_bug.cgi?id=106695 > >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >> > >> Not sure I understand what is the sequence of events, is the problem we > >> get a signal in the middle of a 'good' execbuf and exit the while loop > >> prematurely? If so maybe we can also think of making the timer 'VIRTUAL' > >> so that it would decrement only when the process is executing. > > > > If it's VIRTUAL it'll never fire when we wait for space (as being asleep > > no user/sys time is consumed). > > > > The only way I can explain 106695 would be with some very strange > > scheduler behaviour, but even then it requires us to hit a path where we > > actually check for a pending signal -- which should only happen when we > > run out of ring space for this setup. Not even the device being wedged > > (which it wasn't) would cause the ring to drain. Possibly going over 10s > > and the cork being unplugged? Very stange. > > Just a bit concerned that we might be covering up some weird corner case > where we are sleeping where we shouldn't. The bugs are exactly the opposite, where there's a signal pending and we ignore it ;) And ignore them we do. If one day someone cares signal latency, we have a lot of explaining to do... It's just worrying because the only signal_pending() check we expect to hit is in wait-for-space (i915_request_wait to be precise); and that should be consistent between calls to execbuf. However, it's not meant to be a defining test of user behaviour, just exploiting the limitation of the implementation to report said limitation. All that we must do is to be sure that we don't over-report or else the callers will hang during their setup and fail. Under reporting is a nuisance, but not a huge issue. -Chris
diff --git a/lib/i915/gem_ring.c b/lib/i915/gem_ring.c index 7d64165eb..0c061000c 100644 --- a/lib/i915/gem_ring.c +++ b/lib/i915/gem_ring.c @@ -96,6 +96,8 @@ __gem_measure_ring_inflight(int fd, unsigned int engine, enum measure_ring_flags if (last == count) break; + /* sleep until the next timer interrupt (woken on signal) */ + pause(); last = count; } while (1);
After hitting the SIGINT from execbuf, wait until the next timer signal before trying again. This aligns the start of the ioctl to the timer, hopefully maximising the amount of time we have for processing before the next signal -- trying to prevent the case where we are scheduled out in the middle of processing and so hit the timer signal too early. References: https://bugs.freedesktop.org/show_bug.cgi?id=106695 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- lib/i915/gem_ring.c | 2 ++ 1 file changed, 2 insertions(+)