Message ID | 20180417170638.20550-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Chris Wilson (2018-04-17 20:06:38) > The old wait_on_atomic_t used a custom callback to perform the > schedule(), which used my return semantics of reporting an error code on > timeout. wait_var_event_timeout() uses the schedule() return semantics > of reporting the remaining jiffies (1 if it timed out with 0 jiffies > remaining!) and 0 on failure. This semantic mismatch lead to us falsely > claiming a time out occurred. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106085 > Fixes: d224985a5e31 ("sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Regards, Joonas
Quoting Joonas Lahtinen (2018-04-18 10:10:17) > Quoting Chris Wilson (2018-04-17 20:06:38) > > The old wait_on_atomic_t used a custom callback to perform the > > schedule(), which used my return semantics of reporting an error code on > > timeout. wait_var_event_timeout() uses the schedule() return semantics > > of reporting the remaining jiffies (1 if it timed out with 0 jiffies > > remaining!) and 0 on failure. This semantic mismatch lead to us falsely > > claiming a time out occurred. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106085 > > Fixes: d224985a5e31 ("sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API") > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> How about a backmerge of rc1 onto drm-intel-next-queued so we can apply the fix? -Chris
Quoting Chris Wilson (2018-04-18 12:14:15) > Quoting Joonas Lahtinen (2018-04-18 10:10:17) > > Quoting Chris Wilson (2018-04-17 20:06:38) > > > The old wait_on_atomic_t used a custom callback to perform the > > > schedule(), which used my return semantics of reporting an error code on > > > timeout. wait_var_event_timeout() uses the schedule() return semantics > > > of reporting the remaining jiffies (1 if it timed out with 0 jiffies > > > remaining!) and 0 on failure. This semantic mismatch lead to us falsely > > > claiming a time out occurred. > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106085 > > > Fixes: d224985a5e31 ("sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API") > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > How about a backmerge of rc1 onto drm-intel-next-queued so we can apply > the fix? + Jani for that Regards, Joonas > -Chris
Chris Wilson <chris@chris-wilson.co.uk> writes: > The old wait_on_atomic_t used a custom callback to perform the > schedule(), which used my return semantics of reporting an error code on > timeout. wait_var_event_timeout() uses the schedule() return semantics > of reporting the remaining jiffies (1 if it timed out with 0 jiffies > remaining!) and 0 on failure. This semantic mismatch lead to us falsely > claiming a time out occurred. I might have gone too far into the rabbit hole on this one. The more I abseiled, the more there was macro trickery and shadowing return values. Anxiety started to creep in. I am back but will need to recuperate. Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106085 > Fixes: d224985a5e31 ("sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c b/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c > index 46580026c7fc..d6926e7820e5 100644 > --- a/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c > +++ b/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c > @@ -412,10 +412,11 @@ static int igt_wakeup(void *arg) > * that they are ready for the next test. We wait until all > * threads are complete and waiting for us (i.e. not a seqno). > */ > - err = wait_var_event_timeout(&done, !atomic_read(&done), 10 * HZ); > - if (err) { > + if (!wait_var_event_timeout(&done, > + !atomic_read(&done), 10 * HZ)) { > pr_err("Timed out waiting for %d remaining waiters\n", > atomic_read(&done)); > + err = -ETIMEDOUT; > break; > } > > -- > 2.17.0 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Quoting Patchwork (2018-04-18 14:38:56) > ==== Possible fixes ==== > > igt@drv_selftest@mock_breadcrumbs: > shard-hsw: DMESG-FAIL (fdo#106085) -> PASS > shard-snb: DMESG-FAIL (fdo#106085) -> PASS > shard-apl: DMESG-FAIL (fdo#106085) -> PASS > shard-kbl: DMESG-FAIL (fdo#106085) -> PASS And with dinq rolled forward, I've pushed the patch. Thanks for the review, -Chris
diff --git a/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c b/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c index 46580026c7fc..d6926e7820e5 100644 --- a/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c @@ -412,10 +412,11 @@ static int igt_wakeup(void *arg) * that they are ready for the next test. We wait until all * threads are complete and waiting for us (i.e. not a seqno). */ - err = wait_var_event_timeout(&done, !atomic_read(&done), 10 * HZ); - if (err) { + if (!wait_var_event_timeout(&done, + !atomic_read(&done), 10 * HZ)) { pr_err("Timed out waiting for %d remaining waiters\n", atomic_read(&done)); + err = -ETIMEDOUT; break; }
The old wait_on_atomic_t used a custom callback to perform the schedule(), which used my return semantics of reporting an error code on timeout. wait_var_event_timeout() uses the schedule() return semantics of reporting the remaining jiffies (1 if it timed out with 0 jiffies remaining!) and 0 on failure. This semantic mismatch lead to us falsely claiming a time out occurred. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106085 Fixes: d224985a5e31 ("sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/selftests/intel_breadcrumbs.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)