Message ID | 20180420095426.1105-1-mika.kuoppala@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 4/20/2018 3:24 PM, Mika Kuoppala wrote: > We use jiffies to determine when wait expires. However > Imre did find out that jiffies can and will do a >1 > increments on certain situations [1]. When this happens > in a wait_for loop, we return timeout errorneously > much earlier than what the real wallclock would say. > > We can't afford our waits to timeout prematurely. > Discard jiffies and change to ktime to detect timeouts. > > Reported-by: Imre Deak <imre.deak@intel.com> > References: https://lkml.org/lkml/2018/4/18/798 [1] > Cc: Imre Deak <imre.deak@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > --- > drivers/gpu/drm/i915/intel_drv.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h > index 8b20824e806e..ac7565220aa3 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -49,12 +49,12 @@ > * check the condition before the timeout. > */ > #define __wait_for(OP, COND, US, Wmin, Wmax) ({ \ > - unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \ > + const ktime_t end__ = ktime_add_ns(ktime_get_raw(), 1000ll * (US)); \ Is ktime_get_raw() monotonic? Thomas suggested ktime_get() > long wait__ = (Wmin); /* recommended min for usleep is 10 us */ \ > int ret__; \ > might_sleep(); \ > for (;;) { \ > - bool expired__ = time_after(jiffies, timeout__); \ > + const bool expired__ = ktime_after(ktime_get_raw(), end__); \ > OP; \ > if (COND) { \ > ret__ = 0; \
Quoting Mika Kuoppala (2018-04-20 10:54:26) > We use jiffies to determine when wait expires. However > Imre did find out that jiffies can and will do a >1 > increments on certain situations [1]. When this happens > in a wait_for loop, we return timeout errorneously > much earlier than what the real wallclock would say. > > We can't afford our waits to timeout prematurely. > Discard jiffies and change to ktime to detect timeouts. > > Reported-by: Imre Deak <imre.deak@intel.com> > References: https://lkml.org/lkml/2018/4/18/798 [1] > Cc: Imre Deak <imre.deak@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> The atomic variant is already jiffie-less (since it has to work in irq-off contexts). Maybe a bit tricky to suggest that the callers know if jiffie incremens are accurate or not. What is not clear from the link is whether our wait_for() is running across suspend, or whether it is just jiffie recalibration some time during resume that breaks. > drivers/gpu/drm/i915/intel_drv.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h > index 8b20824e806e..ac7565220aa3 100644 > --- a/drivers/gpu/drm/i915/intel_drv.h > +++ b/drivers/gpu/drm/i915/intel_drv.h > @@ -49,12 +49,12 @@ > * check the condition before the timeout. > */ > #define __wait_for(OP, COND, US, Wmin, Wmax) ({ \ > - unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \ > + const ktime_t end__ = ktime_add_ns(ktime_get_raw(), 1000ll * (US)); \ > long wait__ = (Wmin); /* recommended min for usleep is 10 us */ \ > int ret__; \ > might_sleep(); \ > for (;;) { \ > - bool expired__ = time_after(jiffies, timeout__); \ > + const bool expired__ = ktime_after(ktime_get_raw(), end__); \ > OP; \ > if (COND) { \ > ret__ = 0; \ Nevertheless, the patch is ok and I don't have too much objection to adding another tsc (at best, hpet at worst!) read around every mmio+sleep, plus expanding the code for the function calls. Out of curiosity what is the size delta? How many wait_for() do we have left that we need to convert to a function call rather than macro expansion? Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc stable? -Chris
Quoting Sagar Arun Kamble (2018-04-20 11:23:50) > > > On 4/20/2018 3:24 PM, Mika Kuoppala wrote: > > We use jiffies to determine when wait expires. However > > Imre did find out that jiffies can and will do a >1 > > increments on certain situations [1]. When this happens > > in a wait_for loop, we return timeout errorneously > > much earlier than what the real wallclock would say. > > > > We can't afford our waits to timeout prematurely. > > Discard jiffies and change to ktime to detect timeouts. > > > > Reported-by: Imre Deak <imre.deak@intel.com> > > References: https://lkml.org/lkml/2018/4/18/798 [1] > > Cc: Imre Deak <imre.deak@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > --- > > drivers/gpu/drm/i915/intel_drv.h | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h > > index 8b20824e806e..ac7565220aa3 100644 > > --- a/drivers/gpu/drm/i915/intel_drv.h > > +++ b/drivers/gpu/drm/i915/intel_drv.h > > @@ -49,12 +49,12 @@ > > * check the condition before the timeout. > > */ > > #define __wait_for(OP, COND, US, Wmin, Wmax) ({ \ > > - unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \ > > + const ktime_t end__ = ktime_add_ns(ktime_get_raw(), 1000ll * (US)); \ > Is ktime_get_raw() monotonic? Thomas suggested ktime_get() It proclaims to be monotonic, without the clock drift calibration. For the milliseconds we should be sleeping at most, I hope that is immaterial. -Chris
On 4/20/2018 4:09 PM, Chris Wilson wrote: > Quoting Sagar Arun Kamble (2018-04-20 11:23:50) >> >> On 4/20/2018 3:24 PM, Mika Kuoppala wrote: >>> We use jiffies to determine when wait expires. However >>> Imre did find out that jiffies can and will do a >1 >>> increments on certain situations [1]. When this happens >>> in a wait_for loop, we return timeout errorneously >>> much earlier than what the real wallclock would say. >>> >>> We can't afford our waits to timeout prematurely. >>> Discard jiffies and change to ktime to detect timeouts. >>> >>> Reported-by: Imre Deak <imre.deak@intel.com> >>> References: https://lkml.org/lkml/2018/4/18/798 [1] >>> Cc: Imre Deak <imre.deak@intel.com> >>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> >>> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> >>> --- >>> drivers/gpu/drm/i915/intel_drv.h | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h >>> index 8b20824e806e..ac7565220aa3 100644 >>> --- a/drivers/gpu/drm/i915/intel_drv.h >>> +++ b/drivers/gpu/drm/i915/intel_drv.h >>> @@ -49,12 +49,12 @@ >>> * check the condition before the timeout. >>> */ >>> #define __wait_for(OP, COND, US, Wmin, Wmax) ({ \ >>> - unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \ >>> + const ktime_t end__ = ktime_add_ns(ktime_get_raw(), 1000ll * (US)); \ >> Is ktime_get_raw() monotonic? Thomas suggested ktime_get() > It proclaims to be monotonic, without the clock drift calibration. For > the milliseconds we should be sleeping at most, I hope that is > immaterial. Yes. I remembered from Imre's comment[1] that raw clock can jump and will not be calibrated. If jumps are are not > jiffies we should be good get_raw then. [1] https://lists.freedesktop.org/archives/dri-devel/2012-October/028878.html > -Chris
Chris Wilson <chris@chris-wilson.co.uk> writes: > Quoting Sagar Arun Kamble (2018-04-20 11:23:50) >> >> >> On 4/20/2018 3:24 PM, Mika Kuoppala wrote: >> > We use jiffies to determine when wait expires. However >> > Imre did find out that jiffies can and will do a >1 >> > increments on certain situations [1]. When this happens >> > in a wait_for loop, we return timeout errorneously >> > much earlier than what the real wallclock would say. >> > >> > We can't afford our waits to timeout prematurely. >> > Discard jiffies and change to ktime to detect timeouts. >> > >> > Reported-by: Imre Deak <imre.deak@intel.com> >> > References: https://lkml.org/lkml/2018/4/18/798 [1] >> > Cc: Imre Deak <imre.deak@intel.com> >> > Cc: Chris Wilson <chris@chris-wilson.co.uk> >> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> >> > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> >> > --- >> > drivers/gpu/drm/i915/intel_drv.h | 4 ++-- >> > 1 file changed, 2 insertions(+), 2 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h >> > index 8b20824e806e..ac7565220aa3 100644 >> > --- a/drivers/gpu/drm/i915/intel_drv.h >> > +++ b/drivers/gpu/drm/i915/intel_drv.h >> > @@ -49,12 +49,12 @@ >> > * check the condition before the timeout. >> > */ >> > #define __wait_for(OP, COND, US, Wmin, Wmax) ({ \ >> > - unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \ >> > + const ktime_t end__ = ktime_add_ns(ktime_get_raw(), 1000ll * (US)); \ >> Is ktime_get_raw() monotonic? Thomas suggested ktime_get() > > It proclaims to be monotonic, without the clock drift calibration. For > the milliseconds we should be sleeping at most, I hope that is > immaterial. Agreed. And even if we would be affected by drift, we would want to extend the end a little instead of accepting a drift calibration while we are waiting. Here is how the code size is affected: add/remove: 0/0 grow/shrink: 5/29 up/down: 77/-1099 (-1022) Function old new delta __intel_wait_for_register_fw 462 496 +34 swsci 576 596 +20 skl_pcode_request 488 507 +19 intel_hdcp_auth 3806 3809 +3 __intel_wait_for_register 436 437 +1 hsw_power_well_disable 548 539 -9 i915_gem_idle_work_handler 608 593 -15 guc_fw_xfer 875 854 -21 bxt_ddi_pll_disable 324 303 -21 chv_set_cdclk 357 334 -23 vlv_wm_get_hw_state 2474 2450 -24 gmbus_wait 564 538 -26 cnl_cdclk_pll_enable 286 259 -27 cnl_cdclk_pll_disable 247 220 -27 hsw_enable_pc8 1699 1671 -28 bdw_set_cdclk 1006 977 -29 wait_for_pipe_scanline_moving 388 358 -30 intel_enable_dsi_pll 1060 1028 -32 i915_gem_wait_for_idle 348 314 -34 chv_set_memory_dvfs 258 223 -35 vlv_set_power_well 321 282 -39 intel_hdmi_hdcp_check_link 485 446 -39 chv_set_pipe_power_well.constprop 359 320 -39 g33_do_reset 216 175 -41 vlv_wait_for_pw_status 210 168 -42 intel_engines_park 348 306 -42 vlv_set_cdclk 711 662 -49 lspcon_wait_mode 265 211 -54 lpt_init_pch_refclk 1542 1488 -54 g4x_do_reset 595 541 -54 bxt_ddi_pll_enable 2407 2353 -54 ironlake_crtc_enable 3279 3211 -68 i915_do_reset 443 375 -68 intel_guc_send_ct 1691 1616 -75 Total: Before=1232834, After=1231812, chg -0.08% -Mika
On Fri, Apr 20, 2018 at 11:27:55AM +0100, Chris Wilson wrote: > Quoting Mika Kuoppala (2018-04-20 10:54:26) > > We use jiffies to determine when wait expires. However > > Imre did find out that jiffies can and will do a >1 > > increments on certain situations [1]. When this happens > > in a wait_for loop, we return timeout errorneously > > much earlier than what the real wallclock would say. > > > > We can't afford our waits to timeout prematurely. > > Discard jiffies and change to ktime to detect timeouts. > > > > Reported-by: Imre Deak <imre.deak@intel.com> > > References: https://lkml.org/lkml/2018/4/18/798 [1] > > Cc: Imre Deak <imre.deak@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > The atomic variant is already jiffie-less (since it has to work in > irq-off contexts). Maybe a bit tricky to suggest that the callers know > if jiffie incremens are accurate or not. > > What is not clear from the link is whether our wait_for() is running > across suspend, or whether it is just jiffie recalibration some time > during resume that breaks. The wait_for starts on the resume path, so the jump shouldn't be related to any of the timekeeping adjustments across suspend/resume (happening already during syscore resume). It looks like a delayed LAPIC timer interrupt on that GLK system, trying to get more details on that with irqsoff ftracing. > > > drivers/gpu/drm/i915/intel_drv.h | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h > > index 8b20824e806e..ac7565220aa3 100644 > > --- a/drivers/gpu/drm/i915/intel_drv.h > > +++ b/drivers/gpu/drm/i915/intel_drv.h > > @@ -49,12 +49,12 @@ > > * check the condition before the timeout. > > */ > > #define __wait_for(OP, COND, US, Wmin, Wmax) ({ \ > > - unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \ > > + const ktime_t end__ = ktime_add_ns(ktime_get_raw(), 1000ll * (US)); \ > > long wait__ = (Wmin); /* recommended min for usleep is 10 us */ \ > > int ret__; \ > > might_sleep(); \ > > for (;;) { \ > > - bool expired__ = time_after(jiffies, timeout__); \ > > + const bool expired__ = ktime_after(ktime_get_raw(), end__); \ > > OP; \ > > if (COND) { \ > > ret__ = 0; \ > > Nevertheless, the patch is ok and I don't have too much objection to > adding another tsc (at best, hpet at worst!) read around every mmio+sleep, > plus expanding the code for the function calls. Out of curiosity what is > the size delta? How many wait_for() do we have left that we need to > convert to a function call rather than macro expansion? > > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc stable? > -Chris > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Imre Deak <imre.deak@intel.com> writes: > On Fri, Apr 20, 2018 at 11:27:55AM +0100, Chris Wilson wrote: >> Quoting Mika Kuoppala (2018-04-20 10:54:26) >> > We use jiffies to determine when wait expires. However >> > Imre did find out that jiffies can and will do a >1 >> > increments on certain situations [1]. When this happens >> > in a wait_for loop, we return timeout errorneously >> > much earlier than what the real wallclock would say. >> > >> > We can't afford our waits to timeout prematurely. >> > Discard jiffies and change to ktime to detect timeouts. >> > >> > Reported-by: Imre Deak <imre.deak@intel.com> >> > References: https://lkml.org/lkml/2018/4/18/798 [1] >> > Cc: Imre Deak <imre.deak@intel.com> >> > Cc: Chris Wilson <chris@chris-wilson.co.uk> >> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> >> > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> >> >> The atomic variant is already jiffie-less (since it has to work in >> irq-off contexts). Maybe a bit tricky to suggest that the callers know >> if jiffie incremens are accurate or not. >> >> What is not clear from the link is whether our wait_for() is running >> across suspend, or whether it is just jiffie recalibration some time >> during resume that breaks. > > The wait_for starts on the resume path, so the jump shouldn't be related > to any of the timekeeping adjustments across suspend/resume (happening > already during syscore resume). It looks like a delayed LAPIC timer > interrupt on that GLK system, trying to get more details on that with > irqsoff ftracing. Both patches pushed. Thanks for the report and reviews. -Mika > >> >> > drivers/gpu/drm/i915/intel_drv.h | 4 ++-- >> > 1 file changed, 2 insertions(+), 2 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h >> > index 8b20824e806e..ac7565220aa3 100644 >> > --- a/drivers/gpu/drm/i915/intel_drv.h >> > +++ b/drivers/gpu/drm/i915/intel_drv.h >> > @@ -49,12 +49,12 @@ >> > * check the condition before the timeout. >> > */ >> > #define __wait_for(OP, COND, US, Wmin, Wmax) ({ \ >> > - unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \ >> > + const ktime_t end__ = ktime_add_ns(ktime_get_raw(), 1000ll * (US)); \ >> > long wait__ = (Wmin); /* recommended min for usleep is 10 us */ \ >> > int ret__; \ >> > might_sleep(); \ >> > for (;;) { \ >> > - bool expired__ = time_after(jiffies, timeout__); \ >> > + const bool expired__ = ktime_after(ktime_get_raw(), end__); \ >> > OP; \ >> > if (COND) { \ >> > ret__ = 0; \ >> >> Nevertheless, the patch is ok and I don't have too much objection to >> adding another tsc (at best, hpet at worst!) read around every mmio+sleep, >> plus expanding the code for the function calls. Out of curiosity what is >> the size delta? How many wait_for() do we have left that we need to >> convert to a function call rather than macro expansion? >> >> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> >> >> Cc stable? >> -Chris >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 8b20824e806e..ac7565220aa3 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -49,12 +49,12 @@ * check the condition before the timeout. */ #define __wait_for(OP, COND, US, Wmin, Wmax) ({ \ - unsigned long timeout__ = jiffies + usecs_to_jiffies(US) + 1; \ + const ktime_t end__ = ktime_add_ns(ktime_get_raw(), 1000ll * (US)); \ long wait__ = (Wmin); /* recommended min for usleep is 10 us */ \ int ret__; \ might_sleep(); \ for (;;) { \ - bool expired__ = time_after(jiffies, timeout__); \ + const bool expired__ = ktime_after(ktime_get_raw(), end__); \ OP; \ if (COND) { \ ret__ = 0; \
We use jiffies to determine when wait expires. However Imre did find out that jiffies can and will do a >1 increments on certain situations [1]. When this happens in a wait_for loop, we return timeout errorneously much earlier than what the real wallclock would say. We can't afford our waits to timeout prematurely. Discard jiffies and change to ktime to detect timeouts. Reported-by: Imre Deak <imre.deak@intel.com> References: https://lkml.org/lkml/2018/4/18/798 [1] Cc: Imre Deak <imre.deak@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/intel_drv.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)