Message ID | 1481776147-23041-1-git-send-email-hofrat@osadl.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: > Even on fast systems a 2 microsecond delay is most likely more efficient > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - > change this to a udelay(2). Similar concerns as in [1]. We don't need the accuracy of udelay() here, so this boils down to which is the better use of CPU. We could probably relax the max delay more if that was helpful. But I'm not immediately sold on "is most likely more efficient" which sounds like a gut feeling. I'm sorry it's not clear in my other reply that I do appreciate addressing incorrect/silly use of usleep_range(); I'm just not (yet) convinced udelay() is the answer. BR, Jani. [1] http://lkml.kernel.org/r/8737hpr32a.fsf@intel.com > > Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> > --- > > Problem found by coccinelle: > > Patch was compile tested with: x86_64_defconfig (implies CONFIG_DRM_I915) > > Patch is against 4.9.0 (localversion-next is next-20161214) > > drivers/gpu/drm/i915/intel_dsi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c > index 5b72c50..19fe86b 100644 > --- a/drivers/gpu/drm/i915/intel_dsi.c > +++ b/drivers/gpu/drm/i915/intel_dsi.c > @@ -379,7 +379,7 @@ static void bxt_dsi_device_ready(struct intel_encoder *encoder) > val &= ~ULPS_STATE_MASK; > val |= (ULPS_STATE_ENTER | DEVICE_READY); > I915_WRITE(MIPI_DEVICE_READY(port), val); > - usleep_range(2, 3); > + udelay(2); > > /* 3. Exit ULPS */ > val = I915_READ(MIPI_DEVICE_READY(port));
On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: > On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: > > Even on fast systems a 2 microsecond delay is most likely more efficient > > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - > > change this to a udelay(2). > > Similar concerns as in [1]. We don't need the accuracy of udelay() here, > so this boils down to which is the better use of CPU. We could probably > relax the max delay more if that was helpful. But I'm not immediately > sold on "is most likely more efficient" which sounds like a gut feeling. > > I'm sorry it's not clear in my other reply that I do appreciate > addressing incorrect/silly use of usleep_range(); I'm just not (yet) > convinced udelay() is the answer. So one reason why we unconditionally use *sleep variants is the might_sleep check. Because in the past people have used udelay and mdelay, those delays had to be increased a lot because hw, and at the same time someone added users of these functions to our irq helper, resulting in irq handling times measures in multiple ms. That's not good. So until someone can demonstrate that there's a real benefit (which let's be honest, for modeset code, will never be the case) I very highly prefer to use *sleep* functions. They prevent some silly things from happening by accident. Micro-optimizing modeset code and hampering maitainability in the process is not good. -Daniel > > BR, > Jani. > > > [1] http://lkml.kernel.org/r/8737hpr32a.fsf@intel.com > > > > > > Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> > > --- > > > > Problem found by coccinelle: > > > > Patch was compile tested with: x86_64_defconfig (implies CONFIG_DRM_I915) > > > > Patch is against 4.9.0 (localversion-next is next-20161214) > > > > drivers/gpu/drm/i915/intel_dsi.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c > > index 5b72c50..19fe86b 100644 > > --- a/drivers/gpu/drm/i915/intel_dsi.c > > +++ b/drivers/gpu/drm/i915/intel_dsi.c > > @@ -379,7 +379,7 @@ static void bxt_dsi_device_ready(struct intel_encoder *encoder) > > val &= ~ULPS_STATE_MASK; > > val |= (ULPS_STATE_ENTER | DEVICE_READY); > > I915_WRITE(MIPI_DEVICE_READY(port), val); > > - usleep_range(2, 3); > > + udelay(2); > > > > /* 3. Exit ULPS */ > > val = I915_READ(MIPI_DEVICE_READY(port)); > > -- > Jani Nikula, Intel Open Source Technology Center > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Thu, Dec 15, 2016 at 10:25:19AM +0100, Daniel Vetter wrote: > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: > > On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: > > > Even on fast systems a 2 microsecond delay is most likely more efficient > > > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - > > > change this to a udelay(2). > > > > Similar concerns as in [1]. We don't need the accuracy of udelay() here, > > so this boils down to which is the better use of CPU. We could probably > > relax the max delay more if that was helpful. But I'm not immediately > > sold on "is most likely more efficient" which sounds like a gut feeling. > > > > I'm sorry it's not clear in my other reply that I do appreciate > > addressing incorrect/silly use of usleep_range(); I'm just not (yet) > > convinced udelay() is the answer. > > So one reason why we unconditionally use *sleep variants is the > might_sleep check. Because in the past people have used udelay and mdelay, > those delays had to be increased a lot because hw, and at the same time > someone added users of these functions to our irq helper, resulting in irq > handling times measures in multiple ms. That's not good. > > So until someone can demonstrate that there's a real benefit (which let's > be honest, for modeset code, will never be the case) I very highly prefer > to use *sleep* functions. They prevent some silly things from happening by > accident. Micro-optimizing modeset code and hampering maitainability in > the process is not good. Also, the entire premise seems backwards: usleep_range is inefficient for certain parameter ranges and better replaced with udelay. That makes sense. But why exactly do we not fix udelay_range then, but instead do a cocci job crawling through all the thousands of callers? Just fix usleep(_range) to use udelay for very small values (and still keep the might_sleep check ofc) if that's more efficient! -Daniel
On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: > On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: > > Even on fast systems a 2 microsecond delay is most likely more efficient > > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - > > change this to a udelay(2). > > Similar concerns as in [1]. We don't need the accuracy of udelay() here, > so this boils down to which is the better use of CPU. We could probably > relax the max delay more if that was helpful. But I'm not immediately > sold on "is most likely more efficient" which sounds like a gut feeling. > > I'm sorry it's not clear in my other reply that I do appreciate > addressing incorrect/silly use of usleep_range(); I'm just not (yet) > convinced udelay() is the answer. if the delay is not critical and all that is needed is an assurance that it is greater than X us then usleep_range is fine with a relaxed limit. So from what you wrote my patch proposal is wrong - the udelay() is not the way to got. My intent is to get rid of very small usleep_range() cases so if usleep_range(20,50) causes no issues with this driver and does not induce any performance penalty then that would be the way to go I think. thx! hofrat
On Thu, 15 Dec 2016, Nicholas Mc Guire <der.herr@hofr.at> wrote: > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: >> On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: >> > Even on fast systems a 2 microsecond delay is most likely more efficient >> > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - >> > change this to a udelay(2). >> >> Similar concerns as in [1]. We don't need the accuracy of udelay() here, >> so this boils down to which is the better use of CPU. We could probably >> relax the max delay more if that was helpful. But I'm not immediately >> sold on "is most likely more efficient" which sounds like a gut feeling. >> >> I'm sorry it's not clear in my other reply that I do appreciate >> addressing incorrect/silly use of usleep_range(); I'm just not (yet) >> convinced udelay() is the answer. > > if the delay is not critical and all that is needed > is an assurance that it is greater than X us then > usleep_range is fine with a relaxed limit. > So from what you wrote my patch proposal is wrong - the > udelay() is not the way to got. > My intent is to get rid of very small usleep_range() cases > so if usleep_range(20,50) causes no issues with this driver > and does not induce any performance penalty then that would > be the way to go I think. Okay, so I looked at the code, and I looked at our spec, and I looked at the MIPI D-PHY spec, and I cried a little. Long story short, I think usleep_range(10, 50) will be fine. BR, Jani.
On Thu, Dec 15, 2016 at 11:52:34AM +0200, Jani Nikula wrote: > On Thu, 15 Dec 2016, Nicholas Mc Guire <der.herr@hofr.at> wrote: > > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: > >> On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: > >> > Even on fast systems a 2 microsecond delay is most likely more efficient > >> > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - > >> > change this to a udelay(2). > >> > >> Similar concerns as in [1]. We don't need the accuracy of udelay() here, > >> so this boils down to which is the better use of CPU. We could probably > >> relax the max delay more if that was helpful. But I'm not immediately > >> sold on "is most likely more efficient" which sounds like a gut feeling. > >> > >> I'm sorry it's not clear in my other reply that I do appreciate > >> addressing incorrect/silly use of usleep_range(); I'm just not (yet) > >> convinced udelay() is the answer. > > > > if the delay is not critical and all that is needed > > is an assurance that it is greater than X us then > > usleep_range is fine with a relaxed limit. > > So from what you wrote my patch proposal is wrong - the > > udelay() is not the way to got. > > My intent is to get rid of very small usleep_range() cases > > so if usleep_range(20,50) causes no issues with this driver > > and does not induce any performance penalty then that would > > be the way to go I think. > > Okay, so I looked at the code, and I looked at our spec, and I looked at > the MIPI D-PHY spec, and I cried a little. > > Long story short, I think usleep_range(10, 50) will be fine. Note that I really want to see a comment next to every delay like this documenting the actual hardware requirement, if the delay used by the code doesn't 100% match it.
On Thu, 15 Dec 2016, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > On Thu, Dec 15, 2016 at 11:52:34AM +0200, Jani Nikula wrote: >> On Thu, 15 Dec 2016, Nicholas Mc Guire <der.herr@hofr.at> wrote: >> > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: >> >> On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: >> >> > Even on fast systems a 2 microsecond delay is most likely more efficient >> >> > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - >> >> > change this to a udelay(2). >> >> >> >> Similar concerns as in [1]. We don't need the accuracy of udelay() here, >> >> so this boils down to which is the better use of CPU. We could probably >> >> relax the max delay more if that was helpful. But I'm not immediately >> >> sold on "is most likely more efficient" which sounds like a gut feeling. >> >> >> >> I'm sorry it's not clear in my other reply that I do appreciate >> >> addressing incorrect/silly use of usleep_range(); I'm just not (yet) >> >> convinced udelay() is the answer. >> > >> > if the delay is not critical and all that is needed >> > is an assurance that it is greater than X us then >> > usleep_range is fine with a relaxed limit. >> > So from what you wrote my patch proposal is wrong - the >> > udelay() is not the way to got. >> > My intent is to get rid of very small usleep_range() cases >> > so if usleep_range(20,50) causes no issues with this driver >> > and does not induce any performance penalty then that would >> > be the way to go I think. >> >> Okay, so I looked at the code, and I looked at our spec, and I looked at >> the MIPI D-PHY spec, and I cried a little. >> >> Long story short, I think usleep_range(10, 50) will be fine. > > Note that I really want to see a comment next to every delay like this > documenting the actual hardware requirement, if the delay used by the > code doesn't 100% match it. Our spec says, "Wait for 2us for ULPS to complete". That's a simplistic view wrt D-PHY, and our code doesn't even match the spec. Hence the tears. Want to propose a wording for the comment so we can apply this change, without going for a full rewrite of the sequence? BR, Jani.
On Thu, Dec 15, 2016 at 12:20:01PM +0200, Jani Nikula wrote: > On Thu, 15 Dec 2016, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > On Thu, Dec 15, 2016 at 11:52:34AM +0200, Jani Nikula wrote: > >> On Thu, 15 Dec 2016, Nicholas Mc Guire <der.herr@hofr.at> wrote: > >> > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: > >> >> On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: > >> >> > Even on fast systems a 2 microsecond delay is most likely more efficient > >> >> > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - > >> >> > change this to a udelay(2). > >> >> > >> >> Similar concerns as in [1]. We don't need the accuracy of udelay() here, > >> >> so this boils down to which is the better use of CPU. We could probably > >> >> relax the max delay more if that was helpful. But I'm not immediately > >> >> sold on "is most likely more efficient" which sounds like a gut feeling. > >> >> > >> >> I'm sorry it's not clear in my other reply that I do appreciate > >> >> addressing incorrect/silly use of usleep_range(); I'm just not (yet) > >> >> convinced udelay() is the answer. > >> > > >> > if the delay is not critical and all that is needed > >> > is an assurance that it is greater than X us then > >> > usleep_range is fine with a relaxed limit. > >> > So from what you wrote my patch proposal is wrong - the > >> > udelay() is not the way to got. > >> > My intent is to get rid of very small usleep_range() cases > >> > so if usleep_range(20,50) causes no issues with this driver > >> > and does not induce any performance penalty then that would > >> > be the way to go I think. > >> > >> Okay, so I looked at the code, and I looked at our spec, and I looked at > >> the MIPI D-PHY spec, and I cried a little. > >> > >> Long story short, I think usleep_range(10, 50) will be fine. > > > > Note that I really want to see a comment next to every delay like this > > documenting the actual hardware requirement, if the delay used by the > > code doesn't 100% match it. > > Our spec says, "Wait for 2us for ULPS to complete". That's a simplistic > view wrt D-PHY, and our code doesn't even match the spec. Hence the > tears. Want to propose a wording for the comment so we can apply this > change, without going for a full rewrite of the sequence? > is that suitable or am I overdoing it ? - usleep_range(2, 3); + /* delay for at least 2us - relaxed to 10-50 to allow + * hrtimer subsystem to optimize uncritical timer handling + */ + usleep_range(10, 50); thx! hofrat
On Thu, Dec 15, 2016 at 10:27:59AM +0100, Daniel Vetter wrote: > On Thu, Dec 15, 2016 at 10:25:19AM +0100, Daniel Vetter wrote: > > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: > > > On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: > > > > Even on fast systems a 2 microsecond delay is most likely more efficient > > > > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - > > > > change this to a udelay(2). > > > > > > Similar concerns as in [1]. We don't need the accuracy of udelay() here, > > > so this boils down to which is the better use of CPU. We could probably > > > relax the max delay more if that was helpful. But I'm not immediately > > > sold on "is most likely more efficient" which sounds like a gut feeling. > > > > > > I'm sorry it's not clear in my other reply that I do appreciate > > > addressing incorrect/silly use of usleep_range(); I'm just not (yet) > > > convinced udelay() is the answer. > > > > So one reason why we unconditionally use *sleep variants is the > > might_sleep check. Because in the past people have used udelay and mdelay, > > those delays had to be increased a lot because hw, and at the same time > > someone added users of these functions to our irq helper, resulting in irq > > handling times measures in multiple ms. That's not good. > > > > So until someone can demonstrate that there's a real benefit (which let's > > be honest, for modeset code, will never be the case) I very highly prefer > > to use *sleep* functions. They prevent some silly things from happening by > > accident. Micro-optimizing modeset code and hampering maitainability in > > the process is not good. > > Also, the entire premise seems backwards: usleep_range is inefficient for > certain parameter ranges and better replaced with udelay. That makes > sense. > > But why exactly do we not fix udelay_range then, but instead do a cocci > job crawling through all the thousands of callers? Just fix usleep(_range) > to use udelay for very small values (and still keep the might_sleep check > ofc) if that's more efficient! its actually not thousands more like a few dozen: usleep_range(min,max) in linux-stable 4.9.0 1648 calls total 1488 pass numeric values only (90.29%) 27 min below 10us (1.81%) 40 min above 10ms (2.68%) min out of spec 4.50% 76 preprocessor constants (4.61%) 1 min below 10us (1.31%) 8 min above 10ms (10.52%) min out of spec 11.84% 85 expressions (5.15%) 1(0) min below 10us (1.50%)* 6(2) min above 10ms (7.50%)* min out of spec 9.0% Errors: 23 where min==max (1.39%) 0 where max < min (0.00%) Total: Bugs: 6.48%-10.70%* Crit: 3.09%-3.15%* (min < 10 and min==max and max < min) Detectable by coccinelle: Bugs: 74/103 (71.8%) Crit: 50/52 (96.1%) *based on estimats as runtime values not known. there is no usleep() as noted in Documentation/timers/timers-howto.txt - Why not usleep? On slower systems, (embedded, OR perhaps a speed- stepped PC!) the overhead of setting up the hrtimers for usleep *may* not be worth it. Such an evaluation will obviously depend on your specific situation, but it is something to be aware of. and it suggests to use different API for different ranges - sounds sane to me and seems to cover the needs of drivers. thx! hofrat
On Thu, Dec 15, 2016 at 11:51 AM, Nicholas Mc Guire <der.herr@hofr.at> wrote: > On Thu, Dec 15, 2016 at 10:27:59AM +0100, Daniel Vetter wrote: >> On Thu, Dec 15, 2016 at 10:25:19AM +0100, Daniel Vetter wrote: >> > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: >> > > On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: >> > > > Even on fast systems a 2 microsecond delay is most likely more efficient >> > > > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - >> > > > change this to a udelay(2). >> > > >> > > Similar concerns as in [1]. We don't need the accuracy of udelay() here, >> > > so this boils down to which is the better use of CPU. We could probably >> > > relax the max delay more if that was helpful. But I'm not immediately >> > > sold on "is most likely more efficient" which sounds like a gut feeling. >> > > >> > > I'm sorry it's not clear in my other reply that I do appreciate >> > > addressing incorrect/silly use of usleep_range(); I'm just not (yet) >> > > convinced udelay() is the answer. >> > >> > So one reason why we unconditionally use *sleep variants is the >> > might_sleep check. Because in the past people have used udelay and mdelay, >> > those delays had to be increased a lot because hw, and at the same time >> > someone added users of these functions to our irq helper, resulting in irq >> > handling times measures in multiple ms. That's not good. >> > >> > So until someone can demonstrate that there's a real benefit (which let's >> > be honest, for modeset code, will never be the case) I very highly prefer >> > to use *sleep* functions. They prevent some silly things from happening by >> > accident. Micro-optimizing modeset code and hampering maitainability in >> > the process is not good. >> >> Also, the entire premise seems backwards: usleep_range is inefficient for >> certain parameter ranges and better replaced with udelay. That makes >> sense. >> >> But why exactly do we not fix udelay_range then, but instead do a cocci >> job crawling through all the thousands of callers? Just fix usleep(_range) >> to use udelay for very small values (and still keep the might_sleep check >> ofc) if that's more efficient! > > its actually not thousands more like a few dozen: There's 1k + usleep* calls you need to audit and teach people how to exactly use it. > usleep_range(min,max) in linux-stable 4.9.0 > > 1648 calls total > 1488 pass numeric values only (90.29%) > 27 min below 10us (1.81%) > 40 min above 10ms (2.68%) > min out of spec 4.50% > 76 preprocessor constants (4.61%) > 1 min below 10us (1.31%) > 8 min above 10ms (10.52%) > min out of spec 11.84% > 85 expressions (5.15%) > 1(0) min below 10us (1.50%)* > 6(2) min above 10ms (7.50%)* > min out of spec 9.0% > Errors: > 23 where min==max (1.39%) > 0 where max < min (0.00%) > > Total: > Bugs: 6.48%-10.70%* > Crit: 3.09%-3.15%* (min < 10 and min==max and max < min) > Detectable by coccinelle: > Bugs: 74/103 (71.8%) > Crit: 50/52 (96.1%) > > *based on estimats as runtime values not known. > > > there is no usleep() as noted in Documentation/timers/timers-howto.txt > - Why not usleep? > On slower systems, (embedded, OR perhaps a speed- > stepped PC!) the overhead of setting up the hrtimers > for usleep *may* not be worth it. Such an evaluation > will obviously depend on your specific situation, but > it is something to be aware of. > > and it suggests to use different API for different ranges - sounds sane > to me and seems to cover the needs of drivers. git grep usleep seems to disagree, at least I see a bunch of usleep() calls. And per Rusty's api usability scale the ultimate fucntion is dwim(). It just feels to me like pushing such trade-off decisions to drivers is bad design because a) the tradeoffs depend upon the cpu you're running on b) driver writers are pretty good at getting such tradeoffs wrong in general. Aiming for a more dwim()-like api for this here makes sense, instead of an eternal fight to correct drivers that get it wrong all the time. -Daniel
On Thu, 15 Dec 2016, Nicholas Mc Guire <der.herr@hofr.at> wrote: > On Thu, Dec 15, 2016 at 12:20:01PM +0200, Jani Nikula wrote: >> On Thu, 15 Dec 2016, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: >> > On Thu, Dec 15, 2016 at 11:52:34AM +0200, Jani Nikula wrote: >> >> On Thu, 15 Dec 2016, Nicholas Mc Guire <der.herr@hofr.at> wrote: >> >> > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: >> >> >> On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: >> >> >> > Even on fast systems a 2 microsecond delay is most likely more efficient >> >> >> > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - >> >> >> > change this to a udelay(2). >> >> >> >> >> >> Similar concerns as in [1]. We don't need the accuracy of udelay() here, >> >> >> so this boils down to which is the better use of CPU. We could probably >> >> >> relax the max delay more if that was helpful. But I'm not immediately >> >> >> sold on "is most likely more efficient" which sounds like a gut feeling. >> >> >> >> >> >> I'm sorry it's not clear in my other reply that I do appreciate >> >> >> addressing incorrect/silly use of usleep_range(); I'm just not (yet) >> >> >> convinced udelay() is the answer. >> >> > >> >> > if the delay is not critical and all that is needed >> >> > is an assurance that it is greater than X us then >> >> > usleep_range is fine with a relaxed limit. >> >> > So from what you wrote my patch proposal is wrong - the >> >> > udelay() is not the way to got. >> >> > My intent is to get rid of very small usleep_range() cases >> >> > so if usleep_range(20,50) causes no issues with this driver >> >> > and does not induce any performance penalty then that would >> >> > be the way to go I think. >> >> >> >> Okay, so I looked at the code, and I looked at our spec, and I looked at >> >> the MIPI D-PHY spec, and I cried a little. >> >> >> >> Long story short, I think usleep_range(10, 50) will be fine. >> > >> > Note that I really want to see a comment next to every delay like this >> > documenting the actual hardware requirement, if the delay used by the >> > code doesn't 100% match it. >> >> Our spec says, "Wait for 2us for ULPS to complete". That's a simplistic >> view wrt D-PHY, and our code doesn't even match the spec. Hence the >> tears. Want to propose a wording for the comment so we can apply this >> change, without going for a full rewrite of the sequence? >> > is that suitable or am I overdoing it ? > > - usleep_range(2, 3); > + /* delay for at least 2us - relaxed to 10-50 to allow > + * hrtimer subsystem to optimize uncritical timer handling > + */ > + usleep_range(10, 50); I'm fine with that. Or maybe just make it "relaxed to allow" without the values. Jani. > > thx! > hofrat
On Thu, Dec 15, 2016 at 10:34:00AM +0000, Nicholas Mc Guire wrote: > On Thu, Dec 15, 2016 at 12:20:01PM +0200, Jani Nikula wrote: > > On Thu, 15 Dec 2016, Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > > On Thu, Dec 15, 2016 at 11:52:34AM +0200, Jani Nikula wrote: > > >> On Thu, 15 Dec 2016, Nicholas Mc Guire <der.herr@hofr.at> wrote: > > >> > On Thu, Dec 15, 2016 at 11:08:49AM +0200, Jani Nikula wrote: > > >> >> On Thu, 15 Dec 2016, Nicholas Mc Guire <hofrat@osadl.org> wrote: > > >> >> > Even on fast systems a 2 microsecond delay is most likely more efficient > > >> >> > as a busy-wait loop. The overhead of a hrtimer does not seem warranted - > > >> >> > change this to a udelay(2). > > >> >> > > >> >> Similar concerns as in [1]. We don't need the accuracy of udelay() here, > > >> >> so this boils down to which is the better use of CPU. We could probably > > >> >> relax the max delay more if that was helpful. But I'm not immediately > > >> >> sold on "is most likely more efficient" which sounds like a gut feeling. > > >> >> > > >> >> I'm sorry it's not clear in my other reply that I do appreciate > > >> >> addressing incorrect/silly use of usleep_range(); I'm just not (yet) > > >> >> convinced udelay() is the answer. > > >> > > > >> > if the delay is not critical and all that is needed > > >> > is an assurance that it is greater than X us then > > >> > usleep_range is fine with a relaxed limit. > > >> > So from what you wrote my patch proposal is wrong - the > > >> > udelay() is not the way to got. > > >> > My intent is to get rid of very small usleep_range() cases > > >> > so if usleep_range(20,50) causes no issues with this driver > > >> > and does not induce any performance penalty then that would > > >> > be the way to go I think. > > >> > > >> Okay, so I looked at the code, and I looked at our spec, and I looked at > > >> the MIPI D-PHY spec, and I cried a little. > > >> > > >> Long story short, I think usleep_range(10, 50) will be fine. > > > > > > Note that I really want to see a comment next to every delay like this > > > documenting the actual hardware requirement, if the delay used by the > > > code doesn't 100% match it. > > > > Our spec says, "Wait for 2us for ULPS to complete". That's a simplistic > > view wrt D-PHY, and our code doesn't even match the spec. Hence the > > tears. Want to propose a wording for the comment so we can apply this > > change, without going for a full rewrite of the sequence? > > > is that suitable or am I overdoing it ? > > - usleep_range(2, 3); > + /* delay for at least 2us - relaxed to 10-50 to allow > + * hrtimer subsystem to optimize uncritical timer handling > + */ That's entirely too verbose IMO, and the reason for using usleep_range() is pretty obvious without spelling it out. All we really want to know is what the spec says is the minimum acceptable delay. > + usleep_range(10, 50); > > thx! > hofrat
diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c index 5b72c50..19fe86b 100644 --- a/drivers/gpu/drm/i915/intel_dsi.c +++ b/drivers/gpu/drm/i915/intel_dsi.c @@ -379,7 +379,7 @@ static void bxt_dsi_device_ready(struct intel_encoder *encoder) val &= ~ULPS_STATE_MASK; val |= (ULPS_STATE_ENTER | DEVICE_READY); I915_WRITE(MIPI_DEVICE_READY(port), val); - usleep_range(2, 3); + udelay(2); /* 3. Exit ULPS */ val = I915_READ(MIPI_DEVICE_READY(port));
Even on fast systems a 2 microsecond delay is most likely more efficient as a busy-wait loop. The overhead of a hrtimer does not seem warranted - change this to a udelay(2). Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> --- Problem found by coccinelle: Patch was compile tested with: x86_64_defconfig (implies CONFIG_DRM_I915) Patch is against 4.9.0 (localversion-next is next-20161214) drivers/gpu/drm/i915/intel_dsi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)