Message ID | 1412126893-15796-2-git-send-email-mitchelh@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tuesday 30 September 2014 18:28:12 Mitchel Humpherys wrote: > + */ > +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ > +({ \ > + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ > + might_sleep_if(timeout_us); \ Does it make sense to call this with timeout_us = 0? > + for (;;) { \ > + (val) = readl(addr); \ > + if (cond) \ > + break; \ > + if (timeout_us && ktime_compare(ktime_get(), timeout) > 0) { \ > + (val) = readl(addr); \ > + break; \ > + } \ > + if (sleep_us) \ > + usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \ > + } \ > + (cond) ? 0 : -ETIMEDOUT; \ > +}) I think it would be better to tie the 'range' argument to the timeout. Also doing a division seems expensive here. > +/** > + * readl_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs > + * @addr: Address to poll > + * @val: Variable to read the value into > + * @cond: Break condition (usually involving @val) > + * @max_reads: Maximum number of reads before giving up > + * @time_between_us: Time to udelay() between successive reads > + * > + * Returns 0 on success and -ETIMEDOUT upon a timeout. > + */ > +#define readl_poll_timeout_atomic(addr, val, cond, max_reads, time_between_us) \ > +({ \ > + int count; \ > + for (count = (max_reads); count > 0; count--) { \ > + (val) = readl(addr); \ > + if (cond) \ > + break; \ > + udelay(time_between_us); \ > + } \ > + (cond) ? 0 : -ETIMEDOUT; \ > +}) udelay has a large variability, I think it would be better to also use ktime_compare here and make the interface the same as the other one. You might want to add a warning if someone tries to pass more than a few microseconds as the timeout. More generally speaking, using 'readl' seems fairly specific. I suspect that we'd have to add the entire range of accessors over time if this catches on: readb, readw, readq, readb_relaxed, readw_relaxed, readl_relaxed, readq_relaxed, ioread8, ioread16, ioread16be, ioread32, ioread32be, inb, inb_p, inw, inw_p, inw, inl, inl_p, and possibly more of those. Would it make sense to pass that operation as an argument? Arnd
On Wed, Oct 01 2014 at 01:25:33 AM, Arnd Bergmann <arnd@arndb.de> wrote: > On Tuesday 30 September 2014 18:28:12 Mitchel Humpherys wrote: >> + */ >> +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ >> +({ \ >> + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ >> + might_sleep_if(timeout_us); \ > > Does it make sense to call this with timeout_us = 0? Yes, the idea there being to "never timeout". That mode should, of course, be used with extreme caution since never timing out is not really "playing nice" with the system. > >> + for (;;) { \ >> + (val) = readl(addr); \ >> + if (cond) \ >> + break; \ >> + if (timeout_us && ktime_compare(ktime_get(), timeout) > 0) { \ >> + (val) = readl(addr); \ >> + break; \ >> + } \ >> + if (sleep_us) \ >> + usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \ >> + } \ >> + (cond) ? 0 : -ETIMEDOUT; \ >> +}) > > I think it would be better to tie the 'range' argument to the timeout. Also > doing a division seems expensive here. We may have cases where the HW spec says something like "the foo widget response time is on average 5us, with a worst case of 100us." In such a case we may want to poll the bit very frequently to optimize for the common case of a very fast lock time, but we may not want to error out due to a timeout unless we've been waiting 100us. Regarding the division, for the overwhelmingly common case where the user of the API passes in a constant for sleep_us the compiler optimizes out this calculation altogether and just sticks the final result in (I verified this with gcc 4.9 and the kernel build system's built-in support for generating .s files). Conveying semantic meaning by using `DIV_ROUND_UP' is nice but if you feel strongly about it we can make this a shift instead. > >> +/** >> + * readl_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs >> + * @addr: Address to poll >> + * @val: Variable to read the value into >> + * @cond: Break condition (usually involving @val) >> + * @max_reads: Maximum number of reads before giving up >> + * @time_between_us: Time to udelay() between successive reads >> + * >> + * Returns 0 on success and -ETIMEDOUT upon a timeout. >> + */ >> +#define readl_poll_timeout_atomic(addr, val, cond, max_reads, time_between_us) \ >> +({ \ >> + int count; \ >> + for (count = (max_reads); count > 0; count--) { \ >> + (val) = readl(addr); \ >> + if (cond) \ >> + break; \ >> + udelay(time_between_us); \ >> + } \ >> + (cond) ? 0 : -ETIMEDOUT; \ >> +}) > > udelay has a large variability, I think it would be better to also use > ktime_compare here and make the interface the same as the other one. > You might want to add a warning if someone tries to pass more than a few > microseconds as the timeout. Sounds good, will update in v5. > > More generally speaking, using 'readl' seems fairly specific. I suspect > that we'd have to add the entire range of accessors over time if this > catches on: readb, readw, readq, readb_relaxed, readw_relaxed, readl_relaxed, > readq_relaxed, ioread8, ioread16, ioread16be, ioread32, ioread32be, > inb, inb_p, inw, inw_p, inw, inl, inl_p, and possibly more of those. > > Would it make sense to pass that operation as an argument? Sure, we'll do that in v5 as well. -Mitch
On Tuesday 07 October 2014 18:47:59 Mitchel Humpherys wrote: > On Wed, Oct 01 2014 at 01:25:33 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > On Tuesday 30 September 2014 18:28:12 Mitchel Humpherys wrote: > >> + */ > >> +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ > >> +({ \ > >> + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ > >> + might_sleep_if(timeout_us); \ > > > > Does it make sense to call this with timeout_us = 0? > > Yes, the idea there being to "never timeout". That mode should, of > course, be used with extreme caution since never timing out is not > really "playing nice" with the system. But then you certainly still 'might_sleep' here. The might_sleep_if(timeout_us) line suggests that it won't sleep, but that isn't the case. > > > >> + for (;;) { \ > >> + (val) = readl(addr); \ > >> + if (cond) \ > >> + break; \ > >> + if (timeout_us && ktime_compare(ktime_get(), timeout) > 0) { \ > >> + (val) = readl(addr); \ > >> + break; \ > >> + } \ > >> + if (sleep_us) \ > >> + usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \ > >> + } \ > >> + (cond) ? 0 : -ETIMEDOUT; \ > >> +}) > > > > I think it would be better to tie the 'range' argument to the timeout. Also > > doing a division seems expensive here. > > We may have cases where the HW spec says something like "the foo widget > response time is on average 5us, with a worst case of 100us." In such a > case we may want to poll the bit very frequently to optimize for the > common case of a very fast lock time, but we may not want to error out > due to a timeout unless we've been waiting 100us. Ok. > Regarding the division, for the overwhelmingly common case where the > user of the API passes in a constant for sleep_us the compiler optimizes > out this calculation altogether and just sticks the final result in (I > verified this with gcc 4.9 and the kernel build system's built-in > support for generating .s files). Conveying semantic meaning by using > `DIV_ROUND_UP' is nice but if you feel strongly about it we can make > this a shift instead. The more important question is probably if you want to keep the _ROUND_UP part. If that's not significant, I think a shift would be better. Arnd
On Tue, Oct 07 2014 at 06:47:59 PM, Mitchel Humpherys <mitchelh@codeaurora.org> wrote: >>> +#define readl_poll_timeout_atomic(addr, val, cond, max_reads, time_between_us) \ >>> +({ \ >>> + int count; \ >>> + for (count = (max_reads); count > 0; count--) { \ >>> + (val) = readl(addr); \ >>> + if (cond) \ >>> + break; \ >>> + udelay(time_between_us); \ >>> + } \ >>> + (cond) ? 0 : -ETIMEDOUT; \ >>> +}) >> >> udelay has a large variability, I think it would be better to also use >> ktime_compare here and make the interface the same as the other one. >> You might want to add a warning if someone tries to pass more than a few >> microseconds as the timeout. > > Sounds good, will update in v5. Except I'll probably hold off on adding a warning about udelay since udelay already includes a "warning" (a compile error, actually) when exceedingly large delays are requested. -Mitch
On Wed, Oct 08 2014 at 06:40:46 AM, Arnd Bergmann <arnd@arndb.de> wrote: > On Tuesday 07 October 2014 18:47:59 Mitchel Humpherys wrote: >> On Wed, Oct 01 2014 at 01:25:33 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> > On Tuesday 30 September 2014 18:28:12 Mitchel Humpherys wrote: >> >> + */ >> >> +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ >> >> +({ \ >> >> + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ >> >> + might_sleep_if(timeout_us); \ >> > >> > Does it make sense to call this with timeout_us = 0? >> >> Yes, the idea there being to "never timeout". That mode should, of >> course, be used with extreme caution since never timing out is not >> really "playing nice" with the system. > > But then you certainly still 'might_sleep' here. The > might_sleep_if(timeout_us) line suggests that it won't sleep, but > that isn't the case. Yes looks like that was actually a bug. Should have been might_sleep_if()'ing on sleep_us. This is fixed in the v5 I just sent out. [...] >> Regarding the division, for the overwhelmingly common case where the >> user of the API passes in a constant for sleep_us the compiler optimizes >> out this calculation altogether and just sticks the final result in (I >> verified this with gcc 4.9 and the kernel build system's built-in >> support for generating .s files). Conveying semantic meaning by using >> `DIV_ROUND_UP' is nice but if you feel strongly about it we can make >> this a shift instead. > > The more important question is probably if you want to keep the _ROUND_UP > part. If that's not significant, I think a shift would be better. If we drop the _ROUND_UP then passing a sleep_us <= 4 would result in a minimum sleep time of 0, so we'd be polling a lot faster than the user had expected. -Mitch
On Friday 10 October 2014 12:44:45 Mitchel Humpherys wrote: > >> Regarding the division, for the overwhelmingly common case where the > >> user of the API passes in a constant for sleep_us the compiler optimizes > >> out this calculation altogether and just sticks the final result in (I > >> verified this with gcc 4.9 and the kernel build system's built-in > >> support for generating .s files). Conveying semantic meaning by using > >> `DIV_ROUND_UP' is nice but if you feel strongly about it we can make > >> this a shift instead. > > > > The more important question is probably if you want to keep the _ROUND_UP > > part. If that's not significant, I think a shift would be better. > > If we drop the _ROUND_UP then passing a sleep_us <= 4 would result in a > minimum sleep time of 0, so we'd be polling a lot faster than the user > had expected. How about changing the semantics to sleep at least the sleep_us time, and at most four times that? This would turn the expensive division into a multiplication and avoid the need for rounding. If there are important reasons to keep doing the division, you could instead use '(sleep_us >> 4) + 1', which is also very cheap to compute and avoids the problem you mention. Arnd
On Fri, Oct 10 2014 at 12:50:23 PM, Arnd Bergmann <arnd@arndb.de> wrote: > On Friday 10 October 2014 12:44:45 Mitchel Humpherys wrote: >> >> Regarding the division, for the overwhelmingly common case where the >> >> user of the API passes in a constant for sleep_us the compiler optimizes >> >> out this calculation altogether and just sticks the final result in (I >> >> verified this with gcc 4.9 and the kernel build system's built-in >> >> support for generating .s files). Conveying semantic meaning by using >> >> `DIV_ROUND_UP' is nice but if you feel strongly about it we can make >> >> this a shift instead. >> > >> > The more important question is probably if you want to keep the _ROUND_UP >> > part. If that's not significant, I think a shift would be better. >> >> If we drop the _ROUND_UP then passing a sleep_us <= 4 would result in a >> minimum sleep time of 0, so we'd be polling a lot faster than the user >> had expected. > > How about changing the semantics to sleep at least the sleep_us time, > and at most four times that? This would turn the expensive division into > a multiplication and avoid the need for rounding. We already have a bunch of code using this and I'm not sure what the effect would be on the system by changing this. It would probably be negligible but saving a couple of instructions hardly seems like justification for a change... More importantly, I think users would rather poll their register a little quicker than they asked, rather than slower. > If there are important reasons to keep doing the division, you could > instead use '(sleep_us >> 4) + 1', which is also very cheap to compute > and avoids the problem you mention. But I think you meant `(sleep_us >> 2) + 1', right? Incidentally, that illustrates the benefit of the semantic clarity provided by explicitly dividing :). In any case, I'm happy to go with a shift here for v6. -Mitch
diff --git a/include/linux/iopoll.h b/include/linux/iopoll.h new file mode 100644 index 0000000000..594b0d4f03 --- /dev/null +++ b/include/linux/iopoll.h @@ -0,0 +1,77 @@ +/* + * Copyright (c) 2012-2014 The Linux Foundation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 and + * only version 2 as published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + */ + +#ifndef _LINUX_IOPOLL_H +#define _LINUX_IOPOLL_H + +#include <linux/kernel.h> +#include <linux/types.h> +#include <linux/hrtimer.h> +#include <linux/delay.h> +#include <linux/errno.h> +#include <linux/io.h> + +/** + * readl_poll_timeout - Periodically poll an address until a condition is met or a timeout occurs + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @sleep_us: Maximum time to sleep between reads in us (0 tight-loops) + * @timeout_us: Timeout in us, 0 means never timeout + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. In either + * case, the last read value at @addr is stored in @val. Must not + * be called from atomic context if sleep_us or timeout_us are used. + */ +#define readl_poll_timeout(addr, val, cond, sleep_us, timeout_us) \ +({ \ + ktime_t timeout = ktime_add_us(ktime_get(), timeout_us); \ + might_sleep_if(timeout_us); \ + for (;;) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + if (timeout_us && ktime_compare(ktime_get(), timeout) > 0) { \ + (val) = readl(addr); \ + break; \ + } \ + if (sleep_us) \ + usleep_range(DIV_ROUND_UP(sleep_us, 4), sleep_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +/** + * readl_poll_timeout_atomic - Periodically poll an address until a condition is met or a timeout occurs + * @addr: Address to poll + * @val: Variable to read the value into + * @cond: Break condition (usually involving @val) + * @max_reads: Maximum number of reads before giving up + * @time_between_us: Time to udelay() between successive reads + * + * Returns 0 on success and -ETIMEDOUT upon a timeout. + */ +#define readl_poll_timeout_atomic(addr, val, cond, max_reads, time_between_us) \ +({ \ + int count; \ + for (count = (max_reads); count > 0; count--) { \ + (val) = readl(addr); \ + if (cond) \ + break; \ + udelay(time_between_us); \ + } \ + (cond) ? 0 : -ETIMEDOUT; \ +}) + +#endif /* _LINUX_IOPOLL_H */