Message ID | 20221029005400.2712577-1-linux@roeck-us.net (mailing list archive) |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | rtc: cros-ec: Limit RTC alarm range if needed | expand |
On Fri, Oct 28, 2022 at 05:54:00PM -0700, Guenter Roeck wrote: > RTC chips on some older Chromebooks can only handle alarms less than 24 > hours in the future. Attempts to set an alarm beyond that range fails. > The most severe impact of this limitation is that suspend requests fail > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > in the future. > > Try to set the real-time alarm to just below 24 hours if setting it to > a larger value fails to work around the problem. While not perfect, it > is better than just failing the call. A similar workaround is already > implemented in the rtc-tps6586x driver. > > Drop error messages in cros_ec_rtc_get() and cros_ec_rtc_set() since the > calling code also logs an error and to avoid spurious error messages if > setting the alarm ultimately succeeds. > > Cc: Brian Norris <briannorris@chromium.org> > Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Brian Norris <briannorris@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org>
On Fri, Oct 28, 2022 at 05:54:00PM -0700, Guenter Roeck wrote: > Drop error messages in cros_ec_rtc_get() and cros_ec_rtc_set() since the > calling code also logs an error and to avoid spurious error messages if > setting the alarm ultimately succeeds. It only retries for cros_ec_rtc_set(). cros_ec_rtc_get() doesn't emit spurious error messages. cros_ec_rtc_get() could preserve the error log; cros_ec_rtc_set() could change from using dev_err() to dev_warn() since cros_ec_rtc_set_alarm() calls dev_err() if cros_ec_rtc_set() fails. But this is quite nitpick so anyway. > Cc: Brian Norris <briannorris@chromium.org> > Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org>
On Mon, Oct 31, 2022 at 11:26:44AM +0800, Tzung-Bi Shih wrote: > On Fri, Oct 28, 2022 at 05:54:00PM -0700, Guenter Roeck wrote: > > Drop error messages in cros_ec_rtc_get() and cros_ec_rtc_set() since the > > calling code also logs an error and to avoid spurious error messages if > > setting the alarm ultimately succeeds. > > It only retries for cros_ec_rtc_set(). cros_ec_rtc_get() doesn't emit > spurious error messages. All of cros_ec_rtc_get()'s callers were also logging the same message. So it was redundant. I think the general strategy here was to log the error(s) in callers (last point before we "exit" the driver), to have the best chance at context-relevant error messages, or ignoring them where proper. It's already a bit dubious to log kernel messages at all in response to normal sysfs operations. We probably want them in some cases, when things are particularly unexpected, but it shouldn't be a regular occurrence, and we certainly don't need *two* log lines for each error. Technically, if one wants to be super-nitpicky about one purpose per patch, then maybe a patch to trim the logging, and a patch to fix the alarm range issues... ...but I think that would be a little silly, and perhaps even harmful. They are related concerns that should be patched (and probably backported) together. Brian
Hello, On 28/10/2022 17:54:00-0700, Guenter Roeck wrote: > RTC chips on some older Chromebooks can only handle alarms less than 24 > hours in the future. Attempts to set an alarm beyond that range fails. > The most severe impact of this limitation is that suspend requests fail > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > in the future. > > Try to set the real-time alarm to just below 24 hours if setting it to > a larger value fails to work around the problem. While not perfect, it > is better than just failing the call. A similar workaround is already > implemented in the rtc-tps6586x driver. I'm not super convinced this is actually better than failing the call because your are implementing policy in the driver which is bad from a user point of view. It would be way better to return -ERANGE and let userspace select a better alarm time. Do you have to know in advance which are the "older" chromebooks that are affected? > > Drop error messages in cros_ec_rtc_get() and cros_ec_rtc_set() since the > calling code also logs an error and to avoid spurious error messages if > setting the alarm ultimately succeeds. > > Cc: Brian Norris <briannorris@chromium.org> > Signed-off-by: Guenter Roeck <linux@roeck-us.net> > --- > drivers/rtc/rtc-cros-ec.c | 35 ++++++++++++++++++++--------------- > 1 file changed, 20 insertions(+), 15 deletions(-) > > diff --git a/drivers/rtc/rtc-cros-ec.c b/drivers/rtc/rtc-cros-ec.c > index 887f5193e253..a3ec066d8066 100644 > --- a/drivers/rtc/rtc-cros-ec.c > +++ b/drivers/rtc/rtc-cros-ec.c > @@ -14,6 +14,8 @@ > > #define DRV_NAME "cros-ec-rtc" > > +#define SECS_PER_DAY (24 * 60 * 60) > + > /** > * struct cros_ec_rtc - Driver data for EC RTC > * > @@ -43,13 +45,8 @@ static int cros_ec_rtc_get(struct cros_ec_device *cros_ec, u32 command, > msg.msg.insize = sizeof(msg.data); > > ret = cros_ec_cmd_xfer_status(cros_ec, &msg.msg); > - if (ret < 0) { > - dev_err(cros_ec->dev, > - "error getting %s from EC: %d\n", > - command == EC_CMD_RTC_GET_VALUE ? "time" : "alarm", > - ret); > + if (ret < 0) > return ret; > - } > > *response = msg.data.time; > > @@ -59,7 +56,7 @@ static int cros_ec_rtc_get(struct cros_ec_device *cros_ec, u32 command, > static int cros_ec_rtc_set(struct cros_ec_device *cros_ec, u32 command, > u32 param) > { > - int ret = 0; > + int ret; > struct { > struct cros_ec_command msg; > struct ec_response_rtc data; > @@ -71,13 +68,8 @@ static int cros_ec_rtc_set(struct cros_ec_device *cros_ec, u32 command, > msg.data.time = param; > > ret = cros_ec_cmd_xfer_status(cros_ec, &msg.msg); > - if (ret < 0) { > - dev_err(cros_ec->dev, "error setting %s on EC: %d\n", > - command == EC_CMD_RTC_SET_VALUE ? "time" : "alarm", > - ret); > + if (ret < 0) > return ret; > - } > - > return 0; > } > > @@ -190,8 +182,21 @@ static int cros_ec_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm) > > ret = cros_ec_rtc_set(cros_ec, EC_CMD_RTC_SET_ALARM, alarm_offset); > if (ret < 0) { > - dev_err(dev, "error setting alarm: %d\n", ret); > - return ret; > + if (ret == -EINVAL && alarm_offset >= SECS_PER_DAY) { > + /* > + * RTC chips on some older Chromebooks can only handle > + * alarms up to 24h in the future. Try to set an alarm > + * below that limit to avoid suspend failures. > + */ > + ret = cros_ec_rtc_set(cros_ec, EC_CMD_RTC_SET_ALARM, > + SECS_PER_DAY - 1); > + } > + > + if (ret < 0) { > + dev_err(dev, "error setting alarm in %u seconds: %d\n", > + alarm_offset, ret); > + return ret; > + } > } > > return 0; > -- > 2.36.2 >
CC kernel/time/alarmtimer.c maintainers On Mon, Oct 31, 2022 at 06:10:53PM +0100, Alexandre Belloni wrote: > On 28/10/2022 17:54:00-0700, Guenter Roeck wrote: > > RTC chips on some older Chromebooks can only handle alarms less than 24 > > hours in the future. Attempts to set an alarm beyond that range fails. > > The most severe impact of this limitation is that suspend requests fail > > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > > in the future. > > > > Try to set the real-time alarm to just below 24 hours if setting it to > > a larger value fails to work around the problem. While not perfect, it > > is better than just failing the call. A similar workaround is already > > implemented in the rtc-tps6586x driver. > > I'm not super convinced this is actually better than failing the call > because your are implementing policy in the driver which is bad from a > user point of view. It would be way better to return -ERANGE and let > userspace select a better alarm time. There is no way to signal user space. alarmtimer_suspend() is doing this on behalf of CLOCK_BOOTTIME_ALARM or CLOCK_REALTIME_ALARM timers, which were set long ago. We could possibly figure out some way to change the clock API to signal some kind of error back to the timer handlers, but that seems destined to be overly complex and not really help anyone (stable ABI, etc.). The right answer for alarmtimer is to just wake up a little early, IMO. (And failing alarmtimer_suspend() is Bad.) I think Guenter considered some alternative change to teach drivers/rtc/* and alarmtimer_suspend() to agree on an error code (ERANGE? or EDOM?) to do some automatic backoff there. But given the existing example (rtc-tps6586x) and the inconsistent use of error codes in drivers/rtc/, this seemed just as good of an option to me. But if we want to shave more yaks, then we'll have a more complex / riskier patch set and a harder time backporting the fix. That's OK too. Brian
On Mon, Oct 31, 2022 at 06:10:53PM +0100, Alexandre Belloni wrote: > Hello, > > On 28/10/2022 17:54:00-0700, Guenter Roeck wrote: > > RTC chips on some older Chromebooks can only handle alarms less than 24 > > hours in the future. Attempts to set an alarm beyond that range fails. > > The most severe impact of this limitation is that suspend requests fail > > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > > in the future. > > > > Try to set the real-time alarm to just below 24 hours if setting it to > > a larger value fails to work around the problem. While not perfect, it > > is better than just failing the call. A similar workaround is already > > implemented in the rtc-tps6586x driver. > > I'm not super convinced this is actually better than failing the call > because your are implementing policy in the driver which is bad from a > user point of view. It would be way better to return -ERANGE and let > userspace select a better alarm time. The failing call is from alarmtimer_suspend() which is called during suspend. It is not from userspace, and userspace has no chance to intervene. It is also not just one userspace application which could request a large timeout, it is a variety of userspace applications, and not all of them are written by Google. Some are Android applications. I don't see how it would be realistic to expect all such applications to fix their code (if that is even possible - there might be an application which called sleep(100000) or something equivalent, which works just fine as long as the system is not suspended. > Do you have to know in advance which are the "older" chromebooks that > are affected? Not sure I understand the question. Technically we know, but the cros_ec rtc driver doesn't know because the EC doesn't have an API to report the maximum timeout to the Linux driver. Even if that existed, it would not help because the rtc API only supports absolute maximum clock values, not clock offsets relative to the current time. So ultimately there is no means for an RTC driver to tell the maximum possible alarm timer offset to the RTC subsystem, and there is no means for a user such as alarmtimer_suspend() to obtain the maximum time offset. Does that answer your question ? On a side note, I tried an alternate implementation by adding a retry into alarmtimer_suspend(), where it would request a smaller timeout if the requested timeout failed. I did not pursue/submit this since it seemed hacky. To solve that problem, I'd rather discuss extending the RTC API to provide a maximum offset to its users. Such a solution would probably be desirable, but that it more longer term and would not solve the immediate problem. If you see a better solution, please let me know. Again, the problem is that alarmtimer_suspend() fails because the requested timeout is too large. Thanks, Guenter > > > > > Drop error messages in cros_ec_rtc_get() and cros_ec_rtc_set() since the > > calling code also logs an error and to avoid spurious error messages if > > setting the alarm ultimately succeeds. > > > > Cc: Brian Norris <briannorris@chromium.org> > > Signed-off-by: Guenter Roeck <linux@roeck-us.net> > > --- > > drivers/rtc/rtc-cros-ec.c | 35 ++++++++++++++++++++--------------- > > 1 file changed, 20 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/rtc/rtc-cros-ec.c b/drivers/rtc/rtc-cros-ec.c > > index 887f5193e253..a3ec066d8066 100644 > > --- a/drivers/rtc/rtc-cros-ec.c > > +++ b/drivers/rtc/rtc-cros-ec.c > > @@ -14,6 +14,8 @@ > > > > #define DRV_NAME "cros-ec-rtc" > > > > +#define SECS_PER_DAY (24 * 60 * 60) > > + > > /** > > * struct cros_ec_rtc - Driver data for EC RTC > > * > > @@ -43,13 +45,8 @@ static int cros_ec_rtc_get(struct cros_ec_device *cros_ec, u32 command, > > msg.msg.insize = sizeof(msg.data); > > > > ret = cros_ec_cmd_xfer_status(cros_ec, &msg.msg); > > - if (ret < 0) { > > - dev_err(cros_ec->dev, > > - "error getting %s from EC: %d\n", > > - command == EC_CMD_RTC_GET_VALUE ? "time" : "alarm", > > - ret); > > + if (ret < 0) > > return ret; > > - } > > > > *response = msg.data.time; > > > > @@ -59,7 +56,7 @@ static int cros_ec_rtc_get(struct cros_ec_device *cros_ec, u32 command, > > static int cros_ec_rtc_set(struct cros_ec_device *cros_ec, u32 command, > > u32 param) > > { > > - int ret = 0; > > + int ret; > > struct { > > struct cros_ec_command msg; > > struct ec_response_rtc data; > > @@ -71,13 +68,8 @@ static int cros_ec_rtc_set(struct cros_ec_device *cros_ec, u32 command, > > msg.data.time = param; > > > > ret = cros_ec_cmd_xfer_status(cros_ec, &msg.msg); > > - if (ret < 0) { > > - dev_err(cros_ec->dev, "error setting %s on EC: %d\n", > > - command == EC_CMD_RTC_SET_VALUE ? "time" : "alarm", > > - ret); > > + if (ret < 0) > > return ret; > > - } > > - > > return 0; > > } > > > > @@ -190,8 +182,21 @@ static int cros_ec_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm) > > > > ret = cros_ec_rtc_set(cros_ec, EC_CMD_RTC_SET_ALARM, alarm_offset); > > if (ret < 0) { > > - dev_err(dev, "error setting alarm: %d\n", ret); > > - return ret; > > + if (ret == -EINVAL && alarm_offset >= SECS_PER_DAY) { > > + /* > > + * RTC chips on some older Chromebooks can only handle > > + * alarms up to 24h in the future. Try to set an alarm > > + * below that limit to avoid suspend failures. > > + */ > > + ret = cros_ec_rtc_set(cros_ec, EC_CMD_RTC_SET_ALARM, > > + SECS_PER_DAY - 1); > > + } > > + > > + if (ret < 0) { > > + dev_err(dev, "error setting alarm in %u seconds: %d\n", > > + alarm_offset, ret); > > + return ret; > > + } > > } > > > > return 0; > > -- > > 2.36.2 > > > > -- > Alexandre Belloni, co-owner and COO, Bootlin > Embedded Linux and Kernel engineering > https://bootlin.com
On 31/10/2022 10:56:16-0700, Brian Norris wrote: > CC kernel/time/alarmtimer.c maintainers > > On Mon, Oct 31, 2022 at 06:10:53PM +0100, Alexandre Belloni wrote: > > On 28/10/2022 17:54:00-0700, Guenter Roeck wrote: > > > RTC chips on some older Chromebooks can only handle alarms less than 24 > > > hours in the future. Attempts to set an alarm beyond that range fails. > > > The most severe impact of this limitation is that suspend requests fail > > > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > > > in the future. > > > > > > Try to set the real-time alarm to just below 24 hours if setting it to > > > a larger value fails to work around the problem. While not perfect, it > > > is better than just failing the call. A similar workaround is already > > > implemented in the rtc-tps6586x driver. > > > > I'm not super convinced this is actually better than failing the call > > because your are implementing policy in the driver which is bad from a > > user point of view. It would be way better to return -ERANGE and let > > userspace select a better alarm time. > > There is no way to signal user space. alarmtimer_suspend() is doing this > on behalf of CLOCK_BOOTTIME_ALARM or CLOCK_REALTIME_ALARM timers, which > were set long ago. We could possibly figure out some way to change the > clock API to signal some kind of error back to the timer handlers, but > that seems destined to be overly complex and not really help anyone > (stable ABI, etc.). The right answer for alarmtimer is to just wake up a > little early, IMO. (And failing alarmtimer_suspend() is Bad.) But it is not the right answer from the RTC subsystem point of view because there are many uses cases were you don't want to forcefully wake up earlier or you are going to unnecessarily deplete a battery for example or you may be able to select another RTC device which can wake you later on. > I think Guenter considered some alternative change to teach > drivers/rtc/* and alarmtimer_suspend() to agree on an error code > (ERANGE? or EDOM?) to do some automatic backoff there. But given the > existing example (rtc-tps6586x) and the inconsistent use of error codes The existing example predates actual maintenance of the subsystem. You can't complain about inconsistent use of error codes (which I believe has been cut down) and at the same time introduce inconsistent behaviour. > in drivers/rtc/, this seemed just as good of an option to me. > > But if we want to shave more yaks, then we'll have a more complex / > riskier patch set and a harder time backporting the fix. That's OK too. > The issue with the current patch is that it forbids going for a better solution because you will then take for granted that this driver can't ever fail.
On 31/10/2022 11:19:13-0700, Guenter Roeck wrote: > On Mon, Oct 31, 2022 at 06:10:53PM +0100, Alexandre Belloni wrote: > > Hello, > > > > On 28/10/2022 17:54:00-0700, Guenter Roeck wrote: > > > RTC chips on some older Chromebooks can only handle alarms less than 24 > > > hours in the future. Attempts to set an alarm beyond that range fails. > > > The most severe impact of this limitation is that suspend requests fail > > > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > > > in the future. > > > > > > Try to set the real-time alarm to just below 24 hours if setting it to > > > a larger value fails to work around the problem. While not perfect, it > > > is better than just failing the call. A similar workaround is already > > > implemented in the rtc-tps6586x driver. > > > > I'm not super convinced this is actually better than failing the call > > because your are implementing policy in the driver which is bad from a > > user point of view. It would be way better to return -ERANGE and let > > userspace select a better alarm time. > > The failing call is from alarmtimer_suspend() which is called during suspend. > It is not from userspace, and userspace has no chance to intervene. > > It is also not just one userspace application which could request a large > timeout, it is a variety of userspace applications, and not all of them are > written by Google. Some are Android applications. I don't see how it would be > realistic to expect all such applications to fix their code (if that is even > possible - there might be an application which called sleep(100000) or > something equivalent, which works just fine as long as the system is not > suspended. > > > Do you have to know in advance which are the "older" chromebooks that > > are affected? > > Not sure I understand the question. Technically we know, but the cros_ec > rtc driver doesn't know because the EC doesn't have an API to report the > maximum timeout to the Linux driver. Even if that existed, it would not > help because the rtc API only supports absolute maximum clock values, > not clock offsets relative to the current time. So ultimately there is no > means for an RTC driver to tell the maximum possible alarm timer offset to > the RTC subsystem, and there is no means for a user such as > alarmtimer_suspend() to obtain the maximum time offset. Does that answer > your question ? Yes, my question was missing a few words, sorry I wanted to know if you had *a way* to know. > > On a side note, I tried an alternate implementation by adding a retry into > alarmtimer_suspend(), where it would request a smaller timeout if the > requested timeout failed. I did not pursue/submit this since it seemed > hacky. To solve that problem, I'd rather discuss extending the RTC API > to provide a maximum offset to its users. Such a solution would probably > be desirable, but that it more longer term and would not solve the > immediate problem. Yes, this is what I was aiming for. This is something that is indeed missing in the RTC API and that I already thought about. But indeed, it would be great to have a way to set the alarm range separately from the time keeping range. This would indeed have to be a range relative to the current time. alarmtimer_suspend() can then get the allowed alarm range for the RTC, and set the alarm to max(alarm range, timer value) and loop until the timer has expired. Once we have this API, userspace can do the same. I guess that ultimately, this doesn't help your driver unless you are wanting to wakeup all the chromebooks at least once a day regardless of their EC. > If you see a better solution, please let me know. Again, the problem > is that alarmtimer_suspend() fails because the requested timeout is too > large. > > Thanks, > Guenter > > > > > > > > > Drop error messages in cros_ec_rtc_get() and cros_ec_rtc_set() since the > > > calling code also logs an error and to avoid spurious error messages if > > > setting the alarm ultimately succeeds. > > > > > > Cc: Brian Norris <briannorris@chromium.org> > > > Signed-off-by: Guenter Roeck <linux@roeck-us.net> > > > --- > > > drivers/rtc/rtc-cros-ec.c | 35 ++++++++++++++++++++--------------- > > > 1 file changed, 20 insertions(+), 15 deletions(-) > > > > > > diff --git a/drivers/rtc/rtc-cros-ec.c b/drivers/rtc/rtc-cros-ec.c > > > index 887f5193e253..a3ec066d8066 100644 > > > --- a/drivers/rtc/rtc-cros-ec.c > > > +++ b/drivers/rtc/rtc-cros-ec.c > > > @@ -14,6 +14,8 @@ > > > > > > #define DRV_NAME "cros-ec-rtc" > > > > > > +#define SECS_PER_DAY (24 * 60 * 60) > > > + > > > /** > > > * struct cros_ec_rtc - Driver data for EC RTC > > > * > > > @@ -43,13 +45,8 @@ static int cros_ec_rtc_get(struct cros_ec_device *cros_ec, u32 command, > > > msg.msg.insize = sizeof(msg.data); > > > > > > ret = cros_ec_cmd_xfer_status(cros_ec, &msg.msg); > > > - if (ret < 0) { > > > - dev_err(cros_ec->dev, > > > - "error getting %s from EC: %d\n", > > > - command == EC_CMD_RTC_GET_VALUE ? "time" : "alarm", > > > - ret); > > > + if (ret < 0) > > > return ret; > > > - } > > > > > > *response = msg.data.time; > > > > > > @@ -59,7 +56,7 @@ static int cros_ec_rtc_get(struct cros_ec_device *cros_ec, u32 command, > > > static int cros_ec_rtc_set(struct cros_ec_device *cros_ec, u32 command, > > > u32 param) > > > { > > > - int ret = 0; > > > + int ret; > > > struct { > > > struct cros_ec_command msg; > > > struct ec_response_rtc data; > > > @@ -71,13 +68,8 @@ static int cros_ec_rtc_set(struct cros_ec_device *cros_ec, u32 command, > > > msg.data.time = param; > > > > > > ret = cros_ec_cmd_xfer_status(cros_ec, &msg.msg); > > > - if (ret < 0) { > > > - dev_err(cros_ec->dev, "error setting %s on EC: %d\n", > > > - command == EC_CMD_RTC_SET_VALUE ? "time" : "alarm", > > > - ret); > > > + if (ret < 0) > > > return ret; > > > - } > > > - > > > return 0; > > > } > > > > > > @@ -190,8 +182,21 @@ static int cros_ec_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm) > > > > > > ret = cros_ec_rtc_set(cros_ec, EC_CMD_RTC_SET_ALARM, alarm_offset); > > > if (ret < 0) { > > > - dev_err(dev, "error setting alarm: %d\n", ret); > > > - return ret; > > > + if (ret == -EINVAL && alarm_offset >= SECS_PER_DAY) { > > > + /* > > > + * RTC chips on some older Chromebooks can only handle > > > + * alarms up to 24h in the future. Try to set an alarm > > > + * below that limit to avoid suspend failures. > > > + */ > > > + ret = cros_ec_rtc_set(cros_ec, EC_CMD_RTC_SET_ALARM, > > > + SECS_PER_DAY - 1); > > > + } > > > + > > > + if (ret < 0) { > > > + dev_err(dev, "error setting alarm in %u seconds: %d\n", > > > + alarm_offset, ret); > > > + return ret; > > > + } > > > } > > > > > > return 0; > > > -- > > > 2.36.2 > > > > > > > -- > > Alexandre Belloni, co-owner and COO, Bootlin > > Embedded Linux and Kernel engineering > > https://bootlin.com
On Mon, Oct 31, 2022 at 10:55:21PM +0100, Alexandre Belloni wrote: > > The issue with the current patch is that it forbids going for a better > solution because you will then take for granted that this driver can't > ever fail. > This is incorrect. My plan was to get this accepted first and then work with those responsible on a cleaner solution (which is much more vague). We can not wait for that cleaner solution now. There is nothing that prevents us from taking our time to find a cleaner solution, and then to change the code again to use it. Guenter
On Mon, Oct 31, 2022 at 11:14:23PM +0100, Alexandre Belloni wrote: > On 31/10/2022 11:19:13-0700, Guenter Roeck wrote: > > On Mon, Oct 31, 2022 at 06:10:53PM +0100, Alexandre Belloni wrote: > > > Hello, > > > > > > On 28/10/2022 17:54:00-0700, Guenter Roeck wrote: > > > > RTC chips on some older Chromebooks can only handle alarms less than 24 > > > > hours in the future. Attempts to set an alarm beyond that range fails. > > > > The most severe impact of this limitation is that suspend requests fail > > > > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > > > > in the future. > > > > > > > > Try to set the real-time alarm to just below 24 hours if setting it to > > > > a larger value fails to work around the problem. While not perfect, it > > > > is better than just failing the call. A similar workaround is already > > > > implemented in the rtc-tps6586x driver. > > > > > > I'm not super convinced this is actually better than failing the call > > > because your are implementing policy in the driver which is bad from a > > > user point of view. It would be way better to return -ERANGE and let > > > userspace select a better alarm time. > > > > The failing call is from alarmtimer_suspend() which is called during suspend. > > It is not from userspace, and userspace has no chance to intervene. > > > > It is also not just one userspace application which could request a large > > timeout, it is a variety of userspace applications, and not all of them are > > written by Google. Some are Android applications. I don't see how it would be > > realistic to expect all such applications to fix their code (if that is even > > possible - there might be an application which called sleep(100000) or > > something equivalent, which works just fine as long as the system is not > > suspended. > > > > > Do you have to know in advance which are the "older" chromebooks that > > > are affected? > > > > Not sure I understand the question. Technically we know, but the cros_ec > > rtc driver doesn't know because the EC doesn't have an API to report the > > maximum timeout to the Linux driver. Even if that existed, it would not > > help because the rtc API only supports absolute maximum clock values, > > not clock offsets relative to the current time. So ultimately there is no > > means for an RTC driver to tell the maximum possible alarm timer offset to > > the RTC subsystem, and there is no means for a user such as > > alarmtimer_suspend() to obtain the maximum time offset. Does that answer > > your question ? > > Yes, my question was missing a few words, sorry I wanted to know if you > had *a way* to know. > See below. It is doable, but there is no real good solution, or at least I don't see one right now. > > > > On a side note, I tried an alternate implementation by adding a retry into > > alarmtimer_suspend(), where it would request a smaller timeout if the > > requested timeout failed. I did not pursue/submit this since it seemed > > hacky. To solve that problem, I'd rather discuss extending the RTC API > > to provide a maximum offset to its users. Such a solution would probably > > be desirable, but that it more longer term and would not solve the > > immediate problem. > > Yes, this is what I was aiming for. This is something that is indeed > missing in the RTC API and that I already thought about. But indeed, it > would be great to have a way to set the alarm range separately from the > time keeping range. This would indeed have to be a range relative to the > current time. > > alarmtimer_suspend() can then get the allowed alarm range for the RTC, > and set the alarm to max(alarm range, timer value) and loop until the > timer has expired. Once we have this API, userspace can do the same. > > I guess that ultimately, this doesn't help your driver unless you are > wanting to wakeup all the chromebooks at least once a day regardless of > their EC. That is a no-go. It would reduce battery lifetime on all Chromebooks, including those not affected by the problem (that is, almost all of them). To implement reporting the maximum supported offset, I'd probably either try to identify affected Chromebooks using devicetree information, or by sending am alarm request > 24h in the future in the probe function and setting the maximum offset just below 24h if that request fails. We'd have to discuss the best approach internally. Either case, that doesn't help with the short term problem that we have to solve now and that can be backported to older kernels. It also won't help userspace - userspace alarm requests, as Brian has pointed out, are separate from limits supported by the RTC hardware. We can not change the API for CLOCK_xxx_ALARM to userspace, and doing so would not make sense anyway since it works just fine as long as the system isn't suspended. Besides, changing alarmtimer_suspend() as you suggest above would solve the problem for userspace, so I don't see a need for a userspace API/ABI change unless I am missing something. Thanks, Guenter
Alexandre, On Mon, Oct 31, 2022 at 04:07:51PM -0700, Guenter Roeck wrote: [ ... ] > > > > > > On a side note, I tried an alternate implementation by adding a retry into > > > alarmtimer_suspend(), where it would request a smaller timeout if the > > > requested timeout failed. I did not pursue/submit this since it seemed > > > hacky. To solve that problem, I'd rather discuss extending the RTC API > > > to provide a maximum offset to its users. Such a solution would probably > > > be desirable, but that it more longer term and would not solve the > > > immediate problem. > > > > Yes, this is what I was aiming for. This is something that is indeed > > missing in the RTC API and that I already thought about. But indeed, it > > would be great to have a way to set the alarm range separately from the > > time keeping range. This would indeed have to be a range relative to the > > current time. > > > > alarmtimer_suspend() can then get the allowed alarm range for the RTC, > > and set the alarm to max(alarm range, timer value) and loop until the > > timer has expired. Once we have this API, userspace can do the same. > > > > I guess that ultimately, this doesn't help your driver unless you are > > wanting to wakeup all the chromebooks at least once a day regardless of > > their EC. > > That is a no-go. It would reduce battery lifetime on all Chromebooks, > including those not affected by the problem (that is, almost all of them). > > To implement reporting the maximum supported offset, I'd probably either > try to identify affected Chromebooks using devicetree information, > or by sending am alarm request > 24h in the future in the probe function > and setting the maximum offset just below 24h if that request fails. > We'd have to discuss the best approach internally. > > Either case, that doesn't help with the short term problem that we > have to solve now and that can be backported to older kernels. It also > won't help userspace - userspace alarm requests, as Brian has pointed out, > are separate from limits supported by the RTC hardware. We can not change > the API for CLOCK_xxx_ALARM to userspace, and doing so would not make > sense anyway since it works just fine as long as the system isn't > suspended. Besides, changing alarmtimer_suspend() as you suggest above > would solve the problem for userspace, so I don't see a need for a > userspace API/ABI change unless I am missing something. > Would you be open to accepting this patch, with me starting to work on the necessary infastructure changes as suggested above for a more comprehensive solution ? Thanks, Guenter
Hi, On 02/11/2022 11:48:04-0700, Guenter Roeck wrote: > Alexandre, > > On Mon, Oct 31, 2022 at 04:07:51PM -0700, Guenter Roeck wrote: > [ ... ] > > > > > > > > On a side note, I tried an alternate implementation by adding a retry into > > > > alarmtimer_suspend(), where it would request a smaller timeout if the > > > > requested timeout failed. I did not pursue/submit this since it seemed > > > > hacky. To solve that problem, I'd rather discuss extending the RTC API > > > > to provide a maximum offset to its users. Such a solution would probably > > > > be desirable, but that it more longer term and would not solve the > > > > immediate problem. > > > > > > Yes, this is what I was aiming for. This is something that is indeed > > > missing in the RTC API and that I already thought about. But indeed, it > > > would be great to have a way to set the alarm range separately from the > > > time keeping range. This would indeed have to be a range relative to the > > > current time. > > > > > > alarmtimer_suspend() can then get the allowed alarm range for the RTC, > > > and set the alarm to max(alarm range, timer value) and loop until the > > > timer has expired. Once we have this API, userspace can do the same. > > > > > > I guess that ultimately, this doesn't help your driver unless you are > > > wanting to wakeup all the chromebooks at least once a day regardless of > > > their EC. > > > > That is a no-go. It would reduce battery lifetime on all Chromebooks, > > including those not affected by the problem (that is, almost all of them). > > > > To implement reporting the maximum supported offset, I'd probably either > > try to identify affected Chromebooks using devicetree information, > > or by sending am alarm request > 24h in the future in the probe function > > and setting the maximum offset just below 24h if that request fails. > > We'd have to discuss the best approach internally. > > > > Either case, that doesn't help with the short term problem that we > > have to solve now and that can be backported to older kernels. It also > > won't help userspace - userspace alarm requests, as Brian has pointed out, > > are separate from limits supported by the RTC hardware. We can not change > > the API for CLOCK_xxx_ALARM to userspace, and doing so would not make > > sense anyway since it works just fine as long as the system isn't > > suspended. Besides, changing alarmtimer_suspend() as you suggest above > > would solve the problem for userspace, so I don't see a need for a > > userspace API/ABI change unless I am missing something. > > > > Would you be open to accepting this patch, with me starting to work > on the necessary infastructure changes as suggested above for a more > comprehensive solution ? > I'll take the patch as-is so you can backport it and have a solution. I'll also work on the alarm range and I'll let you get the series once this is ready so you can test.
Hi, On Mon, Nov 07, 2022 at 11:52:50PM +0100, Alexandre Belloni wrote: [ ... ] > > I'll take the patch as-is so you can backport it and have a solution. > I'll also work on the alarm range and I'll let you get the series once > this is ready so you can test. > Excellent, thanks a lot. I also started looking into a poor-man's solution of range support. I attached what I currently have below for your reference. It isn't much, but it let me test follow-up changes in the cros-ec rtc driver. Unfortunately I was not able to find a means to implement something like "go back to sleep fast" in the alarm timer code. In this context: Is there a standardized set of error codes for RTC drivers ? I see -EINVAL, -ETIME, -EDOM, -ERANGE, but those are not consistently used. I assumed -ETIME for "time expired" and -ERANGE for "time too far in the future" below, but that was just a wild guess. Thanks, Guenter --- commit 7918f162f947424ec0ad7a318c45febeaea51d2e Author: Guenter Roeck <linux@roeck-us.net> AuthorDate: Wed Nov 2 19:35:09 2022 -0700 Commit: Guenter Roeck <linux@roeck-us.net> CommitDate: Fri Nov 4 09:54:06 2022 -0700 rtc: Add support for limited alarm timer offsets Some alarm timers are based on time offsets, not on absolute times. In some situations, the amount of time that can be scheduled in the future is limited. This may result in a refusal to suspend the system, causing substantial battery drain. Some RTC alarm drivers remedy the situation by setting the alarm time to the maximum supported time if a request for an out-of-range timeout is made. This is not really desirable since it may result in unexpected early wakeups. To reduce the impact of this problem, let RTC drivers report the maximum supported alarm timer offset. The code setting alarm timers can then decide if it wants to reject setting alarm timers to a larger value, if it wants to implement recurring alarms until the actually requested alarm time is met, or if it wants to accept the limited alarm time. Signed-off-by: Guenter Roeck <linux@roeck-us.net> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c index 9edd662c69ac..05ec9afbb6ba 100644 --- a/drivers/rtc/interface.c +++ b/drivers/rtc/interface.c @@ -426,6 +426,10 @@ static int __rtc_set_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm) if (scheduled <= now) return -ETIME; + + if (rtc->range_max_offset && scheduled - now > rtc->range_max_offset) + return -ERANGE; + /* * XXX - We just checked to make sure the alarm time is not * in the past, but there is still a race window where if diff --git a/include/linux/rtc.h b/include/linux/rtc.h index 1fd9c6a21ebe..b6d000ab1e5e 100644 --- a/include/linux/rtc.h +++ b/include/linux/rtc.h @@ -146,6 +146,7 @@ struct rtc_device { time64_t range_min; timeu64_t range_max; + timeu64_t range_max_offset; time64_t start_secs; time64_t offset_secs; bool set_start_time; diff --git a/kernel/time/alarmtimer.c b/kernel/time/alarmtimer.c index 5897828b9d7e..af8e0a9e0d63 100644 --- a/kernel/time/alarmtimer.c +++ b/kernel/time/alarmtimer.c @@ -291,6 +291,19 @@ static int alarmtimer_suspend(struct device *dev) rtc_timer_cancel(rtc, &rtctimer); rtc_read_time(rtc, &tm); now = rtc_tm_to_ktime(tm); + + /* + * If the RTC alarm timer only supports a limited time offset, set + * the alarm time to the maximum supported value. + * The system will wake up earlier than necessary and is expected + * to go back to sleep if it has nothing to do. + * It would be desirable to handle such early wakeups without fully + * waking up the system, but it is unknown if this is even possible. + */ + if (rtc->range_max_offset && + rtc->range_max_offset * NSEC_PER_SEC > ktime_to_ns(min)) + min = ns_to_ktime(rtc->range_max_offset * NSEC_PER_SEC); + now = ktime_add(now, min); /* Set alarm, if in the past reject suspend briefly to handle */
On Fri, 28 Oct 2022 17:54:00 -0700, Guenter Roeck wrote: > RTC chips on some older Chromebooks can only handle alarms less than 24 > hours in the future. Attempts to set an alarm beyond that range fails. > The most severe impact of this limitation is that suspend requests fail > if alarmtimer_suspend() tries to set an alarm for more than 24 hours > in the future. > > Try to set the real-time alarm to just below 24 hours if setting it to > a larger value fails to work around the problem. While not perfect, it > is better than just failing the call. A similar workaround is already > implemented in the rtc-tps6586x driver. > > [...] Applied, thanks! [1/1] rtc: cros-ec: Limit RTC alarm range if needed commit: a78590c82c501c53b6f30a5ee10e4261e8b377f7 Best regards,
diff --git a/drivers/rtc/rtc-cros-ec.c b/drivers/rtc/rtc-cros-ec.c index 887f5193e253..a3ec066d8066 100644 --- a/drivers/rtc/rtc-cros-ec.c +++ b/drivers/rtc/rtc-cros-ec.c @@ -14,6 +14,8 @@ #define DRV_NAME "cros-ec-rtc" +#define SECS_PER_DAY (24 * 60 * 60) + /** * struct cros_ec_rtc - Driver data for EC RTC * @@ -43,13 +45,8 @@ static int cros_ec_rtc_get(struct cros_ec_device *cros_ec, u32 command, msg.msg.insize = sizeof(msg.data); ret = cros_ec_cmd_xfer_status(cros_ec, &msg.msg); - if (ret < 0) { - dev_err(cros_ec->dev, - "error getting %s from EC: %d\n", - command == EC_CMD_RTC_GET_VALUE ? "time" : "alarm", - ret); + if (ret < 0) return ret; - } *response = msg.data.time; @@ -59,7 +56,7 @@ static int cros_ec_rtc_get(struct cros_ec_device *cros_ec, u32 command, static int cros_ec_rtc_set(struct cros_ec_device *cros_ec, u32 command, u32 param) { - int ret = 0; + int ret; struct { struct cros_ec_command msg; struct ec_response_rtc data; @@ -71,13 +68,8 @@ static int cros_ec_rtc_set(struct cros_ec_device *cros_ec, u32 command, msg.data.time = param; ret = cros_ec_cmd_xfer_status(cros_ec, &msg.msg); - if (ret < 0) { - dev_err(cros_ec->dev, "error setting %s on EC: %d\n", - command == EC_CMD_RTC_SET_VALUE ? "time" : "alarm", - ret); + if (ret < 0) return ret; - } - return 0; } @@ -190,8 +182,21 @@ static int cros_ec_rtc_set_alarm(struct device *dev, struct rtc_wkalrm *alrm) ret = cros_ec_rtc_set(cros_ec, EC_CMD_RTC_SET_ALARM, alarm_offset); if (ret < 0) { - dev_err(dev, "error setting alarm: %d\n", ret); - return ret; + if (ret == -EINVAL && alarm_offset >= SECS_PER_DAY) { + /* + * RTC chips on some older Chromebooks can only handle + * alarms up to 24h in the future. Try to set an alarm + * below that limit to avoid suspend failures. + */ + ret = cros_ec_rtc_set(cros_ec, EC_CMD_RTC_SET_ALARM, + SECS_PER_DAY - 1); + } + + if (ret < 0) { + dev_err(dev, "error setting alarm in %u seconds: %d\n", + alarm_offset, ret); + return ret; + } } return 0;
RTC chips on some older Chromebooks can only handle alarms less than 24 hours in the future. Attempts to set an alarm beyond that range fails. The most severe impact of this limitation is that suspend requests fail if alarmtimer_suspend() tries to set an alarm for more than 24 hours in the future. Try to set the real-time alarm to just below 24 hours if setting it to a larger value fails to work around the problem. While not perfect, it is better than just failing the call. A similar workaround is already implemented in the rtc-tps6586x driver. Drop error messages in cros_ec_rtc_get() and cros_ec_rtc_set() since the calling code also logs an error and to avoid spurious error messages if setting the alarm ultimately succeeds. Cc: Brian Norris <briannorris@chromium.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> --- drivers/rtc/rtc-cros-ec.c | 35 ++++++++++++++++++++--------------- 1 file changed, 20 insertions(+), 15 deletions(-)