Message ID | 87bk6pvzql.fsf@somnus (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | timers/migration: Return early on deactivation | expand |
Le Thu, Apr 04, 2024 at 06:50:26PM +0200, Anna-Maria Behnsen a écrit : > Commit 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on > deactivation") removed the logic to return early in tmigr_update_events() > on deactivation. With this the problem with a not properly updated first > global event in a hierarchy containing only a single group was fixed. > > But when having a look at this code path with a hierarchy with more than a > single level, now unnecessary work is done (example is partially copied > from the message of the commit mentioned above): > > [GRP1:0] > migrator = GRP0:0 > active = GRP0:0 > nextevt = T0:0i, T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = 0 migrator = NONE > active = 0 active = NONE > nextevt = T0i, T1 nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > active idle idle idle > > 0) CPU 0 is active thus its event is ignored (the letter 'i') and so are > upper levels' events. CPU 1 is idle and has the timer T1 enqueued. > CPU 2 also has a timer. The expiry order is T0 (ignored) < T1 < T2 > > [GRP1:0] > migrator = GRP0:0 > active = GRP0:0 > nextevt = T0:0i, T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = NONE migrator = NONE > active = NONE active = NONE > nextevt = T1 nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > idle idle idle idle > > 1) CPU 0 goes idle without global event queued. Therefore KTIME_MAX is > pushed as its next expiry and its own event kept as "ignore". Without this > early return the following steps happen in tmigr_update_events() when > child = null and group = GRP0:0 : > > lock(GRP0:0->lock); > timerqueue_del(GRP0:0, T0i); > unlock(GRP0:0->lock); > > > [GRP1:0] > migrator = NONE > active = NONE > nextevt = T0:0, T0:1 > / \ > [GRP0:0] [GRP0:1] > migrator = NONE migrator = NONE > active = NONE active = NONE > nextevt = T1 nextevt = T2 > / \ / \ > 0 (T0i) 1 (T1) 2 (T2) 3 > idle idle idle idle > > 2) The change now propagates up to the top. Then tmigr_update_events() > updates the group event of GRP0:0 and executes the following steps > (child = GRP0:0 and group = GRP0:0): > > lock(GRP0:0->lock); > lock(GRP1:0->lock); > evt = tmigr_next_groupevt(GRP0:0); -> this removes the ignored events > in GRP0:0 > ... update GRP1:0 group event and timerqueue ... > unlock(GRP1:0->lock); > unlock(GRP0:0->lock); > > So the dance in 1) with locking the GRP0:0->lock and removing the T0i from > the timerqueue is redundand as this is done nevertheless in 2) when > tmigr_next_groupevt(GRP0:0) is executed. > > Revert commit 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on > deactivation") and add a condition into return path to skip the return > only, when hierarchy contains a single group. > > Fixes: 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on deactivation") > Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Just some comment nits: > --- > kernel/time/timer_migration.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > --- a/kernel/time/timer_migration.c > +++ b/kernel/time/timer_migration.c > @@ -751,6 +751,31 @@ bool tmigr_update_events(struct tmigr_gr > > first_childevt = evt = data->evt; > > + /* > + * Walking the hierarchy is required in any case when a > + * remote expiry was done before. This ensures to not lose > + * already queued events in non active groups (see section > + * "Required event and timerqueue update after a remote > + * expiry" in the documentation at the top). > + * > + * The two call sites which are executed without a remote expiry > + * before, are not prevented from propagating changes through > + * the hierarchy by the return: > + * - When entering this path by tmigr_new_timer(), @evt->ignore > + * is never set. > + * - tmigr_inactive_up() takes care of the propagation by > + * itself and ignores the return value. But an immediate > + * return is required because nothing has to be done in this > + * level as the event could be ignored. It's not exactly required, it's an optimization. How about: """ But an immediate return is possible if there is a parent, sparing group locking at this level, because the upper walking call to the parent will take care about removing this event from within the group and update next_expiry accordingly. """ > + * > + * But, if the hierarchy has only a single level so @group is > + * the top level group, make sure first event information of the > + * group is updated properly and also handled properly, so skip > + * this fast return path. """ However if there is no parent, ie: the hierarchy has only a single level so @group is the top level group, make sure the first event information of the group is updated properly and also handled properly, so skip this fast return path. """ Thanks. > + */ > + if (evt->ignore && !remote && group->parent) > + return true; > + > raw_spin_lock(&group->lock); > > childstate.state = 0;
--- a/kernel/time/timer_migration.c +++ b/kernel/time/timer_migration.c @@ -751,6 +751,31 @@ bool tmigr_update_events(struct tmigr_gr first_childevt = evt = data->evt; + /* + * Walking the hierarchy is required in any case when a + * remote expiry was done before. This ensures to not lose + * already queued events in non active groups (see section + * "Required event and timerqueue update after a remote + * expiry" in the documentation at the top). + * + * The two call sites which are executed without a remote expiry + * before, are not prevented from propagating changes through + * the hierarchy by the return: + * - When entering this path by tmigr_new_timer(), @evt->ignore + * is never set. + * - tmigr_inactive_up() takes care of the propagation by + * itself and ignores the return value. But an immediate + * return is required because nothing has to be done in this + * level as the event could be ignored. + * + * But, if the hierarchy has only a single level so @group is + * the top level group, make sure first event information of the + * group is updated properly and also handled properly, so skip + * this fast return path. + */ + if (evt->ignore && !remote && group->parent) + return true; + raw_spin_lock(&group->lock); childstate.state = 0;
Commit 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on deactivation") removed the logic to return early in tmigr_update_events() on deactivation. With this the problem with a not properly updated first global event in a hierarchy containing only a single group was fixed. But when having a look at this code path with a hierarchy with more than a single level, now unnecessary work is done (example is partially copied from the message of the commit mentioned above): [GRP1:0] migrator = GRP0:0 active = GRP0:0 nextevt = T0:0i, T0:1 / \ [GRP0:0] [GRP0:1] migrator = 0 migrator = NONE active = 0 active = NONE nextevt = T0i, T1 nextevt = T2 / \ / \ 0 (T0i) 1 (T1) 2 (T2) 3 active idle idle idle 0) CPU 0 is active thus its event is ignored (the letter 'i') and so are upper levels' events. CPU 1 is idle and has the timer T1 enqueued. CPU 2 also has a timer. The expiry order is T0 (ignored) < T1 < T2 [GRP1:0] migrator = GRP0:0 active = GRP0:0 nextevt = T0:0i, T0:1 / \ [GRP0:0] [GRP0:1] migrator = NONE migrator = NONE active = NONE active = NONE nextevt = T1 nextevt = T2 / \ / \ 0 (T0i) 1 (T1) 2 (T2) 3 idle idle idle idle 1) CPU 0 goes idle without global event queued. Therefore KTIME_MAX is pushed as its next expiry and its own event kept as "ignore". Without this early return the following steps happen in tmigr_update_events() when child = null and group = GRP0:0 : lock(GRP0:0->lock); timerqueue_del(GRP0:0, T0i); unlock(GRP0:0->lock); [GRP1:0] migrator = NONE active = NONE nextevt = T0:0, T0:1 / \ [GRP0:0] [GRP0:1] migrator = NONE migrator = NONE active = NONE active = NONE nextevt = T1 nextevt = T2 / \ / \ 0 (T0i) 1 (T1) 2 (T2) 3 idle idle idle idle 2) The change now propagates up to the top. Then tmigr_update_events() updates the group event of GRP0:0 and executes the following steps (child = GRP0:0 and group = GRP0:0): lock(GRP0:0->lock); lock(GRP1:0->lock); evt = tmigr_next_groupevt(GRP0:0); -> this removes the ignored events in GRP0:0 ... update GRP1:0 group event and timerqueue ... unlock(GRP1:0->lock); unlock(GRP0:0->lock); So the dance in 1) with locking the GRP0:0->lock and removing the T0i from the timerqueue is redundand as this is done nevertheless in 2) when tmigr_next_groupevt(GRP0:0) is executed. Revert commit 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on deactivation") and add a condition into return path to skip the return only, when hierarchy contains a single group. Fixes: 4b6f4c5a67c0 ("timer/migration: Remove buggy early return on deactivation") Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> --- kernel/time/timer_migration.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)