Message ID | 20200219183231.50985-1-balejs@google.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | cgroup-v1: freezer: optionally killable freezer | expand |
Hi all, did anyone have time to look into my proposal and, in case, are there any suggestions, ideas or comments about it? Marco
On Fri, Feb 28, 2020 at 04:51:31PM -0800, Marco Ballesio wrote: > Hi all, > > did anyone have time to look into my proposal and, in case, are there > any suggestions, ideas or comments about it? Hello, Marco! I'm sorry, somehow I missed the original letter. In general the cgroup v1 interface is considered frozen. Are there any particular reasons why you want to extend the v1 freezer rather than use the v2 version of it? You don't even need to fully convert to cgroup v2 in order to do it, some v1 controllers can still be used. Thanks! Roman
On Sat, Feb 29, 2020 at 10:43:00AM -0800, Roman Gushchin wrote: > On Fri, Feb 28, 2020 at 04:51:31PM -0800, Marco Ballesio wrote: > > Hi all, > > > > did anyone have time to look into my proposal and, in case, are there > > any suggestions, ideas or comments about it? > > Hello, Marco! > > I'm sorry, somehow I missed the original letter. > > In general the cgroup v1 interface is considered frozen. Are there any particular > reasons why you want to extend the v1 freezer rather than use the v2 version of it? > > You don't even need to fully convert to cgroup v2 in order to do it, some v1 > controllers can still be used. > > Thanks! > > Roman Hi Roman, When compared with backports of v2 features and their dependency chains, this patch would be easier to carry in Android common. The potential is to have killability for frozen processes on hw currently in use. Marco
On Sun, Mar 01, 2020 at 08:20:03AM -0800, Marco Ballesio wrote: > On Sat, Feb 29, 2020 at 10:43:00AM -0800, Roman Gushchin wrote: > > On Fri, Feb 28, 2020 at 04:51:31PM -0800, Marco Ballesio wrote: > > > Hi all, > > > > > > did anyone have time to look into my proposal and, in case, are there > > > any suggestions, ideas or comments about it? > > > > Hello, Marco! > > > > I'm sorry, somehow I missed the original letter. > > > > In general the cgroup v1 interface is considered frozen. Are there any particular > > reasons why you want to extend the v1 freezer rather than use the v2 version of it? > > > > You don't even need to fully convert to cgroup v2 in order to do it, some v1 > > controllers can still be used. > > > > Thanks! > > > > Roman > > Hi Roman, > > When compared with backports of v2 features and their dependency chains, this > patch would be easier to carry in Android common. The potential is to have > killability for frozen processes on hw currently in use. I see... The implementation looks good to me, but I really not sure if adding new control files to cgroup v1 is a good idea at this point. Are there any plans in the Android world to move forward to cgroup v2? If not, why not? If there are any specific issues/dependencies, let's discuss and resolve them. Thanks! Roman
On Mon, Mar 2, 2020 at 8:53 AM Roman Gushchin <guro@fb.com> wrote: > > On Sun, Mar 01, 2020 at 08:20:03AM -0800, Marco Ballesio wrote: > > On Sat, Feb 29, 2020 at 10:43:00AM -0800, Roman Gushchin wrote: > > > On Fri, Feb 28, 2020 at 04:51:31PM -0800, Marco Ballesio wrote: > > > > Hi all, > > > > > > > > did anyone have time to look into my proposal and, in case, are there > > > > any suggestions, ideas or comments about it? > > > > > > Hello, Marco! > > > > > > I'm sorry, somehow I missed the original letter. > > > > > > In general the cgroup v1 interface is considered frozen. Are there any particular > > > reasons why you want to extend the v1 freezer rather than use the v2 version of it? > > > > > > You don't even need to fully convert to cgroup v2 in order to do it, some v1 > > > controllers can still be used. > > > > > > Thanks! > > > > > > Roman > > > > Hi Roman, > > > > When compared with backports of v2 features and their dependency chains, this > > patch would be easier to carry in Android common. The potential is to have > > killability for frozen processes on hw currently in use. > Hi Roman, > I see... > > The implementation looks good to me, but I really not sure if adding new control files > to cgroup v1 is a good idea at this point. Are there any plans in the Android world > to move forward to cgroup v2? If not, why not? There are plans to prototype that and gradually move from cgroups v1 to v2 at least for some cgroup controllers (the ones that can use unified hierarchy). Creating an additional per-process cgroup v2 hierarchy only for freezer would be a high price to pay today. In the future when we migrate some controllers to v2 the price will be amortized and we will probably be able to do that. > If there are any specific issues/dependencies, let's discuss and resolve them. > > Thanks! > > Roman Thanks, Suren.
On Mon, Mar 02, 2020 at 09:46:36AM -0800, Suren Baghdasaryan wrote: > On Mon, Mar 2, 2020 at 8:53 AM Roman Gushchin <guro@fb.com> wrote: > > > > On Sun, Mar 01, 2020 at 08:20:03AM -0800, Marco Ballesio wrote: > > > On Sat, Feb 29, 2020 at 10:43:00AM -0800, Roman Gushchin wrote: > > > > On Fri, Feb 28, 2020 at 04:51:31PM -0800, Marco Ballesio wrote: > > > > > Hi all, > > > > > > > > > > did anyone have time to look into my proposal and, in case, are there > > > > > any suggestions, ideas or comments about it? > > > > > > > > Hello, Marco! > > > > > > > > I'm sorry, somehow I missed the original letter. > > > > > > > > In general the cgroup v1 interface is considered frozen. Are there any particular > > > > reasons why you want to extend the v1 freezer rather than use the v2 version of it? > > > > > > > > You don't even need to fully convert to cgroup v2 in order to do it, some v1 > > > > controllers can still be used. > > > > > > > > Thanks! > > > > > > > > Roman > > > > > > Hi Roman, > > > > > > When compared with backports of v2 features and their dependency chains, this > > > patch would be easier to carry in Android common. The potential is to have > > > killability for frozen processes on hw currently in use. > > > > Hi Roman, > > > I see... > > > > The implementation looks good to me, but I really not sure if adding new control files > > to cgroup v1 is a good idea at this point. Are there any plans in the Android world > > to move forward to cgroup v2? If not, why not? > > There are plans to prototype that and gradually move from cgroups v1 > to v2 at least for some cgroup controllers (the ones that can use > unified hierarchy). Creating an additional per-process cgroup v2 > hierarchy only for freezer would be a high price to pay today. In the > future when we migrate some controllers to v2 the price will be > amortized and we will probably be able to do that. I see... Thanks for the explanation, Suren! Overall the idea of extending the frozen v1 interface looks dubious to me. Especially if it's only required during the transition to v2. But of course the decision is on maintainers. Thanks!
Hello, On Wed, Feb 19, 2020 at 10:32:31AM -0800, Marco Ballesio wrote: > @@ -94,6 +94,18 @@ The following cgroupfs files are created by cgroup freezer. > Shows the parent-state. 0 if none of the cgroup's ancestors is > frozen; otherwise, 1. > > +* freezer.killable: Read-write > + > + When read, returns the killable state of a cgroup - "1" if frozen > + tasks will respond to fatal signals, or "0" if they won't. > + > + When written, this property sets the killable state of the cgroup. > + A value equal to "1" will switch the state of all frozen tasks in > + the cgroup to TASK_INTERRUPTIBLE (similarly to cgroup v2) and will > + make them react to fatal signals. A value of "0" will switch the > + state of frozen tasks to TASK_UNINTERRUPTIBLE and they won't respond > + to signals unless thawed or unfrozen. As Roman said, I'm not too sure about adding a new cgroup1 freezer interface at this point. If we do this, *maybe* a mount option would be more minimal? > diff --git a/kernel/freezer.c b/kernel/freezer.c > index dc520f01f99d..92de1bfe62cf 100644 > --- a/kernel/freezer.c > +++ b/kernel/freezer.c > @@ -42,6 +42,9 @@ bool freezing_slow_path(struct task_struct *p) > if (test_tsk_thread_flag(p, TIF_MEMDIE)) > return false; > > + if (cgroup_freezer_killable(p) && fatal_signal_pending(p)) > + return false; > + > if (pm_nosig_freezing || cgroup_freezing(p)) > return true; > > @@ -63,7 +66,12 @@ bool __refrigerator(bool check_kthr_stop) > pr_debug("%s entered refrigerator\n", current->comm); > > for (;;) { > - set_current_state(TASK_UNINTERRUPTIBLE); > + bool killable = cgroup_freezer_killable(current); > + > + if (killable) > + set_current_state(TASK_INTERRUPTIBLE); > + else > + set_current_state(TASK_UNINTERRUPTIBLE); > > spin_lock_irq(&freezer_lock); > current->flags |= PF_FROZEN; > @@ -75,6 +83,16 @@ bool __refrigerator(bool check_kthr_stop) > if (!(current->flags & PF_FROZEN)) > break; > was_frozen = true; > + > + /* > + * Now we're sure that there is no pending fatal signal. > + * Clear TIF_SIGPENDING to not get out of schedule() > + * immediately (if there is a non-fatal signal pending), and > + * put the task into sleep. > + */ and this looks really racy to me. What happens if this task gets a fatal signal here? We clear TIF_SIGPENDING and go to sleep? > + if (killable) > + clear_thread_flag(TIF_SIGPENDING); > + > schedule(); > } Thanks.
On Tue, Mar 3, 2020 at 5:48 AM Tejun Heo <tj@kernel.org> wrote: > > Hello, > > On Wed, Feb 19, 2020 at 10:32:31AM -0800, Marco Ballesio wrote: > > @@ -94,6 +94,18 @@ The following cgroupfs files are created by cgroup freezer. > > Shows the parent-state. 0 if none of the cgroup's ancestors is > > frozen; otherwise, 1. > > > > +* freezer.killable: Read-write > > + > > + When read, returns the killable state of a cgroup - "1" if frozen > > + tasks will respond to fatal signals, or "0" if they won't. > > + > > + When written, this property sets the killable state of the cgroup. > > + A value equal to "1" will switch the state of all frozen tasks in > > + the cgroup to TASK_INTERRUPTIBLE (similarly to cgroup v2) and will > > + make them react to fatal signals. A value of "0" will switch the > > + state of frozen tasks to TASK_UNINTERRUPTIBLE and they won't respond > > + to signals unless thawed or unfrozen. > > As Roman said, I'm not too sure about adding a new cgroup1 freezer > interface at this point. If we do this, *maybe* a mount option would > be more minimal? I'd still prefer a cgroup flag. A mount option is a bigger compatibility risk and isn't really any simpler than another cgroup flag. A mount option will affect anything using the cgroup mount point, potentially turning non-killable frozen processes into killable ones unexpectedly. (Sure, you could mount multiple times, but only one location is canonical, and that's the one that's going to get the flag flipped.) A per-cgroup flag allows people to opt into the new behavior only in specific contexts, so it's safer.
On Wed, Mar 11, 2020 at 10:46:15AM -0700, Daniel Colascione wrote: > On Tue, Mar 3, 2020 at 5:48 AM Tejun Heo <tj@kernel.org> wrote: > > > > Hello, > > > > On Wed, Feb 19, 2020 at 10:32:31AM -0800, Marco Ballesio wrote: > > > @@ -94,6 +94,18 @@ The following cgroupfs files are created by cgroup freezer. > > > Shows the parent-state. 0 if none of the cgroup's ancestors is > > > frozen; otherwise, 1. > > > > > > +* freezer.killable: Read-write > > > + > > > + When read, returns the killable state of a cgroup - "1" if frozen > > > + tasks will respond to fatal signals, or "0" if they won't. > > > + > > > + When written, this property sets the killable state of the cgroup. > > > + A value equal to "1" will switch the state of all frozen tasks in > > > + the cgroup to TASK_INTERRUPTIBLE (similarly to cgroup v2) and will > > > + make them react to fatal signals. A value of "0" will switch the > > > + state of frozen tasks to TASK_UNINTERRUPTIBLE and they won't respond > > > + to signals unless thawed or unfrozen. > > > > As Roman said, I'm not too sure about adding a new cgroup1 freezer > > interface at this point. If we do this, *maybe* a mount option would > > be more minimal? > > I'd still prefer a cgroup flag. A mount option is a bigger > compatibility risk and isn't really any simpler than another cgroup > flag. A mount option will affect anything using the cgroup mount > point, potentially turning non-killable frozen processes into killable > ones unexpectedly. (Sure, you could mount multiple times, but only one > location is canonical, and that's the one that's going to get the flag > flipped.) A per-cgroup flag allows people to opt into the new behavior > only in specific contexts, so it's safer. It might also be desirable for userland to have a way to modify the behavior of an already mounted v1 freezer. Tejun, would it be acceptable to have a flag but disable it by default, hiding it behind a kernel configuration option?
Hello, On Fri, Mar 20, 2020 at 01:10:38PM -0700, Marco Ballesio wrote: > It might also be desirable for userland to have a way to modify the behavior of > an already mounted v1 freezer. > > Tejun, would it be acceptable to have a flag but disable it by default, hiding > it behind a kernel configuration option? Given how dead-end this is, I'm not sure this needs to be upstream. Can you give me some rationales? Thanks.
diff --git a/Documentation/admin-guide/cgroup-v1/freezer-subsystem.rst b/Documentation/admin-guide/cgroup-v1/freezer-subsystem.rst index 582d3427de3f..06485ae9dccd 100644 --- a/Documentation/admin-guide/cgroup-v1/freezer-subsystem.rst +++ b/Documentation/admin-guide/cgroup-v1/freezer-subsystem.rst @@ -94,6 +94,18 @@ The following cgroupfs files are created by cgroup freezer. Shows the parent-state. 0 if none of the cgroup's ancestors is frozen; otherwise, 1. +* freezer.killable: Read-write + + When read, returns the killable state of a cgroup - "1" if frozen + tasks will respond to fatal signals, or "0" if they won't. + + When written, this property sets the killable state of the cgroup. + A value equal to "1" will switch the state of all frozen tasks in + the cgroup to TASK_INTERRUPTIBLE (similarly to cgroup v2) and will + make them react to fatal signals. A value of "0" will switch the + state of frozen tasks to TASK_UNINTERRUPTIBLE and they won't respond + to signals unless thawed or unfrozen. + The root cgroup is non-freezable and the above interface files don't exist. diff --git a/include/linux/freezer.h b/include/linux/freezer.h index 21f5aa0b217f..1443810ac2bf 100644 --- a/include/linux/freezer.h +++ b/include/linux/freezer.h @@ -72,6 +72,7 @@ extern bool set_freezable(void); #ifdef CONFIG_CGROUP_FREEZER extern bool cgroup_freezing(struct task_struct *task); +extern bool cgroup_freezer_killable(struct task_struct *task); #else /* !CONFIG_CGROUP_FREEZER */ static inline bool cgroup_freezing(struct task_struct *task) { diff --git a/kernel/cgroup/legacy_freezer.c b/kernel/cgroup/legacy_freezer.c index 08236798d173..5bbc26c4b822 100644 --- a/kernel/cgroup/legacy_freezer.c +++ b/kernel/cgroup/legacy_freezer.c @@ -35,6 +35,7 @@ enum freezer_state_flags { CGROUP_FREEZING_SELF = (1 << 1), /* this freezer is freezing */ CGROUP_FREEZING_PARENT = (1 << 2), /* the parent freezer is freezing */ CGROUP_FROZEN = (1 << 3), /* this and its descendants frozen */ + CGROUP_FREEZER_KILLABLE = (1 << 4), /* frozen pocesses can be killed */ /* mask for all FREEZING flags */ CGROUP_FREEZING = CGROUP_FREEZING_SELF | CGROUP_FREEZING_PARENT, @@ -73,6 +74,17 @@ bool cgroup_freezing(struct task_struct *task) return ret; } +bool cgroup_freezer_killable(struct task_struct *task) +{ + bool ret; + + rcu_read_lock(); + ret = task_freezer(task)->state & CGROUP_FREEZER_KILLABLE; + rcu_read_unlock(); + + return ret; +} + static const char *freezer_state_strs(unsigned int state) { if (state & CGROUP_FROZEN) @@ -111,9 +123,15 @@ static int freezer_css_online(struct cgroup_subsys_state *css) freezer->state |= CGROUP_FREEZER_ONLINE; - if (parent && (parent->state & CGROUP_FREEZING)) { - freezer->state |= CGROUP_FREEZING_PARENT | CGROUP_FROZEN; - atomic_inc(&system_freezing_cnt); + if (parent) { + if (parent->state & CGROUP_FREEZER_KILLABLE) + freezer->state |= CGROUP_FREEZER_KILLABLE; + + if (parent->state & CGROUP_FREEZING) { + freezer->state |= CGROUP_FREEZING_PARENT | + CGROUP_FROZEN; + atomic_inc(&system_freezing_cnt); + } } mutex_unlock(&freezer_mutex); @@ -450,6 +468,45 @@ static u64 freezer_parent_freezing_read(struct cgroup_subsys_state *css, return (bool)(freezer->state & CGROUP_FREEZING_PARENT); } +static u64 freezer_killable_read(struct cgroup_subsys_state *css, + struct cftype *cft) +{ + struct freezer *freezer = css_freezer(css); + + return (bool)(freezer->state & CGROUP_FREEZER_KILLABLE); +} + +static int freezer_killable_write(struct cgroup_subsys_state *css, + struct cftype *cft, u64 val) +{ + struct freezer *freezer = css_freezer(css); + + if (val > 1) + return -EINVAL; + + mutex_lock(&freezer_mutex); + + if (val == !!(freezer->state & CGROUP_FREEZER_KILLABLE)) + goto out; + + if (val) + freezer->state |= CGROUP_FREEZER_KILLABLE; + else + freezer->state &= ~CGROUP_FREEZER_KILLABLE; + + + /* + * Let __refrigerator spin once for each task to set it into the + * appropriate state. + */ + unfreeze_cgroup(freezer); + +out: + mutex_unlock(&freezer_mutex); + + return 0; +} + static struct cftype files[] = { { .name = "state", @@ -467,6 +524,12 @@ static struct cftype files[] = { .flags = CFTYPE_NOT_ON_ROOT, .read_u64 = freezer_parent_freezing_read, }, + { + .name = "killable", + .flags = CFTYPE_NOT_ON_ROOT, + .write_u64 = freezer_killable_write, + .read_u64 = freezer_killable_read, + }, { } /* terminate */ }; diff --git a/kernel/freezer.c b/kernel/freezer.c index dc520f01f99d..92de1bfe62cf 100644 --- a/kernel/freezer.c +++ b/kernel/freezer.c @@ -42,6 +42,9 @@ bool freezing_slow_path(struct task_struct *p) if (test_tsk_thread_flag(p, TIF_MEMDIE)) return false; + if (cgroup_freezer_killable(p) && fatal_signal_pending(p)) + return false; + if (pm_nosig_freezing || cgroup_freezing(p)) return true; @@ -63,7 +66,12 @@ bool __refrigerator(bool check_kthr_stop) pr_debug("%s entered refrigerator\n", current->comm); for (;;) { - set_current_state(TASK_UNINTERRUPTIBLE); + bool killable = cgroup_freezer_killable(current); + + if (killable) + set_current_state(TASK_INTERRUPTIBLE); + else + set_current_state(TASK_UNINTERRUPTIBLE); spin_lock_irq(&freezer_lock); current->flags |= PF_FROZEN; @@ -75,6 +83,16 @@ bool __refrigerator(bool check_kthr_stop) if (!(current->flags & PF_FROZEN)) break; was_frozen = true; + + /* + * Now we're sure that there is no pending fatal signal. + * Clear TIF_SIGPENDING to not get out of schedule() + * immediately (if there is a non-fatal signal pending), and + * put the task into sleep. + */ + if (killable) + clear_thread_flag(TIF_SIGPENDING); + schedule(); }
The cgroup v2 freezer allows killing frozen processes without the need to unfreeze them first. This is not possible with the v1 freezer, where processes are to be unfrozen prior any pending kill signals to take effect. Add a configurable option to allow killing frozen tasks in a way similar to cgroups v2. Change the status of frozen tasks to TASK_INTERRUPTIBLE and reset their PF_FROZEN flag on pending fatal signals. Use the run-time configurable option freezer.killable to enable killability, preserve the pre-existing behavior by default. Signed-off-by: Marco Ballesio <balejs@google.com> --- .../cgroup-v1/freezer-subsystem.rst | 12 ++++ include/linux/freezer.h | 1 + kernel/cgroup/legacy_freezer.c | 69 ++++++++++++++++++- kernel/freezer.c | 20 +++++- 4 files changed, 98 insertions(+), 4 deletions(-)