Message ID | 20250201163106.28912-4-mjguzik@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | reduce tasklist_lock hold time on exit and do some | expand |
On 02/01, Mateusz Guzik wrote: > > Instead of smuggling the tty pointer directly, use a struct so that more > things can be added later. I am not sure this particular change worth the effort, but I won't argue. I'd like to know what Eric thinks. OTOH, if we do this, then perhaps we can do more "call tty_kref_put() lockless" changes later. And perhaps even add the new void tty_kref_put_sync(struct tty_struct *tty) { if (tty) kref_put(&tty->kref, release_one_tty); } helper. With this change release_task() doesn't need to abuse schedule_work(), and this helper can have more users. Nevermind, this is almost off-topic. Oleg.
On Mon, Feb 3, 2025 at 7:07 PM Oleg Nesterov <oleg@redhat.com> wrote: > > On 02/01, Mateusz Guzik wrote: > > > > Instead of smuggling the tty pointer directly, use a struct so that more > > things can be added later. > > I am not sure this particular change worth the effort, but I won't argue. > I'd like to know what Eric thinks. > it trivially whacks an atomic from an area protected by a global lock > OTOH, if we do this, then perhaps we can do more "call tty_kref_put() > lockless" changes later. And perhaps even add the new > > void tty_kref_put_sync(struct tty_struct *tty) > { > if (tty) > kref_put(&tty->kref, release_one_tty); > } > > helper. With this change release_task() doesn't need to abuse > schedule_work(), and this helper can have more users. > > Nevermind, this is almost off-topic. > I have no interest in messing with ttys, regardless I agree this is definitely something to consider *after* this patch is sorted out.
On 02/03, Mateusz Guzik wrote: > > On Mon, Feb 3, 2025 at 7:07 PM Oleg Nesterov <oleg@redhat.com> wrote: > > > > On 02/01, Mateusz Guzik wrote: > > > > > > Instead of smuggling the tty pointer directly, use a struct so that more > > > things can be added later. > > > > I am not sure this particular change worth the effort, but I won't argue. > > I'd like to know what Eric thinks. > > > > it trivially whacks an atomic from an area protected by a global lock Yes, but I am not sure this can make a noticeable difference... Let me repeat, I am not really arguing, I leave this to you and Eric. The patch looks correct and I have already acked it. Just it looks "less obvious" to me than other changes in this series. And even if we do this, I am not sure we need the new "struct release_task_post", it seems we can simply move tty_kref_put() from __exit_signal() to release_task(), see the patch at the end. Nobody can touch signal->tty after the group leader passes __exit_signal(), if nothing else lock_task_sighand() can't succeed. But again, feel free to ignore. Oleg. --- x/kernel/exit.c +++ x/kernel/exit.c @@ -146,7 +146,6 @@ static void __exit_signal(struct task_st struct signal_struct *sig = tsk->signal; bool group_dead = thread_group_leader(tsk); struct sighand_struct *sighand; - struct tty_struct *tty; u64 utime, stime; sighand = rcu_dereference_check(tsk->sighand, @@ -159,10 +158,7 @@ static void __exit_signal(struct task_st posix_cpu_timers_exit_group(tsk); #endif - if (group_dead) { - tty = sig->tty; - sig->tty = NULL; - } else { + if (!group_dead) { /* * If there is any task waiting for the group exit * then notify it: @@ -210,10 +206,8 @@ static void __exit_signal(struct task_st __cleanup_sighand(sighand); clear_tsk_thread_flag(tsk, TIF_SIGPENDING); - if (group_dead) { + if (group_dead) flush_sigqueue(&sig->shared_pending); - tty_kref_put(tty); - } } static void delayed_put_task_struct(struct rcu_head *rhp) @@ -279,6 +273,10 @@ repeat: proc_flush_pid(thread_pid); put_pid(thread_pid); release_thread(p); + + if (p == leader) + tty_kref_put(p->signal->tty); + put_task_struct_rcu_user(p); p = leader;
On Tue, Feb 4, 2025 at 12:23 PM Oleg Nesterov <oleg@redhat.com> wrote: > > On 02/03, Mateusz Guzik wrote: > > > > On Mon, Feb 3, 2025 at 7:07 PM Oleg Nesterov <oleg@redhat.com> wrote: > > > > > > On 02/01, Mateusz Guzik wrote: > > > > > > > > Instead of smuggling the tty pointer directly, use a struct so that more > > > > things can be added later. > > > > > > I am not sure this particular change worth the effort, but I won't argue. > > > I'd like to know what Eric thinks. > > > > > > > it trivially whacks an atomic from an area protected by a global lock > > Yes, but I am not sure this can make a noticeable difference... Let > me repeat, I am not really arguing, I leave this to you and Eric. > The patch looks correct and I have already acked it. Just it looks > "less obvious" to me than other changes in this series. Reducing lock hold time is basic multicore hygiene, doubly so for global locks. Atomics are notoriously slow even if cache-hot. I agree this change in isolation is not a big deal, but there are several other "not a big deal"s in the area which will add up if sorted out. Given the triviality of the change as far as complexity goes, I would argue that's a routine patch not warranting much of a discussion (modulo the new struct, for that see below). > > And even if we do this, I am not sure we need the new > "struct release_task_post", it seems we can simply move > tty_kref_put() from __exit_signal() to release_task(), see the > patch at the end. > > Nobody can touch signal->tty after the group leader passes __exit_signal(), > if nothing else lock_task_sighand() can't succeed. > I wanted to keep "whack the tty" logic intact in order reduce any discussion. ;) This brings me to the release_task_post struct. Suppose the tty thing does not get patched at all or gets patched the way you are proposing here. I still need a way to handle the pidmap stuff. Suppose to that end a pid array gets passed directly. Some time later another thing might pop up which gets determined within the lock and wants to return the state to clean up after the unlock. Then someone will need to add a mechanism of the same sort I'm adding here. iow I'm future-proofing this and tty just happens to be the first (even if not necessary) consumer. iow preferably the struct (or an equivalent) would still be there without tty stuff. Given the nature of the change there is a lot of opinionated handwaving possible and "happy to change" vs "feel free to ignore" is not a productive discussion, thus I would like to clarify: my primary interest is to whack the pidmap thing out of tasklist_lock-protected area and this bit I'm going to argue about. Everything else is minor touch ups which I can drop without much resistance (I'll note though there is more I can submit in the area later :P), but if you genuinely want something changed, it would be best explicitly say it. As is, given the prior ack, I intend to submit v4 with minor touch ups. > But again, feel free to ignore. > > Oleg. > > --- x/kernel/exit.c > +++ x/kernel/exit.c > @@ -146,7 +146,6 @@ static void __exit_signal(struct task_st > struct signal_struct *sig = tsk->signal; > bool group_dead = thread_group_leader(tsk); > struct sighand_struct *sighand; > - struct tty_struct *tty; > u64 utime, stime; > > sighand = rcu_dereference_check(tsk->sighand, > @@ -159,10 +158,7 @@ static void __exit_signal(struct task_st > posix_cpu_timers_exit_group(tsk); > #endif > > - if (group_dead) { > - tty = sig->tty; > - sig->tty = NULL; > - } else { > + if (!group_dead) { > /* > * If there is any task waiting for the group exit > * then notify it: > @@ -210,10 +206,8 @@ static void __exit_signal(struct task_st > > __cleanup_sighand(sighand); > clear_tsk_thread_flag(tsk, TIF_SIGPENDING); > - if (group_dead) { > + if (group_dead) > flush_sigqueue(&sig->shared_pending); > - tty_kref_put(tty); > - } > } > > static void delayed_put_task_struct(struct rcu_head *rhp) > @@ -279,6 +273,10 @@ repeat: > proc_flush_pid(thread_pid); > put_pid(thread_pid); > release_thread(p); > + > + if (p == leader) > + tty_kref_put(p->signal->tty); > + > put_task_struct_rcu_user(p); > > p = leader; >
diff --git a/kernel/exit.c b/kernel/exit.c index 257dd8ed45ea..d2c74f93f7d2 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -122,6 +122,13 @@ static __init int kernel_exit_sysfs_init(void) late_initcall(kernel_exit_sysfs_init); #endif +/* + * For things release_task() would like to do *after* tasklist_lock is released. + */ +struct release_task_post { + struct tty_struct *tty; +}; + static void __unhash_process(struct task_struct *p, bool group_dead) { nr_threads--; @@ -141,12 +148,11 @@ static void __unhash_process(struct task_struct *p, bool group_dead) /* * This function expects the tasklist_lock write-locked. */ -static void __exit_signal(struct task_struct *tsk) +static void __exit_signal(struct release_task_post *post, struct task_struct *tsk) { struct signal_struct *sig = tsk->signal; bool group_dead = thread_group_leader(tsk); struct sighand_struct *sighand; - struct tty_struct *tty; u64 utime, stime; sighand = rcu_dereference_check(tsk->sighand, @@ -160,7 +166,7 @@ static void __exit_signal(struct task_struct *tsk) #endif if (group_dead) { - tty = sig->tty; + post->tty = sig->tty; sig->tty = NULL; } else { /* @@ -207,10 +213,8 @@ static void __exit_signal(struct task_struct *tsk) __cleanup_sighand(sighand); clear_tsk_thread_flag(tsk, TIF_SIGPENDING); - if (group_dead) { + if (group_dead) flush_sigqueue(&sig->shared_pending); - tty_kref_put(tty); - } } static void delayed_put_task_struct(struct rcu_head *rhp) @@ -236,10 +240,13 @@ void __weak release_thread(struct task_struct *dead_task) void release_task(struct task_struct *p) { + struct release_task_post post; struct task_struct *leader; struct pid *thread_pid; int zap_leader; repeat: + memset(&post, 0, sizeof(post)); + /* don't need to get the RCU readlock here - the process is dead and * can't be modifying its own credentials. But shut RCU-lockdep up */ rcu_read_lock(); @@ -252,7 +259,7 @@ void release_task(struct task_struct *p) write_lock_irq(&tasklist_lock); ptrace_release_task(p); - __exit_signal(p); + __exit_signal(&post, p); /* * If we are the last non-leader member of the thread @@ -280,6 +287,7 @@ void release_task(struct task_struct *p) sizeof(unsigned long long)); release_thread(p); put_task_struct_rcu_user(p); + tty_kref_put(post.tty); p = leader; if (unlikely(zap_leader))
Instead of smuggling the tty pointer directly, use a struct so that more things can be added later. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> --- kernel/exit.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)