Message ID | 20241130044909.work.541-kees@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | exec: Make sure task->comm is always NUL-terminated | expand |
Edited down to just the end result: On Fri, 29 Nov 2024 at 20:49, Kees Cook <kees@kernel.org> wrote: > > void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec) > { > size_t len = min(strlen(buf), sizeof(tsk->comm) - 1); > > trace_task_rename(tsk, buf); > memcpy(tsk->comm, buf, len); > memset(&tsk->comm[len], 0, sizeof(tsk->comm) - len); > perf_event_comm(tsk, exec); > } I actually don't think that's super-safe either. Yeah, it works in practice, and the last byte is certainly always going to be 0, but it might not be reliably padded. Why? It walks over the source twice. First at strlen() time, then at memcpy. So if the source isn't stable, the end result might have odd results with NUL characters in the middle. And strscpy() really was *supposed* to be safe even in this case, and I thought it was until I looked closer. But I think strscpy() can be saved. Something (UNTESTED!) like the attached I think does the right thing. I added a couple of "READ_ONCE()" things to make it really super-clear that strscpy() reads the source exactly once, and to not allow any compiler re-materialization of the reads (although I think that when I asked people, it turns out neither gcc nor clang rematerialize memory accesses, so that READ_ONCE is likely more a documentation ad theoretical thing than a real thing). And yes, we could make the word-at-a-time case also know about masking the last word, but it's kind of annoying and depends on byte ordering. Hmm? I don't think your version is wrong, but I also think we'd be better off making our 'strscpy()' infrastructure explicitly safe wrt unstable source strings. Linus
diff --git a/fs/exec.c b/fs/exec.c index e0435b31a811..5f16500ac325 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1200,16 +1200,16 @@ char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk) EXPORT_SYMBOL_GPL(__get_task_comm); /* - * These functions flushes out all traces of the currently running executable - * so that a new one can be started + * This is unlocked -- the string will always be NUL-terminated, but + * may show overlapping contents if racing concurrent reads. */ - void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec) { - task_lock(tsk); + size_t len = min(strlen(buf), sizeof(tsk->comm) - 1); + trace_task_rename(tsk, buf); - strscpy_pad(tsk->comm, buf, sizeof(tsk->comm)); - task_unlock(tsk); + memcpy(tsk->comm, buf, len); + memset(&tsk->comm[len], 0, sizeof(tsk->comm) - len); perf_event_comm(tsk, exec); } diff --git a/include/linux/sched.h b/include/linux/sched.h index e6ee4258169a..ac9f429ddc17 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1932,11 +1932,10 @@ static inline void kick_process(struct task_struct *tsk) { } #endif extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec); - -static inline void set_task_comm(struct task_struct *tsk, const char *from) -{ - __set_task_comm(tsk, from, false); -} +#define set_task_comm(tsk, from) ({ \ + BUILD_BUG_ON(sizeof(from) != TASK_COMM_LEN); \ + __set_task_comm(tsk, from, false); \ +}) extern char *__get_task_comm(char *to, size_t len, struct task_struct *tsk); #define get_task_comm(buf, tsk) ({ \ diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index a38f36b68060..5d0928f37471 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -634,7 +634,7 @@ static int io_wq_worker(void *data) struct io_wq_acct *acct = io_wq_get_acct(worker); struct io_wq *wq = worker->wq; bool exit_mask = false, last_timeout = false; - char buf[TASK_COMM_LEN]; + char buf[TASK_COMM_LEN] = {}; set_mask_bits(&worker->flags, 0, BIT(IO_WORKER_F_UP) | BIT(IO_WORKER_F_RUNNING)); diff --git a/io_uring/sqpoll.c b/io_uring/sqpoll.c index a26593979887..90011f06c7fb 100644 --- a/io_uring/sqpoll.c +++ b/io_uring/sqpoll.c @@ -271,7 +271,7 @@ static int io_sq_thread(void *data) struct io_ring_ctx *ctx; struct rusage start; unsigned long timeout = 0; - char buf[TASK_COMM_LEN]; + char buf[TASK_COMM_LEN] = {}; DEFINE_WAIT(wait); /* offload context creation failed, just exit */ diff --git a/kernel/kthread.c b/kernel/kthread.c index db4ceb0f503c..162d55811744 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -736,10 +736,11 @@ EXPORT_SYMBOL(kthread_stop_put); int kthreadd(void *unused) { + static const char comm[TASK_COMM_LEN] = "kthreadd"; struct task_struct *tsk = current; /* Setup a clean context for our children to inherit. */ - set_task_comm(tsk, "kthreadd"); + set_task_comm(tsk, comm); ignore_signals(tsk); set_cpus_allowed_ptr(tsk, housekeeping_cpumask(HK_TYPE_KTHREAD)); set_mems_allowed(node_states[N_MEMORY]);
Using strscpy() meant that the final character in task->comm may be non-NUL for a moment before the "string too long" truncation happens. Instead of adding a new use of the ambiguous strncpy(), we'd want to use memtostr_pad() which enforces being able to check at compile time that sizes are sensible, but this requires being able to see string buffer lengths. Instead of trying to inline __set_task_comm() (which needs to call trace and perf functions), just open-code it. But to make sure we're always safe, add compile-time checking like we already do for get_task_comm(). Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Kees Cook <kees@kernel.org> --- Cc: Eric Biederman <ebiederm@xmission.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: linux-mm@kvack.org Cc: linux-fsdevel@vger.kernel.org Here's what I'd prefer to use to clean up set_task_comm(). I merged Linus and Eric's suggestions and open-coded memtostr_pad(). --- fs/exec.c | 12 ++++++------ include/linux/sched.h | 9 ++++----- io_uring/io-wq.c | 2 +- io_uring/sqpoll.c | 2 +- kernel/kthread.c | 3 ++- 5 files changed, 14 insertions(+), 14 deletions(-)