ocfs2: use get_task_comm
diff mbox

Message ID 20171205152110.2050975-1-arnd@arndb.de
State New
Headers show

Commit Message

Arnd Bergmann Dec. 5, 2017, 3:20 p.m. UTC
While reviewing all callers of get_task_comm(), I stumbled
over this one that claimed it was not exported, when in fact
it is. Accessing task->comm directly is not safe, so better
convert this one to using get_task_comm as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/ocfs2/cluster/netdebug.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Arnd Bergmann Dec. 5, 2017, 8:27 p.m. UTC | #1
On Tue, Dec 5, 2017 at 8:19 PM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Dec 5, 2017 at 7:20 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> While reviewing all callers of get_task_comm(), I stumbled
>> over this one that claimed it was not exported, when in fact
>> it is. Accessing task->comm directly is not safe, so better
>> convert this one to using get_task_comm as well.
>
> Using get_task_comm() in cases like this is actually overkill (i.e.
> using up stack space), since there's (currently) no benefit. Nothing
> protects getting a "correct" view of task->comm (i.e. it could get
> updated in the middle of a copy), but it _is_ always NULL terminated,
> so it's safe to use with %s like this. While it does make me slightly
> uncomfortable to _depend_ on this NULL termination, but there are lots
> of open-coded %s users of task->comm. When we're trying to save a
> _copy_ of task->comm, then we want get_task_comm(), just to make sure
> we're doing it right.
>
> So, while I don't oppose this patch, it might be seen as a wasteful
> use of stack space.

It's only a few bytes of stack space in a leaf function, I'd not be
worried about that.

More generally speaking though, how exactly do we guarantee that
there is NUL-termination on tsk->comm during a concurrent update?
Could we ever get into a situation where overwrite the NUL byte
while setting tsk->comm to a longer string, and read the new start
of the string together with an unterminated end, or do we strictly
guarantee that the last byte is still NUL? I assume the latter is
true, just haven't found exactly where that guarantee is made.

      Arnd
Arnd Bergmann Dec. 5, 2017, 9:44 p.m. UTC | #2
On Tue, Dec 5, 2017 at 9:32 PM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Dec 5, 2017 at 12:27 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>>
>> More generally speaking though, how exactly do we guarantee that
>> there is NUL-termination on tsk->comm during a concurrent update?
>> Could we ever get into a situation where overwrite the NUL byte
>> while setting tsk->comm to a longer string, and read the new start
>> of the string together with an unterminated end, or do we strictly
>> guarantee that the last byte is still NUL? I assume the latter is
>> true, just haven't found exactly where that guarantee is made.
>
> strncpy will zero pad with the trailing NULL, so it's supposed to
> always be safe... still gives me the creeps, though.

But set_task_comm uses strlcpy(), not strncpy(), so you might
get some of the old data back, the question is just whether it could
leak uninitialized data or part of the task_struct up to the next
NUL byte. I could not come up with any code path that would leave
a non-NUL byte in at the end of task->comm though, so it's
probably still safe.

       Arnd
Peter Zijlstra Dec. 5, 2017, 10:01 p.m. UTC | #3
On Tue, Dec 05, 2017 at 10:44:17PM +0100, Arnd Bergmann wrote:
> On Tue, Dec 5, 2017 at 9:32 PM, Kees Cook <keescook@chromium.org> wrote:
> > On Tue, Dec 5, 2017 at 12:27 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >>
> >> More generally speaking though, how exactly do we guarantee that
> >> there is NUL-termination on tsk->comm during a concurrent update?
> >> Could we ever get into a situation where overwrite the NUL byte
> >> while setting tsk->comm to a longer string, and read the new start
> >> of the string together with an unterminated end, or do we strictly
> >> guarantee that the last byte is still NUL? I assume the latter is
> >> true, just haven't found exactly where that guarantee is made.
> >
> > strncpy will zero pad with the trailing NULL, so it's supposed to
> > always be safe... still gives me the creeps, though.
> 
> But set_task_comm uses strlcpy(), not strncpy(), so you might
> get some of the old data back, the question is just whether it could
> leak uninitialized data or part of the task_struct up to the next
> NUL byte. I could not come up with any code path that would leave
> a non-NUL byte in at the end of task->comm though, so it's
> probably still safe.

So we used to have some magic code set_task_comm() which even included a
memory barrier etc.. But since none of the reading sites include a
memory barrier its all pointless.

There is no guarantee that a tsk->comm user reads the bytes in string
order.

The only thing that ensures we never run over, is the hard guarantee
that ->comm[TSK_COMM_LEN-1] == 0 at all times. If we don't trust
str*cpy() to do the right thing here, we could simply open code the
thing.

Patch
diff mbox

diff --git a/fs/ocfs2/cluster/netdebug.c b/fs/ocfs2/cluster/netdebug.c
index 74a21f6695c8..a51d001b89b1 100644
--- a/fs/ocfs2/cluster/netdebug.c
+++ b/fs/ocfs2/cluster/netdebug.c
@@ -130,6 +130,7 @@  static void *nst_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 static int nst_seq_show(struct seq_file *seq, void *v)
 {
 	struct o2net_send_tracking *nst, *dummy_nst = seq->private;
+	char comm[TASK_COMM_LEN];
 	ktime_t now;
 	s64 sock, send, status;
 
@@ -142,8 +143,8 @@  static int nst_seq_show(struct seq_file *seq, void *v)
 	sock = ktime_to_us(ktime_sub(now, nst->st_sock_time));
 	send = ktime_to_us(ktime_sub(now, nst->st_send_time));
 	status = ktime_to_us(ktime_sub(now, nst->st_status_time));
+	get_task_comm(comm, nst->st_task);
 
-	/* get_task_comm isn't exported.  oh well. */
 	seq_printf(seq, "%p:\n"
 		   "  pid:          %lu\n"
 		   "  tgid:         %lu\n"
@@ -158,7 +159,7 @@  static int nst_seq_show(struct seq_file *seq, void *v)
 		   "  wait start:   %lld usecs ago\n",
 		   nst, (unsigned long)task_pid_nr(nst->st_task),
 		   (unsigned long)nst->st_task->tgid,
-		   nst->st_task->comm, nst->st_node,
+		   comm, nst->st_node,
 		   nst->st_sc, nst->st_id, nst->st_msg_type,
 		   nst->st_msg_key,
 		   (long long)sock,