Message ID | 20231202030410.61047-1-inwardvessel@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | 9p: prevent read overrun in protocol dump tracepoint | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
JP Kobryn wrote on Fri, Dec 01, 2023 at 07:04:10PM -0800: > An out of bounds read can occur within the tracepoint 9p_protocol_dump(). > In the fast assign, there is a memcpy that uses a constant size of 32 > (macro definition as P9_PROTO_DUMP_SZ). When the copy is invoked, the > source buffer is not guaranteed match this size. It was found that in some > cases the source buffer size is less than 32, resulting in a read that > overruns. > > The size of the source buffer seems to be known at the time of the > tracepoint being invoked. The allocations happen within p9_fcall_init(), > where the capacity field is set to the allocated size of the payload > buffer. This patch tries to fix the overrun by using the minimum of that > field (size of source buffer) and the size of destination buffer when > performing the copy. Good catch; this is a regression due to a semi-recent optimization in commit 60ece0833b6c ("net/9p: allocate appropriate reduced message buffers") For some reason I thought we rounded up small messages alloc to 4k but I've just confirmed we don't, so these overruns are quite frequent. I'll add the fixes tag and cc to stable if there's no other comment. Thanks! > > A repro can be performed by different ways. The simplest might just be > mounting a shared filesystem (between host and guest vm) using the plan > 9 protocol while the tracepoint is enabled. > > mount -t 9p -o trans=virtio <mount_tag> <mount_path> > > The bpftrace program below can be used to show the out of bounds read. > Note that a recent version of bpftrace is needed for the raw tracepoint > support. The script was tested using v0.19.0. > > /* from include/net/9p/9p.h */ > struct p9_fcall { > u32 size; > u8 id; > u16 tag; > size_t offset; > size_t capacity; > struct kmem_cache *cache; > u8 *sdata; > bool zc; > }; > > tracepoint:9p:9p_protocol_dump > { > /* out of bounds read can happen when this tracepoint is enabled */ > } > > rawtracepoint:9p_protocol_dump > { > $pdu = (struct p9_fcall *)arg1; > $dump_sz = (uint64)32; > > if ($dump_sz > $pdu->capacity) { > printf("reading %zu bytes from src buffer of %zu bytes\n", > $dump_sz, $pdu->capacity); > } > } > > Signed-off-by: JP Kobryn <inwardvessel@gmail.com> > --- > include/trace/events/9p.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/trace/events/9p.h b/include/trace/events/9p.h > index 4dfa6d7f83ba..8690a7086252 100644 > --- a/include/trace/events/9p.h > +++ b/include/trace/events/9p.h > @@ -185,7 +185,8 @@ TRACE_EVENT(9p_protocol_dump, > __entry->clnt = clnt; > __entry->type = pdu->id; > __entry->tag = pdu->tag; > - memcpy(__entry->line, pdu->sdata, P9_PROTO_DUMP_SZ); > + memcpy(__entry->line, pdu->sdata, > + min(pdu->capacity, P9_PROTO_DUMP_SZ)); > ), > TP_printk("clnt %lu %s(tag = %d)\n%.3x: %16ph\n%.3x: %16ph\n", > (unsigned long)__entry->clnt, show_9p_op(__entry->type),
asmadeus@codewreck.org wrote on Sat, Dec 02, 2023 at 01:35:18PM +0900: > > diff --git a/include/trace/events/9p.h b/include/trace/events/9p.h > > index 4dfa6d7f83ba..8690a7086252 100644 > > --- a/include/trace/events/9p.h > > +++ b/include/trace/events/9p.h > > @@ -185,7 +185,8 @@ TRACE_EVENT(9p_protocol_dump, > > __entry->clnt = clnt; > > __entry->type = pdu->id; > > __entry->tag = pdu->tag; > > - memcpy(__entry->line, pdu->sdata, P9_PROTO_DUMP_SZ); > > + memcpy(__entry->line, pdu->sdata, > > + min(pdu->capacity, P9_PROTO_DUMP_SZ)); Building with W=1 yields a warning: ./include/linux/minmax.h:21:35: warning: comparison of distinct pointer types lacks a cast ... ./include/trace/events/9p.h:189:33: note: in expansion of macro ‘min’ 189 | min(pdu->capacity, P9_PROTO_DUMP_SZ)); I've updated the patch to: + min_t(size_t, pdu->capacity, P9_PROTO_DUMP_SZ)); and pushed to my -next branch: https://github.com/martinetd/linux/commits/9p-next
On Saturday, December 2, 2023 5:35:18 AM CET asmadeus@codewreck.org wrote: > JP Kobryn wrote on Fri, Dec 01, 2023 at 07:04:10PM -0800: > > An out of bounds read can occur within the tracepoint 9p_protocol_dump(). > > In the fast assign, there is a memcpy that uses a constant size of 32 > > (macro definition as P9_PROTO_DUMP_SZ). When the copy is invoked, the > > source buffer is not guaranteed match this size. It was found that in some > > cases the source buffer size is less than 32, resulting in a read that > > overruns. > > > > The size of the source buffer seems to be known at the time of the > > tracepoint being invoked. The allocations happen within p9_fcall_init(), > > where the capacity field is set to the allocated size of the payload > > buffer. This patch tries to fix the overrun by using the minimum of that > > field (size of source buffer) and the size of destination buffer when > > performing the copy. > > Good catch; this is a regression due to a semi-recent optimization in > commit 60ece0833b6c ("net/9p: allocate appropriate reduced message > buffers") Indeed, didn't have this one on screen! Thanks! > For some reason I thought we rounded up small messages alloc to 4k but > I've just confirmed we don't, so these overruns are quite frequent. > I'll add the fixes tag and cc to stable if there's no other comment. Yeah, in p9_msg_buf_size() [net/9p/protocol.c] the smallest allocation size for message types known to be small (at compile-time) is hard coded to 4k. However for all variable-size message types the size is calculated at runtime exactly as needed for that particular message being sent. So these 9p message types can trigger this case (<32). They are currently never rounded up. [...] > > diff --git a/include/trace/events/9p.h b/include/trace/events/9p.h > > index 4dfa6d7f83ba..8690a7086252 100644 > > --- a/include/trace/events/9p.h > > +++ b/include/trace/events/9p.h > > @@ -185,7 +185,8 @@ TRACE_EVENT(9p_protocol_dump, > > __entry->clnt = clnt; > > __entry->type = pdu->id; > > __entry->tag = pdu->tag; > > - memcpy(__entry->line, pdu->sdata, P9_PROTO_DUMP_SZ); > > + memcpy(__entry->line, pdu->sdata, > > + min(pdu->capacity, P9_PROTO_DUMP_SZ)); > > ), > > TP_printk("clnt %lu %s(tag = %d)\n%.3x: %16ph\n%.3x: %16ph\n", > > (unsigned long)__entry->clnt, show_9p_op(__entry->type), AFAICS __entry is a local variable on stack, and array __entry->line not intialized with zeros, i.e. the dump would contain trash at the end. Maybe prepending memset() before memcpy()? /Christian
On Sat, 02 Dec 2023 14:05:24 +0100 Christian Schoenebeck <linux_oss@crudebyte.com> wrote: > > > --- a/include/trace/events/9p.h > > > +++ b/include/trace/events/9p.h > > > @@ -185,7 +185,8 @@ TRACE_EVENT(9p_protocol_dump, > > > __entry->clnt = clnt; > > > __entry->type = pdu->id; > > > __entry->tag = pdu->tag; > > > - memcpy(__entry->line, pdu->sdata, P9_PROTO_DUMP_SZ); > > > + memcpy(__entry->line, pdu->sdata, > > > + min(pdu->capacity, P9_PROTO_DUMP_SZ)); > > > ), > > > TP_printk("clnt %lu %s(tag = %d)\n%.3x: %16ph\n%.3x: %16ph\n", > > > (unsigned long)__entry->clnt, show_9p_op(__entry->type), > > AFAICS __entry is a local variable on stack, and array __entry->line not > intialized with zeros, i.e. the dump would contain trash at the end. Maybe > prepending memset() before memcpy()? __entry is a macro that points into the ring buffer that gets allocated before this is called. TRACE_EVENT() has a __dynamic_array() field that can handle variable length arrays. What you can do is turn this into something like: TRACE_EVENT(9p_protocol_dump, TP_PROTO(struct p9_client *clnt, struct p9_fcall *pdu), TP_ARGS(clnt, pdu), TP_STRUCT__entry( __field( void *, clnt ) __field( __u8, type ) __field( __u16, tag ) __dynamic_array(unsigned char, line, min(pdu->capacity, P9_PROTO_DUMP_SZ) ) ), TP_fast_assign( __entry->clnt = clnt; __entry->type = pdu->id; __entry->tag = pdu->tag; memcpy(__get_dynamic_array(line), pdu->sdata, min(pdu->capacity, P9_PROTO_DUMP_SZ)); ), TP_printk("clnt %lu %s(tag = %d)\n%.3x: %16ph\n%.3x: %16ph\n", (unsigned long)__entry->clnt, show_9p_op(__entry->type), __entry->tag, 0, __get_dynamic_array(line), 16, __get_dynamic_array(line) + 16) ); -- Steve
Steven Rostedt wrote on Sat, Dec 02, 2023 at 08:14:09PM -0500: > > AFAICS __entry is a local variable on stack, and array __entry->line not > > intialized with zeros, i.e. the dump would contain trash at the end. Maybe > > prepending memset() before memcpy()? Well spotted! Now I'm thinking about it we weren't initializing the source buffer either back when we had (>32) msize allocations, so these already had been printing garbage, but might as well get this sorted out while we're here. > __entry is a macro that points into the ring buffer that gets allocated > before this is called. TRACE_EVENT() has a __dynamic_array() field that > can handle variable length arrays. What you can do is turn this into > something like: > > TRACE_EVENT(9p_protocol_dump, > TP_PROTO(struct p9_client *clnt, struct p9_fcall *pdu), > > TP_ARGS(clnt, pdu), > > TP_STRUCT__entry( > __field( void *, clnt ) > __field( __u8, type ) > __field( __u16, tag ) > __dynamic_array(unsigned char, line, min(pdu->capacity, P9_PROTO_DUMP_SZ) ) > ), > > TP_fast_assign( > __entry->clnt = clnt; > __entry->type = pdu->id; > __entry->tag = pdu->tag; > memcpy(__get_dynamic_array(line), pdu->sdata, > min(pdu->capacity, P9_PROTO_DUMP_SZ)); > ), > TP_printk("clnt %lu %s(tag = %d)\n%.3x: %16ph\n%.3x: %16ph\n", > (unsigned long)__entry->clnt, show_9p_op(__entry->type), > __entry->tag, 0, __get_dynamic_array(line), 16, > __get_dynamic_array(line) + 16) This was just printing garbage in the previous version but %16ph with a dynamic alloc would be out of range (even the start of the next buffer, _get_dynamic_array(line) + 16, can be out of range) Also, for custom tracepoints e.g. bpftrace the program needs to know how many bytes can be read safely even if it's just for dumping -- unless dynamic_array is a "fat pointer" that conveys its own size? (Sorry didn't take the time to check) So I see two ways forward: - We can give up on the 16 bytes split here, add the size in one of the fields, and print with %*ph using that size. - Or just give up and zero the tail; I'm surprised there's no "memcpy up to x bytes and zero up to y bytes if required" helper but Christian's suggestion of always doing memset first is probably not that bad performance-wise if someone's dumping these out already. I don't have a hard preference here, what do you think?
On Sun, 3 Dec 2023 10:33:32 +0900 Dominique Martinet <asmadeus@codewreck.org> wrote: > > TP_printk("clnt %lu %s(tag = %d)\n%.3x: %16ph\n%.3x: %16ph\n", > > (unsigned long)__entry->clnt, show_9p_op(__entry->type), > > __entry->tag, 0, __get_dynamic_array(line), 16, > > __get_dynamic_array(line) + 16) > > This was just printing garbage in the previous version but %16ph with a > dynamic alloc would be out of range (even the start of the next buffer, > _get_dynamic_array(line) + 16, can be out of range) > > Also, for custom tracepoints e.g. bpftrace the program needs to know how > many bytes can be read safely even if it's just for dumping -- unless > dynamic_array is a "fat pointer" that conveys its own size? > (Sorry didn't take the time to check) Yes, there's also a __get_dynamic_array_len(line) that will return the allocated length of the line. Is that what you need? -- Steve
Steven Rostedt wrote on Sat, Dec 02, 2023 at 11:15:24PM -0500: > > Also, for custom tracepoints e.g. bpftrace the program needs to know how > > many bytes can be read safely even if it's just for dumping -- unless > > dynamic_array is a "fat pointer" that conveys its own size? > > (Sorry didn't take the time to check) > > Yes, there's also a __get_dynamic_array_len(line) that will return the > allocated length of the line. Is that what you need? Yes, thanks! So the lower two bytes of the field are its position in the entry and the higher two bytes its size; ok. It doesn't look like bpftrace has any helper for it but that can probably be sorted out if someone wants to dump data there. Let's update the event to use a dynamic array and have printk fomrat to use %*ph with that length. JP Kobryn, does that sound good to you? I'm not sure what you were trying to do in the first place. Do you want to send a v2 or shall I?
On Sun, Dec 03, 2023 at 02:32:15PM +0900, Dominique Martinet wrote: > Steven Rostedt wrote on Sat, Dec 02, 2023 at 11:15:24PM -0500: > > > Also, for custom tracepoints e.g. bpftrace the program needs to know how > > > many bytes can be read safely even if it's just for dumping -- unless > > > dynamic_array is a "fat pointer" that conveys its own size? > > > (Sorry didn't take the time to check) > > > > Yes, there's also a __get_dynamic_array_len(line) that will return the > > allocated length of the line. Is that what you need? > > Yes, thanks! So the lower two bytes of the field are its position in > the entry and the higher two bytes its size; ok. > It doesn't look like bpftrace has any helper for it but that can > probably be sorted out if someone wants to dump data there. > > > Let's update the event to use a dynamic array and have printk fomrat to > use %*ph with that length. > > JP Kobryn, does that sound good to you? I'm not sure what you were > trying to do in the first place. > Do you want to send a v2 or shall I? Sounds good. I'll send out a v2. Thanks Steve for recommending the dynamic array macros. JP > > -- > Dominique Martinet | Asmadeus
diff --git a/include/trace/events/9p.h b/include/trace/events/9p.h index 4dfa6d7f83ba..8690a7086252 100644 --- a/include/trace/events/9p.h +++ b/include/trace/events/9p.h @@ -185,7 +185,8 @@ TRACE_EVENT(9p_protocol_dump, __entry->clnt = clnt; __entry->type = pdu->id; __entry->tag = pdu->tag; - memcpy(__entry->line, pdu->sdata, P9_PROTO_DUMP_SZ); + memcpy(__entry->line, pdu->sdata, + min(pdu->capacity, P9_PROTO_DUMP_SZ)); ), TP_printk("clnt %lu %s(tag = %d)\n%.3x: %16ph\n%.3x: %16ph\n", (unsigned long)__entry->clnt, show_9p_op(__entry->type),
An out of bounds read can occur within the tracepoint 9p_protocol_dump(). In the fast assign, there is a memcpy that uses a constant size of 32 (macro definition as P9_PROTO_DUMP_SZ). When the copy is invoked, the source buffer is not guaranteed match this size. It was found that in some cases the source buffer size is less than 32, resulting in a read that overruns. The size of the source buffer seems to be known at the time of the tracepoint being invoked. The allocations happen within p9_fcall_init(), where the capacity field is set to the allocated size of the payload buffer. This patch tries to fix the overrun by using the minimum of that field (size of source buffer) and the size of destination buffer when performing the copy. A repro can be performed by different ways. The simplest might just be mounting a shared filesystem (between host and guest vm) using the plan 9 protocol while the tracepoint is enabled. mount -t 9p -o trans=virtio <mount_tag> <mount_path> The bpftrace program below can be used to show the out of bounds read. Note that a recent version of bpftrace is needed for the raw tracepoint support. The script was tested using v0.19.0. /* from include/net/9p/9p.h */ struct p9_fcall { u32 size; u8 id; u16 tag; size_t offset; size_t capacity; struct kmem_cache *cache; u8 *sdata; bool zc; }; tracepoint:9p:9p_protocol_dump { /* out of bounds read can happen when this tracepoint is enabled */ } rawtracepoint:9p_protocol_dump { $pdu = (struct p9_fcall *)arg1; $dump_sz = (uint64)32; if ($dump_sz > $pdu->capacity) { printf("reading %zu bytes from src buffer of %zu bytes\n", $dump_sz, $pdu->capacity); } } Signed-off-by: JP Kobryn <inwardvessel@gmail.com> --- include/trace/events/9p.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)