diff mbox series

[bpf,2/4] libbpf: Handle size overflow for ringbuf mmap

Message ID 20221111092642.2333724-3-houtao@huaweicloud.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series libbpf: Fixes for ring buffer | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 12 of 12 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 23 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-VM_Test-3 fail Logs for build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-4 fail Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-VM_Test-5 success Logs for build for s390x with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-7 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-8 success Logs for llvm-toolchain
bpf/vmtest-bpf-VM_Test-9 success Logs for set-matrix
bpf/vmtest-bpf-PR success PR summary
bpf/vmtest-bpf-VM_Test-2 success Logs for llvm-toolchain
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck

Commit Message

Hou Tao Nov. 11, 2022, 9:26 a.m. UTC
From: Hou Tao <houtao1@huawei.com>

The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
will overflow u32 when mapping producer page and data pages. Only
casting max_entries to size_t is not enough, because for 32-bits
application on 64-bits kernel the size of read-only mmap region
also could overflow size_t.

So fixing it by casting the size of read-only mmap region into a __u64
and checking whether or not there will be overflow during mmap.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 tools/lib/bpf/ringbuf.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

Stanislav Fomichev Nov. 11, 2022, 5:54 p.m. UTC | #1
On 11/11, Hou Tao wrote:
> From: Hou Tao <houtao1@huawei.com>

> The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> will overflow u32 when mapping producer page and data pages. Only
> casting max_entries to size_t is not enough, because for 32-bits
> application on 64-bits kernel the size of read-only mmap region
> also could overflow size_t.

> So fixing it by casting the size of read-only mmap region into a __u64
> and checking whether or not there will be overflow during mmap.

> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> Signed-off-by: Hou Tao <houtao1@huawei.com>
> ---
>   tools/lib/bpf/ringbuf.c | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)

> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index d285171d4b69..c4bdc88af672 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
>   	__u32 len = sizeof(info);
>   	struct epoll_event *e;
>   	struct ring *r;
> +	__u64 ro_size;
>   	void *tmp;
>   	int err;

> @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int  
> map_fd,
>   	 * data size to allow simple reading of samples that wrap around the
>   	 * end of a ring buffer. See kernel implementation for details.
>   	 * */
> -	tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> -		   MAP_SHARED, map_fd, rb->page_size);
> +	ro_size = rb->page_size + 2 * (__u64)info.max_entries;

[..]

> +	if (ro_size != (__u64)(size_t)ro_size) {
> +		pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> +			info.max_entries);
> +		return libbpf_err(-E2BIG);
> +	}

Why do we need this check at all? IIUC, the problem is that the expression
"rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
overflow. So why doing this part only isn't enough?

size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);

sizeof(size_t) should be 8, so no overflow is possible?


> +	tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> +		   rb->page_size);
>   	if (tmp == MAP_FAILED) {
>   		err = -errno;
>   		ringbuf_unmap_ring(rb, r);
> --
> 2.29.2
Andrii Nakryiko Nov. 11, 2022, 8:56 p.m. UTC | #2
On Fri, Nov 11, 2022 at 9:54 AM <sdf@google.com> wrote:
>
> On 11/11, Hou Tao wrote:
> > From: Hou Tao <houtao1@huawei.com>
>
> > The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> > will overflow u32 when mapping producer page and data pages. Only
> > casting max_entries to size_t is not enough, because for 32-bits
> > application on 64-bits kernel the size of read-only mmap region
> > also could overflow size_t.
>
> > So fixing it by casting the size of read-only mmap region into a __u64
> > and checking whether or not there will be overflow during mmap.
>
> > Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> > Signed-off-by: Hou Tao <houtao1@huawei.com>
> > ---
> >   tools/lib/bpf/ringbuf.c | 11 +++++++++--
> >   1 file changed, 9 insertions(+), 2 deletions(-)
>
> > diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> > index d285171d4b69..c4bdc88af672 100644
> > --- a/tools/lib/bpf/ringbuf.c
> > +++ b/tools/lib/bpf/ringbuf.c
> > @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
> >       __u32 len = sizeof(info);
> >       struct epoll_event *e;
> >       struct ring *r;
> > +     __u64 ro_size;

I found ro_size quite a confusing name, let's call it mmap_sz?

> >       void *tmp;
> >       int err;
>
> > @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
> > map_fd,
> >        * data size to allow simple reading of samples that wrap around the
> >        * end of a ring buffer. See kernel implementation for details.
> >        * */
> > -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> > -                MAP_SHARED, map_fd, rb->page_size);
> > +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
>
> [..]
>
> > +     if (ro_size != (__u64)(size_t)ro_size) {
> > +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> > +                     info.max_entries);
> > +             return libbpf_err(-E2BIG);
> > +     }
>
> Why do we need this check at all? IIUC, the problem is that the expression
> "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
> overflow. So why doing this part only isn't enough?
>
> size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
> mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
>
> sizeof(size_t) should be 8, so no overflow is possible?

not on 32-bit arches, presumably?



>
>
> > +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> > +                rb->page_size);

should we split this mmap into two mmaps -- one for producer_pos page,
another for data area. That will presumably allow to mmap ringbuf with
max_entries = 1GB?

> >       if (tmp == MAP_FAILED) {
> >               err = -errno;
> >               ringbuf_unmap_ring(rb, r);
> > --
> > 2.29.2
>
Stanislav Fomichev Nov. 11, 2022, 9:24 p.m. UTC | #3
On Fri, Nov 11, 2022 at 12:56 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Nov 11, 2022 at 9:54 AM <sdf@google.com> wrote:
> >
> > On 11/11, Hou Tao wrote:
> > > From: Hou Tao <houtao1@huawei.com>
> >
> > > The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> > > will overflow u32 when mapping producer page and data pages. Only
> > > casting max_entries to size_t is not enough, because for 32-bits
> > > application on 64-bits kernel the size of read-only mmap region
> > > also could overflow size_t.
> >
> > > So fixing it by casting the size of read-only mmap region into a __u64
> > > and checking whether or not there will be overflow during mmap.
> >
> > > Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> > > Signed-off-by: Hou Tao <houtao1@huawei.com>
> > > ---
> > >   tools/lib/bpf/ringbuf.c | 11 +++++++++--
> > >   1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > > diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> > > index d285171d4b69..c4bdc88af672 100644
> > > --- a/tools/lib/bpf/ringbuf.c
> > > +++ b/tools/lib/bpf/ringbuf.c
> > > @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
> > >       __u32 len = sizeof(info);
> > >       struct epoll_event *e;
> > >       struct ring *r;
> > > +     __u64 ro_size;
>
> I found ro_size quite a confusing name, let's call it mmap_sz?
>
> > >       void *tmp;
> > >       int err;
> >
> > > @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
> > > map_fd,
> > >        * data size to allow simple reading of samples that wrap around the
> > >        * end of a ring buffer. See kernel implementation for details.
> > >        * */
> > > -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> > > -                MAP_SHARED, map_fd, rb->page_size);
> > > +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
> >
> > [..]
> >
> > > +     if (ro_size != (__u64)(size_t)ro_size) {
> > > +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> > > +                     info.max_entries);
> > > +             return libbpf_err(-E2BIG);
> > > +     }
> >
> > Why do we need this check at all? IIUC, the problem is that the expression
> > "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
> > overflow. So why doing this part only isn't enough?
> >
> > size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
> > mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
> >
> > sizeof(size_t) should be 8, so no overflow is possible?
>
> not on 32-bit arches, presumably?

Good point, he even mentions it in the description, I can't read apparently :-/

"Only casting max_entries to size_t is not enough"

>
>
> >
> >
> > > +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> > > +                rb->page_size);
>
> should we split this mmap into two mmaps -- one for producer_pos page,
> another for data area. That will presumably allow to mmap ringbuf with
> max_entries = 1GB?
>
> > >       if (tmp == MAP_FAILED) {
> > >               err = -errno;
> > >               ringbuf_unmap_ring(rb, r);
> > > --
> > > 2.29.2
> >
Hou Tao Nov. 12, 2022, 3:34 a.m. UTC | #4
Hi,

On 11/12/2022 4:56 AM, Andrii Nakryiko wrote:
> On Fri, Nov 11, 2022 at 9:54 AM <sdf@google.com> wrote:
>> On 11/11, Hou Tao wrote:
>>> From: Hou Tao <houtao1@huawei.com>
>>> The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
>>> will overflow u32 when mapping producer page and data pages. Only
>>> casting max_entries to size_t is not enough, because for 32-bits
>>> application on 64-bits kernel the size of read-only mmap region
>>> also could overflow size_t.
>>> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
>>> Signed-off-by: Hou Tao <houtao1@huawei.com>
>>> ---
>>>   tools/lib/bpf/ringbuf.c | 11 +++++++++--
>>>   1 file changed, 9 insertions(+), 2 deletions(-)
>>> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
>>> index d285171d4b69..c4bdc88af672 100644
>>> --- a/tools/lib/bpf/ringbuf.c
>>> +++ b/tools/lib/bpf/ringbuf.c
>>> @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
>>>       __u32 len = sizeof(info);
>>>       struct epoll_event *e;
>>>       struct ring *r;
>>> +     __u64 ro_size;
> I found ro_size quite a confusing name, let's call it mmap_sz?
OK.
>
>>>       void *tmp;
>>>       int err;
>>> @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
>>> map_fd,
>>>        * data size to allow simple reading of samples that wrap around the
>>>        * end of a ring buffer. See kernel implementation for details.
>>>        * */
>>> -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
>>> -                MAP_SHARED, map_fd, rb->page_size);
>>> +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
>> [..]
>>
>>> +     if (ro_size != (__u64)(size_t)ro_size) {
>>> +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
>>> +                     info.max_entries);
>>> +             return libbpf_err(-E2BIG);
>>> +     }
>> Why do we need this check at all? IIUC, the problem is that the expression
>> "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
>> overflow. So why doing this part only isn't enough?
>>
>> size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
>> mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
>>
>> sizeof(size_t) should be 8, so no overflow is possible?
> not on 32-bit arches, presumably?
Yes. For 32-bits kernel, the total size of virtual address space for user space
and kernel space is 4GB, so when map_entries is 2GB, the needed virtual address
space will be 2GB + 4GB, so the mapping of ring buffer will fail either in
kernel or in userspace. A extreme case is 32-bits userspace under 64-bits
kernel. The mapping of 2GB ring buffer in kernel is OK, but 4GB will overflow
size_t on 32-bits userspace.
>


>
>
>>
>>> +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
>>> +                rb->page_size);
> should we split this mmap into two mmaps -- one for producer_pos page,
> another for data area. That will presumably allow to mmap ringbuf with
> max_entries = 1GB?
I don't understand the reason for the splitting. Even without the splitting, in
theory ring buffer with max_entries = 1GB will be OK for 32-bits kernel, despite
in practice the mapping of 1GB ring buffer on 32-bits kernel will fail because
the most common size of kernel virtual address space is 1GB (although ARM could
use VMSPLIT_1G to increase the size of kernel virtual address to 3GB).
>
>>>       if (tmp == MAP_FAILED) {
>>>               err = -errno;
>>>               ringbuf_unmap_ring(rb, r);
>>> --
>>> 2.29.2
Andrii Nakryiko Nov. 14, 2022, 7:51 p.m. UTC | #5
On Fri, Nov 11, 2022 at 7:34 PM Hou Tao <houtao@huaweicloud.com> wrote:
>
> Hi,
>
> On 11/12/2022 4:56 AM, Andrii Nakryiko wrote:
> > On Fri, Nov 11, 2022 at 9:54 AM <sdf@google.com> wrote:
> >> On 11/11, Hou Tao wrote:
> >>> From: Hou Tao <houtao1@huawei.com>
> >>> The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
> >>> will overflow u32 when mapping producer page and data pages. Only
> >>> casting max_entries to size_t is not enough, because for 32-bits
> >>> application on 64-bits kernel the size of read-only mmap region
> >>> also could overflow size_t.
> >>> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> >>> Signed-off-by: Hou Tao <houtao1@huawei.com>
> >>> ---
> >>>   tools/lib/bpf/ringbuf.c | 11 +++++++++--
> >>>   1 file changed, 9 insertions(+), 2 deletions(-)
> >>> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> >>> index d285171d4b69..c4bdc88af672 100644
> >>> --- a/tools/lib/bpf/ringbuf.c
> >>> +++ b/tools/lib/bpf/ringbuf.c
> >>> @@ -77,6 +77,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
> >>>       __u32 len = sizeof(info);
> >>>       struct epoll_event *e;
> >>>       struct ring *r;
> >>> +     __u64 ro_size;
> > I found ro_size quite a confusing name, let's call it mmap_sz?
> OK.
> >
> >>>       void *tmp;
> >>>       int err;
> >>> @@ -129,8 +130,14 @@ int ring_buffer__add(struct ring_buffer *rb, int
> >>> map_fd,
> >>>        * data size to allow simple reading of samples that wrap around the
> >>>        * end of a ring buffer. See kernel implementation for details.
> >>>        * */
> >>> -     tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
> >>> -                MAP_SHARED, map_fd, rb->page_size);
> >>> +     ro_size = rb->page_size + 2 * (__u64)info.max_entries;
> >> [..]
> >>
> >>> +     if (ro_size != (__u64)(size_t)ro_size) {
> >>> +             pr_warn("ringbuf: ring buffer size (%u) is too big\n",
> >>> +                     info.max_entries);
> >>> +             return libbpf_err(-E2BIG);
> >>> +     }
> >> Why do we need this check at all? IIUC, the problem is that the expression
> >> "rb->page_size + 2 * info.max_entries" is evaluated as u32 and can
> >> overflow. So why doing this part only isn't enough?
> >>
> >> size_t mmap_size = rb->page_size + 2 * (size_t)info.max_entries;
> >> mmap(NULL, mmap_size, PROT_READ, MAP_SHARED, map_fd, ...);
> >>
> >> sizeof(size_t) should be 8, so no overflow is possible?
> > not on 32-bit arches, presumably?
> Yes. For 32-bits kernel, the total size of virtual address space for user space
> and kernel space is 4GB, so when map_entries is 2GB, the needed virtual address
> space will be 2GB + 4GB, so the mapping of ring buffer will fail either in
> kernel or in userspace. A extreme case is 32-bits userspace under 64-bits
> kernel. The mapping of 2GB ring buffer in kernel is OK, but 4GB will overflow
> size_t on 32-bits userspace.
> >
>
>
> >
> >
> >>
> >>> +     tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
> >>> +                rb->page_size);
> > should we split this mmap into two mmaps -- one for producer_pos page,
> > another for data area. That will presumably allow to mmap ringbuf with
> > max_entries = 1GB?
> I don't understand the reason for the splitting. Even without the splitting, in
> theory ring buffer with max_entries = 1GB will be OK for 32-bits kernel, despite
> in practice the mapping of 1GB ring buffer on 32-bits kernel will fail because
> the most common size of kernel virtual address space is 1GB (although ARM could
> use VMSPLIT_1G to increase the size of kernel virtual address to 3GB).

Yep, never mind. size_t is positive, so it can express up to 4GB, so
2GB + 4KB is fine as is already (even though it most probably will
fail).

> >
> >>>       if (tmp == MAP_FAILED) {
> >>>               err = -errno;
> >>>               ringbuf_unmap_ring(rb, r);
> >>> --
> >>> 2.29.2
>
diff mbox series

Patch

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index d285171d4b69..c4bdc88af672 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -77,6 +77,7 @@  int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 	__u32 len = sizeof(info);
 	struct epoll_event *e;
 	struct ring *r;
+	__u64 ro_size;
 	void *tmp;
 	int err;
 
@@ -129,8 +130,14 @@  int ring_buffer__add(struct ring_buffer *rb, int map_fd,
 	 * data size to allow simple reading of samples that wrap around the
 	 * end of a ring buffer. See kernel implementation for details.
 	 * */
-	tmp = mmap(NULL, rb->page_size + 2 * info.max_entries, PROT_READ,
-		   MAP_SHARED, map_fd, rb->page_size);
+	ro_size = rb->page_size + 2 * (__u64)info.max_entries;
+	if (ro_size != (__u64)(size_t)ro_size) {
+		pr_warn("ringbuf: ring buffer size (%u) is too big\n",
+			info.max_entries);
+		return libbpf_err(-E2BIG);
+	}
+	tmp = mmap(NULL, (size_t)ro_size, PROT_READ, MAP_SHARED, map_fd,
+		   rb->page_size);
 	if (tmp == MAP_FAILED) {
 		err = -errno;
 		ringbuf_unmap_ring(rb, r);