diff mbox series

blktrace: put bounds on BLKTRACESETUP buf_size and buf_nr

Message ID 20200604054434.216698-1-harshadshirwadkar@gmail.com (mailing list archive)
State New, archived
Headers show
Series blktrace: put bounds on BLKTRACESETUP buf_size and buf_nr | expand

Commit Message

harshad shirwadkar June 4, 2020, 5:44 a.m. UTC
Make sure that user requested memory via BLKTRACESETUP is within
bounds. This can be easily exploited by setting really large values
for buf_size and buf_nr in BLKTRACESETUP ioctl.

blktrace program has following hardcoded values for bufsize and bufnr:
BUF_SIZE=(512 * 1024)
BUF_NR=(4)

We add buffer to this and define the upper bound to be as follows:
BUF_SIZE=(1024 * 1024)
BUF_NR=(16)

This is very easy to exploit. Setting buf_size / buf_nr in userspace
program to big values make kernel go oom.  Verified that the fix makes
BLKTRACESETUP return -E2BIG if the buf_size * buf_nr crosses the upper
bound.

Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
---
 include/uapi/linux/blktrace_api.h | 3 +++
 kernel/trace/blktrace.c           | 3 +++
 2 files changed, 6 insertions(+)

Comments

Chaitanya Kulkarni June 4, 2020, 7:10 a.m. UTC | #1
On 6/3/20 10:45 PM, Harshad Shirwadkar wrote:
> Make sure that user requested memory via BLKTRACESETUP is within
> bounds. This can be easily exploited by setting really large values
> for buf_size and buf_nr in BLKTRACESETUP ioctl.
> 
> blktrace program has following hardcoded values for bufsize and bufnr:
> BUF_SIZE=(512 * 1024)
> BUF_NR=(4)
> 
> We add buffer to this and define the upper bound to be as follows:
> BUF_SIZE=(1024 * 1024)
> BUF_NR=(16)
> 
Aren't these values should be system dependent to start with?
I wonder what are the side effects of having hard-coded values ?
> This is very easy to exploit. Setting buf_size / buf_nr in userspace
> program to big values make kernel go oom.  Verified that the fix makes
> BLKTRACESETUP return -E2BIG if the buf_size * buf_nr crosses the upper
> bound.
> 
> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
> ---
>   include/uapi/linux/blktrace_api.h | 3 +++
>   kernel/trace/blktrace.c           | 3 +++
>   2 files changed, 6 insertions(+)
> 
> diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
> index 690621b610e5..4d9dc44a83f9 100644
> --- a/include/uapi/linux/blktrace_api.h
> +++ b/include/uapi/linux/blktrace_api.h
> @@ -129,6 +129,9 @@ enum {
>   };
>   
>   #define BLKTRACE_BDEV_SIZE	32
> +#define BLKTRACE_MAX_BUFSIZ	(1024 * 1024)
> +#define BLKTRACE_MAX_BUFNR	16
> +#define BLKTRACE_MAX_ALLOC	((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR))
>   
This is an API change and should be reflected to kernel & userspace 
tools. One way of doing it is to remove BUF_SIZE and BUF_NR and sync
kernel header with userspace blktrace_api.h.

>   /*
>    * User setup structure passed with BLKTRACESETUP
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index ea47f2084087..b3b0a8164c05 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
>   	if (!buts->buf_size || !buts->buf_nr)
>   		return -EINVAL;
>   
> +	if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC)
> +		return -E2BIG;
> +

On the other hand we can easily get rid of the kernel part by moving 
this check into user space tools, this will minimize the change diff and 
we will still have sane behavior.

What about something like this (completely untested/not compiled) :-

diff --git a/blktrace.c b/blktrace.c
index d0d271f..d6a9f39 100644
--- a/blktrace.c
+++ b/blktrace.c
@@ -2247,6 +2247,9 @@ static int handle_args(int argc, char *argv[])
         if (net_mode == Net_client && net_setup_addr())
                 return 1;

+       /* Check for unsually large numbers for buffers */
+       if (buf_size * buf_nr > BLKTRACE_MAX_ALLOC)
+               return -E2BIG;
         /*
          * Set up for appropriate PFD handler based upon output name.
          */

>   	if (!blk_debugfs_root)
>   		return -ENOENT;
>   
>
harshad shirwadkar June 4, 2020, 3:26 p.m. UTC | #2
On Thu, Jun 4, 2020 at 12:10 AM Chaitanya Kulkarni
<Chaitanya.Kulkarni@wdc.com> wrote:
>
> On 6/3/20 10:45 PM, Harshad Shirwadkar wrote:
> > Make sure that user requested memory via BLKTRACESETUP is within
> > bounds. This can be easily exploited by setting really large values
> > for buf_size and buf_nr in BLKTRACESETUP ioctl.
> >
> > blktrace program has following hardcoded values for bufsize and bufnr:
> > BUF_SIZE=(512 * 1024)
> > BUF_NR=(4)
> >
> > We add buffer to this and define the upper bound to be as follows:
> > BUF_SIZE=(1024 * 1024)
> > BUF_NR=(16)
> >
> Aren't these values should be system dependent to start with?
> I wonder what are the side effects of having hard-coded values ?
Sure we can have system dependent bounds. But, the goal here is just
only to ensure that the caller doesn't call BLKTRACESETUP with
unreasonable values and potentially crash the kernel. Given that, do
we really need to have upper-bound to be system dependent? wouldn't a
hard-coded but reasonable upper-bound be sufficient? As of now
blktrace also uses hardcoded values.
> > This is very easy to exploit. Setting buf_size / buf_nr in userspace
> > program to big values make kernel go oom.  Verified that the fix makes
> > BLKTRACESETUP return -E2BIG if the buf_size * buf_nr crosses the upper
> > bound.
> >
> > Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
> > ---
> >   include/uapi/linux/blktrace_api.h | 3 +++
> >   kernel/trace/blktrace.c           | 3 +++
> >   2 files changed, 6 insertions(+)
> >
> > diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
> > index 690621b610e5..4d9dc44a83f9 100644
> > --- a/include/uapi/linux/blktrace_api.h
> > +++ b/include/uapi/linux/blktrace_api.h
> > @@ -129,6 +129,9 @@ enum {
> >   };
> >
> >   #define BLKTRACE_BDEV_SIZE  32
> > +#define BLKTRACE_MAX_BUFSIZ  (1024 * 1024)
> > +#define BLKTRACE_MAX_BUFNR   16
> > +#define BLKTRACE_MAX_ALLOC   ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR))
> >
> This is an API change and should be reflected to kernel & userspace
> tools. One way of doing it is to remove BUF_SIZE and BUF_NR and sync
> kernel header with userspace blktrace_api.h.
Thanks, so we should make a change in userspace header file too.
> >   /*
> >    * User setup structure passed with BLKTRACESETUP
> > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> > index ea47f2084087..b3b0a8164c05 100644
> > --- a/kernel/trace/blktrace.c
> > +++ b/kernel/trace/blktrace.c
> > @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> >       if (!buts->buf_size || !buts->buf_nr)
> >               return -EINVAL;
> >
> > +     if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC)
> > +             return -E2BIG;
> > +
>
> On the other hand we can easily get rid of the kernel part by moving
> this check into user space tools, this will minimize the change diff and
> we will still have sane behavior.
>
> What about something like this (completely untested/not compiled) :-
>
> diff --git a/blktrace.c b/blktrace.c
> index d0d271f..d6a9f39 100644
> --- a/blktrace.c
> +++ b/blktrace.c
> @@ -2247,6 +2247,9 @@ static int handle_args(int argc, char *argv[])
>          if (net_mode == Net_client && net_setup_addr())
>                  return 1;
>
> +       /* Check for unsually large numbers for buffers */
> +       if (buf_size * buf_nr > BLKTRACE_MAX_ALLOC)
> +               return -E2BIG;
>          /*
>           * Set up for appropriate PFD handler based upon output name.
>           */
>
> >       if (!blk_debugfs_root)
> >               return -ENOENT;
> >
> >
The user-space fix would fix blktrace binary but it's still easy to
exploit this by writing a really quick program that calls
BLKTRACESETUP ioctl with unreasonably high values for buf_size and
buf_nr.

Thanks,
Harshad
>
Bart Van Assche June 5, 2020, 4:31 a.m. UTC | #3
On 2020-06-03 22:44, Harshad Shirwadkar wrote:
> Make sure that user requested memory via BLKTRACESETUP is within
> bounds. This can be easily exploited by setting really large values
> for buf_size and buf_nr in BLKTRACESETUP ioctl.
> 
> blktrace program has following hardcoded values for bufsize and bufnr:
> BUF_SIZE=(512 * 1024)
> BUF_NR=(4)

Where is the code that imposes these limits? I haven't found it. Did I
perhaps overlook something?

> diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
> index 690621b610e5..4d9dc44a83f9 100644
> --- a/include/uapi/linux/blktrace_api.h
> +++ b/include/uapi/linux/blktrace_api.h
> @@ -129,6 +129,9 @@ enum {
>  };
>  
>  #define BLKTRACE_BDEV_SIZE	32
> +#define BLKTRACE_MAX_BUFSIZ	(1024 * 1024)
> +#define BLKTRACE_MAX_BUFNR	16
> +#define BLKTRACE_MAX_ALLOC	((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR))

Adding these constants to the user space API is a very inflexible
approach. There must be better approaches to export constants like these
to user space, e.g. through sysfs attributes.

It probably will be easier to get a patch like this one upstream if the
uapi headers are not modified.

>  /*
>   * User setup structure passed with BLKTRACESETUP
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index ea47f2084087..b3b0a8164c05 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
>  	if (!buts->buf_size || !buts->buf_nr)
>  		return -EINVAL;
>  
> +	if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC)
> +		return -E2BIG;
> +
>  	if (!blk_debugfs_root)
>  		return -ENOENT;

Where do the overflow(s) happen if buts->buf_size and buts->buf_nr are
large? Is the following code from relay_open() the only code that can
overflow?

	chan->alloc_size = PAGE_ALIGN(subbuf_size * n_subbufs);

Thanks,

Bart.
harshad shirwadkar June 5, 2020, 5:02 a.m. UTC | #4
Hi Bart,

On Thu, Jun 4, 2020 at 9:31 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 2020-06-03 22:44, Harshad Shirwadkar wrote:
> > Make sure that user requested memory via BLKTRACESETUP is within
> > bounds. This can be easily exploited by setting really large values
> > for buf_size and buf_nr in BLKTRACESETUP ioctl.
> >
> > blktrace program has following hardcoded values for bufsize and bufnr:
> > BUF_SIZE=(512 * 1024)
> > BUF_NR=(4)
>
> Where is the code that imposes these limits? I haven't found it. Did I
> perhaps overlook something?

From what I understand, these are not limits. These are hardcoded
values used by blktrace userspace when it calls BLKTRACESETUP ioctl.
Inside the kernel, before this patch there is no checking in the linux
kernel.

>
> > diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
> > index 690621b610e5..4d9dc44a83f9 100644
> > --- a/include/uapi/linux/blktrace_api.h
> > +++ b/include/uapi/linux/blktrace_api.h
> > @@ -129,6 +129,9 @@ enum {
> >  };
> >
> >  #define BLKTRACE_BDEV_SIZE   32
> > +#define BLKTRACE_MAX_BUFSIZ  (1024 * 1024)
> > +#define BLKTRACE_MAX_BUFNR   16
> > +#define BLKTRACE_MAX_ALLOC   ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR))
>
> Adding these constants to the user space API is a very inflexible
> approach. There must be better approaches to export constants like these
> to user space, e.g. through sysfs attributes.
>
> It probably will be easier to get a patch like this one upstream if the
> uapi headers are not modified.

Thanks, makes sense. I'll do this in the V2. If we do that, we still
need to define the default values for these limits. Do the values
chosen for MAX_BUFSIZE and MAX_ALLOC look reasonable?

>
> >  /*
> >   * User setup structure passed with BLKTRACESETUP
> > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> > index ea47f2084087..b3b0a8164c05 100644
> > --- a/kernel/trace/blktrace.c
> > +++ b/kernel/trace/blktrace.c
> > @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> >       if (!buts->buf_size || !buts->buf_nr)
> >               return -EINVAL;
> >
> > +     if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC)
> > +             return -E2BIG;
> > +
> >       if (!blk_debugfs_root)
> >               return -ENOENT;
>
> Where do the overflow(s) happen if buts->buf_size and buts->buf_nr are
> large? Is the following code from relay_open() the only code that can
> overflow?
>
>         chan->alloc_size = PAGE_ALIGN(subbuf_size * n_subbufs);

Good catch, I'm not sure if the overflows are checked for, but I maybe
wrong. Given the possibility of overflows, perhaps we should be
checking for individual upper bounds instead of the product?

Thanks,
Harshad

>
> Thanks,
>
> Bart.
>
Bart Van Assche June 5, 2020, 1:43 p.m. UTC | #5
On 2020-06-04 22:02, harshad shirwadkar wrote:
> Hi Bart,
> 
> On Thu, Jun 4, 2020 at 9:31 PM Bart Van Assche <bvanassche@acm.org> wrote:
>>
>> On 2020-06-03 22:44, Harshad Shirwadkar wrote:
>>> Make sure that user requested memory via BLKTRACESETUP is within
>>> bounds. This can be easily exploited by setting really large values
>>> for buf_size and buf_nr in BLKTRACESETUP ioctl.
>>>
>>> blktrace program has following hardcoded values for bufsize and bufnr:
>>> BUF_SIZE=(512 * 1024)
>>> BUF_NR=(4)
>>
>> Where is the code that imposes these limits? I haven't found it. Did I
>> perhaps overlook something?
> 
> From what I understand, these are not limits. These are hardcoded
> values used by blktrace userspace when it calls BLKTRACESETUP ioctl.
> Inside the kernel, before this patch there is no checking in the linux
> kernel.

Please do not move limits from userspace tools into the kernel. Doing so
would break other user space software (not sure if there is any) that
uses the same ioctl but different limits.

>>> diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
>>> index 690621b610e5..4d9dc44a83f9 100644
>>> --- a/include/uapi/linux/blktrace_api.h
>>> +++ b/include/uapi/linux/blktrace_api.h
>>> @@ -129,6 +129,9 @@ enum {
>>>  };
>>>
>>>  #define BLKTRACE_BDEV_SIZE   32
>>> +#define BLKTRACE_MAX_BUFSIZ  (1024 * 1024)
>>> +#define BLKTRACE_MAX_BUFNR   16
>>> +#define BLKTRACE_MAX_ALLOC   ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR))
>>
>> Adding these constants to the user space API is a very inflexible
>> approach. There must be better approaches to export constants like these
>> to user space, e.g. through sysfs attributes.
>>
>> It probably will be easier to get a patch like this one upstream if the
>> uapi headers are not modified.
> 
> Thanks, makes sense. I'll do this in the V2. If we do that, we still
> need to define the default values for these limits. Do the values
> chosen for MAX_BUFSIZE and MAX_ALLOC look reasonable?

We typically do not implement arbitrary limits in the kernel. So I'd
prefer not to introduce any artificial limits.

>>
>>>  /*
>>>   * User setup structure passed with BLKTRACESETUP
>>> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
>>> index ea47f2084087..b3b0a8164c05 100644
>>> --- a/kernel/trace/blktrace.c
>>> +++ b/kernel/trace/blktrace.c
>>> @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
>>>       if (!buts->buf_size || !buts->buf_nr)
>>>               return -EINVAL;
>>>
>>> +     if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC)
>>> +             return -E2BIG;
>>> +
>>>       if (!blk_debugfs_root)
>>>               return -ENOENT;
>>
>> Where do the overflow(s) happen if buts->buf_size and buts->buf_nr are
>> large? Is the following code from relay_open() the only code that can
>> overflow?
>>
>>         chan->alloc_size = PAGE_ALIGN(subbuf_size * n_subbufs);
> 
> Good catch, I'm not sure if the overflows are checked for, but I maybe
> wrong. Given the possibility of overflows, perhaps we should be
> checking for individual upper bounds instead of the product?

How about using check_mul_overflow() to catch overflows of this
multiplication? There may be other statements that need protection
against overflow.

Thanks,

Bart.
harshad shirwadkar June 5, 2020, 5:39 p.m. UTC | #6
On Fri, Jun 5, 2020 at 6:43 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On 2020-06-04 22:02, harshad shirwadkar wrote:
> > Hi Bart,
> >
> > On Thu, Jun 4, 2020 at 9:31 PM Bart Van Assche <bvanassche@acm.org> wrote:
> >>
> >> On 2020-06-03 22:44, Harshad Shirwadkar wrote:
> >>> Make sure that user requested memory via BLKTRACESETUP is within
> >>> bounds. This can be easily exploited by setting really large values
> >>> for buf_size and buf_nr in BLKTRACESETUP ioctl.
> >>>
> >>> blktrace program has following hardcoded values for bufsize and bufnr:
> >>> BUF_SIZE=(512 * 1024)
> >>> BUF_NR=(4)
> >>
> >> Where is the code that imposes these limits? I haven't found it. Did I
> >> perhaps overlook something?
> >
> > From what I understand, these are not limits. These are hardcoded
> > values used by blktrace userspace when it calls BLKTRACESETUP ioctl.
> > Inside the kernel, before this patch there is no checking in the linux
> > kernel.
>
> Please do not move limits from userspace tools into the kernel. Doing so
> would break other user space software (not sure if there is any) that
> uses the same ioctl but different limits.
Ack.
>
> >>> diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
> >>> index 690621b610e5..4d9dc44a83f9 100644
> >>> --- a/include/uapi/linux/blktrace_api.h
> >>> +++ b/include/uapi/linux/blktrace_api.h
> >>> @@ -129,6 +129,9 @@ enum {
> >>>  };
> >>>
> >>>  #define BLKTRACE_BDEV_SIZE   32
> >>> +#define BLKTRACE_MAX_BUFSIZ  (1024 * 1024)
> >>> +#define BLKTRACE_MAX_BUFNR   16
> >>> +#define BLKTRACE_MAX_ALLOC   ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR))
> >>
> >> Adding these constants to the user space API is a very inflexible
> >> approach. There must be better approaches to export constants like these
> >> to user space, e.g. through sysfs attributes.
> >>
> >> It probably will be easier to get a patch like this one upstream if the
> >> uapi headers are not modified.
> >
> > Thanks, makes sense. I'll do this in the V2. If we do that, we still
> > need to define the default values for these limits. Do the values
> > chosen for MAX_BUFSIZE and MAX_ALLOC look reasonable?
>
> We typically do not implement arbitrary limits in the kernel. So I'd
> prefer not to introduce any artificial limits.
I see. But my worry is that if we don't check for bounds in the kernel
in this case, a non-root user who has access to any block device (even
a simple loop device) can make the kernel go out of memory.
---
...
int main(int argc, char *argv[])
{
        struct blk_user_trace_setup buts;
        int fd;
        int ret;

        fd = open(argv[1], O_RDONLY|O_NONBLOCK);

        memset(&buts, 0, sizeof(buts));
        buts.buf_size = ~0;
        buts.buf_nr = 1;
        buts.act_mask = 65535;
        return ioctl(fd, BLKTRACESETUP, &buts);
}
---

I understand that introducing artificial hard-coded limits is probably
not the best approach. However, not having a reasonable sanity
checking on the IOCTL arguments, especially when the IOCTL itself
doesn't require any special capability checks and the kernel uses
those arguments almost as-is for allocating memory seems like a
vulnerability that attackers can exploit.

>
> >>
> >>>  /*
> >>>   * User setup structure passed with BLKTRACESETUP
> >>> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> >>> index ea47f2084087..b3b0a8164c05 100644
> >>> --- a/kernel/trace/blktrace.c
> >>> +++ b/kernel/trace/blktrace.c
> >>> @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
> >>>       if (!buts->buf_size || !buts->buf_nr)
> >>>               return -EINVAL;
> >>>
> >>> +     if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC)
> >>> +             return -E2BIG;
> >>> +
> >>>       if (!blk_debugfs_root)
> >>>               return -ENOENT;
> >>
> >> Where do the overflow(s) happen if buts->buf_size and buts->buf_nr are
> >> large? Is the following code from relay_open() the only code that can
> >> overflow?
> >>
> >>         chan->alloc_size = PAGE_ALIGN(subbuf_size * n_subbufs);
> >
> > Good catch, I'm not sure if the overflows are checked for, but I maybe
> > wrong. Given the possibility of overflows, perhaps we should be
> > checking for individual upper bounds instead of the product?
>
> How about using check_mul_overflow() to catch overflows of this
> multiplication? There may be other statements that need protection
> against overflow.

Makes sense. I can add that check.

Thanks,
Harshad.

>
> Thanks,
>
> Bart.
Chaitanya Kulkarni June 8, 2020, 6:34 a.m. UTC | #7
Bart,

On 6/4/20 9:31 PM, Bart Van Assche wrote:
> Where is the code that imposes these limits? I haven't found it. Did I
> perhaps overlook something?It is in blktrace.c user-space tool.

   55
   56 /*
   57  * You may want to increase this even more, if you are logging at 
a high
   58  * rate and see skipped/missed events
   59  */
   60 #define BUF_SIZE                (512 * 1024)
   61 #define BUF_NR                  (4)
   62
   63 #define FILE_VBUF_SIZE          (128 * 1024)
   64
   65 #define DEBUGFS_TYPE            (0x64626720)
   66 #define TRACE_NET_PORT          (8462)
   67
Chaitanya Kulkarni June 8, 2020, 6:40 a.m. UTC | #8
Bart,
On 6/5/20 6:43 AM, Bart Van Assche wrote:
> We typically do not implement arbitrary limits in the kernel. So I'd
> prefer not to introduce any artificial limits.

That is what I mentioned in [1] that we can add a check suggested in 
[1]. That way we will not enforce any limits in the kernel and keep
the backward compatibility.

Do you see any problem with the approach suggested in [1].

[1] https://www.spinics.net/lists/linux-block/msg54754.html
Chaitanya Kulkarni June 8, 2020, 6:42 a.m. UTC | #9
Harshad,

On 6/5/20 10:40 AM, harshad shirwadkar wrote:
> I see. But my worry is that if we don't check for bounds in the kernel
> in this case, a non-root user who has access to any block device (even
> a simple loop device) can make the kernel go out of memory.
> ---
> ...
> int main(int argc, char *argv[])
> {
>          struct blk_user_trace_setup buts;
>          int fd;
>          int ret;
> 
>          fd = open(argv[1], O_RDONLY|O_NONBLOCK);
> 
>          memset(&buts, 0, sizeof(buts));
>          buts.buf_size = ~0;
>          buts.buf_nr = 1;
>          buts.act_mask = 65535;
>          return ioctl(fd, BLKTRACESETUP, &buts);
> }
> ---
> 
> I understand that introducing artificial hard-coded limits is probably
> not the best approach. However, not having a reasonable sanity
> checking on the IOCTL arguments, especially when the IOCTL itself
> doesn't require any special capability checks and the kernel uses
> those arguments almost as-is for allocating memory seems like a
> vulnerability that attackers can exploit.

Yes this is a problem and needs to be fixed with keeping the backward
compatibility as per pointed out by Bart.

I'm not aware of the applications (apart from blktrace-blkparse 
userspace tools) which implement block trace userspace interface,
can you please point me to the application ?

If so those needs to be fixed also than imposing hard limits in kernel.
Bart Van Assche June 8, 2020, 2:13 p.m. UTC | #10
On 2020-06-07 23:34, Chaitanya Kulkarni wrote:
> Bart,
> 
> On 6/4/20 9:31 PM, Bart Van Assche wrote:
>> Where is the code that imposes these limits? I haven't found it. Did I
>> perhaps overlook something?
>
> It is in blktrace.c user-space tool.
> 
>    55
>    56 /*
>    57  * You may want to increase this even more, if you are logging at 
> a high
>    58  * rate and see skipped/missed events
>    59  */
>    60 #define BUF_SIZE                (512 * 1024)
>    61 #define BUF_NR                  (4)
>    62
>    63 #define FILE_VBUF_SIZE          (128 * 1024)
>    64
>    65 #define DEBUGFS_TYPE            (0x64626720)
>    66 #define TRACE_NET_PORT          (8462)
>    67

Hi Chaitanya,

With my question I was referring to kernel code. I think Harshad
understood that part of my question.

Thanks,

Bart.
Bart Van Assche June 8, 2020, 2:20 p.m. UTC | #11
On 2020-06-07 23:40, Chaitanya Kulkarni wrote:
> Bart,
> On 6/5/20 6:43 AM, Bart Van Assche wrote:
>> We typically do not implement arbitrary limits in the kernel. So I'd
>> prefer not to introduce any artificial limits.
> 
> That is what I mentioned in [1] that we can add a check suggested in 
> [1]. That way we will not enforce any limits in the kernel and keep
> the backward compatibility.
> 
> Do you see any problem with the approach suggested in [1].
> 
> [1] https://www.spinics.net/lists/linux-block/msg54754.html

Please take another look at Harshad's patch description. My
understanding is that Harshad wants to protect the kernel against
malicious user space software. Modifying the user space blktrace
software as proposed in [1] doesn't help at all towards the goal of
hardening the kernel.

Thanks,

Bart.
Chaitanya Kulkarni June 8, 2020, 9:59 p.m. UTC | #12
Bart,
On 6/8/20 7:20 AM, Bart Van Assche wrote:
> On 2020-06-07 23:40, Chaitanya Kulkarni wrote:
>> Bart,
>> On 6/5/20 6:43 AM, Bart Van Assche wrote:
>>> We typically do not implement arbitrary limits in the kernel. So I'd
>>> prefer not to introduce any artificial limits.
>> That is what I mentioned in [1] that we can add a check suggested in
>> [1]. That way we will not enforce any limits in the kernel and keep
>> the backward compatibility.
>>
>> Do you see any problem with the approach suggested in [1].
>>
>> [1]https://www.spinics.net/lists/linux-block/msg54754.html
> Please take another look at Harshad's patch description. My
> understanding is that Harshad wants to protect the kernel against
> malicious user space software. Modifying the user space blktrace
> software as proposed in [1] doesn't help at all towards the goal of
> hardening the kernel.
> 
> Thanks,
> 
> Bart.
> 

Hmmm, I agree that we need fix for that. What I did't understand that 
why we don't need userspace fix ?

Also, what is a right way to impose these limits without having any 
bounds in kernel ?

Either I did not understand your comment(s) or I'm confuse.

Can you please elaborate ?
harshad shirwadkar June 8, 2020, 11:40 p.m. UTC | #13
Given that this is a kernel bug and can be exploited by any user-space
application, the fix in the kernel is a must have. But we don't want
to break any existing user-space applications. So, we do need to go
make sure that this patch doesn't break any existing applications. I
originally sent this patch assuming that blktrace is the only user of
this IOCTL. So, if anyone knows any other callers that we need to
investigate, please let me know. I have verified that these limits
don't break blktrace.

On Mon, Jun 8, 2020 at 2:59 PM Chaitanya Kulkarni
<Chaitanya.Kulkarni@wdc.com> wrote:
>
> Bart,
> On 6/8/20 7:20 AM, Bart Van Assche wrote:
> > On 2020-06-07 23:40, Chaitanya Kulkarni wrote:
> >> Bart,
> >> On 6/5/20 6:43 AM, Bart Van Assche wrote:
> >>> We typically do not implement arbitrary limits in the kernel. So I'd
> >>> prefer not to introduce any artificial limits.
> >> That is what I mentioned in [1] that we can add a check suggested in
> >> [1]. That way we will not enforce any limits in the kernel and keep
> >> the backward compatibility.
> >>
> >> Do you see any problem with the approach suggested in [1].
> >>
> >> [1]https://www.spinics.net/lists/linux-block/msg54754.html
> > Please take another look at Harshad's patch description. My
> > understanding is that Harshad wants to protect the kernel against
> > malicious user space software. Modifying the user space blktrace
> > software as proposed in [1] doesn't help at all towards the goal of
> > hardening the kernel.
> >
> > Thanks,
> >
> > Bart.
> >
>
> Hmmm, I agree that we need fix for that. What I did't understand that
> why we don't need userspace fix ?
>
> Also, what is a right way to impose these limits without having any
> bounds in kernel ?
From what I understand, there's no alternative to having a fix in the
kernel. That's because, if the kernel is not fixed and only the
commonly used user-space apps are fixed, I can always write a new
program to break the kernel. So, as mentioned above, maybe we can make
these limits configurable via sysfs but we'll need these bound checks
in the kernel.

Thanks,
Harshad
>
> Either I did not understand your comment(s) or I'm confuse.
>
> Can you please elaborate ?
Chaitanya Kulkarni June 9, 2020, midnight UTC | #14
On 6/8/20 4:40 PM, harshad shirwadkar wrote:
>  From what I understand, there's no alternative to having a fix in the
> kernel. That's because, if the kernel is not fixed and only the
> commonly used user-space apps are fixed, I can always write a new
> program to break the kernel. So, as mentioned above, maybe we can make
> these limits configurable via sysfs but we'll need these bound checks
> in the kernel.

Okay, thanks for the explanation.
diff mbox series

Patch

diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h
index 690621b610e5..4d9dc44a83f9 100644
--- a/include/uapi/linux/blktrace_api.h
+++ b/include/uapi/linux/blktrace_api.h
@@ -129,6 +129,9 @@  enum {
 };
 
 #define BLKTRACE_BDEV_SIZE	32
+#define BLKTRACE_MAX_BUFSIZ	(1024 * 1024)
+#define BLKTRACE_MAX_BUFNR	16
+#define BLKTRACE_MAX_ALLOC	((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR))
 
 /*
  * User setup structure passed with BLKTRACESETUP
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index ea47f2084087..b3b0a8164c05 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -482,6 +482,9 @@  static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
 	if (!buts->buf_size || !buts->buf_nr)
 		return -EINVAL;
 
+	if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC)
+		return -E2BIG;
+
 	if (!blk_debugfs_root)
 		return -ENOENT;