Message ID | 20200604054434.216698-1-harshadshirwadkar@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blktrace: put bounds on BLKTRACESETUP buf_size and buf_nr | expand |
On 6/3/20 10:45 PM, Harshad Shirwadkar wrote: > Make sure that user requested memory via BLKTRACESETUP is within > bounds. This can be easily exploited by setting really large values > for buf_size and buf_nr in BLKTRACESETUP ioctl. > > blktrace program has following hardcoded values for bufsize and bufnr: > BUF_SIZE=(512 * 1024) > BUF_NR=(4) > > We add buffer to this and define the upper bound to be as follows: > BUF_SIZE=(1024 * 1024) > BUF_NR=(16) > Aren't these values should be system dependent to start with? I wonder what are the side effects of having hard-coded values ? > This is very easy to exploit. Setting buf_size / buf_nr in userspace > program to big values make kernel go oom. Verified that the fix makes > BLKTRACESETUP return -E2BIG if the buf_size * buf_nr crosses the upper > bound. > > Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> > --- > include/uapi/linux/blktrace_api.h | 3 +++ > kernel/trace/blktrace.c | 3 +++ > 2 files changed, 6 insertions(+) > > diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h > index 690621b610e5..4d9dc44a83f9 100644 > --- a/include/uapi/linux/blktrace_api.h > +++ b/include/uapi/linux/blktrace_api.h > @@ -129,6 +129,9 @@ enum { > }; > > #define BLKTRACE_BDEV_SIZE 32 > +#define BLKTRACE_MAX_BUFSIZ (1024 * 1024) > +#define BLKTRACE_MAX_BUFNR 16 > +#define BLKTRACE_MAX_ALLOC ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR)) > This is an API change and should be reflected to kernel & userspace tools. One way of doing it is to remove BUF_SIZE and BUF_NR and sync kernel header with userspace blktrace_api.h. > /* > * User setup structure passed with BLKTRACESETUP > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c > index ea47f2084087..b3b0a8164c05 100644 > --- a/kernel/trace/blktrace.c > +++ b/kernel/trace/blktrace.c > @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, > if (!buts->buf_size || !buts->buf_nr) > return -EINVAL; > > + if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC) > + return -E2BIG; > + On the other hand we can easily get rid of the kernel part by moving this check into user space tools, this will minimize the change diff and we will still have sane behavior. What about something like this (completely untested/not compiled) :- diff --git a/blktrace.c b/blktrace.c index d0d271f..d6a9f39 100644 --- a/blktrace.c +++ b/blktrace.c @@ -2247,6 +2247,9 @@ static int handle_args(int argc, char *argv[]) if (net_mode == Net_client && net_setup_addr()) return 1; + /* Check for unsually large numbers for buffers */ + if (buf_size * buf_nr > BLKTRACE_MAX_ALLOC) + return -E2BIG; /* * Set up for appropriate PFD handler based upon output name. */ > if (!blk_debugfs_root) > return -ENOENT; > >
On Thu, Jun 4, 2020 at 12:10 AM Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com> wrote: > > On 6/3/20 10:45 PM, Harshad Shirwadkar wrote: > > Make sure that user requested memory via BLKTRACESETUP is within > > bounds. This can be easily exploited by setting really large values > > for buf_size and buf_nr in BLKTRACESETUP ioctl. > > > > blktrace program has following hardcoded values for bufsize and bufnr: > > BUF_SIZE=(512 * 1024) > > BUF_NR=(4) > > > > We add buffer to this and define the upper bound to be as follows: > > BUF_SIZE=(1024 * 1024) > > BUF_NR=(16) > > > Aren't these values should be system dependent to start with? > I wonder what are the side effects of having hard-coded values ? Sure we can have system dependent bounds. But, the goal here is just only to ensure that the caller doesn't call BLKTRACESETUP with unreasonable values and potentially crash the kernel. Given that, do we really need to have upper-bound to be system dependent? wouldn't a hard-coded but reasonable upper-bound be sufficient? As of now blktrace also uses hardcoded values. > > This is very easy to exploit. Setting buf_size / buf_nr in userspace > > program to big values make kernel go oom. Verified that the fix makes > > BLKTRACESETUP return -E2BIG if the buf_size * buf_nr crosses the upper > > bound. > > > > Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> > > --- > > include/uapi/linux/blktrace_api.h | 3 +++ > > kernel/trace/blktrace.c | 3 +++ > > 2 files changed, 6 insertions(+) > > > > diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h > > index 690621b610e5..4d9dc44a83f9 100644 > > --- a/include/uapi/linux/blktrace_api.h > > +++ b/include/uapi/linux/blktrace_api.h > > @@ -129,6 +129,9 @@ enum { > > }; > > > > #define BLKTRACE_BDEV_SIZE 32 > > +#define BLKTRACE_MAX_BUFSIZ (1024 * 1024) > > +#define BLKTRACE_MAX_BUFNR 16 > > +#define BLKTRACE_MAX_ALLOC ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR)) > > > This is an API change and should be reflected to kernel & userspace > tools. One way of doing it is to remove BUF_SIZE and BUF_NR and sync > kernel header with userspace blktrace_api.h. Thanks, so we should make a change in userspace header file too. > > /* > > * User setup structure passed with BLKTRACESETUP > > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c > > index ea47f2084087..b3b0a8164c05 100644 > > --- a/kernel/trace/blktrace.c > > +++ b/kernel/trace/blktrace.c > > @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, > > if (!buts->buf_size || !buts->buf_nr) > > return -EINVAL; > > > > + if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC) > > + return -E2BIG; > > + > > On the other hand we can easily get rid of the kernel part by moving > this check into user space tools, this will minimize the change diff and > we will still have sane behavior. > > What about something like this (completely untested/not compiled) :- > > diff --git a/blktrace.c b/blktrace.c > index d0d271f..d6a9f39 100644 > --- a/blktrace.c > +++ b/blktrace.c > @@ -2247,6 +2247,9 @@ static int handle_args(int argc, char *argv[]) > if (net_mode == Net_client && net_setup_addr()) > return 1; > > + /* Check for unsually large numbers for buffers */ > + if (buf_size * buf_nr > BLKTRACE_MAX_ALLOC) > + return -E2BIG; > /* > * Set up for appropriate PFD handler based upon output name. > */ > > > if (!blk_debugfs_root) > > return -ENOENT; > > > > The user-space fix would fix blktrace binary but it's still easy to exploit this by writing a really quick program that calls BLKTRACESETUP ioctl with unreasonably high values for buf_size and buf_nr. Thanks, Harshad >
On 2020-06-03 22:44, Harshad Shirwadkar wrote: > Make sure that user requested memory via BLKTRACESETUP is within > bounds. This can be easily exploited by setting really large values > for buf_size and buf_nr in BLKTRACESETUP ioctl. > > blktrace program has following hardcoded values for bufsize and bufnr: > BUF_SIZE=(512 * 1024) > BUF_NR=(4) Where is the code that imposes these limits? I haven't found it. Did I perhaps overlook something? > diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h > index 690621b610e5..4d9dc44a83f9 100644 > --- a/include/uapi/linux/blktrace_api.h > +++ b/include/uapi/linux/blktrace_api.h > @@ -129,6 +129,9 @@ enum { > }; > > #define BLKTRACE_BDEV_SIZE 32 > +#define BLKTRACE_MAX_BUFSIZ (1024 * 1024) > +#define BLKTRACE_MAX_BUFNR 16 > +#define BLKTRACE_MAX_ALLOC ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR)) Adding these constants to the user space API is a very inflexible approach. There must be better approaches to export constants like these to user space, e.g. through sysfs attributes. It probably will be easier to get a patch like this one upstream if the uapi headers are not modified. > /* > * User setup structure passed with BLKTRACESETUP > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c > index ea47f2084087..b3b0a8164c05 100644 > --- a/kernel/trace/blktrace.c > +++ b/kernel/trace/blktrace.c > @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, > if (!buts->buf_size || !buts->buf_nr) > return -EINVAL; > > + if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC) > + return -E2BIG; > + > if (!blk_debugfs_root) > return -ENOENT; Where do the overflow(s) happen if buts->buf_size and buts->buf_nr are large? Is the following code from relay_open() the only code that can overflow? chan->alloc_size = PAGE_ALIGN(subbuf_size * n_subbufs); Thanks, Bart.
Hi Bart, On Thu, Jun 4, 2020 at 9:31 PM Bart Van Assche <bvanassche@acm.org> wrote: > > On 2020-06-03 22:44, Harshad Shirwadkar wrote: > > Make sure that user requested memory via BLKTRACESETUP is within > > bounds. This can be easily exploited by setting really large values > > for buf_size and buf_nr in BLKTRACESETUP ioctl. > > > > blktrace program has following hardcoded values for bufsize and bufnr: > > BUF_SIZE=(512 * 1024) > > BUF_NR=(4) > > Where is the code that imposes these limits? I haven't found it. Did I > perhaps overlook something? From what I understand, these are not limits. These are hardcoded values used by blktrace userspace when it calls BLKTRACESETUP ioctl. Inside the kernel, before this patch there is no checking in the linux kernel. > > > diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h > > index 690621b610e5..4d9dc44a83f9 100644 > > --- a/include/uapi/linux/blktrace_api.h > > +++ b/include/uapi/linux/blktrace_api.h > > @@ -129,6 +129,9 @@ enum { > > }; > > > > #define BLKTRACE_BDEV_SIZE 32 > > +#define BLKTRACE_MAX_BUFSIZ (1024 * 1024) > > +#define BLKTRACE_MAX_BUFNR 16 > > +#define BLKTRACE_MAX_ALLOC ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR)) > > Adding these constants to the user space API is a very inflexible > approach. There must be better approaches to export constants like these > to user space, e.g. through sysfs attributes. > > It probably will be easier to get a patch like this one upstream if the > uapi headers are not modified. Thanks, makes sense. I'll do this in the V2. If we do that, we still need to define the default values for these limits. Do the values chosen for MAX_BUFSIZE and MAX_ALLOC look reasonable? > > > /* > > * User setup structure passed with BLKTRACESETUP > > diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c > > index ea47f2084087..b3b0a8164c05 100644 > > --- a/kernel/trace/blktrace.c > > +++ b/kernel/trace/blktrace.c > > @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, > > if (!buts->buf_size || !buts->buf_nr) > > return -EINVAL; > > > > + if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC) > > + return -E2BIG; > > + > > if (!blk_debugfs_root) > > return -ENOENT; > > Where do the overflow(s) happen if buts->buf_size and buts->buf_nr are > large? Is the following code from relay_open() the only code that can > overflow? > > chan->alloc_size = PAGE_ALIGN(subbuf_size * n_subbufs); Good catch, I'm not sure if the overflows are checked for, but I maybe wrong. Given the possibility of overflows, perhaps we should be checking for individual upper bounds instead of the product? Thanks, Harshad > > Thanks, > > Bart. >
On 2020-06-04 22:02, harshad shirwadkar wrote: > Hi Bart, > > On Thu, Jun 4, 2020 at 9:31 PM Bart Van Assche <bvanassche@acm.org> wrote: >> >> On 2020-06-03 22:44, Harshad Shirwadkar wrote: >>> Make sure that user requested memory via BLKTRACESETUP is within >>> bounds. This can be easily exploited by setting really large values >>> for buf_size and buf_nr in BLKTRACESETUP ioctl. >>> >>> blktrace program has following hardcoded values for bufsize and bufnr: >>> BUF_SIZE=(512 * 1024) >>> BUF_NR=(4) >> >> Where is the code that imposes these limits? I haven't found it. Did I >> perhaps overlook something? > > From what I understand, these are not limits. These are hardcoded > values used by blktrace userspace when it calls BLKTRACESETUP ioctl. > Inside the kernel, before this patch there is no checking in the linux > kernel. Please do not move limits from userspace tools into the kernel. Doing so would break other user space software (not sure if there is any) that uses the same ioctl but different limits. >>> diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h >>> index 690621b610e5..4d9dc44a83f9 100644 >>> --- a/include/uapi/linux/blktrace_api.h >>> +++ b/include/uapi/linux/blktrace_api.h >>> @@ -129,6 +129,9 @@ enum { >>> }; >>> >>> #define BLKTRACE_BDEV_SIZE 32 >>> +#define BLKTRACE_MAX_BUFSIZ (1024 * 1024) >>> +#define BLKTRACE_MAX_BUFNR 16 >>> +#define BLKTRACE_MAX_ALLOC ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR)) >> >> Adding these constants to the user space API is a very inflexible >> approach. There must be better approaches to export constants like these >> to user space, e.g. through sysfs attributes. >> >> It probably will be easier to get a patch like this one upstream if the >> uapi headers are not modified. > > Thanks, makes sense. I'll do this in the V2. If we do that, we still > need to define the default values for these limits. Do the values > chosen for MAX_BUFSIZE and MAX_ALLOC look reasonable? We typically do not implement arbitrary limits in the kernel. So I'd prefer not to introduce any artificial limits. >> >>> /* >>> * User setup structure passed with BLKTRACESETUP >>> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c >>> index ea47f2084087..b3b0a8164c05 100644 >>> --- a/kernel/trace/blktrace.c >>> +++ b/kernel/trace/blktrace.c >>> @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, >>> if (!buts->buf_size || !buts->buf_nr) >>> return -EINVAL; >>> >>> + if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC) >>> + return -E2BIG; >>> + >>> if (!blk_debugfs_root) >>> return -ENOENT; >> >> Where do the overflow(s) happen if buts->buf_size and buts->buf_nr are >> large? Is the following code from relay_open() the only code that can >> overflow? >> >> chan->alloc_size = PAGE_ALIGN(subbuf_size * n_subbufs); > > Good catch, I'm not sure if the overflows are checked for, but I maybe > wrong. Given the possibility of overflows, perhaps we should be > checking for individual upper bounds instead of the product? How about using check_mul_overflow() to catch overflows of this multiplication? There may be other statements that need protection against overflow. Thanks, Bart.
On Fri, Jun 5, 2020 at 6:43 AM Bart Van Assche <bvanassche@acm.org> wrote: > > On 2020-06-04 22:02, harshad shirwadkar wrote: > > Hi Bart, > > > > On Thu, Jun 4, 2020 at 9:31 PM Bart Van Assche <bvanassche@acm.org> wrote: > >> > >> On 2020-06-03 22:44, Harshad Shirwadkar wrote: > >>> Make sure that user requested memory via BLKTRACESETUP is within > >>> bounds. This can be easily exploited by setting really large values > >>> for buf_size and buf_nr in BLKTRACESETUP ioctl. > >>> > >>> blktrace program has following hardcoded values for bufsize and bufnr: > >>> BUF_SIZE=(512 * 1024) > >>> BUF_NR=(4) > >> > >> Where is the code that imposes these limits? I haven't found it. Did I > >> perhaps overlook something? > > > > From what I understand, these are not limits. These are hardcoded > > values used by blktrace userspace when it calls BLKTRACESETUP ioctl. > > Inside the kernel, before this patch there is no checking in the linux > > kernel. > > Please do not move limits from userspace tools into the kernel. Doing so > would break other user space software (not sure if there is any) that > uses the same ioctl but different limits. Ack. > > >>> diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h > >>> index 690621b610e5..4d9dc44a83f9 100644 > >>> --- a/include/uapi/linux/blktrace_api.h > >>> +++ b/include/uapi/linux/blktrace_api.h > >>> @@ -129,6 +129,9 @@ enum { > >>> }; > >>> > >>> #define BLKTRACE_BDEV_SIZE 32 > >>> +#define BLKTRACE_MAX_BUFSIZ (1024 * 1024) > >>> +#define BLKTRACE_MAX_BUFNR 16 > >>> +#define BLKTRACE_MAX_ALLOC ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR)) > >> > >> Adding these constants to the user space API is a very inflexible > >> approach. There must be better approaches to export constants like these > >> to user space, e.g. through sysfs attributes. > >> > >> It probably will be easier to get a patch like this one upstream if the > >> uapi headers are not modified. > > > > Thanks, makes sense. I'll do this in the V2. If we do that, we still > > need to define the default values for these limits. Do the values > > chosen for MAX_BUFSIZE and MAX_ALLOC look reasonable? > > We typically do not implement arbitrary limits in the kernel. So I'd > prefer not to introduce any artificial limits. I see. But my worry is that if we don't check for bounds in the kernel in this case, a non-root user who has access to any block device (even a simple loop device) can make the kernel go out of memory. --- ... int main(int argc, char *argv[]) { struct blk_user_trace_setup buts; int fd; int ret; fd = open(argv[1], O_RDONLY|O_NONBLOCK); memset(&buts, 0, sizeof(buts)); buts.buf_size = ~0; buts.buf_nr = 1; buts.act_mask = 65535; return ioctl(fd, BLKTRACESETUP, &buts); } --- I understand that introducing artificial hard-coded limits is probably not the best approach. However, not having a reasonable sanity checking on the IOCTL arguments, especially when the IOCTL itself doesn't require any special capability checks and the kernel uses those arguments almost as-is for allocating memory seems like a vulnerability that attackers can exploit. > > >> > >>> /* > >>> * User setup structure passed with BLKTRACESETUP > >>> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c > >>> index ea47f2084087..b3b0a8164c05 100644 > >>> --- a/kernel/trace/blktrace.c > >>> +++ b/kernel/trace/blktrace.c > >>> @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, > >>> if (!buts->buf_size || !buts->buf_nr) > >>> return -EINVAL; > >>> > >>> + if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC) > >>> + return -E2BIG; > >>> + > >>> if (!blk_debugfs_root) > >>> return -ENOENT; > >> > >> Where do the overflow(s) happen if buts->buf_size and buts->buf_nr are > >> large? Is the following code from relay_open() the only code that can > >> overflow? > >> > >> chan->alloc_size = PAGE_ALIGN(subbuf_size * n_subbufs); > > > > Good catch, I'm not sure if the overflows are checked for, but I maybe > > wrong. Given the possibility of overflows, perhaps we should be > > checking for individual upper bounds instead of the product? > > How about using check_mul_overflow() to catch overflows of this > multiplication? There may be other statements that need protection > against overflow. Makes sense. I can add that check. Thanks, Harshad. > > Thanks, > > Bart.
Bart, On 6/4/20 9:31 PM, Bart Van Assche wrote: > Where is the code that imposes these limits? I haven't found it. Did I > perhaps overlook something?It is in blktrace.c user-space tool. 55 56 /* 57 * You may want to increase this even more, if you are logging at a high 58 * rate and see skipped/missed events 59 */ 60 #define BUF_SIZE (512 * 1024) 61 #define BUF_NR (4) 62 63 #define FILE_VBUF_SIZE (128 * 1024) 64 65 #define DEBUGFS_TYPE (0x64626720) 66 #define TRACE_NET_PORT (8462) 67
Bart, On 6/5/20 6:43 AM, Bart Van Assche wrote: > We typically do not implement arbitrary limits in the kernel. So I'd > prefer not to introduce any artificial limits. That is what I mentioned in [1] that we can add a check suggested in [1]. That way we will not enforce any limits in the kernel and keep the backward compatibility. Do you see any problem with the approach suggested in [1]. [1] https://www.spinics.net/lists/linux-block/msg54754.html
Harshad, On 6/5/20 10:40 AM, harshad shirwadkar wrote: > I see. But my worry is that if we don't check for bounds in the kernel > in this case, a non-root user who has access to any block device (even > a simple loop device) can make the kernel go out of memory. > --- > ... > int main(int argc, char *argv[]) > { > struct blk_user_trace_setup buts; > int fd; > int ret; > > fd = open(argv[1], O_RDONLY|O_NONBLOCK); > > memset(&buts, 0, sizeof(buts)); > buts.buf_size = ~0; > buts.buf_nr = 1; > buts.act_mask = 65535; > return ioctl(fd, BLKTRACESETUP, &buts); > } > --- > > I understand that introducing artificial hard-coded limits is probably > not the best approach. However, not having a reasonable sanity > checking on the IOCTL arguments, especially when the IOCTL itself > doesn't require any special capability checks and the kernel uses > those arguments almost as-is for allocating memory seems like a > vulnerability that attackers can exploit. Yes this is a problem and needs to be fixed with keeping the backward compatibility as per pointed out by Bart. I'm not aware of the applications (apart from blktrace-blkparse userspace tools) which implement block trace userspace interface, can you please point me to the application ? If so those needs to be fixed also than imposing hard limits in kernel.
On 2020-06-07 23:34, Chaitanya Kulkarni wrote: > Bart, > > On 6/4/20 9:31 PM, Bart Van Assche wrote: >> Where is the code that imposes these limits? I haven't found it. Did I >> perhaps overlook something? > > It is in blktrace.c user-space tool. > > 55 > 56 /* > 57 * You may want to increase this even more, if you are logging at > a high > 58 * rate and see skipped/missed events > 59 */ > 60 #define BUF_SIZE (512 * 1024) > 61 #define BUF_NR (4) > 62 > 63 #define FILE_VBUF_SIZE (128 * 1024) > 64 > 65 #define DEBUGFS_TYPE (0x64626720) > 66 #define TRACE_NET_PORT (8462) > 67 Hi Chaitanya, With my question I was referring to kernel code. I think Harshad understood that part of my question. Thanks, Bart.
On 2020-06-07 23:40, Chaitanya Kulkarni wrote: > Bart, > On 6/5/20 6:43 AM, Bart Van Assche wrote: >> We typically do not implement arbitrary limits in the kernel. So I'd >> prefer not to introduce any artificial limits. > > That is what I mentioned in [1] that we can add a check suggested in > [1]. That way we will not enforce any limits in the kernel and keep > the backward compatibility. > > Do you see any problem with the approach suggested in [1]. > > [1] https://www.spinics.net/lists/linux-block/msg54754.html Please take another look at Harshad's patch description. My understanding is that Harshad wants to protect the kernel against malicious user space software. Modifying the user space blktrace software as proposed in [1] doesn't help at all towards the goal of hardening the kernel. Thanks, Bart.
Bart, On 6/8/20 7:20 AM, Bart Van Assche wrote: > On 2020-06-07 23:40, Chaitanya Kulkarni wrote: >> Bart, >> On 6/5/20 6:43 AM, Bart Van Assche wrote: >>> We typically do not implement arbitrary limits in the kernel. So I'd >>> prefer not to introduce any artificial limits. >> That is what I mentioned in [1] that we can add a check suggested in >> [1]. That way we will not enforce any limits in the kernel and keep >> the backward compatibility. >> >> Do you see any problem with the approach suggested in [1]. >> >> [1]https://www.spinics.net/lists/linux-block/msg54754.html > Please take another look at Harshad's patch description. My > understanding is that Harshad wants to protect the kernel against > malicious user space software. Modifying the user space blktrace > software as proposed in [1] doesn't help at all towards the goal of > hardening the kernel. > > Thanks, > > Bart. > Hmmm, I agree that we need fix for that. What I did't understand that why we don't need userspace fix ? Also, what is a right way to impose these limits without having any bounds in kernel ? Either I did not understand your comment(s) or I'm confuse. Can you please elaborate ?
Given that this is a kernel bug and can be exploited by any user-space application, the fix in the kernel is a must have. But we don't want to break any existing user-space applications. So, we do need to go make sure that this patch doesn't break any existing applications. I originally sent this patch assuming that blktrace is the only user of this IOCTL. So, if anyone knows any other callers that we need to investigate, please let me know. I have verified that these limits don't break blktrace. On Mon, Jun 8, 2020 at 2:59 PM Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com> wrote: > > Bart, > On 6/8/20 7:20 AM, Bart Van Assche wrote: > > On 2020-06-07 23:40, Chaitanya Kulkarni wrote: > >> Bart, > >> On 6/5/20 6:43 AM, Bart Van Assche wrote: > >>> We typically do not implement arbitrary limits in the kernel. So I'd > >>> prefer not to introduce any artificial limits. > >> That is what I mentioned in [1] that we can add a check suggested in > >> [1]. That way we will not enforce any limits in the kernel and keep > >> the backward compatibility. > >> > >> Do you see any problem with the approach suggested in [1]. > >> > >> [1]https://www.spinics.net/lists/linux-block/msg54754.html > > Please take another look at Harshad's patch description. My > > understanding is that Harshad wants to protect the kernel against > > malicious user space software. Modifying the user space blktrace > > software as proposed in [1] doesn't help at all towards the goal of > > hardening the kernel. > > > > Thanks, > > > > Bart. > > > > Hmmm, I agree that we need fix for that. What I did't understand that > why we don't need userspace fix ? > > Also, what is a right way to impose these limits without having any > bounds in kernel ? From what I understand, there's no alternative to having a fix in the kernel. That's because, if the kernel is not fixed and only the commonly used user-space apps are fixed, I can always write a new program to break the kernel. So, as mentioned above, maybe we can make these limits configurable via sysfs but we'll need these bound checks in the kernel. Thanks, Harshad > > Either I did not understand your comment(s) or I'm confuse. > > Can you please elaborate ?
On 6/8/20 4:40 PM, harshad shirwadkar wrote: > From what I understand, there's no alternative to having a fix in the > kernel. That's because, if the kernel is not fixed and only the > commonly used user-space apps are fixed, I can always write a new > program to break the kernel. So, as mentioned above, maybe we can make > these limits configurable via sysfs but we'll need these bound checks > in the kernel. Okay, thanks for the explanation.
diff --git a/include/uapi/linux/blktrace_api.h b/include/uapi/linux/blktrace_api.h index 690621b610e5..4d9dc44a83f9 100644 --- a/include/uapi/linux/blktrace_api.h +++ b/include/uapi/linux/blktrace_api.h @@ -129,6 +129,9 @@ enum { }; #define BLKTRACE_BDEV_SIZE 32 +#define BLKTRACE_MAX_BUFSIZ (1024 * 1024) +#define BLKTRACE_MAX_BUFNR 16 +#define BLKTRACE_MAX_ALLOC ((BLKTRACE_MAX_BUFNR) * (BLKTRACE_MAX_BUFNR)) /* * User setup structure passed with BLKTRACESETUP diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c index ea47f2084087..b3b0a8164c05 100644 --- a/kernel/trace/blktrace.c +++ b/kernel/trace/blktrace.c @@ -482,6 +482,9 @@ static int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev, if (!buts->buf_size || !buts->buf_nr) return -EINVAL; + if (buts->buf_size * buts->buf_nr > BLKTRACE_MAX_ALLOC) + return -E2BIG; + if (!blk_debugfs_root) return -ENOENT;
Make sure that user requested memory via BLKTRACESETUP is within bounds. This can be easily exploited by setting really large values for buf_size and buf_nr in BLKTRACESETUP ioctl. blktrace program has following hardcoded values for bufsize and bufnr: BUF_SIZE=(512 * 1024) BUF_NR=(4) We add buffer to this and define the upper bound to be as follows: BUF_SIZE=(1024 * 1024) BUF_NR=(16) This is very easy to exploit. Setting buf_size / buf_nr in userspace program to big values make kernel go oom. Verified that the fix makes BLKTRACESETUP return -E2BIG if the buf_size * buf_nr crosses the upper bound. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> --- include/uapi/linux/blktrace_api.h | 3 +++ kernel/trace/blktrace.c | 3 +++ 2 files changed, 6 insertions(+)