diff mbox series

block: check for page size in queue_logical_block_size()

Message ID 20200601005520.420719-1-mfo@canonical.com (mailing list archive)
State New, archived
Headers show
Series block: check for page size in queue_logical_block_size() | expand

Commit Message

Mauricio Faria de Oliveira June 1, 2020, 12:55 a.m. UTC
It's possible for a block driver to set logical block size to
a value greater than page size incorrectly; e.g. bcache takes
the value from the superblock, set by the user w/ make-bcache.

This causes a BUG/NULL pointer dereference in the path:

  __blkdev_get()
  -> set_init_blocksize() // set i_blkbits based on ...
     -> bdev_logical_block_size()
        -> queue_logical_block_size() // ... this value
  -> bdev_disk_changed()
     ...
     -> blkdev_readpage()
        -> block_read_full_page()
           -> create_page_buffers() // size = 1 << i_blkbits
              -> create_empty_buffers() // give size/take pointer
                 -> alloc_page_buffers() // return NULL
                 .. BUG!

Because alloc_page_buffers() is called with size > PAGE_SIZE,
thus it initializes head = NULL, skips the loop, return head;
then create_empty_buffers() gets (and uses) the NULL pointer.

This has been around longer than commit ad6bf88a6c19 ("block:
fix an integer overflow in logical block size"); however, it
increased the range of values that can trigger the issue.

Previously only 8k/16k/32k (on x86/4k page size) would do it,
as greater values overflow unsigned short to zero, and queue_
logical_block_size() would then use the default of 512.

Now the range with unsigned int is much larger, and one user
with an (incorrect) 512k value, which happened to be zero'ed
previously and work fine, hits the issue -- the zero is gone,
and queue_logical_block_size() does return 512k (> PAGE_SIZE)

Fix this for the general case in queue_logical_block_size(),
regardless of the driver-side fault/fix, to prevent current
and future issues, while drivers adjust to the right values.

Test-case:

    # IMG=/root/disk.img
    # dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G
    # DEV=$(losetup --find --show $IMG)
    # make-bcache --bdev $DEV --block 8k
      < see dmesg >

Before:

    # uname -rv
    5.7.0-rc7

    [   55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000
    ...
    [   55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 #4
    ...
    [   55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100
    ...
    [   55.966434] Call Trace:
    [   55.967021]  create_page_buffers+0x48/0x50
    [   55.967834]  block_read_full_page+0x49/0x380
    [   55.972181]  do_read_cache_page+0x494/0x610
    [   55.974780]  read_part_sector+0x2d/0xaa
    [   55.975558]  read_lba+0x10e/0x1e0
    [   55.977904]  efi_partition+0x120/0x5a6
    [   55.980227]  blk_add_partitions+0x161/0x390
    [   55.982177]  bdev_disk_changed+0x61/0xd0
    [   55.982961]  __blkdev_get+0x350/0x490
    [   55.983715]  __device_add_disk+0x318/0x480
    [   55.984539]  bch_cached_dev_run+0xc5/0x270
    [   55.986010]  register_bcache.cold+0x122/0x179
    [   55.987628]  kernfs_fop_write+0xbc/0x1a0
    [   55.988416]  vfs_write+0xb1/0x1a0
    [   55.989134]  ksys_write+0x5a/0xd0
    [   55.989825]  do_syscall_64+0x43/0x140
    [   55.990563]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [   55.991519] RIP: 0033:0x7f7d60ba3154
    ...

After:

    # uname -rv
    5.7.0-rc7.chkpgsz1

    [   46.313306] bcache: register_bdev() registered backing device loop0

    # grep ^ /sys/block/bcache0/queue/*_block_size
    /sys/block/bcache0/queue/logical_block_size:512
    /sys/block/bcache0/queue/physical_block_size:8192

Reported-by: Ryan Finnie <ryan@finnie.org>
Reported-by: Sebastian Marsching <sebastian@marsching.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
---
 include/linux/blkdev.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Ming Lei June 1, 2020, 7:34 a.m. UTC | #1
On Sun, May 31, 2020 at 09:55:20PM -0300, Mauricio Faria de Oliveira wrote:
> It's possible for a block driver to set logical block size to
> a value greater than page size incorrectly; e.g. bcache takes
> the value from the superblock, set by the user w/ make-bcache.
> 
> This causes a BUG/NULL pointer dereference in the path:
> 
>   __blkdev_get()
>   -> set_init_blocksize() // set i_blkbits based on ...
>      -> bdev_logical_block_size()
>         -> queue_logical_block_size() // ... this value
>   -> bdev_disk_changed()
>      ...
>      -> blkdev_readpage()
>         -> block_read_full_page()
>            -> create_page_buffers() // size = 1 << i_blkbits
>               -> create_empty_buffers() // give size/take pointer
>                  -> alloc_page_buffers() // return NULL
>                  .. BUG!
> 
> Because alloc_page_buffers() is called with size > PAGE_SIZE,
> thus it initializes head = NULL, skips the loop, return head;
> then create_empty_buffers() gets (and uses) the NULL pointer.
> 
> This has been around longer than commit ad6bf88a6c19 ("block:
> fix an integer overflow in logical block size"); however, it
> increased the range of values that can trigger the issue.
> 
> Previously only 8k/16k/32k (on x86/4k page size) would do it,
> as greater values overflow unsigned short to zero, and queue_
> logical_block_size() would then use the default of 512.
> 
> Now the range with unsigned int is much larger, and one user
> with an (incorrect) 512k value, which happened to be zero'ed
> previously and work fine, hits the issue -- the zero is gone,
> and queue_logical_block_size() does return 512k (> PAGE_SIZE)

There is only very limited such potential users(loop, virtio-blk,
xen-blkfront), so could you fix the user instead of working around
queue_logical_block_size()?


thanks, 
Ming
Mauricio Faria de Oliveira June 1, 2020, 1:47 p.m. UTC | #2
Hi Ming,

(sorry, re-sending in plain text; previous reply had HTML by mistake,
and bounced in linux-block.)

On Mon, Jun 1, 2020 at 4:34 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Sun, May 31, 2020 at 09:55:20PM -0300, Mauricio Faria de Oliveira wrote:
> > It's possible for a block driver to set logical block size to
> > a value greater than page size incorrectly; e.g. bcache takes
> > the value from the superblock, set by the user w/ make-bcache.
> >
> > This causes a BUG/NULL pointer dereference in the path:
> >
> >   __blkdev_get()
> >   -> set_init_blocksize() // set i_blkbits based on ...
> >      -> bdev_logical_block_size()
> >         -> queue_logical_block_size() // ... this value
> >   -> bdev_disk_changed()
> >      ...
> >      -> blkdev_readpage()
> >         -> block_read_full_page()
> >            -> create_page_buffers() // size = 1 << i_blkbits
> >               -> create_empty_buffers() // give size/take pointer
> >                  -> alloc_page_buffers() // return NULL
> >                  .. BUG!
> >
> > Because alloc_page_buffers() is called with size > PAGE_SIZE,
> > thus it initializes head = NULL, skips the loop, return head;
> > then create_empty_buffers() gets (and uses) the NULL pointer.
> >
> > This has been around longer than commit ad6bf88a6c19 ("block:
> > fix an integer overflow in logical block size"); however, it
> > increased the range of values that can trigger the issue.
> >
> > Previously only 8k/16k/32k (on x86/4k page size) would do it,
> > as greater values overflow unsigned short to zero, and queue_
> > logical_block_size() would then use the default of 512.
> >
> > Now the range with unsigned int is much larger, and one user
> > with an (incorrect) 512k value, which happened to be zero'ed
> > previously and work fine, hits the issue -- the zero is gone,
> > and queue_logical_block_size() does return 512k (> PAGE_SIZE)
>
> There is only very limited such potential users(loop, virtio-blk,
> xen-blkfront), so could you fix the user instead of working around
> queue_logical_block_size()?
>

Thanks for reviewing.

I can take a look at that, sure, but think the current approach may
still be useful? as it prevents the current, and future potential
users too.

Cheers,

> thanks,
> Ming
>
Mauricio Faria de Oliveira June 3, 2020, 11:33 a.m. UTC | #3
On Mon, Jun 1, 2020 at 10:47 AM Mauricio Faria de Oliveira
<mfo@canonical.com> wrote:
>
> Hi Ming,
>
> (sorry, re-sending in plain text; previous reply had HTML by mistake,
> and bounced in linux-block.)
>
> On Mon, Jun 1, 2020 at 4:34 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Sun, May 31, 2020 at 09:55:20PM -0300, Mauricio Faria de Oliveira wrote:
> > > It's possible for a block driver to set logical block size to
> > > a value greater than page size incorrectly; e.g. bcache takes
> > > the value from the superblock, set by the user w/ make-bcache.
> > >
> > > This causes a BUG/NULL pointer dereference in the path:
> > >
> > >   __blkdev_get()
> > >   -> set_init_blocksize() // set i_blkbits based on ...
> > >      -> bdev_logical_block_size()
> > >         -> queue_logical_block_size() // ... this value
> > >   -> bdev_disk_changed()
> > >      ...
> > >      -> blkdev_readpage()
> > >         -> block_read_full_page()
> > >            -> create_page_buffers() // size = 1 << i_blkbits
> > >               -> create_empty_buffers() // give size/take pointer
> > >                  -> alloc_page_buffers() // return NULL
> > >                  .. BUG!
> > >
> > > Because alloc_page_buffers() is called with size > PAGE_SIZE,
> > > thus it initializes head = NULL, skips the loop, return head;
> > > then create_empty_buffers() gets (and uses) the NULL pointer.
> > >
> > > This has been around longer than commit ad6bf88a6c19 ("block:
> > > fix an integer overflow in logical block size"); however, it
> > > increased the range of values that can trigger the issue.
> > >
> > > Previously only 8k/16k/32k (on x86/4k page size) would do it,
> > > as greater values overflow unsigned short to zero, and queue_
> > > logical_block_size() would then use the default of 512.
> > >
> > > Now the range with unsigned int is much larger, and one user
> > > with an (incorrect) 512k value, which happened to be zero'ed
> > > previously and work fine, hits the issue -- the zero is gone,
> > > and queue_logical_block_size() does return 512k (> PAGE_SIZE)
> >
> > There is only very limited such potential users(loop, virtio-blk,
> > xen-blkfront), so could you fix the user instead of working around
> > queue_logical_block_size()?
> >
>
> Thanks for reviewing.
>
> I can take a look at that, sure, but think the current approach may
> still be useful? as it prevents the current, and future potential
> users too.
>

Please disregard this patch.

Giving this more thought, it's not a good idea to "prevent" any issues
here -- that would actually mask them.
It's probably better to let current issues break, to identify and fix
them (e.g., this), and especially future issues, to hit/fix before
landing.

Thanks,

> Cheers,
>
> > thanks,
> > Ming
> >
>
>
> --
> Mauricio Faria de Oliveira
diff mbox series

Patch

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 32868fbedc9e..fb9dfc8c7e68 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1297,7 +1297,8 @@  static inline unsigned queue_logical_block_size(const struct request_queue *q)
 {
 	int retval = 512;
 
-	if (q && q->limits.logical_block_size)
+	if (q && q->limits.logical_block_size &&
+		 q->limits.logical_block_size <= PAGE_SIZE)
 		retval = q->limits.logical_block_size;
 
 	return retval;