diff mbox series

[V2] fs: fix guard_bio_eod to check for real EOD errors

Message ID 20190226105150.10717-1-cmaiolino@redhat.com (mailing list archive)
State New, archived
Headers show
Series [V2] fs: fix guard_bio_eod to check for real EOD errors | expand

Commit Message

Carlos Maiolino Feb. 26, 2019, 10:51 a.m. UTC
guard_bio_eod() can truncate a segment in bio to allow it to do IO on
odd last sectors of a device.

It already checks if the IO starts past EOD, but it does not consider
the possibility of an IO request starting within device boundaries can
contain more than one segment past EOD.

In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will
underflow bvec->bv_len.

Fix this by checking if truncated_bytes is lower than PAGE_SIZE.

This situation has been found on filesystems such as isofs and vfat,
which doesn't check the device size before mount, if the device is
smaller than the filesystem itself, a readahead on such filesystem,
which spans EOD, can trigger this situation, leading a call to
zero_user() with a wrong size possibly corrupting memory.

I didn't see any crash, or didn't let the system run long enough to
check if memory corruption will be hit somewhere, but adding
instrumentation to guard_bio_end() to check truncated_bytes size, was
enough to see the error.

The following script can trigger the error.

MNT=/mnt
IMG=./DISK.img
DEV=/dev/loop0

mkfs.vfat $IMG
mount $IMG $MNT
cp -R /etc $MNT &> /dev/null
umount $MNT

losetup -D

losetup --find --show --sizelimit 16247280 $IMG
mount $DEV $MNT

find $MNT -type f -exec cat {} + >/dev/null

Kudos to Eric Sandeen for coming up with the reproducer above

Changelog:

	V2: Compare truncated_bytes agains bvec->bv_len instead of
	    PAGE_SIZE

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/buffer.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Ming Lei Feb. 26, 2019, 10:55 a.m. UTC | #1
On Tue, Feb 26, 2019 at 6:52 PM Carlos Maiolino <cmaiolino@redhat.com> wrote:
>
> guard_bio_eod() can truncate a segment in bio to allow it to do IO on
> odd last sectors of a device.
>
> It already checks if the IO starts past EOD, but it does not consider
> the possibility of an IO request starting within device boundaries can
> contain more than one segment past EOD.
>
> In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will
> underflow bvec->bv_len.
>
> Fix this by checking if truncated_bytes is lower than PAGE_SIZE.
>
> This situation has been found on filesystems such as isofs and vfat,
> which doesn't check the device size before mount, if the device is
> smaller than the filesystem itself, a readahead on such filesystem,
> which spans EOD, can trigger this situation, leading a call to
> zero_user() with a wrong size possibly corrupting memory.
>
> I didn't see any crash, or didn't let the system run long enough to
> check if memory corruption will be hit somewhere, but adding
> instrumentation to guard_bio_end() to check truncated_bytes size, was
> enough to see the error.
>
> The following script can trigger the error.
>
> MNT=/mnt
> IMG=./DISK.img
> DEV=/dev/loop0
>
> mkfs.vfat $IMG
> mount $IMG $MNT
> cp -R /etc $MNT &> /dev/null
> umount $MNT
>
> losetup -D
>
> losetup --find --show --sizelimit 16247280 $IMG
> mount $DEV $MNT
>
> find $MNT -type f -exec cat {} + >/dev/null
>
> Kudos to Eric Sandeen for coming up with the reproducer above
>
> Changelog:
>
>         V2: Compare truncated_bytes agains bvec->bv_len instead of
>             PAGE_SIZE
>
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
>  fs/buffer.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 784de3dbcf28..b43347896ce0 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -3063,6 +3063,13 @@ void guard_bio_eod(int op, struct bio *bio)
>         /* Uhhuh. We've got a bio that straddles the device size! */
>         truncated_bytes = bio->bi_iter.bi_size - (maxsector << 9);
>
> +       /*
> +        * The bio contains more than one segment which spans EOD, just return
> +        * and let IO layer turn it into an EIO
> +        */
> +       if (truncated_bytes > bvec->bv_len)
> +               return;
> +
>         /* Truncate the bio.. */
>         bio->bi_iter.bi_size -= truncated_bytes;
>         bvec->bv_len -= truncated_bytes;

Reviewed-by: Ming Lei <ming.lei@redhat.com>


Thanks,
Ming Lei
Jens Axboe Feb. 28, 2019, 9 p.m. UTC | #2
On 2/26/19 3:51 AM, Carlos Maiolino wrote:
> guard_bio_eod() can truncate a segment in bio to allow it to do IO on
> odd last sectors of a device.
> 
> It already checks if the IO starts past EOD, but it does not consider
> the possibility of an IO request starting within device boundaries can
> contain more than one segment past EOD.
> 
> In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will
> underflow bvec->bv_len.
> 
> Fix this by checking if truncated_bytes is lower than PAGE_SIZE.
> 
> This situation has been found on filesystems such as isofs and vfat,
> which doesn't check the device size before mount, if the device is
> smaller than the filesystem itself, a readahead on such filesystem,
> which spans EOD, can trigger this situation, leading a call to
> zero_user() with a wrong size possibly corrupting memory.
> 
> I didn't see any crash, or didn't let the system run long enough to
> check if memory corruption will be hit somewhere, but adding
> instrumentation to guard_bio_end() to check truncated_bytes size, was
> enough to see the error.
> 
> The following script can trigger the error.
> 
> MNT=/mnt
> IMG=./DISK.img
> DEV=/dev/loop0
> 
> mkfs.vfat $IMG
> mount $IMG $MNT
> cp -R /etc $MNT &> /dev/null
> umount $MNT
> 
> losetup -D
> 
> losetup --find --show --sizelimit 16247280 $IMG
> mount $DEV $MNT
> 
> find $MNT -type f -exec cat {} + >/dev/null
> 
> Kudos to Eric Sandeen for coming up with the reproducer above
> 
> Changelog:
> 
> 	V2: Compare truncated_bytes agains bvec->bv_len instead of
> 	    PAGE_SIZE

Applied - note I snipped your changelog, that should go below the ---
lines to not end up in the commit message.
Carlos Maiolino March 1, 2019, 10:07 a.m. UTC | #3
On Thu, Feb 28, 2019 at 02:00:17PM -0700, Jens Axboe wrote:
> On 2/26/19 3:51 AM, Carlos Maiolino wrote:
> > guard_bio_eod() can truncate a segment in bio to allow it to do IO on
> > odd last sectors of a device.
> > 
> > It already checks if the IO starts past EOD, but it does not consider
> > the possibility of an IO request starting within device boundaries can
> > contain more than one segment past EOD.
> > 
> > In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will
> > underflow bvec->bv_len.
> > 
> > Fix this by checking if truncated_bytes is lower than PAGE_SIZE.
> > 
> > This situation has been found on filesystems such as isofs and vfat,
> > which doesn't check the device size before mount, if the device is
> > smaller than the filesystem itself, a readahead on such filesystem,
> > which spans EOD, can trigger this situation, leading a call to
> > zero_user() with a wrong size possibly corrupting memory.
> > 
> > I didn't see any crash, or didn't let the system run long enough to
> > check if memory corruption will be hit somewhere, but adding
> > instrumentation to guard_bio_end() to check truncated_bytes size, was
> > enough to see the error.
> > 
> > The following script can trigger the error.
> > 
> > MNT=/mnt
> > IMG=./DISK.img
> > DEV=/dev/loop0
> > 
> > mkfs.vfat $IMG
> > mount $IMG $MNT
> > cp -R /etc $MNT &> /dev/null
> > umount $MNT
> > 
> > losetup -D
> > 
> > losetup --find --show --sizelimit 16247280 $IMG
> > mount $DEV $MNT
> > 
> > find $MNT -type f -exec cat {} + >/dev/null
> > 
> > Kudos to Eric Sandeen for coming up with the reproducer above
> > 
> > Changelog:
> > 
> > 	V2: Compare truncated_bytes agains bvec->bv_len instead of
> > 	    PAGE_SIZE
> 
> Applied - note I snipped your changelog, that should go below the ---
> lines to not end up in the commit message.
> 

Thanks Jens, and my apologies, I'll make sure to add the Changelog below the
cutting line next time.

> -- 
> Jens Axboe
>
diff mbox series

Patch

diff --git a/fs/buffer.c b/fs/buffer.c
index 784de3dbcf28..b43347896ce0 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3063,6 +3063,13 @@  void guard_bio_eod(int op, struct bio *bio)
 	/* Uhhuh. We've got a bio that straddles the device size! */
 	truncated_bytes = bio->bi_iter.bi_size - (maxsector << 9);
 
+	/*
+	 * The bio contains more than one segment which spans EOD, just return
+	 * and let IO layer turn it into an EIO
+	 */
+	if (truncated_bytes > bvec->bv_len)
+		return;
+
 	/* Truncate the bio.. */
 	bio->bi_iter.bi_size -= truncated_bytes;
 	bvec->bv_len -= truncated_bytes;