diff mbox series

block: Deny writable memory mapping if block is read-only

Message ID 20230510074223.991297-1-loic.poulain@linaro.org (mailing list archive)
State New, archived
Headers show
Series block: Deny writable memory mapping if block is read-only | expand

Commit Message

Loic Poulain May 10, 2023, 7:42 a.m. UTC
User should not be able to write block device if it is read-only at
block level (e.g force_ro attribute). This is ensured in the regular
fops write operation (blkdev_write_iter) but not when writing via
user mapping (mmap), allowing user to actually write a read-only
block device via a PROT_WRITE mapping.

Example: This can lead to integrity issue of eMMC boot partition
(e.g mmcblk0boot0) which is read-only by default.

To fix this issue, simply deny shared writable mapping if the block
is readonly.

Note: Block remains writable if switch to read-only is performed
after the initial mapping, but this is expected behavior according
to commit a32e236eb93e ("Partially revert "block: fail op_is_write()
requests to read-only partitions"")'.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
---
 block/fops.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig May 10, 2023, 1:27 p.m. UTC | #1
On Wed, May 10, 2023 at 09:42:23AM +0200, Loic Poulain wrote:
> User should not be able to write block device if it is read-only at
> block level (e.g force_ro attribute). This is ensured in the regular
> fops write operation (blkdev_write_iter) but not when writing via
> user mapping (mmap), allowing user to actually write a read-only
> block device via a PROT_WRITE mapping.
> 
> Example: This can lead to integrity issue of eMMC boot partition
> (e.g mmcblk0boot0) which is read-only by default.
> 
> To fix this issue, simply deny shared writable mapping if the block
> is readonly.
> 
> Note: Block remains writable if switch to read-only is performed
> after the initial mapping, but this is expected behavior according
> to commit a32e236eb93e ("Partially revert "block: fail op_is_write()
> requests to read-only partitions"")'.

We should not be able to every mmap something (shared-)writable if the
file descriptor.

And ... we don't failed writable opens for block devices ?!?

Something like this is what we would need, but I really need to look
into the history of this whole thing a bit more:


diff --git a/block/bdev.c b/block/bdev.c
index 1795c7d4b99efa..6dd6045672b8bf 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -724,6 +724,11 @@ struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
 		return ERR_PTR(-ENXIO);
 	disk = bdev->bd_disk;
 
+	if ((mode & FMODE_WRITE) && bdev_read_only(bdev)) {
+		ret = -EACCES;
+		goto put_blkdev;
+	}
+
 	if (mode & FMODE_EXCL) {
 		ret = bd_prepare_to_claim(bdev, holder);
 		if (ret)
@@ -798,7 +803,6 @@ EXPORT_SYMBOL(blkdev_get_by_dev);
 struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
 					void *holder)
 {
-	struct block_device *bdev;
 	dev_t dev;
 	int error;
 
@@ -806,13 +810,7 @@ struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
 	if (error)
 		return ERR_PTR(error);
 
-	bdev = blkdev_get_by_dev(dev, mode, holder);
-	if (!IS_ERR(bdev) && (mode & FMODE_WRITE) && bdev_read_only(bdev)) {
-		blkdev_put(bdev, mode);
-		return ERR_PTR(-EACCES);
-	}
-
-	return bdev;
+	return blkdev_get_by_dev(dev, mode, holder);
 }
 EXPORT_SYMBOL(blkdev_get_by_path);
Loic Poulain May 10, 2023, 2:17 p.m. UTC | #2
On Wed, 10 May 2023 at 15:27, Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, May 10, 2023 at 09:42:23AM +0200, Loic Poulain wrote:
> > User should not be able to write block device if it is read-only at
> > block level (e.g force_ro attribute). This is ensured in the regular
> > fops write operation (blkdev_write_iter) but not when writing via
> > user mapping (mmap), allowing user to actually write a read-only
> > block device via a PROT_WRITE mapping.
> >
> > Example: This can lead to integrity issue of eMMC boot partition
> > (e.g mmcblk0boot0) which is read-only by default.
> >
> > To fix this issue, simply deny shared writable mapping if the block
> > is readonly.
> >
> > Note: Block remains writable if switch to read-only is performed
> > after the initial mapping, but this is expected behavior according
> > to commit a32e236eb93e ("Partially revert "block: fail op_is_write()
> > requests to read-only partitions"")'.
>
> We should not be able to every mmap something (shared-)writable if the
> file descriptor.
>
> And ... we don't failed writable opens for block devices ?!?

No, because the file itself is writable, but not the underlying block.
I agree, it would make more sense to simply deny the block open fops
instead... but it could be considered as uapi breakage as we may have
some existing applications opening the device RW, and simply
ignore/discard the sys write errors for ro devices... but if it's
acceptable let's do it. For sure, we could argue that making the mmap
failing is also a change in uapi behavior, but except reconsidering
a32e236eb93e which may be obsolete today, I don't see a better
solution to prevent unwanted writing.



>
> Something like this is what we would need, but I really need to look
> into the history of this whole thing a bit more:
>
>
> diff --git a/block/bdev.c b/block/bdev.c
> index 1795c7d4b99efa..6dd6045672b8bf 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -724,6 +724,11 @@ struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
>                 return ERR_PTR(-ENXIO);
>         disk = bdev->bd_disk;
>
> +       if ((mode & FMODE_WRITE) && bdev_read_only(bdev)) {
> +               ret = -EACCES;
> +               goto put_blkdev;
> +       }
> +
>         if (mode & FMODE_EXCL) {
>                 ret = bd_prepare_to_claim(bdev, holder);
>                 if (ret)
> @@ -798,7 +803,6 @@ EXPORT_SYMBOL(blkdev_get_by_dev);
>  struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
>                                         void *holder)
>  {
> -       struct block_device *bdev;
>         dev_t dev;
>         int error;
>
> @@ -806,13 +810,7 @@ struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
>         if (error)
>                 return ERR_PTR(error);
>
> -       bdev = blkdev_get_by_dev(dev, mode, holder);
> -       if (!IS_ERR(bdev) && (mode & FMODE_WRITE) && bdev_read_only(bdev)) {
> -               blkdev_put(bdev, mode);
> -               return ERR_PTR(-EACCES);
> -       }
> -
> -       return bdev;
> +       return blkdev_get_by_dev(dev, mode, holder);
>  }
>  EXPORT_SYMBOL(blkdev_get_by_path);
>
Christoph Hellwig May 15, 2023, 9:26 a.m. UTC | #3
On Wed, May 10, 2023 at 04:17:01PM +0200, Loic Poulain wrote:
> No, because the file itself is writable, but not the underlying block.

Eww.

> I agree, it would make more sense to simply deny the block open fops
> instead... but it could be considered as uapi breakage as we may have
> some existing applications opening the device RW, and simply
> ignore/discard the sys write errors for ro devices...

True.  I suspect the right thing might be to still only open the device
read-only.  Which brings us to the next mess that ->open for block
devices is only called once, but different openers might have different
flags (with write and nodelay being the once that matter).

> but if it's
> acceptable let's do it. For sure, we could argue that making the mmap
> failing is also a change in uapi behavior,

We really should fail it.  Unlike the device open, where allowing it
feels wrong to me, but at least has use cases being able to create a
shared writable mmap doesn't have any point.

> but except reconsidering
> a32e236eb93e which may be obsolete today, I don't see a better
> solution to prevent unwanted writing.

I think we need to absolutely reconsider it (in addition to your patch),
especially as I just got another report related to it.  I'll need to
talk to the DM folks to figure out if we can do a workaround in dm
somehow.
Christoph Hellwig May 15, 2023, 9:30 a.m. UTC | #4
Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Jens Axboe May 20, 2023, 2:17 a.m. UTC | #5
On Wed, 10 May 2023 09:42:23 +0200, Loic Poulain wrote:
> User should not be able to write block device if it is read-only at
> block level (e.g force_ro attribute). This is ensured in the regular
> fops write operation (blkdev_write_iter) but not when writing via
> user mapping (mmap), allowing user to actually write a read-only
> block device via a PROT_WRITE mapping.
> 
> Example: This can lead to integrity issue of eMMC boot partition
> (e.g mmcblk0boot0) which is read-only by default.
> 
> [...]

Applied, thanks!

[1/1] block: Deny writable memory mapping if block is read-only
      commit: 69baa3a623fd2e58624f24f2f23d46f87b817c93

Best regards,
diff mbox series

Patch

diff --git a/block/fops.c b/block/fops.c
index d2e6be4e3d1c..58d0aebc7313 100644
--- a/block/fops.c
+++ b/block/fops.c
@@ -678,6 +678,16 @@  static long blkdev_fallocate(struct file *file, int mode, loff_t start,
 	return error;
 }
 
+static int blkdev_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct inode *bd_inode = bdev_file_inode(file);
+
+	if (bdev_read_only(I_BDEV(bd_inode)))
+		return generic_file_readonly_mmap(file, vma);
+
+	return generic_file_mmap(file, vma);
+}
+
 const struct file_operations def_blk_fops = {
 	.open		= blkdev_open,
 	.release	= blkdev_close,
@@ -685,7 +695,7 @@  const struct file_operations def_blk_fops = {
 	.read_iter	= blkdev_read_iter,
 	.write_iter	= blkdev_write_iter,
 	.iopoll		= iocb_bio_iopoll,
-	.mmap		= generic_file_mmap,
+	.mmap		= blkdev_mmap,
 	.fsync		= blkdev_fsync,
 	.unlocked_ioctl	= blkdev_ioctl,
 #ifdef CONFIG_COMPAT