diff mbox series

block_dev: fix crash on chained bios with O_DIRECT

Message ID 20190320081253.129688-1-hare@suse.de (mailing list archive)
State New, archived
Headers show
Series block_dev: fix crash on chained bios with O_DIRECT | expand

Commit Message

Hannes Reinecke March 20, 2019, 8:12 a.m. UTC
__blkdev_direct_IO_simple() is allocating a bio on the stack.
When that bio needs to be split bio_chain_endio() invokes bio_put()
on this bio, causing the kernel to crash in mempool_free() as the
bio was never allocated from a mempool in the first place.
So call bio_get() before submitting to avoid this problem.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 fs/block_dev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Johannes Thumshirn March 20, 2019, 8:45 a.m. UTC | #1
On 20/03/2019 09:12, Hannes Reinecke wrote:
> __blkdev_direct_IO_simple() is allocating a bio on the stack.
> When that bio needs to be split bio_chain_endio() invokes bio_put()
> on this bio, causing the kernel to crash in mempool_free() as the
> bio was never allocated from a mempool in the first place.
> So call bio_get() before submitting to avoid this problem.

Hmm this sounds as if we're just papering over the real issue here,
which is calling bio_free() for bios not allocated using bio_alloc_bioset().

How about the following untested patch:

From 9c8434e5bf81595e97ea5647437d12bfce0e37b6 Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <jthumshirn@suse.de>
Date: Wed, 20 Mar 2019 09:40:18 +0100
Subject: [PATCH] bio: Introduce BIO_ALLOCED flag and check it in bio_free

When we're submitting a bio from stack and this ends up being split, we
call bio_put(). bio_put() will eventually call bio_free() if the reference
count drops to 0. But freeing the bio is wrong, as it was never allocated
out of the bio's mempool.

Flag each normally allocated bio as 'BIO_ALLOCATED' and skip freeing if the
flag isn't set.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 block/bio.c               | 4 ++++
 include/linux/blk_types.h | 1 +
 2 files changed, 5 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 4db1008309ed..caa8bc076377 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -253,6 +253,9 @@ static void bio_free(struct bio *bio)
 	struct bio_set *bs = bio->bi_pool;
 	void *p;

+	if (!bio_flagged(bio, BIO_ALLOCED))
+		return;
+
 	bio_uninit(bio);

 	if (bs) {
@@ -521,6 +524,7 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask,
unsigned int nr_iovecs,
 		bvl = bio->bi_inline_vecs;
 	}

+	bio_set_flag(bio, BIO_ALLOCED);
 	bio->bi_pool = bs;
 	bio->bi_max_vecs = nr_iovecs;
 	bio->bi_io_vec = bvl;
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index d66bf5f32610..14b4f87a1eab 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -229,6 +229,7 @@ struct bio {
 				 * of this bio. */
 #define BIO_QUEUE_ENTERED 11	/* can use blk_queue_enter_live() */
 #define BIO_TRACKED 12		/* set if bio goes through the rq_qos path */
+#define BIO_ALLOCED 13		/* set if the bio was allocated by
bio_alloc_bioset */

 /* See BVEC_POOL_OFFSET below before adding new flags */
Hannes Reinecke March 20, 2019, 8:51 a.m. UTC | #2
On 3/20/19 9:45 AM, Johannes Thumshirn wrote:
> On 20/03/2019 09:12, Hannes Reinecke wrote:
>> __blkdev_direct_IO_simple() is allocating a bio on the stack.
>> When that bio needs to be split bio_chain_endio() invokes bio_put()
>> on this bio, causing the kernel to crash in mempool_free() as the
>> bio was never allocated from a mempool in the first place.
>> So call bio_get() before submitting to avoid this problem.
> 
> Hmm this sounds as if we're just papering over the real issue here,
> which is calling bio_free() for bios not allocated using bio_alloc_bioset().
> 
> How about the following untested patch:
> 
>  From 9c8434e5bf81595e97ea5647437d12bfce0e37b6 Mon Sep 17 00:00:00 2001
> From: Johannes Thumshirn <jthumshirn@suse.de>
> Date: Wed, 20 Mar 2019 09:40:18 +0100
> Subject: [PATCH] bio: Introduce BIO_ALLOCED flag and check it in bio_free
> 
> When we're submitting a bio from stack and this ends up being split, we
> call bio_put(). bio_put() will eventually call bio_free() if the reference
> count drops to 0. But freeing the bio is wrong, as it was never allocated
> out of the bio's mempool.
> 
> Flag each normally allocated bio as 'BIO_ALLOCATED' and skip freeing if the
> flag isn't set.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>   block/bio.c               | 4 ++++
>   include/linux/blk_types.h | 1 +
>   2 files changed, 5 insertions(+)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 4db1008309ed..caa8bc076377 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -253,6 +253,9 @@ static void bio_free(struct bio *bio)
>   	struct bio_set *bs = bio->bi_pool;
>   	void *p;
> 
> +	if (!bio_flagged(bio, BIO_ALLOCED))
> +		return;
> +
>   	bio_uninit(bio);
> 
>   	if (bs) {
> @@ -521,6 +524,7 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask,
> unsigned int nr_iovecs,
>   		bvl = bio->bi_inline_vecs;
>   	}
> 
> +	bio_set_flag(bio, BIO_ALLOCED);
>   	bio->bi_pool = bs;
>   	bio->bi_max_vecs = nr_iovecs;
>   	bio->bi_io_vec = bvl;
> diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
> index d66bf5f32610..14b4f87a1eab 100644
> --- a/include/linux/blk_types.h
> +++ b/include/linux/blk_types.h
> @@ -229,6 +229,7 @@ struct bio {
>   				 * of this bio. */
>   #define BIO_QUEUE_ENTERED 11	/* can use blk_queue_enter_live() */
>   #define BIO_TRACKED 12		/* set if bio goes through the rq_qos path */
> +#define BIO_ALLOCED 13		/* set if the bio was allocated by
> bio_alloc_bioset */
> 
>   /* See BVEC_POOL_OFFSET below before adding new flags */
> 
Yeah, should work, too.
But we should be calling bio_uninit() for all bios.

Will you be sending an updated patch?

Cheers,

Hannes
Johannes Thumshirn March 20, 2019, 8:53 a.m. UTC | #3
On 20/03/2019 09:51, Hannes Reinecke wrote:
> Yeah, should work, too.
> But we should be calling bio_uninit() for all bios.

Yup, probably.

> Will you be sending an updated patch?

Let's wait what other's thing first.
Jan Kara March 20, 2019, 11:47 a.m. UTC | #4
On Wed 20-03-19 09:53:10, Johannes Thumshirn wrote:
> On 20/03/2019 09:51, Hannes Reinecke wrote:
> > Yeah, should work, too.
> > But we should be calling bio_uninit() for all bios.
> 
> Yup, probably.
> 
> > Will you be sending an updated patch?
> 
> Let's wait what other's thing first.

FWIW I'm OK with either solution. Yours seems a bit more future-proof so I
like it a bit more.

								Honza
Johannes Thumshirn March 20, 2019, 1:19 p.m. UTC | #5
On 20/03/2019 12:47, Jan Kara wrote:
> On Wed 20-03-19 09:53:10, Johannes Thumshirn wrote:
>> On 20/03/2019 09:51, Hannes Reinecke wrote:
>>> Yeah, should work, too.
>>> But we should be calling bio_uninit() for all bios.
>>
>> Yup, probably.
>>
>>> Will you be sending an updated patch?
>>
>> Let's wait what other's thing first.
> 
> FWIW I'm OK with either solution. Yours seems a bit more future-proof so I
> like it a bit more.

FWIW Bit 13 for the Flag doesn't work, need to find a free one before
doing a proper submission.
Jens Axboe March 20, 2019, 7:57 p.m. UTC | #6
On 3/20/19 7:19 AM, Johannes Thumshirn wrote:
> On 20/03/2019 12:47, Jan Kara wrote:
>> On Wed 20-03-19 09:53:10, Johannes Thumshirn wrote:
>>> On 20/03/2019 09:51, Hannes Reinecke wrote:
>>>> Yeah, should work, too.
>>>> But we should be calling bio_uninit() for all bios.
>>>
>>> Yup, probably.
>>>
>>>> Will you be sending an updated patch?
>>>
>>> Let's wait what other's thing first.
>>
>> FWIW I'm OK with either solution. Yours seems a bit more future-proof so I
>> like it a bit more.
> 
> FWIW Bit 13 for the Flag doesn't work, need to find a free one before
> doing a proper submission.

Yeah, you're going to overlap and crash... We really should have a build
bug on for that.

We don't have any free ones. I've got a patch in io_uring-next that
uses the last one.

That said, I do greatly prefer your approach to solving the issue.
Johannes Thumshirn March 21, 2019, 8:28 a.m. UTC | #7
On 20/03/2019 20:57, Jens Axboe wrote:
> Yeah, you're going to overlap and crash... We really should have a build
> bug on for that.
> 
> We don't have any free ones. I've got a patch in io_uring-next that
> uses the last one.

Damn it, I have updated the patch to use 0 as well.

> That said, I do greatly prefer your approach to solving the issue.

Any ideas to proceed from here?
diff mbox series

Patch

diff --git a/fs/block_dev.c b/fs/block_dev.c
index c546cdce77e6..4b3a04c3b8bd 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -235,6 +235,7 @@  __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
 	if (iocb->ki_flags & IOCB_HIPRI)
 		bio.bi_opf |= REQ_HIPRI;
 
+	bio_get(&bio);
 	qc = submit_bio(&bio);
 	for (;;) {
 		set_current_state(TASK_UNINTERRUPTIBLE);
@@ -254,7 +255,7 @@  __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
 
 	if (unlikely(bio.bi_status))
 		ret = blk_status_to_errno(bio.bi_status);
-
+	bio_put(&bio);
 out:
 	if (vecs != inline_vecs)
 		kfree(vecs);