diff mbox series

[1/1] blk/core: Gracefully handle unset make_request_fn

Message ID 20200123091713.12623-2-stefan.bader@canonical.com (mailing list archive)
State Changes Requested, archived
Delegated to: Mike Snitzer
Headers show
Series Handle NULL make_request_fn in generic_make_request() | expand

Commit Message

Stefan Bader Jan. 23, 2020, 9:17 a.m. UTC
When device-mapper adapted for multi-queue functionality, they
also re-organized the way the make-request function was set.
Before, this happened when the device-mapper logical device was
created. Now it is done once the mapping table gets loaded the
first time (this also decides whether the block device is request
or bio based).

However in generic_make_request(), the request function gets used
without further checks and this happens if one tries to mount such
a partially set up device.

This can easily be reproduced with the following steps:
 - dmsetup create -n test
 - mount /dev/dm-<#> /mnt

This maybe is something which also should be fixed up in device-
mapper. But given there is already a check for an unset queue
pointer and potentially there could be other drivers which do or
might do the same, it sounds like a good move to add another check
to generic_make_request_checks() and to bail out if the request
function has not been set, yet.

BugLink: https://bugs.launchpad.net/bugs/1860231
Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper")
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
---
 block/blk-core.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Tyler Hicks Jan. 23, 2020, 10:23 a.m. UTC | #1
On 2020-01-23 11:17:13, Stefan Bader wrote:
> When device-mapper adapted for multi-queue functionality, they
> also re-organized the way the make-request function was set.
> Before, this happened when the device-mapper logical device was
> created. Now it is done once the mapping table gets loaded the
> first time (this also decides whether the block device is request
> or bio based).
> 
> However in generic_make_request(), the request function gets used
> without further checks and this happens if one tries to mount such
> a partially set up device.
> 
> This can easily be reproduced with the following steps:
>  - dmsetup create -n test
>  - mount /dev/dm-<#> /mnt
> 
> This maybe is something which also should be fixed up in device-
> mapper. But given there is already a check for an unset queue
> pointer and potentially there could be other drivers which do or
> might do the same, it sounds like a good move to add another check
> to generic_make_request_checks() and to bail out if the request
> function has not been set, yet.
> 
> BugLink: https://bugs.launchpad.net/bugs/1860231
> Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper")
> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

I helped debug the crash with Stefan and I think this is the most
straightforward fix (and is trivial to backport for stable kernels). I
looked at delaying the queue allocation in the dm code until the table
load ioctl but I decided that was risky and doesn't help the general
case of preventing other subsystems from making this same mistake.

Tested-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>

Tyler

> ---
>  block/blk-core.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 1075aaff606d..adcd042edd2d 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -884,6 +884,13 @@ generic_make_request_checks(struct bio *bio)
>  			bio_devname(bio, b), (long long)bio->bi_iter.bi_sector);
>  		goto end_io;
>  	}
> +	if (unlikely(!q->make_request_fn)) {
> +		printk(KERN_ERR
> +		       "generic_make_request: Trying to access "
> +		       "block-device without request function: %s\n",
> +		       bio_devname(bio, b));
> +		goto end_io;
> +	}
>  
>  	/*
>  	 * Non-mq queues do not honor REQ_NOWAIT, so complete a bio
> -- 
> 2.17.1
> 


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Jan. 23, 2020, 5:28 p.m. UTC | #2
On Thu, Jan 23 2020 at  5:35am -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Thu, Jan 23 2020 at  4:17am -0500,
> Stefan Bader <stefan.bader@canonical.com> wrote:
> 
> > When device-mapper adapted for multi-queue functionality, they
> > also re-organized the way the make-request function was set.
> > Before, this happened when the device-mapper logical device was
> > created. Now it is done once the mapping table gets loaded the
> > first time (this also decides whether the block device is request
> > or bio based).
> > 
> > However in generic_make_request(), the request function gets used
> > without further checks and this happens if one tries to mount such
> > a partially set up device.
> > 
> > This can easily be reproduced with the following steps:
> >  - dmsetup create -n test
> >  - mount /dev/dm-<#> /mnt
> > 
> > This maybe is something which also should be fixed up in device-
> > mapper.
> 
> I'll look closer at other options.
> 
> > But given there is already a check for an unset queue
> > pointer and potentially there could be other drivers which do or
> > might do the same, it sounds like a good move to add another check
> > to generic_make_request_checks() and to bail out if the request
> > function has not been set, yet.
> > 
> > BugLink: https://bugs.launchpad.net/bugs/1860231
> 
> >From that bug;
> "The currently proposed fix introduces no chance of stability
> regressions. There is a chance of a very small performance regression
> since an additional pointer comparison is performed on each block layer
> request but this is unlikely to be noticeable."
> 
> This captures my immediate concern: slowing down everyone for this DM
> edge-case isn't desirable.

SO I had a look and there isn't anything easier than adding the proposed
NULL check in generic_make_request_checks().  Given the many
conditionals in that  function.. what's one more? ;)

I looked at marking the queue frozen to prevent IO via
blk_queue_enter()'s existing cheeck -- but that quickly felt like an
abuse, especially in that there isn't a queue unfreeze for bio-based.

Jens, I'll defer to you to judge this patch further.  If you're OK with
it: cool.  If not, I'm open to suggestions for how to proceed.  

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Jens Axboe Jan. 23, 2020, 6:52 p.m. UTC | #3
On 1/23/20 10:28 AM, Mike Snitzer wrote:
> On Thu, Jan 23 2020 at  5:35am -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
>> On Thu, Jan 23 2020 at  4:17am -0500,
>> Stefan Bader <stefan.bader@canonical.com> wrote:
>>
>>> When device-mapper adapted for multi-queue functionality, they
>>> also re-organized the way the make-request function was set.
>>> Before, this happened when the device-mapper logical device was
>>> created. Now it is done once the mapping table gets loaded the
>>> first time (this also decides whether the block device is request
>>> or bio based).
>>>
>>> However in generic_make_request(), the request function gets used
>>> without further checks and this happens if one tries to mount such
>>> a partially set up device.
>>>
>>> This can easily be reproduced with the following steps:
>>>  - dmsetup create -n test
>>>  - mount /dev/dm-<#> /mnt
>>>
>>> This maybe is something which also should be fixed up in device-
>>> mapper.
>>
>> I'll look closer at other options.
>>
>>> But given there is already a check for an unset queue
>>> pointer and potentially there could be other drivers which do or
>>> might do the same, it sounds like a good move to add another check
>>> to generic_make_request_checks() and to bail out if the request
>>> function has not been set, yet.
>>>
>>> BugLink: https://bugs.launchpad.net/bugs/1860231
>>
>> >From that bug;
>> "The currently proposed fix introduces no chance of stability
>> regressions. There is a chance of a very small performance regression
>> since an additional pointer comparison is performed on each block layer
>> request but this is unlikely to be noticeable."
>>
>> This captures my immediate concern: slowing down everyone for this DM
>> edge-case isn't desirable.
> 
> SO I had a look and there isn't anything easier than adding the proposed
> NULL check in generic_make_request_checks().  Given the many
> conditionals in that  function.. what's one more? ;)
> 
> I looked at marking the queue frozen to prevent IO via
> blk_queue_enter()'s existing cheeck -- but that quickly felt like an
> abuse, especially in that there isn't a queue unfreeze for bio-based.
> 
> Jens, I'll defer to you to judge this patch further.  If you're OK with
> it: cool.  If not, I'm open to suggestions for how to proceed.  
> 

It does kinda suck... The generic_make_request_checks() is a mess, and
this doesn't make it any better. Any reason why we can't solve this
two step setup in a clean fashion instead of patching around it like
this? Feels like a pretty bad hack, tbh.
Stefan Bader Jan. 24, 2020, 6:04 a.m. UTC | #4
On 23.01.20 20:52, Jens Axboe wrote:
> On 1/23/20 10:28 AM, Mike Snitzer wrote:
>> On Thu, Jan 23 2020 at  5:35am -0500,
>> Mike Snitzer <snitzer@redhat.com> wrote:
>>
>>> On Thu, Jan 23 2020 at  4:17am -0500,
>>> Stefan Bader <stefan.bader@canonical.com> wrote:
>>>
>>>> When device-mapper adapted for multi-queue functionality, they
>>>> also re-organized the way the make-request function was set.
>>>> Before, this happened when the device-mapper logical device was
>>>> created. Now it is done once the mapping table gets loaded the
>>>> first time (this also decides whether the block device is request
>>>> or bio based).
>>>>
>>>> However in generic_make_request(), the request function gets used
>>>> without further checks and this happens if one tries to mount such
>>>> a partially set up device.
>>>>
>>>> This can easily be reproduced with the following steps:
>>>>  - dmsetup create -n test
>>>>  - mount /dev/dm-<#> /mnt
>>>>
>>>> This maybe is something which also should be fixed up in device-
>>>> mapper.
>>>
>>> I'll look closer at other options.
>>>
>>>> But given there is already a check for an unset queue
>>>> pointer and potentially there could be other drivers which do or
>>>> might do the same, it sounds like a good move to add another check
>>>> to generic_make_request_checks() and to bail out if the request
>>>> function has not been set, yet.
>>>>
>>>> BugLink: https://bugs.launchpad.net/bugs/1860231
>>>
>>> >From that bug;
>>> "The currently proposed fix introduces no chance of stability
>>> regressions. There is a chance of a very small performance regression
>>> since an additional pointer comparison is performed on each block layer
>>> request but this is unlikely to be noticeable."
>>>
>>> This captures my immediate concern: slowing down everyone for this DM
>>> edge-case isn't desirable.
>>
>> SO I had a look and there isn't anything easier than adding the proposed
>> NULL check in generic_make_request_checks().  Given the many
>> conditionals in that  function.. what's one more? ;)
>>
>> I looked at marking the queue frozen to prevent IO via
>> blk_queue_enter()'s existing cheeck -- but that quickly felt like an
>> abuse, especially in that there isn't a queue unfreeze for bio-based.
>>
>> Jens, I'll defer to you to judge this patch further.  If you're OK with
>> it: cool.  If not, I'm open to suggestions for how to proceed.  
>>
> 
> It does kinda suck... The generic_make_request_checks() is a mess, and
> this doesn't make it any better. Any reason why we can't solve this
> two step setup in a clean fashion instead of patching around it like
> this? Feels like a pretty bad hack, tbh.
> 

Tyler spent some time thinking about delaying the allocation of the queue
structure until later but that seemed rather dangerous. IIRC there are places
during registration of the (generic) block device which expect this to be done.

Not sure whether it would be feasible to start with one kind of dummy
make_request_fn and then switch that over to the proper one once that decision
can be made...
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Jan. 27, 2020, 7:32 p.m. UTC | #5
On Thu, Jan 23 2020 at  1:52pm -0500,
Jens Axboe <axboe@kernel.dk> wrote:

> On 1/23/20 10:28 AM, Mike Snitzer wrote:
> > On Thu, Jan 23 2020 at  5:35am -0500,
> > Mike Snitzer <snitzer@redhat.com> wrote:
> > 
> >> On Thu, Jan 23 2020 at  4:17am -0500,
> >> Stefan Bader <stefan.bader@canonical.com> wrote:
> >>
> >>> When device-mapper adapted for multi-queue functionality, they
> >>> also re-organized the way the make-request function was set.
> >>> Before, this happened when the device-mapper logical device was
> >>> created. Now it is done once the mapping table gets loaded the
> >>> first time (this also decides whether the block device is request
> >>> or bio based).
> >>>
> >>> However in generic_make_request(), the request function gets used
> >>> without further checks and this happens if one tries to mount such
> >>> a partially set up device.
> >>>
> >>> This can easily be reproduced with the following steps:
> >>>  - dmsetup create -n test
> >>>  - mount /dev/dm-<#> /mnt
> >>>
> >>> This maybe is something which also should be fixed up in device-
> >>> mapper.
> >>
> >> I'll look closer at other options.
> >>
> >>> But given there is already a check for an unset queue
> >>> pointer and potentially there could be other drivers which do or
> >>> might do the same, it sounds like a good move to add another check
> >>> to generic_make_request_checks() and to bail out if the request
> >>> function has not been set, yet.
> >>>
> >>> BugLink: https://bugs.launchpad.net/bugs/1860231
> >>
> >> >From that bug;
> >> "The currently proposed fix introduces no chance of stability
> >> regressions. There is a chance of a very small performance regression
> >> since an additional pointer comparison is performed on each block layer
> >> request but this is unlikely to be noticeable."
> >>
> >> This captures my immediate concern: slowing down everyone for this DM
> >> edge-case isn't desirable.
> > 
> > SO I had a look and there isn't anything easier than adding the proposed
> > NULL check in generic_make_request_checks().  Given the many
> > conditionals in that  function.. what's one more? ;)
> > 
> > I looked at marking the queue frozen to prevent IO via
> > blk_queue_enter()'s existing cheeck -- but that quickly felt like an
> > abuse, especially in that there isn't a queue unfreeze for bio-based.
> > 
> > Jens, I'll defer to you to judge this patch further.  If you're OK with
> > it: cool.  If not, I'm open to suggestions for how to proceed.  
> > 
> 
> It does kinda suck... The generic_make_request_checks() is a mess, and
> this doesn't make it any better. Any reason why we can't solve this
> two step setup in a clean fashion instead of patching around it like
> this? Feels like a pretty bad hack, tbh.

I just staged the following DM fix:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.6&id=28a101d6b344f5a38d482a686d18b1205bc92333

From: Mike Snitzer <snitzer@redhat.com>
Date: Mon, 27 Jan 2020 14:07:23 -0500
Subject: [PATCH] dm: fix potential for q->make_request_fn NULL pointer

Move blk_queue_make_request() to dm.c:alloc_dev() so that
q->make_request_fn is never NULL during the lifetime of a DM device
(even one that is created without a DM table).

Otherwise generic_make_request() will crash simply by doing:
  dmsetup create -n test
  mount /dev/dm-N /mnt

While at it, move ->congested_data initialization out of
dm.c:alloc_dev() and into the bio-based specific init method.

Reported-by: Stefan Bader <stefan.bader@canonical.com>
BugLink: https://bugs.launchpad.net/bugs/1860231
Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper")
Depends-on: c12c9a3c3860c ("dm: various cleanups to md->queue initialization code")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index e8f9661a10a1..b89f07ee2eff 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1859,6 +1859,7 @@ static void dm_init_normal_md_queue(struct mapped_device *md)
 	/*
 	 * Initialize aspects of queue that aren't relevant for blk-mq
 	 */
+	md->queue->backing_dev_info->congested_data = md;
 	md->queue->backing_dev_info->congested_fn = dm_any_congested;
 }
 
@@ -1949,7 +1950,12 @@ static struct mapped_device *alloc_dev(int minor)
 	if (!md->queue)
 		goto bad;
 	md->queue->queuedata = md;
-	md->queue->backing_dev_info->congested_data = md;
+	/*
+	 * default to bio-based required ->make_request_fn until DM
+	 * table is loaded and md->type established. If request-based
+	 * table is loaded: blk-mq will override accordingly.
+	 */
+	blk_queue_make_request(md->queue, dm_make_request);
 
 	md->disk = alloc_disk_node(1, md->numa_node_id);
 	if (!md->disk)
@@ -2264,7 +2270,6 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 	case DM_TYPE_DAX_BIO_BASED:
 	case DM_TYPE_NVME_BIO_BASED:
 		dm_init_normal_md_queue(md);
-		blk_queue_make_request(md->queue, dm_make_request);
 		break;
 	case DM_TYPE_NONE:
 		WARN_ON_ONCE(true);
Jens Axboe Jan. 27, 2020, 7:39 p.m. UTC | #6
On 1/27/20 12:32 PM, Mike Snitzer wrote:
> On Thu, Jan 23 2020 at  1:52pm -0500,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 1/23/20 10:28 AM, Mike Snitzer wrote:
>>> On Thu, Jan 23 2020 at  5:35am -0500,
>>> Mike Snitzer <snitzer@redhat.com> wrote:
>>>
>>>> On Thu, Jan 23 2020 at  4:17am -0500,
>>>> Stefan Bader <stefan.bader@canonical.com> wrote:
>>>>
>>>>> When device-mapper adapted for multi-queue functionality, they
>>>>> also re-organized the way the make-request function was set.
>>>>> Before, this happened when the device-mapper logical device was
>>>>> created. Now it is done once the mapping table gets loaded the
>>>>> first time (this also decides whether the block device is request
>>>>> or bio based).
>>>>>
>>>>> However in generic_make_request(), the request function gets used
>>>>> without further checks and this happens if one tries to mount such
>>>>> a partially set up device.
>>>>>
>>>>> This can easily be reproduced with the following steps:
>>>>>  - dmsetup create -n test
>>>>>  - mount /dev/dm-<#> /mnt
>>>>>
>>>>> This maybe is something which also should be fixed up in device-
>>>>> mapper.
>>>>
>>>> I'll look closer at other options.
>>>>
>>>>> But given there is already a check for an unset queue
>>>>> pointer and potentially there could be other drivers which do or
>>>>> might do the same, it sounds like a good move to add another check
>>>>> to generic_make_request_checks() and to bail out if the request
>>>>> function has not been set, yet.
>>>>>
>>>>> BugLink: https://bugs.launchpad.net/bugs/1860231
>>>>
>>>> >From that bug;
>>>> "The currently proposed fix introduces no chance of stability
>>>> regressions. There is a chance of a very small performance regression
>>>> since an additional pointer comparison is performed on each block layer
>>>> request but this is unlikely to be noticeable."
>>>>
>>>> This captures my immediate concern: slowing down everyone for this DM
>>>> edge-case isn't desirable.
>>>
>>> SO I had a look and there isn't anything easier than adding the proposed
>>> NULL check in generic_make_request_checks().  Given the many
>>> conditionals in that  function.. what's one more? ;)
>>>
>>> I looked at marking the queue frozen to prevent IO via
>>> blk_queue_enter()'s existing cheeck -- but that quickly felt like an
>>> abuse, especially in that there isn't a queue unfreeze for bio-based.
>>>
>>> Jens, I'll defer to you to judge this patch further.  If you're OK with
>>> it: cool.  If not, I'm open to suggestions for how to proceed.  
>>>
>>
>> It does kinda suck... The generic_make_request_checks() is a mess, and
>> this doesn't make it any better. Any reason why we can't solve this
>> two step setup in a clean fashion instead of patching around it like
>> this? Feels like a pretty bad hack, tbh.
> 
> I just staged the following DM fix:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.6&id=28a101d6b344f5a38d482a686d18b1205bc92333

I like that a lot more than the NULL check in the core.
Stefan Bader Jan. 28, 2020, 2:32 p.m. UTC | #7
On 27.01.20 20:32, Mike Snitzer wrote:
> On Thu, Jan 23 2020 at  1:52pm -0500,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 1/23/20 10:28 AM, Mike Snitzer wrote:
>>> On Thu, Jan 23 2020 at  5:35am -0500,
>>> Mike Snitzer <snitzer@redhat.com> wrote:
>>>
>>>> On Thu, Jan 23 2020 at  4:17am -0500,
>>>> Stefan Bader <stefan.bader@canonical.com> wrote:
>>>>
>>>>> When device-mapper adapted for multi-queue functionality, they
>>>>> also re-organized the way the make-request function was set.
>>>>> Before, this happened when the device-mapper logical device was
>>>>> created. Now it is done once the mapping table gets loaded the
>>>>> first time (this also decides whether the block device is request
>>>>> or bio based).
>>>>>
>>>>> However in generic_make_request(), the request function gets used
>>>>> without further checks and this happens if one tries to mount such
>>>>> a partially set up device.
>>>>>
>>>>> This can easily be reproduced with the following steps:
>>>>>  - dmsetup create -n test
>>>>>  - mount /dev/dm-<#> /mnt
>>>>>
>>>>> This maybe is something which also should be fixed up in device-
>>>>> mapper.
>>>>
>>>> I'll look closer at other options.
>>>>
>>>>> But given there is already a check for an unset queue
>>>>> pointer and potentially there could be other drivers which do or
>>>>> might do the same, it sounds like a good move to add another check
>>>>> to generic_make_request_checks() and to bail out if the request
>>>>> function has not been set, yet.
>>>>>
>>>>> BugLink: https://bugs.launchpad.net/bugs/1860231
>>>>
>>>> >From that bug;
>>>> "The currently proposed fix introduces no chance of stability
>>>> regressions. There is a chance of a very small performance regression
>>>> since an additional pointer comparison is performed on each block layer
>>>> request but this is unlikely to be noticeable."
>>>>
>>>> This captures my immediate concern: slowing down everyone for this DM
>>>> edge-case isn't desirable.
>>>
>>> SO I had a look and there isn't anything easier than adding the proposed
>>> NULL check in generic_make_request_checks().  Given the many
>>> conditionals in that  function.. what's one more? ;)
>>>
>>> I looked at marking the queue frozen to prevent IO via
>>> blk_queue_enter()'s existing cheeck -- but that quickly felt like an
>>> abuse, especially in that there isn't a queue unfreeze for bio-based.
>>>
>>> Jens, I'll defer to you to judge this patch further.  If you're OK with
>>> it: cool.  If not, I'm open to suggestions for how to proceed.  
>>>
>>
>> It does kinda suck... The generic_make_request_checks() is a mess, and
>> this doesn't make it any better. Any reason why we can't solve this
>> two step setup in a clean fashion instead of patching around it like
>> this? Feels like a pretty bad hack, tbh.
> 
> I just staged the following DM fix:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.6&id=28a101d6b344f5a38d482a686d18b1205bc92333

Thanks Mike,

yeah this looks like it resolves the problem without adding any impact on the
generic I/O path. We certainly had thought about that but felt uncertain whether
it would not open other risks. Like something adding requests just before the
table load. Could this cause some I/O be handled by one function and the rest by
another? And would that really matter?

The other thing that was a bit strange but maybe someone else's problem is that
mount generated I/O requests to start with. The device size should be 0 still.


> 
> From: Mike Snitzer <snitzer@redhat.com>
> Date: Mon, 27 Jan 2020 14:07:23 -0500
> Subject: [PATCH] dm: fix potential for q->make_request_fn NULL pointer
> 
> Move blk_queue_make_request() to dm.c:alloc_dev() so that
> q->make_request_fn is never NULL during the lifetime of a DM device
> (even one that is created without a DM table).
> 
> Otherwise generic_make_request() will crash simply by doing:
>   dmsetup create -n test
>   mount /dev/dm-N /mnt
> 
> While at it, move ->congested_data initialization out of
> dm.c:alloc_dev() and into the bio-based specific init method.
> 
> Reported-by: Stefan Bader <stefan.bader@canonical.com>
> BugLink: https://bugs.launchpad.net/bugs/1860231
> Fixes: ff36ab34583a ("dm: remove request-based logic from make_request_fn wrapper")
> Depends-on: c12c9a3c3860c ("dm: various cleanups to md->queue initialization code")
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index e8f9661a10a1..b89f07ee2eff 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1859,6 +1859,7 @@ static void dm_init_normal_md_queue(struct mapped_device *md)
>  	/*
>  	 * Initialize aspects of queue that aren't relevant for blk-mq
>  	 */
> +	md->queue->backing_dev_info->congested_data = md;
>  	md->queue->backing_dev_info->congested_fn = dm_any_congested;
>  }
>  
> @@ -1949,7 +1950,12 @@ static struct mapped_device *alloc_dev(int minor)
>  	if (!md->queue)
>  		goto bad;
>  	md->queue->queuedata = md;
> -	md->queue->backing_dev_info->congested_data = md;
> +	/*
> +	 * default to bio-based required ->make_request_fn until DM
> +	 * table is loaded and md->type established. If request-based
> +	 * table is loaded: blk-mq will override accordingly.
> +	 */
> +	blk_queue_make_request(md->queue, dm_make_request);
>  
>  	md->disk = alloc_disk_node(1, md->numa_node_id);
>  	if (!md->disk)
> @@ -2264,7 +2270,6 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
>  	case DM_TYPE_DAX_BIO_BASED:
>  	case DM_TYPE_NVME_BIO_BASED:
>  		dm_init_normal_md_queue(md);
> -		blk_queue_make_request(md->queue, dm_make_request);
>  		break;
>  	case DM_TYPE_NONE:
>  		WARN_ON_ONCE(true);
>
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Jan. 28, 2020, 4:26 p.m. UTC | #8
On Tue, Jan 28 2020 at  9:32am -0500,
Stefan Bader <stefan.bader@canonical.com> wrote:

> On 27.01.20 20:32, Mike Snitzer wrote:
> > 
> > I just staged the following DM fix:
> > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.6&id=28a101d6b344f5a38d482a686d18b1205bc92333
> 
> Thanks Mike,
> 
> yeah this looks like it resolves the problem without adding any impact on the
> generic I/O path. We certainly had thought about that but felt uncertain whether
> it would not open other risks. Like something adding requests just before the
> table load. Could this cause some I/O be handled by one function and the rest by
> another? And would that really matter?

I considered this too.  Any IO issued to the device before it is "ready"
won't matter anyway (no where to send the IO due to not having a DM
table -- such IO should result in an error (from dm.c:dm_process_bio's
!map check).  But given the device has no size, a simple write will hit
-ENOSPC before.

And the only way to get the DM device to have a proper destination for
its IO is to load a table, which requires a sequence like:

# dmsetup create -n test
# dmsetup table
test:
# echo "0 20971520 linear 259:0 2048" | dmsetup load test
# dmsetup table --inactive
test: 0 20971520 linear 259:0 2048
# dmsetup suspend test
# dmsetup resume test
# dmsetup table
test: 0 20971520 linear 259:0 2048

And once a table is loaded there will be accompanying change
uevents that trigger udev, blkid, etc.

(NOTE: the suspend phase implies a flush of all outstanding IO, but even
if 'dmsetup suspend --noflush test' were used the IO would just get
pushed onto a list in DM core and it would be issued after the new table
is in place).

> The other thing that was a bit strange but maybe someone else's problem is that
> mount generated I/O requests to start with. The device size should be 0 still.

That's just mount not having a negative check for device size being 0.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/block/blk-core.c b/block/blk-core.c
index 1075aaff606d..adcd042edd2d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -884,6 +884,13 @@  generic_make_request_checks(struct bio *bio)
 			bio_devname(bio, b), (long long)bio->bi_iter.bi_sector);
 		goto end_io;
 	}
+	if (unlikely(!q->make_request_fn)) {
+		printk(KERN_ERR
+		       "generic_make_request: Trying to access "
+		       "block-device without request function: %s\n",
+		       bio_devname(bio, b));
+		goto end_io;
+	}
 
 	/*
 	 * Non-mq queues do not honor REQ_NOWAIT, so complete a bio