diff mbox

[BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

Message ID 20170213210107.4848-2-paolo.valente@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Paolo Valente Feb. 13, 2017, 9:01 p.m. UTC
If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
that scheduler is set and initialized without any check, driving the
system into an inconsistent state. This commit addresses this issue by
letting elevator_get fail for these wrong cross choices.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/elevator.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

Comments

Bart Van Assche Feb. 13, 2017, 9:12 p.m. UTC | #1
On Mon, 2017-02-13 at 22:01 +0100, Paolo Valente wrote:
> -static struct elevator_type *elevator_get(const char *name, bool try_loading)
> +static struct elevator_type *elevator_get(const char *name, bool try_loading,
> +					  bool mq_ops)

Please choose a better name for that argument, e.q. "mq". To me the name "mq_ops"
means "a pointer to a data structure with operation function pointers".

> +	if (e && (e->uses_mq != mq_ops)) {
> +		pr_err("ERROR: attempted to choose %s %s I/O scheduler in blk%s",
> +		       name, e->uses_mq ? "blk-mq" : "legacy", mq_ops ? "-mq" : "");
> +		e = NULL;
> +	}

How about changing the above into:

+       if (e && e->uses_mq != mq) {
+               pr_err("ERROR: attempt to configure %s as I/O scheduler for a %s queue\n",
+                      name, mq ? "blk-mq" : "legacy");
+               e = NULL;
+       }

Thanks,

Bart.
Omar Sandoval Feb. 13, 2017, 10:09 p.m. UTC | #2
On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
> that scheduler is set and initialized without any check, driving the
> system into an inconsistent state. This commit addresses this issue by
> letting elevator_get fail for these wrong cross choices.
> 
> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
> ---
>  block/elevator.c | 26 ++++++++++++++++++--------
>  1 file changed, 18 insertions(+), 8 deletions(-)

Hey, Paolo,

How exactly are you triggering this? In __elevator_change(), we do check
for mq or not mq:

	if (!e->uses_mq && q->mq_ops) {
		elevator_put(e);
		return -EINVAL;
	}
	if (e->uses_mq && !q->mq_ops) {
		elevator_put(e);
		return -EINVAL;
	}

We don't ever appear to call elevator_init() with a specific scheduler
name, and for the default we switch off of q->mq_ops and use the
defaults from Kconfig:

	if (q->mq_ops && q->nr_hw_queues == 1)
		e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
	else if (q->mq_ops)
		e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
	else
		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);

	if (!e) {
		printk(KERN_ERR
			"Default I/O scheduler not found. " \
			"Using noop/none.\n");
		e = elevator_get("noop", false);
	}

So I guess this could happen if someone manually changed those Kconfig
options, but I don't see what other case would make this happen, could
you please explain?
Jens Axboe Feb. 13, 2017, 10:28 p.m. UTC | #3
On 02/13/2017 03:09 PM, Omar Sandoval wrote:
> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>> that scheduler is set and initialized without any check, driving the
>> system into an inconsistent state. This commit addresses this issue by
>> letting elevator_get fail for these wrong cross choices.
>>
>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>> ---
>>  block/elevator.c | 26 ++++++++++++++++++--------
>>  1 file changed, 18 insertions(+), 8 deletions(-)
> 
> Hey, Paolo,
> 
> How exactly are you triggering this? In __elevator_change(), we do check
> for mq or not mq:
> 
> 	if (!e->uses_mq && q->mq_ops) {
> 		elevator_put(e);
> 		return -EINVAL;
> 	}
> 	if (e->uses_mq && !q->mq_ops) {
> 		elevator_put(e);
> 		return -EINVAL;
> 	}
> 
> We don't ever appear to call elevator_init() with a specific scheduler
> name, and for the default we switch off of q->mq_ops and use the
> defaults from Kconfig:
> 
> 	if (q->mq_ops && q->nr_hw_queues == 1)
> 		e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
> 	else if (q->mq_ops)
> 		e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
> 	else
> 		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
> 
> 	if (!e) {
> 		printk(KERN_ERR
> 			"Default I/O scheduler not found. " \
> 			"Using noop/none.\n");
> 		e = elevator_get("noop", false);
> 	}
> 
> So I guess this could happen if someone manually changed those Kconfig
> options, but I don't see what other case would make this happen, could
> you please explain?

Was wondering the same - is it using the 'elevator=' boot parameter?
Didn't look at that path just now, but that's the only one I could
think of. If it is, I'd much prefer only using 'chosen_elevator' for
the non-mq stuff, and the fix should be just that instead.

So instead of:

	if (!e && *chosen_elevator) {

do

	if (!e && !q->mq_ops && && *chosen_elevator) {
Hannes Reinecke Feb. 14, 2017, 6:58 a.m. UTC | #4
On 02/13/2017 11:28 PM, Jens Axboe wrote:
> On 02/13/2017 03:09 PM, Omar Sandoval wrote:
>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>>> that scheduler is set and initialized without any check, driving the
>>> system into an inconsistent state. This commit addresses this issue by
>>> letting elevator_get fail for these wrong cross choices.
>>>
>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>> ---
>>>  block/elevator.c | 26 ++++++++++++++++++--------
>>>  1 file changed, 18 insertions(+), 8 deletions(-)
>>
>> Hey, Paolo,
>>
>> How exactly are you triggering this? In __elevator_change(), we do check
>> for mq or not mq:
>>
>> 	if (!e->uses_mq && q->mq_ops) {
>> 		elevator_put(e);
>> 		return -EINVAL;
>> 	}
>> 	if (e->uses_mq && !q->mq_ops) {
>> 		elevator_put(e);
>> 		return -EINVAL;
>> 	}
>>
>> We don't ever appear to call elevator_init() with a specific scheduler
>> name, and for the default we switch off of q->mq_ops and use the
>> defaults from Kconfig:
>>
>> 	if (q->mq_ops && q->nr_hw_queues == 1)
>> 		e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
>> 	else if (q->mq_ops)
>> 		e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
>> 	else
>> 		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
>>
>> 	if (!e) {
>> 		printk(KERN_ERR
>> 			"Default I/O scheduler not found. " \
>> 			"Using noop/none.\n");
>> 		e = elevator_get("noop", false);
>> 	}
>>
>> So I guess this could happen if someone manually changed those Kconfig
>> options, but I don't see what other case would make this happen, could
>> you please explain?
> 
> Was wondering the same - is it using the 'elevator=' boot parameter?
> Didn't look at that path just now, but that's the only one I could
> think of. If it is, I'd much prefer only using 'chosen_elevator' for
> the non-mq stuff, and the fix should be just that instead.
> 
[ .. ]
While we're at the topic:

Can't we use the same names for legacy and mq scheduler?
It's quite an unnecessary complication to have
'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
settings or udev rules will continue to work and we wouldn't get any
annoying and pointless warnings here...

Cheers,

Hannes
Omar Sandoval Feb. 14, 2017, 7:07 a.m. UTC | #5
On Tue, Feb 14, 2017 at 07:58:22AM +0100, Hannes Reinecke wrote:
> While we're at the topic:
> 
> Can't we use the same names for legacy and mq scheduler?
> It's quite an unnecessary complication to have
> 'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
> for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
> settings or udev rules will continue to work and we wouldn't get any
> annoying and pointless warnings here...

I mentioned this to Jens a little while ago but I didn't feel strongly
enough to push the issue. I also like this idea -- it makes the
transition to blk-mq a little more transparent.
Hannes Reinecke Feb. 14, 2017, 7:11 a.m. UTC | #6
On 02/14/2017 08:07 AM, Omar Sandoval wrote:
> On Tue, Feb 14, 2017 at 07:58:22AM +0100, Hannes Reinecke wrote:
>> While we're at the topic:
>>
>> Can't we use the same names for legacy and mq scheduler?
>> It's quite an unnecessary complication to have
>> 'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
>> for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
>> settings or udev rules will continue to work and we wouldn't get any
>> annoying and pointless warnings here...
> 
> I mentioned this to Jens a little while ago but I didn't feel strongly
> enough to push the issue. I also like this idea -- it makes the
> transition to blk-mq a little more transparent.
> 
And saves us _a lot_ of support cases :-)

Cheers,

Hannes
Jens Axboe Feb. 14, 2017, 3:13 p.m. UTC | #7
On 02/13/2017 11:58 PM, Hannes Reinecke wrote:
> On 02/13/2017 11:28 PM, Jens Axboe wrote:
>> On 02/13/2017 03:09 PM, Omar Sandoval wrote:
>>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>>>> that scheduler is set and initialized without any check, driving the
>>>> system into an inconsistent state. This commit addresses this issue by
>>>> letting elevator_get fail for these wrong cross choices.
>>>>
>>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>>> ---
>>>>  block/elevator.c | 26 ++++++++++++++++++--------
>>>>  1 file changed, 18 insertions(+), 8 deletions(-)
>>>
>>> Hey, Paolo,
>>>
>>> How exactly are you triggering this? In __elevator_change(), we do check
>>> for mq or not mq:
>>>
>>> 	if (!e->uses_mq && q->mq_ops) {
>>> 		elevator_put(e);
>>> 		return -EINVAL;
>>> 	}
>>> 	if (e->uses_mq && !q->mq_ops) {
>>> 		elevator_put(e);
>>> 		return -EINVAL;
>>> 	}
>>>
>>> We don't ever appear to call elevator_init() with a specific scheduler
>>> name, and for the default we switch off of q->mq_ops and use the
>>> defaults from Kconfig:
>>>
>>> 	if (q->mq_ops && q->nr_hw_queues == 1)
>>> 		e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
>>> 	else if (q->mq_ops)
>>> 		e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
>>> 	else
>>> 		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
>>>
>>> 	if (!e) {
>>> 		printk(KERN_ERR
>>> 			"Default I/O scheduler not found. " \
>>> 			"Using noop/none.\n");
>>> 		e = elevator_get("noop", false);
>>> 	}
>>>
>>> So I guess this could happen if someone manually changed those Kconfig
>>> options, but I don't see what other case would make this happen, could
>>> you please explain?
>>
>> Was wondering the same - is it using the 'elevator=' boot parameter?
>> Didn't look at that path just now, but that's the only one I could
>> think of. If it is, I'd much prefer only using 'chosen_elevator' for
>> the non-mq stuff, and the fix should be just that instead.
>>
> [ .. ]
> While we're at the topic:
> 
> Can't we use the same names for legacy and mq scheduler?
> It's quite an unnecessary complication to have
> 'noop', 'deadline', and 'cfq' for legacy, but 'none' and 'mq-deadline'
> for mq. If we could use 'noop' and 'deadline' for mq, too, the existing
> settings or udev rules will continue to work and we wouldn't get any
> annoying and pointless warnings here...

I'm fine with potentially renaming mq-deadline to deadline, but I don't
want to mix up none and noop. One is an actual scheduler, the other is
not.
diff mbox

Patch

diff --git a/block/elevator.c b/block/elevator.c
index 27ff1ed..a25bdd9 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -99,7 +99,8 @@  static void elevator_put(struct elevator_type *e)
 	module_put(e->elevator_owner);
 }
 
-static struct elevator_type *elevator_get(const char *name, bool try_loading)
+static struct elevator_type *elevator_get(const char *name, bool try_loading,
+					  bool mq_ops)
 {
 	struct elevator_type *e;
 
@@ -113,6 +114,12 @@  static struct elevator_type *elevator_get(const char *name, bool try_loading)
 		e = elevator_find(name);
 	}
 
+	if (e && (e->uses_mq != mq_ops)) {
+		pr_err("ERROR: attempted to choose %s %s I/O scheduler in blk%s",
+		       name, e->uses_mq ? "blk-mq" : "legacy", mq_ops ? "-mq" : "");
+		e = NULL;
+	}
+
 	if (e && !try_module_get(e->elevator_owner))
 		e = NULL;
 
@@ -201,7 +208,7 @@  int elevator_init(struct request_queue *q, char *name)
 	q->boundary_rq = NULL;
 
 	if (name) {
-		e = elevator_get(name, true);
+		e = elevator_get(name, true, q->mq_ops);
 		if (!e)
 			return -EINVAL;
 	}
@@ -212,7 +219,7 @@  int elevator_init(struct request_queue *q, char *name)
 	 * off async and request_module() isn't allowed from async.
 	 */
 	if (!e && *chosen_elevator) {
-		e = elevator_get(chosen_elevator, false);
+		e = elevator_get(chosen_elevator, false, q->mq_ops);
 		if (!e)
 			printk(KERN_ERR "I/O scheduler %s not found\n",
 							chosen_elevator);
@@ -220,17 +227,20 @@  int elevator_init(struct request_queue *q, char *name)
 
 	if (!e) {
 		if (q->mq_ops && q->nr_hw_queues == 1)
-			e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
+			e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false,
+					 q->mq_ops);
 		else if (q->mq_ops)
-			e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
+			e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false,
+					 q->mq_ops);
 		else
-			e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
+			e = elevator_get(CONFIG_DEFAULT_IOSCHED, false,
+					 q->mq_ops);
 
 		if (!e) {
 			printk(KERN_ERR
 				"Default I/O scheduler not found. " \
 				"Using noop/none.\n");
-			e = elevator_get("noop", false);
+			e = elevator_get("noop", false, q->mq_ops);
 		}
 	}
 
@@ -1051,7 +1061,7 @@  static int __elevator_change(struct request_queue *q, const char *name)
 		return elevator_switch(q, NULL);
 
 	strlcpy(elevator_name, name, sizeof(elevator_name));
-	e = elevator_get(strstrip(elevator_name), true);
+	e = elevator_get(strstrip(elevator_name), true, q->mq_ops);
 	if (!e) {
 		printk(KERN_ERR "elevator: type %s not found\n", elevator_name);
 		return -EINVAL;