diff mbox

[BUGFIX] block: make elevator_get robust against cross blk/blk-mq choice

Message ID b1a1ad6d-6e22-7d55-bbad-40807d95f59b@kernel.dk (mailing list archive)
State New, archived
Headers show

Commit Message

Jens Axboe Feb. 13, 2017, 11:10 p.m. UTC
On 02/13/2017 03:28 PM, Jens Axboe wrote:
> On 02/13/2017 03:09 PM, Omar Sandoval wrote:
>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>>> that scheduler is set and initialized without any check, driving the
>>> system into an inconsistent state. This commit addresses this issue by
>>> letting elevator_get fail for these wrong cross choices.
>>>
>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>> ---
>>>  block/elevator.c | 26 ++++++++++++++++++--------
>>>  1 file changed, 18 insertions(+), 8 deletions(-)
>>
>> Hey, Paolo,
>>
>> How exactly are you triggering this? In __elevator_change(), we do check
>> for mq or not mq:
>>
>> 	if (!e->uses_mq && q->mq_ops) {
>> 		elevator_put(e);
>> 		return -EINVAL;
>> 	}
>> 	if (e->uses_mq && !q->mq_ops) {
>> 		elevator_put(e);
>> 		return -EINVAL;
>> 	}
>>
>> We don't ever appear to call elevator_init() with a specific scheduler
>> name, and for the default we switch off of q->mq_ops and use the
>> defaults from Kconfig:
>>
>> 	if (q->mq_ops && q->nr_hw_queues == 1)
>> 		e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
>> 	else if (q->mq_ops)
>> 		e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
>> 	else
>> 		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
>>
>> 	if (!e) {
>> 		printk(KERN_ERR
>> 			"Default I/O scheduler not found. " \
>> 			"Using noop/none.\n");
>> 		e = elevator_get("noop", false);
>> 	}
>>
>> So I guess this could happen if someone manually changed those Kconfig
>> options, but I don't see what other case would make this happen, could
>> you please explain?
> 
> Was wondering the same - is it using the 'elevator=' boot parameter?
> Didn't look at that path just now, but that's the only one I could
> think of. If it is, I'd much prefer only using 'chosen_elevator' for
> the non-mq stuff, and the fix should be just that instead.
> 
> So instead of:
> 
> 	if (!e && *chosen_elevator) {
> 
> do
> 
> 	if (!e && !q->mq_ops && && *chosen_elevator) {

Confirmed, that's what it seems to be, and here's a real diff of the
above example that works for me:

Comments

Paolo Valente Feb. 14, 2017, 8:14 a.m. UTC | #1
> Il giorno 14 feb 2017, alle ore 00:10, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 02/13/2017 03:28 PM, Jens Axboe wrote:
>> On 02/13/2017 03:09 PM, Omar Sandoval wrote:
>>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>>>> that scheduler is set and initialized without any check, driving the
>>>> system into an inconsistent state. This commit addresses this issue by
>>>> letting elevator_get fail for these wrong cross choices.
>>>> 
>>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>>> ---
>>>> block/elevator.c | 26 ++++++++++++++++++--------
>>>> 1 file changed, 18 insertions(+), 8 deletions(-)
>>> 
>>> Hey, Paolo,
>>> 
>>> How exactly are you triggering this? In __elevator_change(), we do check
>>> for mq or not mq:
>>> 
>>> 	if (!e->uses_mq && q->mq_ops) {
>>> 		elevator_put(e);
>>> 		return -EINVAL;
>>> 	}
>>> 	if (e->uses_mq && !q->mq_ops) {
>>> 		elevator_put(e);
>>> 		return -EINVAL;
>>> 	}
>>> 
>>> We don't ever appear to call elevator_init() with a specific scheduler
>>> name, and for the default we switch off of q->mq_ops and use the
>>> defaults from Kconfig:
>>> 
>>> 	if (q->mq_ops && q->nr_hw_queues == 1)
>>> 		e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
>>> 	else if (q->mq_ops)
>>> 		e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
>>> 	else
>>> 		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
>>> 
>>> 	if (!e) {
>>> 		printk(KERN_ERR
>>> 			"Default I/O scheduler not found. " \
>>> 			"Using noop/none.\n");
>>> 		e = elevator_get("noop", false);
>>> 	}
>>> 
>>> So I guess this could happen if someone manually changed those Kconfig
>>> options, but I don't see what other case would make this happen, could
>>> you please explain?
>> 
>> Was wondering the same - is it using the 'elevator=' boot parameter?
>> Didn't look at that path just now, but that's the only one I could
>> think of. If it is, I'd much prefer only using 'chosen_elevator' for
>> the non-mq stuff, and the fix should be just that instead.
>> 
>> So instead of:
>> 
>> 	if (!e && *chosen_elevator) {
>> 
>> do
>> 
>> 	if (!e && !q->mq_ops && && *chosen_elevator) {
> 
> Confirmed, that's what it seems to be, and here's a real diff of the
> above example that works for me:
> 
> diff --git a/block/elevator.c b/block/elevator.c
> index 27ff1ed5a6fa..699d10f71a2c 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -207,11 +207,12 @@ int elevator_init(struct request_queue *q, char *name)
> 	}
> 
> 	/*
> -	 * Use the default elevator specified by config boot param or
> -	 * config option.  Don't try to load modules as we could be running
> -	 * off async and request_module() isn't allowed from async.
> +	 * Use the default elevator specified by config boot param for
> +	 * non-mq devices, or by config option.

I don't fully get this choice: being able to change the default I/O
scheduler through the command line has been rather useful for me,
saving me a lot of recompilations, and such a feature seems widespread
among (at least power) users.  However, mine is of course just an
opinion, and I may be missing the main point also in this case.

Thanks,
Paolo


> Don't try to load modules
> +	 * as we could be running off async and request_module() isn't
> +	 * allowed from async.
> 	 */
> -	if (!e && *chosen_elevator) {
> +	if (!e && !q->mq_ops && *chosen_elevator) {
> 		e = elevator_get(chosen_elevator, false);
> 		if (!e)
> 			printk(KERN_ERR "I/O scheduler %s not found\n",
> 
> -- 
> Jens Axboe
Jens Axboe Feb. 14, 2017, 3:16 p.m. UTC | #2
On 02/14/2017 01:14 AM, Paolo Valente wrote:
> 
>> Il giorno 14 feb 2017, alle ore 00:10, Jens Axboe <axboe@kernel.dk> ha scritto:
>>
>> On 02/13/2017 03:28 PM, Jens Axboe wrote:
>>> On 02/13/2017 03:09 PM, Omar Sandoval wrote:
>>>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>>>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>>>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>>>>> that scheduler is set and initialized without any check, driving the
>>>>> system into an inconsistent state. This commit addresses this issue by
>>>>> letting elevator_get fail for these wrong cross choices.
>>>>>
>>>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>>>> ---
>>>>> block/elevator.c | 26 ++++++++++++++++++--------
>>>>> 1 file changed, 18 insertions(+), 8 deletions(-)
>>>>
>>>> Hey, Paolo,
>>>>
>>>> How exactly are you triggering this? In __elevator_change(), we do check
>>>> for mq or not mq:
>>>>
>>>> 	if (!e->uses_mq && q->mq_ops) {
>>>> 		elevator_put(e);
>>>> 		return -EINVAL;
>>>> 	}
>>>> 	if (e->uses_mq && !q->mq_ops) {
>>>> 		elevator_put(e);
>>>> 		return -EINVAL;
>>>> 	}
>>>>
>>>> We don't ever appear to call elevator_init() with a specific scheduler
>>>> name, and for the default we switch off of q->mq_ops and use the
>>>> defaults from Kconfig:
>>>>
>>>> 	if (q->mq_ops && q->nr_hw_queues == 1)
>>>> 		e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
>>>> 	else if (q->mq_ops)
>>>> 		e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
>>>> 	else
>>>> 		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
>>>>
>>>> 	if (!e) {
>>>> 		printk(KERN_ERR
>>>> 			"Default I/O scheduler not found. " \
>>>> 			"Using noop/none.\n");
>>>> 		e = elevator_get("noop", false);
>>>> 	}
>>>>
>>>> So I guess this could happen if someone manually changed those Kconfig
>>>> options, but I don't see what other case would make this happen, could
>>>> you please explain?
>>>
>>> Was wondering the same - is it using the 'elevator=' boot parameter?
>>> Didn't look at that path just now, but that's the only one I could
>>> think of. If it is, I'd much prefer only using 'chosen_elevator' for
>>> the non-mq stuff, and the fix should be just that instead.
>>>
>>> So instead of:
>>>
>>> 	if (!e && *chosen_elevator) {
>>>
>>> do
>>>
>>> 	if (!e && !q->mq_ops && && *chosen_elevator) {
>>
>> Confirmed, that's what it seems to be, and here's a real diff of the
>> above example that works for me:
>>
>> diff --git a/block/elevator.c b/block/elevator.c
>> index 27ff1ed5a6fa..699d10f71a2c 100644
>> --- a/block/elevator.c
>> +++ b/block/elevator.c
>> @@ -207,11 +207,12 @@ int elevator_init(struct request_queue *q, char *name)
>> 	}
>>
>> 	/*
>> -	 * Use the default elevator specified by config boot param or
>> -	 * config option.  Don't try to load modules as we could be running
>> -	 * off async and request_module() isn't allowed from async.
>> +	 * Use the default elevator specified by config boot param for
>> +	 * non-mq devices, or by config option.
> 
> I don't fully get this choice: being able to change the default I/O
> scheduler through the command line has been rather useful for me,
> saving me a lot of recompilations, and such a feature seems widespread
> among (at least power) users.  However, mine is of course just an
> opinion, and I may be missing the main point also in this case.

The problem with the elevator= boot parameter is that it applies across
everything, which makes very little sense, since it's a per device
setting. In retrospect, it was a mistake to add this parameter, and I
don't want to continue down that path with blk-mq.

Why aren't you just using online switching through sysfs? For normal
users, typically this would be done through udev rules.
Paolo Valente Feb. 14, 2017, 3:48 p.m. UTC | #3
> Il giorno 14 feb 2017, alle ore 16:16, Jens Axboe <axboe@kernel.dk> ha scritto:
> 
> On 02/14/2017 01:14 AM, Paolo Valente wrote:
>> 
>>> Il giorno 14 feb 2017, alle ore 00:10, Jens Axboe <axboe@kernel.dk> ha scritto:
>>> 
>>> On 02/13/2017 03:28 PM, Jens Axboe wrote:
>>>> On 02/13/2017 03:09 PM, Omar Sandoval wrote:
>>>>> On Mon, Feb 13, 2017 at 10:01:07PM +0100, Paolo Valente wrote:
>>>>>> If, at boot, a legacy I/O scheduler is chosen for a device using blk-mq,
>>>>>> or, viceversa, a blk-mq scheduler is chosen for a device using blk, then
>>>>>> that scheduler is set and initialized without any check, driving the
>>>>>> system into an inconsistent state. This commit addresses this issue by
>>>>>> letting elevator_get fail for these wrong cross choices.
>>>>>> 
>>>>>> Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
>>>>>> ---
>>>>>> block/elevator.c | 26 ++++++++++++++++++--------
>>>>>> 1 file changed, 18 insertions(+), 8 deletions(-)
>>>>> 
>>>>> Hey, Paolo,
>>>>> 
>>>>> How exactly are you triggering this? In __elevator_change(), we do check
>>>>> for mq or not mq:
>>>>> 
>>>>> 	if (!e->uses_mq && q->mq_ops) {
>>>>> 		elevator_put(e);
>>>>> 		return -EINVAL;
>>>>> 	}
>>>>> 	if (e->uses_mq && !q->mq_ops) {
>>>>> 		elevator_put(e);
>>>>> 		return -EINVAL;
>>>>> 	}
>>>>> 
>>>>> We don't ever appear to call elevator_init() with a specific scheduler
>>>>> name, and for the default we switch off of q->mq_ops and use the
>>>>> defaults from Kconfig:
>>>>> 
>>>>> 	if (q->mq_ops && q->nr_hw_queues == 1)
>>>>> 		e = elevator_get(CONFIG_DEFAULT_SQ_IOSCHED, false);
>>>>> 	else if (q->mq_ops)
>>>>> 		e = elevator_get(CONFIG_DEFAULT_MQ_IOSCHED, false);
>>>>> 	else
>>>>> 		e = elevator_get(CONFIG_DEFAULT_IOSCHED, false);
>>>>> 
>>>>> 	if (!e) {
>>>>> 		printk(KERN_ERR
>>>>> 			"Default I/O scheduler not found. " \
>>>>> 			"Using noop/none.\n");
>>>>> 		e = elevator_get("noop", false);
>>>>> 	}
>>>>> 
>>>>> So I guess this could happen if someone manually changed those Kconfig
>>>>> options, but I don't see what other case would make this happen, could
>>>>> you please explain?
>>>> 
>>>> Was wondering the same - is it using the 'elevator=' boot parameter?
>>>> Didn't look at that path just now, but that's the only one I could
>>>> think of. If it is, I'd much prefer only using 'chosen_elevator' for
>>>> the non-mq stuff, and the fix should be just that instead.
>>>> 
>>>> So instead of:
>>>> 
>>>> 	if (!e && *chosen_elevator) {
>>>> 
>>>> do
>>>> 
>>>> 	if (!e && !q->mq_ops && && *chosen_elevator) {
>>> 
>>> Confirmed, that's what it seems to be, and here's a real diff of the
>>> above example that works for me:
>>> 
>>> diff --git a/block/elevator.c b/block/elevator.c
>>> index 27ff1ed5a6fa..699d10f71a2c 100644
>>> --- a/block/elevator.c
>>> +++ b/block/elevator.c
>>> @@ -207,11 +207,12 @@ int elevator_init(struct request_queue *q, char *name)
>>> 	}
>>> 
>>> 	/*
>>> -	 * Use the default elevator specified by config boot param or
>>> -	 * config option.  Don't try to load modules as we could be running
>>> -	 * off async and request_module() isn't allowed from async.
>>> +	 * Use the default elevator specified by config boot param for
>>> +	 * non-mq devices, or by config option.
>> 
>> I don't fully get this choice: being able to change the default I/O
>> scheduler through the command line has been rather useful for me,
>> saving me a lot of recompilations, and such a feature seems widespread
>> among (at least power) users.  However, mine is of course just an
>> opinion, and I may be missing the main point also in this case.
> 
> The problem with the elevator= boot parameter is that it applies across
> everything, which makes very little sense, since it's a per device
> setting. In retrospect, it was a mistake to add this parameter, and I
> don't want to continue down that path with blk-mq.
> 

ok, thanks

> Why aren't you just using online switching through syses?

To change the scheduler from the very beginning at boot.  Which maybe
can be done through udev rules as well, I'm just too ignorant.

Thanks,
Paolo

> For normal
> users, typically this would be done through udev rules.
> 
> -- 
> Jens Axboe
diff mbox

Patch

diff --git a/block/elevator.c b/block/elevator.c
index 27ff1ed5a6fa..699d10f71a2c 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -207,11 +207,12 @@  int elevator_init(struct request_queue *q, char *name)
 	}
 
 	/*
-	 * Use the default elevator specified by config boot param or
-	 * config option.  Don't try to load modules as we could be running
-	 * off async and request_module() isn't allowed from async.
+	 * Use the default elevator specified by config boot param for
+	 * non-mq devices, or by config option. Don't try to load modules
+	 * as we could be running off async and request_module() isn't
+	 * allowed from async.
 	 */
-	if (!e && *chosen_elevator) {
+	if (!e && !q->mq_ops && *chosen_elevator) {
 		e = elevator_get(chosen_elevator, false);
 		if (!e)
 			printk(KERN_ERR "I/O scheduler %s not found\n",