diff mbox series

[v2] blk-mq: put the reference of the io scheduler module after switching back

Message ID OS0P286MB0338E8D41770BFDE7B3A4EBBBE229@OS0P286MB0338.JPNP286.PROD.OUTLOOK.COM (mailing list archive)
State Not Applicable, archived
Headers show
Series [v2] blk-mq: put the reference of the io scheduler module after switching back | expand

Commit Message

Jinlong Chen Oct. 12, 2022, 12:35 a.m. UTC
We got a reference of the io scheduler module in
blk_mq_elv_switch_none to prevent the module from
being removed. We need to put that reference back
once we are done.

Signed-off-by: Jinlong Chen <chenjinlong2016@outlook.com>
---
---
Changes in v2:
 - reword the commit message because the patch is for blk-mq precisely 

 block/blk-mq.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Jinlong Chen Oct. 13, 2022, 1:03 p.m. UTC | #1
Sorry for the disturbance.

This patch is just wrong. elevator_switch_mq does not increase the reference
 count of the io scheduler module to which we are switching. Hence, we do not
 need to put the reference back manually.

Sincerely
Jinlong Chen
Yu Kuai Oct. 13, 2022, 1:22 p.m. UTC | #2
Hi,

在 2022/10/13 21:03, Jinlong Chen 写道:
> Sorry for the disturbance.
> 
> This patch is just wrong. elevator_switch_mq does not increase the reference
>   count of the io scheduler module to which we are switching. Hence, we do not
>   need to put the reference back manually.

I'm confused here, cause I do think this patch make sense.

blk_mq_update_nr_hw_queues
  // for each queue using the tagset
  blk_mq_freeze_queue
  // if current elevator is not none, swith to none
  blk_mq_elv_switch_none
   // elevator need to be switched back, got a reference to
   // prevent module to be removed.
   __module_get
   elevator_switch(q, NULL);

  // switch back from none elevator
  blk_mq_elv_switch_back
   -> should release the module reference here
  blk_mq_unfreeze_queue

By the way, I do not test yet, but I think problem can be reporduced:

1. modprobe bfq
2. switch elevator to bfq
3. trigger blk_mq_update_nr_hw_queues
4. switch elevator to none
5. rmmod bfq will fail

Thanks,
Kuai
> 
> Sincerely
> Jinlong Chen
> .
>
Jinlong Chen Oct. 13, 2022, 1:47 p.m. UTC | #3
Hi, Yu Kuai!

> 
> I'm confused here, cause I do think this patch make sense.
> 
> blk_mq_update_nr_hw_queues
>   // for each queue using the tagset
>   blk_mq_freeze_queue
>   // if current elevator is not none, swith to none
>   blk_mq_elv_switch_none
>    // elevator need to be switched back, got a reference to
>    // prevent module to be removed.
>    __module_get
>    elevator_switch(q, NULL);
> 
>   // switch back from none elevator
>   blk_mq_elv_switch_back
>    -> should release the module reference here
>   blk_mq_unfreeze_queue
> 

We need to release the reference only if blk_mq_elv_switch_back got its own
 reference. However, blk_mq_elv_switch_back (precisely elevator_switch_mq)
 does not increase the reference of the module it is switching to. 
 Hence, the reference got in blk_mq_elv_switch_none does not need to be released,
 or we'll have to re-increase the reference count manually.

> 
> By the way, I do not test yet, but I think problem can be reporduced:
> 
> 
> 1. modprobe bfq
> 2. switch elevator to bfq
> 3. trigger blk_mq_update_nr_hw_queues
> 4. switch elevator to none
> 5. rmmod bfq will fail
> 

I tried to reproduce the problem but failed, so I found my mistake.

Sincerely,
Jinlong Chen
Yu Kuai Oct. 13, 2022, 2:05 p.m. UTC | #4
在 2022/10/13 21:47, Jinlong Chen 写道:
> Hi, Yu Kuai!
> 
>>
>> I'm confused here, cause I do think this patch make sense.
>>
>> blk_mq_update_nr_hw_queues
>>    // for each queue using the tagset
>>    blk_mq_freeze_queue
>>    // if current elevator is not none, swith to none
>>    blk_mq_elv_switch_none
>>     // elevator need to be switched back, got a reference to
>>     // prevent module to be removed.
>>     __module_get
>>     elevator_switch(q, NULL);
>>
>>    // switch back from none elevator
>>    blk_mq_elv_switch_back
>>     -> should release the module reference here
>>    blk_mq_unfreeze_queue
>>
> 
> We need to release the reference only if blk_mq_elv_switch_back got its own
>   reference. However, blk_mq_elv_switch_back (precisely elevator_switch_mq)
>   does not increase the reference of the module it is switching to.
>   Hence, the reference got in blk_mq_elv_switch_none does not need to be released,
>   or we'll have to re-increase the reference count manually.'

But I don't see elevator_switch() release the referenct of the module
it is switching from. It's still not balance to me.

Thanks,
Kuai
> 
>>
>> By the way, I do not test yet, but I think problem can be reporduced:
>>
>>
>> 1. modprobe bfq
>> 2. switch elevator to bfq
>> 3. trigger blk_mq_update_nr_hw_queues
>> 4. switch elevator to none
>> 5. rmmod bfq will fail
>>
> 
> I tried to reproduce the problem but failed, so I found my mistake.
> 
> Sincerely,
> Jinlong Chen
> .
>
Jinlong Chen Oct. 13, 2022, 2:18 p.m. UTC | #5
> 
> But I don't see elevator_switch() release the referenct of the module
> it is switching from. It's still not balance to me.
> 

The reference count is released here:

elevator_switch_mq()
  --> elevator_exit()
    --> __elevator_exit()
      --> kobject_put()
        --> kobject_release()
          --> elevator_release()
            --> elevator_put()

What a deep call stack. :)

Sincerely,
Jinlong Chen
Yu Kuai Oct. 14, 2022, 7:27 a.m. UTC | #6
Hi!

在 2022/10/13 22:18, Jinlong Chen 写道:
>>
>> But I don't see elevator_switch() release the referenct of the module
>> it is switching from. It's still not balance to me.
>>
> 
> The reference count is released here:
> 
> elevator_switch_mq()
>    --> elevator_exit()
>      --> __elevator_exit()
>        --> kobject_put()
>          --> kobject_release()
>            --> elevator_release()
>              --> elevator_put()
> 
> What a deep call stack. :)

Yes, you're right.

Thanks,
Kuai
> 
> Sincerely,
> Jinlong Chen
> .
>
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8070b6c10e8d..8dfe3bf3e599 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -4595,6 +4595,13 @@  static void blk_mq_elv_switch_back(struct list_head *head,
 
 	mutex_lock(&q->sysfs_lock);
 	elevator_switch(q, t);
+	/**
+	 * We got a reference of the io scheduler module in
+	 * blk_mq_elv_switch_none to prevent the module from
+	 * being removed. We need to put that reference back
+	 * once we are done.
+	 */
+	module_put(t->elevator_owner);
 	mutex_unlock(&q->sysfs_lock);
 }