diff mbox series

[<PATCH,v1>,4/9] mmc: core: fix SD card request queue refcount underflow during shutdown

Message ID afdbf5eff1918f4004f2418e90bd08400d40ed1b.1576540907.git.nguyenb@codeaurora.org (mailing list archive)
State New, archived
Headers show
Series SD card bug fixes | expand

Commit Message

Bao D. Nguyen Dec. 17, 2019, 2:50 a.m. UTC
From: Can Guo <cang@codeaurora.org>

When system shutdown, kernel shall call shutdown function of mmc to stop
its request queue and clean it up, during which the request queue's kobject
shall be put once. In normal cases, if the SD card is removed, the
mmc_blk_remove routine releases all the resources and kobjects related to
the disk and request queue by decreasing their kref counts to 0. But if the
SD card is removed after its shutdown function is called, below kref count
underflow error shall be thrown out because the kref count was decreased
once during request queue cleanup by the shutdown function in advance. This
change fixes it by skipping request queue cleanup in the mmc blk routine if
the queue has been marked as dead.

[  166.187211] refcount_t: underflow; use-after-free.
[  166.187277] ------------[ cut here ]------------
[  166.187321] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  166.187542] Workqueue: events_freezable mmc_rescan
[  166.187558] task: ffffffe673b96680 task.stack: ffffff8008418000
[  166.187579] pc : refcount_sub_and_test+0x64/0x78
[  166.187593] lr : refcount_sub_and_test+0x64/0x78
[  166.187605] sp : ffffff800841ba20 pstate : 60c00145
[  166.188319] Call trace:
[  166.188331]  refcount_sub_and_test+0x64/0x78
[  166.188343]  refcount_dec_and_test+0x18/0x24
[  166.188355]  kobject_put+0x5c/0x74
[  166.188374]  blk_put_queue+0x1c/0x28
[  166.188388]  disk_release+0x70/0x90
[  166.188402]  device_release+0x38/0x90
[  166.188429]  kobject_cleanup+0xc4/0x1c0
[  166.188441]  kobject_put+0x68/0x74
[  166.188455]  put_disk+0x20/0x2c
[  166.188467]  mmc_blk_put+0x9c/0xdc
[  166.188480]  mmc_blk_remove_req+0x110/0x120
[  166.188493]  mmc_blk_remove+0x14c/0x22c
[  166.188505]  mmc_bus_remove+0x24/0x34
[  166.188517]  device_release_driver_internal+0x13c/0x1e0
[  166.188528]  device_release_driver+0x24/0x30
[  166.188540]  bus_remove_device+0xdc/0x120
[  166.188552]  device_del+0x1e0/0x284
[  166.188564]  mmc_remove_card+0x68/0x7c
[  166.188577]  mmc_sd_remove+0x24/0x48
[  166.188588]  mmc_sd_detect+0x120/0x1a4
[  166.188600]  mmc_rescan+0xf4/0x384
[  166.188613]  process_one_work+0x1c0/0x3d4
[  166.188625]  worker_thread+0x224/0x344
[  166.188637]  kthread+0x120/0x130
[  166.188649]  ret_from_fork+0x10/0x18.

Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Sayali Lokhande <sayalil@codeaurora.org>
Signed-off-by: Bao D. Nguyen <nguyenb@codeaurora.org>
---
 drivers/mmc/core/queue.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Greg KH Dec. 18, 2019, 8:33 a.m. UTC | #1
On Mon, Dec 16, 2019 at 06:50:37PM -0800, Bao D. Nguyen wrote:
> From: Can Guo <cang@codeaurora.org>
> 
> When system shutdown, kernel shall call shutdown function of mmc to stop
> its request queue and clean it up, during which the request queue's kobject
> shall be put once. In normal cases, if the SD card is removed, the
> mmc_blk_remove routine releases all the resources and kobjects related to
> the disk and request queue by decreasing their kref counts to 0. But if the
> SD card is removed after its shutdown function is called, below kref count
> underflow error shall be thrown out because the kref count was decreased
> once during request queue cleanup by the shutdown function in advance. This
> change fixes it by skipping request queue cleanup in the mmc blk routine if
> the queue has been marked as dead.
> 
> [  166.187211] refcount_t: underflow; use-after-free.
> [  166.187277] ------------[ cut here ]------------
> [  166.187321] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> [  166.187542] Workqueue: events_freezable mmc_rescan
> [  166.187558] task: ffffffe673b96680 task.stack: ffffff8008418000
> [  166.187579] pc : refcount_sub_and_test+0x64/0x78
> [  166.187593] lr : refcount_sub_and_test+0x64/0x78
> [  166.187605] sp : ffffff800841ba20 pstate : 60c00145
> [  166.188319] Call trace:
> [  166.188331]  refcount_sub_and_test+0x64/0x78
> [  166.188343]  refcount_dec_and_test+0x18/0x24
> [  166.188355]  kobject_put+0x5c/0x74
> [  166.188374]  blk_put_queue+0x1c/0x28
> [  166.188388]  disk_release+0x70/0x90
> [  166.188402]  device_release+0x38/0x90
> [  166.188429]  kobject_cleanup+0xc4/0x1c0
> [  166.188441]  kobject_put+0x68/0x74
> [  166.188455]  put_disk+0x20/0x2c
> [  166.188467]  mmc_blk_put+0x9c/0xdc
> [  166.188480]  mmc_blk_remove_req+0x110/0x120
> [  166.188493]  mmc_blk_remove+0x14c/0x22c
> [  166.188505]  mmc_bus_remove+0x24/0x34
> [  166.188517]  device_release_driver_internal+0x13c/0x1e0
> [  166.188528]  device_release_driver+0x24/0x30
> [  166.188540]  bus_remove_device+0xdc/0x120
> [  166.188552]  device_del+0x1e0/0x284
> [  166.188564]  mmc_remove_card+0x68/0x7c
> [  166.188577]  mmc_sd_remove+0x24/0x48
> [  166.188588]  mmc_sd_detect+0x120/0x1a4
> [  166.188600]  mmc_rescan+0xf4/0x384
> [  166.188613]  process_one_work+0x1c0/0x3d4
> [  166.188625]  worker_thread+0x224/0x344
> [  166.188637]  kthread+0x120/0x130
> [  166.188649]  ret_from_fork+0x10/0x18.
> 
> Signed-off-by: Can Guo <cang@codeaurora.org>
> Signed-off-by: Sayali Lokhande <sayalil@codeaurora.org>
> Signed-off-by: Bao D. Nguyen <nguyenb@codeaurora.org>
> ---
>  drivers/mmc/core/queue.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
> index 9edc086..846557b 100644
> --- a/drivers/mmc/core/queue.c
> +++ b/drivers/mmc/core/queue.c
> @@ -506,7 +506,8 @@ void mmc_cleanup_queue(struct mmc_queue *mq)
>  	if (blk_queue_quiesced(q))
>  		blk_mq_unquiesce_queue(q);
>  
> -	blk_cleanup_queue(q);
> +	if (likely(!blk_queue_dead(q)))
> +		blk_cleanup_queue(q);

Unless you can measure the performance impact, never use unlikely/likely
in kernel code.  The compiler and cpu will always do much better over
time than you can.

That being said, what will cleanup the queue if it is not "dead" at this
point in time, later on?  Isn't this a leak?

thanks,

greg k-h
diff mbox series

Patch

diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 9edc086..846557b 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -506,7 +506,8 @@  void mmc_cleanup_queue(struct mmc_queue *mq)
 	if (blk_queue_quiesced(q))
 		blk_mq_unquiesce_queue(q);
 
-	blk_cleanup_queue(q);
+	if (likely(!blk_queue_dead(q)))
+		blk_cleanup_queue(q);
 	blk_mq_free_tag_set(&mq->tag_set);
 
 	/*