Message ID | 20241216080206.2850773-2-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | block: fix deadlock caused by atomic limits update | expand |
On Mon, Dec 16, 2024 at 04:02:03PM +0800, Ming Lei wrote: > More importantly, queue_limits_start_update() returns one local copy of > q->limits, then the API user overwrites the local copy, and commit the > copy by queue_limits_commit_update() finally. > > So holding q->limits_lock protects nothing for the local copy, and not see > any real help by preventing new update & commit from happening, cause > what matters is that we do validation & commit atomically. It protects against someone else changing the copy in the queue while the current thread is updating the local copy, and thus incoherent updates. So no, we can't just remove it.
On Mon, Dec 16, 2024 at 04:49:01PM +0100, Christoph Hellwig wrote: > On Mon, Dec 16, 2024 at 04:02:03PM +0800, Ming Lei wrote: > > More importantly, queue_limits_start_update() returns one local copy of > > q->limits, then the API user overwrites the local copy, and commit the > > copy by queue_limits_commit_update() finally. > > > > So holding q->limits_lock protects nothing for the local copy, and not see > > any real help by preventing new update & commit from happening, cause > > what matters is that we do validation & commit atomically. > > It protects against someone else changing the copy in the queue while > the current thread is updating the local copy, and thus incoherent > updates. So no, we can't just remove it. The local copy can be updated in any way with any data, so does another concurrent update on q->limits really matter? thanks, Ming
On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote: > The local copy can be updated in any way with any data, so does another > concurrent update on q->limits really matter? Yes, because that means one of the updates get lost even if it is for entirely separate fields.
On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote: > On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote: > > The local copy can be updated in any way with any data, so does another > > concurrent update on q->limits really matter? > > Yes, because that means one of the updates get lost even if it is > for entirely separate fields. Right, but the limits are still valid anytime. Any suggestion for fixing this deadlock? One idea is to add queue_limits_start_try_update() and apply it in sysfs ->store(), in which update failure could be tolerable. Thanks, Ming
On Tue, Dec 17, 2024 at 03:05:48PM +0800, Ming Lei wrote: > On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote: > > On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote: > > > The local copy can be updated in any way with any data, so does another > > > concurrent update on q->limits really matter? > > > > Yes, because that means one of the updates get lost even if it is > > for entirely separate fields. > > Right, but the limits are still valid anytime. > > Any suggestion for fixing this deadlock? What is "this deadlock"?
On Tue, Dec 17, 2024 at 08:19:28AM +0100, Christoph Hellwig wrote: > On Tue, Dec 17, 2024 at 03:05:48PM +0800, Ming Lei wrote: > > On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote: > > > On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote: > > > > The local copy can be updated in any way with any data, so does another > > > > concurrent update on q->limits really matter? > > > > > > Yes, because that means one of the updates get lost even if it is > > > for entirely separate fields. > > > > Right, but the limits are still valid anytime. > > > > Any suggestion for fixing this deadlock? > > What is "this deadlock"? The commit log provides two reports: - lockdep warning https://lore.kernel.org/linux-block/Z1A8fai9_fQFhs1s@hovoldconsulting.com/ - real deadlock report https://lore.kernel.org/linux-scsi/ZxG38G9BuFdBpBHZ@fedora/ It is actually one simple ABBA lock: 1) queue_attr_store() blk_mq_freeze_queue(q); //queue freeze lock res = entry->store(disk, page, length); queue_limits_start_update //->limits_lock ... queue_limits_commit_update blk_mq_unfreeze_queue(q); 2) sd_revalidate_disk() queue_limits_start_update //->limits_lock sd_read_capacity() scsi_execute_cmd scsi_alloc_request blk_queue_enter //queue freeze lock Thanks, Ming
On 2024/12/16 23:30, Ming Lei wrote: > On Tue, Dec 17, 2024 at 08:19:28AM +0100, Christoph Hellwig wrote: >> On Tue, Dec 17, 2024 at 03:05:48PM +0800, Ming Lei wrote: >>> On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote: >>>> On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote: >>>>> The local copy can be updated in any way with any data, so does another >>>>> concurrent update on q->limits really matter? >>>> >>>> Yes, because that means one of the updates get lost even if it is >>>> for entirely separate fields. >>> >>> Right, but the limits are still valid anytime. >>> >>> Any suggestion for fixing this deadlock? >> >> What is "this deadlock"? > > The commit log provides two reports: > > - lockdep warning > > https://lore.kernel.org/linux-block/Z1A8fai9_fQFhs1s@hovoldconsulting.com/ > > - real deadlock report > > https://lore.kernel.org/linux-scsi/ZxG38G9BuFdBpBHZ@fedora/ > > It is actually one simple ABBA lock: > > 1) queue_attr_store() > > blk_mq_freeze_queue(q); //queue freeze lock > res = entry->store(disk, page, length); > queue_limits_start_update //->limits_lock > ... > queue_limits_commit_update > blk_mq_unfreeze_queue(q); The locking + freeze pattern should be: lim = queue_limits_start_update(q); ... blk_mq_freeze_queue(q); ret = queue_limits_commit_update(q, &lim); blk_mq_unfreeze_queue(q); This pattern is used in most places and anything that does not use it is likely susceptible to a similar ABBA deadlock. We should probably look into trying to integrate the freeze/unfreeze calls directly into queue_limits_commit_update(). Fixing queue_attr_store() to use this pattern seems simpler than trying to fix sd_revalidate_disk(). > > 2) sd_revalidate_disk() > > queue_limits_start_update //->limits_lock > sd_read_capacity() > scsi_execute_cmd > scsi_alloc_request > blk_queue_enter //queue freeze lock > > > Thanks, > Ming > >
On Tue, Dec 17, 2024 at 08:07:06AM -0800, Damien Le Moal wrote: > On 2024/12/16 23:30, Ming Lei wrote: > > On Tue, Dec 17, 2024 at 08:19:28AM +0100, Christoph Hellwig wrote: > >> On Tue, Dec 17, 2024 at 03:05:48PM +0800, Ming Lei wrote: > >>> On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote: > >>>> On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote: > >>>>> The local copy can be updated in any way with any data, so does another > >>>>> concurrent update on q->limits really matter? > >>>> > >>>> Yes, because that means one of the updates get lost even if it is > >>>> for entirely separate fields. > >>> > >>> Right, but the limits are still valid anytime. > >>> > >>> Any suggestion for fixing this deadlock? > >> > >> What is "this deadlock"? > > > > The commit log provides two reports: > > > > - lockdep warning > > > > https://lore.kernel.org/linux-block/Z1A8fai9_fQFhs1s@hovoldconsulting.com/ > > > > - real deadlock report > > > > https://lore.kernel.org/linux-scsi/ZxG38G9BuFdBpBHZ@fedora/ > > > > It is actually one simple ABBA lock: > > > > 1) queue_attr_store() > > > > blk_mq_freeze_queue(q); //queue freeze lock > > res = entry->store(disk, page, length); > > queue_limits_start_update //->limits_lock > > ... > > queue_limits_commit_update > > blk_mq_unfreeze_queue(q); > > The locking + freeze pattern should be: > > lim = queue_limits_start_update(q); > ... > blk_mq_freeze_queue(q); > ret = queue_limits_commit_update(q, &lim); > blk_mq_unfreeze_queue(q); > > This pattern is used in most places and anything that does not use it is likely > susceptible to a similar ABBA deadlock. We should probably look into trying to > integrate the freeze/unfreeze calls directly into queue_limits_commit_update(). > > Fixing queue_attr_store() to use this pattern seems simpler than trying to fix > sd_revalidate_disk(). This way looks good, just commit af2814149883 ("block: freeze the queue in queue_attr_store") needs to be reverted, and freeze/unfreeze has to be added to each queue attribute .store() handler. Thanks, Ming
On 12/18/24 07:39, Ming Lei wrote: > On Tue, Dec 17, 2024 at 08:07:06AM -0800, Damien Le Moal wrote: >> On 2024/12/16 23:30, Ming Lei wrote: >>> On Tue, Dec 17, 2024 at 08:19:28AM +0100, Christoph Hellwig wrote: >>>> On Tue, Dec 17, 2024 at 03:05:48PM +0800, Ming Lei wrote: >>>>> On Tue, Dec 17, 2024 at 05:40:56AM +0100, Christoph Hellwig wrote: >>>>>> On Tue, Dec 17, 2024 at 09:52:51AM +0800, Ming Lei wrote: >>>>>>> The local copy can be updated in any way with any data, so does another >>>>>>> concurrent update on q->limits really matter? >>>>>> >>>>>> Yes, because that means one of the updates get lost even if it is >>>>>> for entirely separate fields. >>>>> >>>>> Right, but the limits are still valid anytime. >>>>> >>>>> Any suggestion for fixing this deadlock? >>>> >>>> What is "this deadlock"? >>> >>> The commit log provides two reports: >>> >>> - lockdep warning >>> >>> https://lore.kernel.org/linux-block/Z1A8fai9_fQFhs1s@hovoldconsulting.com/ >>> >>> - real deadlock report >>> >>> https://lore.kernel.org/linux-scsi/ZxG38G9BuFdBpBHZ@fedora/ >>> >>> It is actually one simple ABBA lock: >>> >>> 1) queue_attr_store() >>> >>> blk_mq_freeze_queue(q); //queue freeze lock >>> res = entry->store(disk, page, length); >>> queue_limits_start_update //->limits_lock >>> ... >>> queue_limits_commit_update >>> blk_mq_unfreeze_queue(q); >> >> The locking + freeze pattern should be: >> >> lim = queue_limits_start_update(q); >> ... >> blk_mq_freeze_queue(q); >> ret = queue_limits_commit_update(q, &lim); >> blk_mq_unfreeze_queue(q); >> >> This pattern is used in most places and anything that does not use it is likely >> susceptible to a similar ABBA deadlock. We should probably look into trying to >> integrate the freeze/unfreeze calls directly into queue_limits_commit_update(). >> >> Fixing queue_attr_store() to use this pattern seems simpler than trying to fix >> sd_revalidate_disk(). > > This way looks good, just commit af2814149883 ("block: freeze the queue in > queue_attr_store") needs to be reverted, and freeze/unfreeze has to be > added to each queue attribute .store() handler. > Wouldn't it be feasible to add blk-mq freeze in queue_limits_start_update() and blk-mq unfreeze in queue_limits_commit_update()? If we do this then the pattern would be, queue_limits_start_update(): limit-lock + freeze queue_limits_commit_update() : unfreeze + limit-unlock Then in queue_attr_store() we shall just remove freeze/unfreeze. We also need to fix few call sites where we've code block, { blk_mq_freeze_queue() ... queue_limits_start_update() ... queue_limits_commit_update() ... blk_mq_unfreeze_queue() } In the above code block, we may then replace blk_mq_freeze_queue() with queue_limits_commit_update() and similarly replace blk_mq_unfreeze_queue() with queue_limits_commit_update(). { queue_limits_start_update() ... ... ... queue_limits_commit_update() } Thanks, --Nilay
diff --git a/block/blk-settings.c b/block/blk-settings.c index 8f09e33f41f6..b737428c6084 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -422,6 +422,7 @@ int queue_limits_commit_update(struct request_queue *q, { int error; + mutex_lock(&q->limits_lock); error = blk_validate_limits(lim); if (error) goto out_unlock; @@ -456,7 +457,6 @@ EXPORT_SYMBOL_GPL(queue_limits_commit_update); */ int queue_limits_set(struct request_queue *q, struct queue_limits *lim) { - mutex_lock(&q->limits_lock); return queue_limits_commit_update(q, lim); } EXPORT_SYMBOL_GPL(queue_limits_set); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 378d3a1a22fc..6cc20ca79adc 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -944,8 +944,13 @@ static inline unsigned int blk_boundary_sectors_left(sector_t offset, static inline struct queue_limits queue_limits_start_update(struct request_queue *q) { + struct queue_limits lim; + mutex_lock(&q->limits_lock); - return q->limits; + lim = q->limits; + mutex_unlock(&q->limits_lock); + + return lim; } int queue_limits_commit_update(struct request_queue *q, struct queue_limits *lim); @@ -962,7 +967,6 @@ int blk_validate_limits(struct queue_limits *lim); */ static inline void queue_limits_cancel_update(struct request_queue *q) { - mutex_unlock(&q->limits_lock); } /*
Commit d690cb8ae14b ("block: add an API to atomically update queue limits") adds APIs for updating queue limits atomically. And q->limits_lock is grabbed in queue_limits_start_update(), and released in queue_limits_commit_update(). This way is very fragile and easy to introduce deadlock[1][2]. More importantly, queue_limits_start_update() returns one local copy of q->limits, then the API user overwrites the local copy, and commit the copy by queue_limits_commit_update() finally. So holding q->limits_lock protects nothing for the local copy, and not see any real help by preventing new update & commit from happening, cause what matters is that we do validation & commit atomically. Changes the API to not hold q->limits_lock across atomic queue limits update APIs for fixing deadlock & making it easy to use. [1] https://lore.kernel.org/linux-block/Z1A8fai9_fQFhs1s@hovoldconsulting.com/ [2] https://lore.kernel.org/linux-scsi/ZxG38G9BuFdBpBHZ@fedora/ Fixes: d690cb8ae14b ("block: add an API to atomically update queue limits") Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-settings.c | 2 +- include/linux/blkdev.h | 8 ++++++-- 2 files changed, 7 insertions(+), 3 deletions(-)