mbox series

[GIT,PULL] Block fix for 6.11-final

Message ID c8f3fba4-9cc1-4e7c-baf3-afb10ab7605d@kernel.dk (mailing list archive)
State New
Headers show
Series [GIT,PULL] Block fix for 6.11-final | expand

Pull-request

git://git.kernel.dk/linux.git tags/block-6.11-20240912

Message

Jens Axboe Sept. 12, 2024, 10:44 p.m. UTC
Hi Linus,

Just a single fix for a deadlock issue that can happen if someone
attempts to change the root disk IO scheduler with a module that
requires loading from disk. Changing the scheduler freezes the queue
while that operation is happening, hence causing a deadlock.

Please pull!


The following changes since commit 4ba032bc71dad8d604d308afffaa16b81816c751:

  Merge tag 'nvme-6.11-2024-09-05' of git://git.infradead.org/nvme into block-6.11 (2024-09-05 08:45:54 -0600)

are available in the Git repository at:

  git://git.kernel.dk/linux.git tags/block-6.11-20240912

for you to fetch changes up to 734e1a8603128ac31526c477a39456be5f4092b6:

  block: Prevent deadlocks when switching elevators (2024-09-10 13:43:42 -0600)

----------------------------------------------------------------
block-6.11-20240912

----------------------------------------------------------------
Damien Le Moal (1):
      block: Prevent deadlocks when switching elevators

 block/blk-sysfs.c | 22 +++++++++++++++++++++-
 block/elevator.c  | 21 +++++++++++++++------
 block/elevator.h  |  2 ++
 3 files changed, 38 insertions(+), 7 deletions(-)

Comments

Linus Torvalds Sept. 12, 2024, 11:21 p.m. UTC | #1
On Thu, 12 Sept 2024 at 15:44, Jens Axboe <axboe@kernel.dk> wrote:
>
> Just a single fix for a deadlock issue that can happen if someone
> attempts to change the root disk IO scheduler with a module that
> requires loading from disk. Changing the scheduler freezes the queue
> while that operation is happening, hence causing a deadlock.

Side note: I do think that doing the blk_mq_freeze_queue() outside the
sysfs_lock mutex is also a mistake, and will deadlock if anybody then
needs to do any IO (like a user space access) inside the sysfs_lock
mutex somewhere else.

It wasn't what caused Jesper's problems, and maybe nothing actually
does that, but it still looks rather questionable in
queue_attr_store().

I mean, imagine holding q->sysfs_lock, and doing something as simple
as just a memory allocation that wants to do swapping, but somebody
else did that queue_attr_store(), which freezed the queues and is now
waiting for the lock and won't unfreeze them until it gets it...

Yeah, yeah, very very unlikely to hit in real life, but still. Seems very wrong.

             Linus
pr-tracker-bot@kernel.org Sept. 12, 2024, 11:24 p.m. UTC | #2
The pull request you sent on Thu, 12 Sep 2024 16:44:21 -0600:

> git://git.kernel.dk/linux.git tags/block-6.11-20240912

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b8e7cd09ae543c1d384677b3d43e009a0e8647ca

Thank you!
Jens Axboe Sept. 13, 2024, 12:39 a.m. UTC | #3
On 9/12/24 5:21 PM, Linus Torvalds wrote:
> On Thu, 12 Sept 2024 at 15:44, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Just a single fix for a deadlock issue that can happen if someone
>> attempts to change the root disk IO scheduler with a module that
>> requires loading from disk. Changing the scheduler freezes the queue
>> while that operation is happening, hence causing a deadlock.
> 
> Side note: I do think that doing the blk_mq_freeze_queue() outside the
> sysfs_lock mutex is also a mistake, and will deadlock if anybody then
> needs to do any IO (like a user space access) inside the sysfs_lock
> mutex somewhere else.
> 
> It wasn't what caused Jesper's problems, and maybe nothing actually
> does that, but it still looks rather questionable in
> queue_attr_store().
> 
> I mean, imagine holding q->sysfs_lock, and doing something as simple
> as just a memory allocation that wants to do swapping, but somebody
> else did that queue_attr_store(), which freezed the queues and is now
> waiting for the lock and won't unfreeze them until it gets it...
> 
> Yeah, yeah, very very unlikely to hit in real life, but still. Seems
> very wrong.

Yep I agree, it does feel like the wrong thing and could deadlock under
reclaim. I didn't get around to replying to the other one, but I didn't
want to risk changing the ordering right before release. Will do a
separate patch for testing.