diff mbox series

block: aoe: fix page fault in freedev()

Message ID c274db07-9c7d-d857-33ad-4a762819bcdd@vrvis.at (mailing list archive)
State New, archived
Headers show
Series block: aoe: fix page fault in freedev() | expand

Commit Message

Valentin Kleibel March 10, 2022, 11:53 a.m. UTC
There is a bug in the aoe driver module where every forcible removal of 
an aoe device (eg. "rmmod aoe" with aoe devices available or "aoe-flush 
ex.x") leads to a page fault.
The code in freedev() calls blk_mq_free_tag_set() before running 
blk_cleanup_queue() which leads to this issue 
(drivers/block/aoe/aoedev.c L281ff).
This issue was fixed upstream in commit 6560ec9 (aoe: use 
blk_mq_alloc_disk and blk_cleanup_disk) with the introduction and use of 
the function blk_cleanup_disk().

This patch applies to kernels 5.4 and 5.10.

The function calls are reordered to match the behavior of 
blk_cleanup_disk() to mitigate this issue.

Fixes: 3582dd2 (aoe: convert aoeblk to blk-mq)
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215647
Signed-off-by: Valentin Kleibel <valentin@vrvis.at>
---
  drivers/block/aoe/aoedev.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Valentin Kleibel March 10, 2022, 12:55 p.m. UTC | #1
On 10/03/2022 13:26, Greg Kroah-Hartman wrote:
> On Thu, Mar 10, 2022 at 01:24:38PM +0100, Valentin Kleibel wrote:
>> On 10/03/2022 13:03, Greg Kroah-Hartman wrote:
>>>> This patch applies to kernels 5.4 and 5.10.
>>>
>>> We need a fix for Linus's tree first before we can backport anything to
>>> older kernels.  Does this also work there?
>>
>> It is fixed in Linus' tree starting with 5.14.
> 
> What commit fixes it there?  Why not just backport that one only?

commit 6560ec961a08 (aoe: use blk_mq_alloc_disk and blk_cleanup_disk)
This commit uses the function blk_cleanup_disk() in freedev() in 
drivers/block/aoe/aoedev.c which fixes the issue.
The function was introduced in f525464a8000 (block: add blk_alloc_disk 
and blk_cleanup_disk APIs):
void blk_cleanup_disk(struct gendisk *disk)
{
	blk_cleanup_queue(disk->queue);
	put_disk(disk);
}
EXPORT_SYMBOL(blk_cleanup_disk);

I tried to backport the fix to the lts kernels without introducing a new 
API by just adjusting the order of the two function calls.
Is it preferable to introduce and use the function blk_cleanup_disk()?

cheers,
valentin
Greg KH March 14, 2022, 11:12 a.m. UTC | #2
On Thu, Mar 10, 2022 at 01:55:25PM +0100, Valentin Kleibel wrote:
> On 10/03/2022 13:26, Greg Kroah-Hartman wrote:
> > On Thu, Mar 10, 2022 at 01:24:38PM +0100, Valentin Kleibel wrote:
> > > On 10/03/2022 13:03, Greg Kroah-Hartman wrote:
> > > > > This patch applies to kernels 5.4 and 5.10.
> > > > 
> > > > We need a fix for Linus's tree first before we can backport anything to
> > > > older kernels.  Does this also work there?
> > > 
> > > It is fixed in Linus' tree starting with 5.14.
> > 
> > What commit fixes it there?  Why not just backport that one only?
> 
> commit 6560ec961a08 (aoe: use blk_mq_alloc_disk and blk_cleanup_disk)
> This commit uses the function blk_cleanup_disk() in freedev() in
> drivers/block/aoe/aoedev.c which fixes the issue.
> The function was introduced in f525464a8000 (block: add blk_alloc_disk and
> blk_cleanup_disk APIs):
> void blk_cleanup_disk(struct gendisk *disk)
> {
> 	blk_cleanup_queue(disk->queue);
> 	put_disk(disk);
> }
> EXPORT_SYMBOL(blk_cleanup_disk);
> 
> I tried to backport the fix to the lts kernels without introducing a new API
> by just adjusting the order of the two function calls.
> Is it preferable to introduce and use the function blk_cleanup_disk()?

I do not know, sorry.  But we prefer the upstream commits as much as
possible as one-off changes like this are almost always buggy and wrong
in the end.

Try doing the backports and see what that looks like please.

thanks,

greg k-h
Valentin Kleibel March 31, 2022, 9:58 a.m. UTC | #3
> I do not know, sorry.  But we prefer the upstream commits as much as
> possible as one-off changes like this are almost always buggy and wrong
> in the end.
> 
> Try doing the backports and see what that looks like please.

I did the backports now but Christoph Hellwig already pointed out:
> No idea what is hidden in bugzilla, but the blk_mq_alloc_disk changes
> aren't easily backportable.
[https://lore.kernel.org/all/20220308060916.GB23629@lst.de/]

Therefore I cherry-picked only the changes for blk_cleanup_disk as they 
are much simpler than the changes to blk_mq_alloc_disk.

I kept the original commit messages, please tell me if you feel they 
should be changed or do so yourself.

Please apply to 5.4 and 5.10.

Cheers,
Valentin

changelog:
v2:
* cherry pick from upstream commits instead of creating a new patch
diff mbox series

Patch

diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
index e2ea2356da06..08c98ea724ea 100644
--- a/drivers/block/aoe/aoedev.c
+++ b/drivers/block/aoe/aoedev.c
@@ -277,9 +277,9 @@  freedev(struct aoedev *d)
         if (d->gd) {
                 aoedisk_rm_debugfs(d);
                 del_gendisk(d->gd);
+               blk_cleanup_queue(d->blkq);
                 put_disk(d->gd);
                 blk_mq_free_tag_set(&d->tag_set);
-               blk_cleanup_queue(d->blkq);
         }
         t = d->targets;
         e = t + d->ntargets;