diff mbox series

[1/2,block/for-6.2] blk-iolatency: Fix memory leak on add_disk() failures

Message ID Y5TQ5gm3O4HXrXR3@slm.duckdns.org (mailing list archive)
State New, archived
Headers show
Series [1/2,block/for-6.2] blk-iolatency: Fix memory leak on add_disk() failures | expand

Commit Message

Tejun Heo Dec. 10, 2022, 6:33 p.m. UTC
When a gendisk is successfully initialized but add_disk() fails such as when
a loop device has invalid number of minor device numbers specified,
blkcg_init_disk() is called during init and then blkcg_exit_disk() during
error handling. Unfortunately, iolatency gets initialized in the former but
doesn't get cleaned up in the latter.

This is because, in non-error cases, the cleanup is performed by
del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
policies, iolatency being one of them, can only be activated once the disk
is fully registered and visible. That assumption is true for wbt and iocost,
but not so for iolatency as it gets initialized before add_disk() is called.

It is desirable to lazy-init rq_qos policies because they are optional
features and add to hot path overhead once initialized - each IO has to walk
all the registered rq_qos policies. So, we want to switch iolatency to lazy
init too. However, that's a bigger change. As a fix for the immediate
problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk().
This is safe because duplicate calls to rq_qos_exit() become noop's.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: darklight2357@icloud.com
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: d70675121546 ("block: introduce blk-iolatency io controller")
Cc: stable@vger.kernel.org # v4.19+
---
Hello,

I'm posting two patches for the iolatency memory leak issue after add_disk()
failure. This one is the immediate fix and should be really safe. However,
any change has risks and given that the bug being address is not critical at
all, I still think it'd make sense to route it through 6.2-rc1 rather than
applying directly to master for 6.1 release. So, it's tagged for the 6.2
merge window.

Thanks.

 block/blk-cgroup.c |    2 ++
 1 file changed, 2 insertions(+)

Comments

Linus Torvalds Dec. 10, 2022, 6:54 p.m. UTC | #1
On Sat, Dec 10, 2022 at 10:33 AM Tejun Heo <tj@kernel.org> wrote:
>
> I'm posting two patches for the iolatency memory leak issue after add_disk()
> failure. This one is the immediate fix and should be really safe. However,
> any change has risks and given that the bug being address is not critical at
> all, I still think it'd make sense to route it through 6.2-rc1 rather than
> applying directly to master for 6.1 release. So, it's tagged for the 6.2
> merge window.

Ack. I'm archiving these patches, and expect I'll be getting them the
usual ways (ie pull request).

If people expect something else (like me applying them during the
merge window as patches), holler.

            Linus
Christoph Hellwig Dec. 12, 2022, 8:47 a.m. UTC | #2
On Sat, Dec 10, 2022 at 08:33:10AM -1000, Tejun Heo wrote:
> When a gendisk is successfully initialized but add_disk() fails such as when
> a loop device has invalid number of minor device numbers specified,
> blkcg_init_disk() is called during init and then blkcg_exit_disk() during
> error handling. Unfortunately, iolatency gets initialized in the former but
> doesn't get cleaned up in the latter.
> 
> This is because, in non-error cases, the cleanup is performed by
> del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
> policies, iolatency being one of them, can only be activated once the disk
> is fully registered and visible. That assumption is true for wbt and iocost,
> but not so for iolatency as it gets initialized before add_disk() is called.
> 
> It is desirable to lazy-init rq_qos policies because they are optional
> features and add to hot path overhead once initialized - each IO has to walk
> all the registered rq_qos policies. So, we want to switch iolatency to lazy
> init too. However, that's a bigger change. As a fix for the immediate
> problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk().
> This is safe because duplicate calls to rq_qos_exit() become noop's.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: darklight2357@icloud.com

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Jens Axboe Dec. 14, 2022, 7:43 p.m. UTC | #3
On Sat, 10 Dec 2022 08:33:10 -1000, Tejun Heo wrote:
> When a gendisk is successfully initialized but add_disk() fails such as when
> a loop device has invalid number of minor device numbers specified,
> blkcg_init_disk() is called during init and then blkcg_exit_disk() during
> error handling. Unfortunately, iolatency gets initialized in the former but
> doesn't get cleaned up in the latter.
> 
> This is because, in non-error cases, the cleanup is performed by
> del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
> policies, iolatency being one of them, can only be activated once the disk
> is fully registered and visible. That assumption is true for wbt and iocost,
> but not so for iolatency as it gets initialized before add_disk() is called.
> 
> [...]

Applied, thanks!

[1/2] blk-iolatency: Fix memory leak on add_disk() failures
      commit: 813e693023ba10da9e75067780f8378465bf27cc

Best regards,
diff mbox series

Patch

--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -33,6 +33,7 @@ 
 #include "blk-cgroup.h"
 #include "blk-ioprio.h"
 #include "blk-throttle.h"
+#include "blk-rq-qos.h"
 
 /*
  * blkcg_pol_mutex protects blkcg_policy[] and policy [de]activation.
@@ -1322,6 +1323,7 @@  err_unlock:
 void blkcg_exit_disk(struct gendisk *disk)
 {
 	blkg_destroy_all(disk);
+	rq_qos_exit(disk->queue);
 	blk_throtl_exit(disk);
 }