[RFC,v2,4/5] mm/swapfile: refcount block and queue before using blkcg_schedule_throttle()
diff mbox series

Message ID 20200409214530.2413-5-mcgrof@kernel.org
State New
Headers show
Series
  • blktrace: fix use after free
Related show

Commit Message

Luis Chamberlain April 9, 2020, 9:45 p.m. UTC
block device are refcounted so to ensure once its final user goes away it
can be cleaned up by the lower layers properly. The block device's
request_queue structure is also refcounted, however if the last
blk_put_queue() is called under atomic context the block layer has
to defer removal.

By refcounting the block device during the use of blkcg_schedule_throttle(),
we ensure ensure two things:

1) the block device remains available during the call
2) we ensure avoid having to deal with the fact we're using the
   request_queue structure in atomic context, since the last
   blk_put_queue() will be called upon disk_release(), *after*
   our own bdput().

This means this code path is *not* going to remove the request_queue
structure, as we are ensuring some later upper layer disk_release()
will be the one to release the request_queue structure for us.

Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: yu kuai <yukuai3@huawei.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 mm/swapfile.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Bart Van Assche April 10, 2020, 3:02 a.m. UTC | #1
On 2020-04-09 14:45, Luis Chamberlain wrote:
>  		if (si->bdev) {
> +			bdev = bdgrab(si->bdev);
> +			if (!bdev)
> +				continue;
> +			/*
> +			 * By adding our own bdgrab() we ensure the queue
> +			 * sticks around until disk_release(), and so we ensure
> +			 * our release of the request_queue does not happen in
> +			 * atomic context.
> +			 */
>  			blkcg_schedule_throttle(bdev_get_queue(si->bdev),
>  						true);

How about changing the si->bdev argument of blkcg_schedule_throttle()
into bdev?

Thanks,

Bart.
Luis Chamberlain April 10, 2020, 2:34 p.m. UTC | #2
On Thu, Apr 09, 2020 at 08:02:59PM -0700, Bart Van Assche wrote:
> On 2020-04-09 14:45, Luis Chamberlain wrote:
> >  		if (si->bdev) {
> > +			bdev = bdgrab(si->bdev);
> > +			if (!bdev)
> > +				continue;
> > +			/*
> > +			 * By adding our own bdgrab() we ensure the queue
> > +			 * sticks around until disk_release(), and so we ensure
> > +			 * our release of the request_queue does not happen in
> > +			 * atomic context.
> > +			 */
> >  			blkcg_schedule_throttle(bdev_get_queue(si->bdev),
> >  						true);
> 
> How about changing the si->bdev argument of blkcg_schedule_throttle()
> into bdev?

Sure, thanks!

  Luis

Patch
diff mbox series

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 6659ab563448..5f6f3a61b5c0 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3753,6 +3753,7 @@  static void free_swap_count_continuations(struct swap_info_struct *si)
 void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, int node,
 				  gfp_t gfp_mask)
 {
+	struct block_device *bdev;
 	struct swap_info_struct *si, *next;
 	if (!(gfp_mask & __GFP_IO) || !memcg)
 		return;
@@ -3771,8 +3772,18 @@  void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg, int node,
 	plist_for_each_entry_safe(si, next, &swap_avail_heads[node],
 				  avail_lists[node]) {
 		if (si->bdev) {
+			bdev = bdgrab(si->bdev);
+			if (!bdev)
+				continue;
+			/*
+			 * By adding our own bdgrab() we ensure the queue
+			 * sticks around until disk_release(), and so we ensure
+			 * our release of the request_queue does not happen in
+			 * atomic context.
+			 */
 			blkcg_schedule_throttle(bdev_get_queue(si->bdev),
 						true);
+			bdput(bdev);
 			break;
 		}
 	}