diff mbox series

[RFC] block: set noio context in submit_bio_noacct_nocheck

Message ID 20240124093941.2259199-1-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [RFC] block: set noio context in submit_bio_noacct_nocheck | expand

Commit Message

Christoph Hellwig Jan. 24, 2024, 9:39 a.m. UTC
Make sure all in-line block layer submission runs in noio reclaim
context.  This is a big step towards allowing GFP_NOIO, the other
one would be to have noio (and nofs for that matter) workqueues for
kblockd and driver internal workqueues.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-core.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Jens Axboe Jan. 24, 2024, 3:40 p.m. UTC | #1
On 1/24/24 2:39 AM, Christoph Hellwig wrote:
> Make sure all in-line block layer submission runs in noio reclaim
> context.  This is a big step towards allowing GFP_NOIO, the other
> one would be to have noio (and nofs for that matter) workqueues for
> kblockd and driver internal workqueues.

I really don't like adding this for no good reason. Who's doing non NOIO
allocations down from this path?
Christoph Hellwig Jan. 25, 2024, 8:10 a.m. UTC | #2
On Wed, Jan 24, 2024 at 08:40:28AM -0700, Jens Axboe wrote:
> On 1/24/24 2:39 AM, Christoph Hellwig wrote:
> > Make sure all in-line block layer submission runs in noio reclaim
> > context.  This is a big step towards allowing GFP_NOIO, the other
> > one would be to have noio (and nofs for that matter) workqueues for
> > kblockd and driver internal workqueues.
> 
> I really don't like adding this for no good reason. Who's doing non NOIO
> allocations down from this path?

If there is a non-NOIO allocation right now that would be a bug,
although I would not be surprised if we had a few of them.

The reason to add this is a different one:  The MM folks want to
get rid of GFP_NOIO and GFP_NOFS and replace them by these context.

And doing this in the submission path and kblockd will cover almost
all of the noio context, with the rest probably covered by other
workqueues.  And this feels a lot less error prone than requiring
every driver to annotate the context in their submission routines.
Jens Axboe Jan. 25, 2024, 4:09 p.m. UTC | #3
On 1/25/24 1:10 AM, Christoph Hellwig wrote:
> On Wed, Jan 24, 2024 at 08:40:28AM -0700, Jens Axboe wrote:
>> On 1/24/24 2:39 AM, Christoph Hellwig wrote:
>>> Make sure all in-line block layer submission runs in noio reclaim
>>> context.  This is a big step towards allowing GFP_NOIO, the other
>>> one would be to have noio (and nofs for that matter) workqueues for
>>> kblockd and driver internal workqueues.
>>
>> I really don't like adding this for no good reason. Who's doing non NOIO
>> allocations down from this path?
> 
> If there is a non-NOIO allocation right now that would be a bug,
> although I would not be surprised if we had a few of them.
> 
> The reason to add this is a different one:  The MM folks want to
> get rid of GFP_NOIO and GFP_NOFS and replace them by these context.
> 
> And doing this in the submission path and kblockd will cover almost
> all of the noio context, with the rest probably covered by other
> workqueues.  And this feels a lot less error prone than requiring
> every driver to annotate the context in their submission routines.

I think it'd be much better to add a DEBUG protected aid that checks for
violating allocations. Nothing that isn't buggy should trigger this,
right now, and then we could catch problems if there are any. If we do
the save/restore there and call it good, then we're going to be stuck
with that forever. Regardless of whether it's actually needed or not.
Matthew Wilcox Jan. 25, 2024, 4:11 p.m. UTC | #4
On Thu, Jan 25, 2024 at 09:09:44AM -0700, Jens Axboe wrote:
> On 1/25/24 1:10 AM, Christoph Hellwig wrote:
> > On Wed, Jan 24, 2024 at 08:40:28AM -0700, Jens Axboe wrote:
> >> On 1/24/24 2:39 AM, Christoph Hellwig wrote:
> >>> Make sure all in-line block layer submission runs in noio reclaim
> >>> context.  This is a big step towards allowing GFP_NOIO, the other
> >>> one would be to have noio (and nofs for that matter) workqueues for
> >>> kblockd and driver internal workqueues.
> >>
> >> I really don't like adding this for no good reason. Who's doing non NOIO
> >> allocations down from this path?
> > 
> > If there is a non-NOIO allocation right now that would be a bug,
> > although I would not be surprised if we had a few of them.
> > 
> > The reason to add this is a different one:  The MM folks want to
> > get rid of GFP_NOIO and GFP_NOFS and replace them by these context.
> > 
> > And doing this in the submission path and kblockd will cover almost
> > all of the noio context, with the rest probably covered by other
> > workqueues.  And this feels a lot less error prone than requiring
> > every driver to annotate the context in their submission routines.
> 
> I think it'd be much better to add a DEBUG protected aid that checks for
> violating allocations. Nothing that isn't buggy should trigger this,
> right now, and then we could catch problems if there are any. If we do
> the save/restore there and call it good, then we're going to be stuck
> with that forever. Regardless of whether it's actually needed or not.

Nono, you don't understand.  The plan is to remove GFP_NOIO
entirely.  Allocations should be done with GFP_KERNEL while under a
memalloc_noio_save().
Jens Axboe Jan. 25, 2024, 4:13 p.m. UTC | #5
On Thu, Jan 25, 2024 at 9:11?AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, Jan 25, 2024 at 09:09:44AM -0700, Jens Axboe wrote:
> > On 1/25/24 1:10 AM, Christoph Hellwig wrote:
> > > On Wed, Jan 24, 2024 at 08:40:28AM -0700, Jens Axboe wrote:
> > >> On 1/24/24 2:39 AM, Christoph Hellwig wrote:
> > >>> Make sure all in-line block layer submission runs in noio reclaim
> > >>> context.  This is a big step towards allowing GFP_NOIO, the other
> > >>> one would be to have noio (and nofs for that matter) workqueues for
> > >>> kblockd and driver internal workqueues.
> > >>
> > >> I really don't like adding this for no good reason. Who's doing non NOIO
> > >> allocations down from this path?
> > >
> > > If there is a non-NOIO allocation right now that would be a bug,
> > > although I would not be surprised if we had a few of them.
> > >
> > > The reason to add this is a different one:  The MM folks want to
> > > get rid of GFP_NOIO and GFP_NOFS and replace them by these context.
> > >
> > > And doing this in the submission path and kblockd will cover almost
> > > all of the noio context, with the rest probably covered by other
> > > workqueues.  And this feels a lot less error prone than requiring
> > > every driver to annotate the context in their submission routines.
> >
> > I think it'd be much better to add a DEBUG protected aid that checks for
> > violating allocations. Nothing that isn't buggy should trigger this,
> > right now, and then we could catch problems if there are any. If we do
> > the save/restore there and call it good, then we're going to be stuck
> > with that forever. Regardless of whether it's actually needed or not.
>
> Nono, you don't understand.  The plan is to remove GFP_NOIO
> entirely.  Allocations should be done with GFP_KERNEL while under a
> memalloc_noio_save().

I do understand, but thanks for the vote of confidence. Place the
save/restore higher up, most likely actual IO submission isn't going to
be the only (or even major) allocation potentially needed for the IO.
Christoph Hellwig Jan. 26, 2024, 1:52 p.m. UTC | #6
On Thu, Jan 25, 2024 at 09:13:37AM -0700, Jens Axboe wrote:
> > Nono, you don't understand.  The plan is to remove GFP_NOIO
> > entirely.  Allocations should be done with GFP_KERNEL while under a
> > memalloc_noio_save().
> 
> I do understand, but thanks for the vote of confidence. Place the
> save/restore higher up, most likely actual IO submission isn't going to
> be the only (or even major) allocation potentially needed for the IO.

NOIO is defined as allocations that will not recurse into the I/O stack.
So for anything block based, entering the block layer is literally
the defined boundary where it should be used below.  So no, wrapping
every submit_bio into a context annotation doesn't make much sense.
diff mbox series

Patch

diff --git a/block/blk-core.c b/block/blk-core.c
index 11342af420d0c4..b85ef8a0fdf6a0 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -691,6 +691,8 @@  static void __submit_bio_noacct_mq(struct bio *bio)
 
 void submit_bio_noacct_nocheck(struct bio *bio)
 {
+	unsigned int noio_flag = memalloc_noio_save();
+
 	blk_cgroup_bio_start(bio);
 	blkcg_bio_issue_init(bio);
 
@@ -715,6 +717,8 @@  void submit_bio_noacct_nocheck(struct bio *bio)
 		__submit_bio_noacct_mq(bio);
 	else
 		__submit_bio_noacct(bio);
+
+	memalloc_noio_restore(noio_flag);
 }
 
 /**