diff mbox series

[05/15] mm: allow read-ahead with IOCB_NOWAIT set

Message ID 20200618144355.17324-6-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series [01/15] block: provide plug based way of signaling forced no-wait semantics | expand

Commit Message

Jens Axboe June 18, 2020, 2:43 p.m. UTC
The read-ahead shouldn't block, so allow it to be done even if
IOCB_NOWAIT is set in the kiocb.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 mm/filemap.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Dave Chinner June 24, 2020, 1:02 a.m. UTC | #1
On Thu, Jun 18, 2020 at 08:43:45AM -0600, Jens Axboe wrote:
> The read-ahead shouldn't block, so allow it to be done even if
> IOCB_NOWAIT is set in the kiocb.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  mm/filemap.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index f0ae9a6308cb..3378d4fca883 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2028,8 +2028,6 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
>  
>  		page = find_get_page(mapping, index);
>  		if (!page) {
> -			if (iocb->ki_flags & IOCB_NOWAIT)
> -				goto would_block;
>  			page_cache_sync_readahead(mapping,
>  					ra, filp,
>  					index, last_index - index);

Doesn't think break preadv2(RWF_NOWAIT) semantics for on buffered
reads? i.e. this can now block on memory allocation for the page
cache, which is something RWF_NOWAIT IO should not do....

Cheers,

Dave.
Matthew Wilcox June 24, 2020, 1:46 a.m. UTC | #2
On Wed, Jun 24, 2020 at 11:02:53AM +1000, Dave Chinner wrote:
> On Thu, Jun 18, 2020 at 08:43:45AM -0600, Jens Axboe wrote:
> > The read-ahead shouldn't block, so allow it to be done even if
> > IOCB_NOWAIT is set in the kiocb.
> 
> Doesn't think break preadv2(RWF_NOWAIT) semantics for on buffered
> reads? i.e. this can now block on memory allocation for the page
> cache, which is something RWF_NOWAIT IO should not do....

Yes.  This eventually ends up in page_cache_readahead_unbounded()
which gets its gfp flags from readahead_gfp_mask(mapping).

I'd be quite happy to add a gfp_t to struct readahead_control.
The other thing I've been looking into for other reasons is adding
a memalloc_nowait_{save,restore}, which would avoid passing down
the gfp_t.
Dave Chinner June 24, 2020, 4:38 a.m. UTC | #3
On Thu, Jun 18, 2020 at 08:43:45AM -0600, Jens Axboe wrote:
> The read-ahead shouldn't block, so allow it to be done even if
> IOCB_NOWAIT is set in the kiocb.
> 
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

BTW, Jens, in case nobody had mentioned it, the Reply-To field for
the patches in this patchset is screwed up:

| Reply-To: Add@vger.kernel.org, support@vger.kernel.org, for@vger.kernel.org,
|         async@vger.kernel.org, buffered@vger.kernel.org,
| 	        reads@vger.kernel.org

Cheers,

Dave.
Jens Axboe June 24, 2020, 3 p.m. UTC | #4
On 6/23/20 7:46 PM, Matthew Wilcox wrote:
> On Wed, Jun 24, 2020 at 11:02:53AM +1000, Dave Chinner wrote:
>> On Thu, Jun 18, 2020 at 08:43:45AM -0600, Jens Axboe wrote:
>>> The read-ahead shouldn't block, so allow it to be done even if
>>> IOCB_NOWAIT is set in the kiocb.
>>
>> Doesn't think break preadv2(RWF_NOWAIT) semantics for on buffered
>> reads? i.e. this can now block on memory allocation for the page
>> cache, which is something RWF_NOWAIT IO should not do....
> 
> Yes.  This eventually ends up in page_cache_readahead_unbounded()
> which gets its gfp flags from readahead_gfp_mask(mapping).
> 
> I'd be quite happy to add a gfp_t to struct readahead_control.
> The other thing I've been looking into for other reasons is adding
> a memalloc_nowait_{save,restore}, which would avoid passing down
> the gfp_t.

That was my first thought, having the memalloc_foo_save/restore for
this. I don't think adding a gfp_t to readahead_control is going
to be super useful, seems like the kind of thing that should be
non-blocking by default.
Jens Axboe June 24, 2020, 3:01 p.m. UTC | #5
On 6/23/20 10:38 PM, Dave Chinner wrote:
> On Thu, Jun 18, 2020 at 08:43:45AM -0600, Jens Axboe wrote:
>> The read-ahead shouldn't block, so allow it to be done even if
>> IOCB_NOWAIT is set in the kiocb.
>>
>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> BTW, Jens, in case nobody had mentioned it, the Reply-To field for
> the patches in this patchset is screwed up:
> 
> | Reply-To: Add@vger.kernel.org, support@vger.kernel.org, for@vger.kernel.org,
> |         async@vger.kernel.org, buffered@vger.kernel.org,
> | 	        reads@vger.kernel.org

Yeah, I pasted the subject line into the wrong spot for git send-email,
hence the reply-to is boogered, and the subject line was empty for the
cover letter...
Jens Axboe June 24, 2020, 3:35 p.m. UTC | #6
On 6/24/20 9:00 AM, Jens Axboe wrote:
> On 6/23/20 7:46 PM, Matthew Wilcox wrote:
>> On Wed, Jun 24, 2020 at 11:02:53AM +1000, Dave Chinner wrote:
>>> On Thu, Jun 18, 2020 at 08:43:45AM -0600, Jens Axboe wrote:
>>>> The read-ahead shouldn't block, so allow it to be done even if
>>>> IOCB_NOWAIT is set in the kiocb.
>>>
>>> Doesn't think break preadv2(RWF_NOWAIT) semantics for on buffered
>>> reads? i.e. this can now block on memory allocation for the page
>>> cache, which is something RWF_NOWAIT IO should not do....
>>
>> Yes.  This eventually ends up in page_cache_readahead_unbounded()
>> which gets its gfp flags from readahead_gfp_mask(mapping).
>>
>> I'd be quite happy to add a gfp_t to struct readahead_control.
>> The other thing I've been looking into for other reasons is adding
>> a memalloc_nowait_{save,restore}, which would avoid passing down
>> the gfp_t.
> 
> That was my first thought, having the memalloc_foo_save/restore for
> this. I don't think adding a gfp_t to readahead_control is going
> to be super useful, seems like the kind of thing that should be
> non-blocking by default.

We're already doing memalloc_nofs_save/restore in
page_cache_readahead_unbounded(), so I think all we need is to just do a
noio dance in generic_file_buffered_read() and that should be enough.

diff --git a/mm/filemap.c b/mm/filemap.c
index a5b1fa8f7ce4..c29d4b310ed6 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -41,6 +41,7 @@
 #include <linux/delayacct.h>
 #include <linux/psi.h>
 #include <linux/ramfs.h>
+#include <linux/sched/mm.h>
 #include "internal.h"
 
 #define CREATE_TRACE_POINTS
@@ -2011,6 +2012,7 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
 	struct address_space *mapping = filp->f_mapping;
 	struct inode *inode = mapping->host;
 	struct file_ra_state *ra = &filp->f_ra;
+	const bool nowait = (iocb->ki_flags & IOCB_NOWAIT) != 0;
 	loff_t *ppos = &iocb->ki_pos;
 	pgoff_t index;
 	pgoff_t last_index;
@@ -2044,9 +2046,15 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
 
 		page = find_get_page(mapping, index);
 		if (!page) {
+			unsigned int flags;
+
+			if (nowait)
+				flags = memalloc_noio_save();
 			page_cache_sync_readahead(mapping,
 					ra, filp,
 					index, last_index - index);
+			if (nowait)
+				memalloc_noio_restore(flags);
 			page = find_get_page(mapping, index);
 			if (unlikely(page == NULL))
 				goto no_cached_page;
@@ -2070,7 +2078,7 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
 				error = wait_on_page_locked_async(page,
 								iocb->ki_waitq);
 			} else {
-				if (iocb->ki_flags & IOCB_NOWAIT) {
+				if (nowait) {
 					put_page(page);
 					goto would_block;
 				}
@@ -2185,7 +2193,7 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb,
 		}
 
 readpage:
-		if (iocb->ki_flags & IOCB_NOWAIT) {
+		if (nowait) {
 			unlock_page(page);
 			put_page(page);
 			goto would_block;
Matthew Wilcox June 24, 2020, 4:41 p.m. UTC | #7
On Wed, Jun 24, 2020 at 09:35:19AM -0600, Jens Axboe wrote:
> On 6/24/20 9:00 AM, Jens Axboe wrote:
> > On 6/23/20 7:46 PM, Matthew Wilcox wrote:
> >> I'd be quite happy to add a gfp_t to struct readahead_control.
> >> The other thing I've been looking into for other reasons is adding
> >> a memalloc_nowait_{save,restore}, which would avoid passing down
> >> the gfp_t.
> > 
> > That was my first thought, having the memalloc_foo_save/restore for
> > this. I don't think adding a gfp_t to readahead_control is going
> > to be super useful, seems like the kind of thing that should be
> > non-blocking by default.
> 
> We're already doing memalloc_nofs_save/restore in
> page_cache_readahead_unbounded(), so I think all we need is to just do a
> noio dance in generic_file_buffered_read() and that should be enough.

I think we can still sleep though, right?  I was thinking more
like this:

http://git.infradead.org/users/willy/linux.git/shortlog/refs/heads/memalloc
Jens Axboe June 24, 2020, 4:44 p.m. UTC | #8
On 6/24/20 10:41 AM, Matthew Wilcox wrote:
> On Wed, Jun 24, 2020 at 09:35:19AM -0600, Jens Axboe wrote:
>> On 6/24/20 9:00 AM, Jens Axboe wrote:
>>> On 6/23/20 7:46 PM, Matthew Wilcox wrote:
>>>> I'd be quite happy to add a gfp_t to struct readahead_control.
>>>> The other thing I've been looking into for other reasons is adding
>>>> a memalloc_nowait_{save,restore}, which would avoid passing down
>>>> the gfp_t.
>>>
>>> That was my first thought, having the memalloc_foo_save/restore for
>>> this. I don't think adding a gfp_t to readahead_control is going
>>> to be super useful, seems like the kind of thing that should be
>>> non-blocking by default.
>>
>> We're already doing memalloc_nofs_save/restore in
>> page_cache_readahead_unbounded(), so I think all we need is to just do a
>> noio dance in generic_file_buffered_read() and that should be enough.
> 
> I think we can still sleep though, right?  I was thinking more
> like this:
> 
> http://git.infradead.org/users/willy/linux.git/shortlog/refs/heads/memalloc

Yeah, that's probably better. How do we want to handle this? I've already
got the other bits queued up. I can either add them to the series, or
pull a branch that'll go into Linus as well.
Andreas Grünbacher July 7, 2020, 11:38 a.m. UTC | #9
Am Mi., 24. Juni 2020 um 18:48 Uhr schrieb Jens Axboe <axboe@kernel.dk>:
>
> On 6/24/20 10:41 AM, Matthew Wilcox wrote:
> > On Wed, Jun 24, 2020 at 09:35:19AM -0600, Jens Axboe wrote:
> >> On 6/24/20 9:00 AM, Jens Axboe wrote:
> >>> On 6/23/20 7:46 PM, Matthew Wilcox wrote:
> >>>> I'd be quite happy to add a gfp_t to struct readahead_control.
> >>>> The other thing I've been looking into for other reasons is adding
> >>>> a memalloc_nowait_{save,restore}, which would avoid passing down
> >>>> the gfp_t.
> >>>
> >>> That was my first thought, having the memalloc_foo_save/restore for
> >>> this. I don't think adding a gfp_t to readahead_control is going
> >>> to be super useful, seems like the kind of thing that should be
> >>> non-blocking by default.
> >>
> >> We're already doing memalloc_nofs_save/restore in
> >> page_cache_readahead_unbounded(), so I think all we need is to just do a
> >> noio dance in generic_file_buffered_read() and that should be enough.
> >
> > I think we can still sleep though, right?  I was thinking more
> > like this:
> >
> > http://git.infradead.org/users/willy/linux.git/shortlog/refs/heads/memalloc
>
> Yeah, that's probably better. How do we want to handle this? I've already
> got the other bits queued up. I can either add them to the series, or
> pull a branch that'll go into Linus as well.

Also note my conflicting patch that introduces a IOCB_NOIO flag for
fixing a gfs2 regression:

https://lore.kernel.org/linux-fsdevel/20200703095325.1491832-2-agruenba@redhat.com/

Thanks,
Andreas
Jens Axboe July 7, 2020, 2:31 p.m. UTC | #10
On 7/7/20 5:38 AM, Andreas Grünbacher wrote:
> Am Mi., 24. Juni 2020 um 18:48 Uhr schrieb Jens Axboe <axboe@kernel.dk>:
>>
>> On 6/24/20 10:41 AM, Matthew Wilcox wrote:
>>> On Wed, Jun 24, 2020 at 09:35:19AM -0600, Jens Axboe wrote:
>>>> On 6/24/20 9:00 AM, Jens Axboe wrote:
>>>>> On 6/23/20 7:46 PM, Matthew Wilcox wrote:
>>>>>> I'd be quite happy to add a gfp_t to struct readahead_control.
>>>>>> The other thing I've been looking into for other reasons is adding
>>>>>> a memalloc_nowait_{save,restore}, which would avoid passing down
>>>>>> the gfp_t.
>>>>>
>>>>> That was my first thought, having the memalloc_foo_save/restore for
>>>>> this. I don't think adding a gfp_t to readahead_control is going
>>>>> to be super useful, seems like the kind of thing that should be
>>>>> non-blocking by default.
>>>>
>>>> We're already doing memalloc_nofs_save/restore in
>>>> page_cache_readahead_unbounded(), so I think all we need is to just do a
>>>> noio dance in generic_file_buffered_read() and that should be enough.
>>>
>>> I think we can still sleep though, right?  I was thinking more
>>> like this:
>>>
>>> http://git.infradead.org/users/willy/linux.git/shortlog/refs/heads/memalloc
>>
>> Yeah, that's probably better. How do we want to handle this? I've already
>> got the other bits queued up. I can either add them to the series, or
>> pull a branch that'll go into Linus as well.
> 
> Also note my conflicting patch that introduces a IOCB_NOIO flag for
> fixing a gfs2 regression:
> 
> https://lore.kernel.org/linux-fsdevel/20200703095325.1491832-2-agruenba@redhat.com/

Yeah I noticed, pretty easy to resolve though.
Dave Chinner Aug. 10, 2020, 10:56 p.m. UTC | #11
On Wed, Jun 24, 2020 at 10:44:21AM -0600, Jens Axboe wrote:
> On 6/24/20 10:41 AM, Matthew Wilcox wrote:
> > On Wed, Jun 24, 2020 at 09:35:19AM -0600, Jens Axboe wrote:
> >> On 6/24/20 9:00 AM, Jens Axboe wrote:
> >>> On 6/23/20 7:46 PM, Matthew Wilcox wrote:
> >>>> I'd be quite happy to add a gfp_t to struct readahead_control.
> >>>> The other thing I've been looking into for other reasons is adding
> >>>> a memalloc_nowait_{save,restore}, which would avoid passing down
> >>>> the gfp_t.
> >>>
> >>> That was my first thought, having the memalloc_foo_save/restore for
> >>> this. I don't think adding a gfp_t to readahead_control is going
> >>> to be super useful, seems like the kind of thing that should be
> >>> non-blocking by default.
> >>
> >> We're already doing memalloc_nofs_save/restore in
> >> page_cache_readahead_unbounded(), so I think all we need is to just do a
> >> noio dance in generic_file_buffered_read() and that should be enough.
> > 
> > I think we can still sleep though, right?  I was thinking more
> > like this:
> > 
> > http://git.infradead.org/users/willy/linux.git/shortlog/refs/heads/memalloc
> 
> Yeah, that's probably better. How do we want to handle this? I've already
> got the other bits queued up. I can either add them to the series, or
> pull a branch that'll go into Linus as well.

Jens, Willy,

Now that this patch has been merged and IOCB_NOWAIT semantics ifor
buffered reads are broken in Linus' tree, what's the plan to get
this regression fixed before 5.9 releases?

Cheers,

Dave.
Jens Axboe Aug. 10, 2020, 11:03 p.m. UTC | #12
On 8/10/20 4:56 PM, Dave Chinner wrote:
> On Wed, Jun 24, 2020 at 10:44:21AM -0600, Jens Axboe wrote:
>> On 6/24/20 10:41 AM, Matthew Wilcox wrote:
>>> On Wed, Jun 24, 2020 at 09:35:19AM -0600, Jens Axboe wrote:
>>>> On 6/24/20 9:00 AM, Jens Axboe wrote:
>>>>> On 6/23/20 7:46 PM, Matthew Wilcox wrote:
>>>>>> I'd be quite happy to add a gfp_t to struct readahead_control.
>>>>>> The other thing I've been looking into for other reasons is adding
>>>>>> a memalloc_nowait_{save,restore}, which would avoid passing down
>>>>>> the gfp_t.
>>>>>
>>>>> That was my first thought, having the memalloc_foo_save/restore for
>>>>> this. I don't think adding a gfp_t to readahead_control is going
>>>>> to be super useful, seems like the kind of thing that should be
>>>>> non-blocking by default.
>>>>
>>>> We're already doing memalloc_nofs_save/restore in
>>>> page_cache_readahead_unbounded(), so I think all we need is to just do a
>>>> noio dance in generic_file_buffered_read() and that should be enough.
>>>
>>> I think we can still sleep though, right?  I was thinking more
>>> like this:
>>>
>>> http://git.infradead.org/users/willy/linux.git/shortlog/refs/heads/memalloc
>>
>> Yeah, that's probably better. How do we want to handle this? I've already
>> got the other bits queued up. I can either add them to the series, or
>> pull a branch that'll go into Linus as well.
> 
> Jens, Willy,
> 
> Now that this patch has been merged and IOCB_NOWAIT semantics ifor
> buffered reads are broken in Linus' tree, what's the plan to get
> this regression fixed before 5.9 releases?

Not sure where Willy's work went on this topic, but it is on my radar. But
I think we can do something truly simple now that we have IOCB_NOIO:


diff --git a/include/linux/fs.h b/include/linux/fs.h
index bd7ec3eaeed0..f1cca4bfdd7b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3293,7 +3293,7 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 	if (flags & RWF_NOWAIT) {
 		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
 			return -EOPNOTSUPP;
-		kiocb_flags |= IOCB_NOWAIT;
+		kiocb_flags |= IOCB_NOWAIT | IOCB_NOIO;
 	}
 	if (flags & RWF_HIPRI)
 		kiocb_flags |= IOCB_HIPRI;
diff mbox series

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index f0ae9a6308cb..3378d4fca883 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2028,8 +2028,6 @@  ssize_t generic_file_buffered_read(struct kiocb *iocb,
 
 		page = find_get_page(mapping, index);
 		if (!page) {
-			if (iocb->ki_flags & IOCB_NOWAIT)
-				goto would_block;
 			page_cache_sync_readahead(mapping,
 					ra, filp,
 					index, last_index - index);