Message ID | 20250224081328.18090-1-raphaelsc@scylladb.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm: Fix error handling in __filemap_get_folio() with FGP_NOWAIT | expand |
On Mon, Feb 24, 2025 at 05:13:28AM -0300, Raphael S. Carvalho wrote: > + if (err) { > + /* Prevents -ENOMEM from escaping to user space with FGP_NOWAIT */ > + if ((fgp_flags & FGP_NOWAIT) && err == -ENOMEM) > + err = -EAGAIN; > return ERR_PTR(err); I don't think the comment is all that useful. It's also overly long. I'd suggest this instead: /* * When NOWAIT I/O fails to allocate folios this could * be due to a nonblocking memory allocation and not * because the system actually is out of memory. * Return -EAGAIN so that there caller retries in a * blocking fashion instead of propagating -ENOMEM * to the application. */
On Mon, Feb 24, 2025 at 03:17:44PM +0100, Christoph Hellwig wrote: > On Mon, Feb 24, 2025 at 05:13:28AM -0300, Raphael S. Carvalho wrote: > > + if (err) { > > + /* Prevents -ENOMEM from escaping to user space with FGP_NOWAIT */ > > + if ((fgp_flags & FGP_NOWAIT) && err == -ENOMEM) > > + err = -EAGAIN; > > return ERR_PTR(err); > > I don't think the comment is all that useful. It's also overly long. > > I'd suggest this instead: > > /* > * When NOWAIT I/O fails to allocate folios this could > * be due to a nonblocking memory allocation and not > * because the system actually is out of memory. > * Return -EAGAIN so that there caller retries in a > * blocking fashion instead of propagating -ENOMEM > * to the application. > */ I don't think it needs a comment at all, but the memory allocation might be for something other than folios, so your suggested comment is misleading.
On Mon, Feb 24, 2025 at 12:33 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, Feb 24, 2025 at 03:17:44PM +0100, Christoph Hellwig wrote: > > On Mon, Feb 24, 2025 at 05:13:28AM -0300, Raphael S. Carvalho wrote: > > > + if (err) { > > > + /* Prevents -ENOMEM from escaping to user space with FGP_NOWAIT */ > > > + if ((fgp_flags & FGP_NOWAIT) && err == -ENOMEM) > > > + err = -EAGAIN; > > > return ERR_PTR(err); > > > > I don't think the comment is all that useful. It's also overly long. > > > > I'd suggest this instead: > > > > /* > > * When NOWAIT I/O fails to allocate folios this could > > * be due to a nonblocking memory allocation and not > > * because the system actually is out of memory. > > * Return -EAGAIN so that there caller retries in a > > * blocking fashion instead of propagating -ENOMEM > > * to the application. > > */ > > I don't think it needs a comment at all, but the memory allocation > might be for something other than folios, so your suggested comment > is misleading. Isn't it all in the context of allocating or adding folio? The reason behind a comment is to prevent movements in the future that could cause a similar regression, and also to inform the poor reader that might be left wondering why we're converting -ENOMEM into -EAGAIN with FGP_NOWAIT. Can it be slightly adjusted to make it more correct? Or you really think it's better to remove it completely?
On Mon, Feb 24, 2025 at 12:45:21PM -0300, Raphael S. Carvalho wrote: > On Mon, Feb 24, 2025 at 12:33 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Mon, Feb 24, 2025 at 03:17:44PM +0100, Christoph Hellwig wrote: > > > On Mon, Feb 24, 2025 at 05:13:28AM -0300, Raphael S. Carvalho wrote: > > > > + if (err) { > > > > + /* Prevents -ENOMEM from escaping to user space with FGP_NOWAIT */ > > > > + if ((fgp_flags & FGP_NOWAIT) && err == -ENOMEM) > > > > + err = -EAGAIN; > > > > return ERR_PTR(err); > > > > > > I don't think the comment is all that useful. It's also overly long. > > > > > > I'd suggest this instead: > > > > > > /* > > > * When NOWAIT I/O fails to allocate folios this could > > > * be due to a nonblocking memory allocation and not > > > * because the system actually is out of memory. > > > * Return -EAGAIN so that there caller retries in a > > > * blocking fashion instead of propagating -ENOMEM > > > * to the application. > > > */ > > > > I don't think it needs a comment at all, but the memory allocation > > might be for something other than folios, so your suggested comment > > is misleading. > > Isn't it all in the context of allocating or adding folio? The reason > behind a comment is to prevent movements in the future that could > cause a similar regression, and also to inform the poor reader that > might be left wondering why we're converting -ENOMEM into -EAGAIN with > FGP_NOWAIT. Can it be slightly adjusted to make it more correct? Or > you really think it's better to remove it completely? I really don't think the comment is needed. This is a common mistake when fixing a bug.
On Mon, Feb 24, 2025 at 12:49 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, Feb 24, 2025 at 12:45:21PM -0300, Raphael S. Carvalho wrote: > > On Mon, Feb 24, 2025 at 12:33 PM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > On Mon, Feb 24, 2025 at 03:17:44PM +0100, Christoph Hellwig wrote: > > > > On Mon, Feb 24, 2025 at 05:13:28AM -0300, Raphael S. Carvalho wrote: > > > > > + if (err) { > > > > > + /* Prevents -ENOMEM from escaping to user space with FGP_NOWAIT */ > > > > > + if ((fgp_flags & FGP_NOWAIT) && err == -ENOMEM) > > > > > + err = -EAGAIN; > > > > > return ERR_PTR(err); > > > > > > > > I don't think the comment is all that useful. It's also overly long. > > > > > > > > I'd suggest this instead: > > > > > > > > /* > > > > * When NOWAIT I/O fails to allocate folios this could > > > > * be due to a nonblocking memory allocation and not > > > > * because the system actually is out of memory. > > > > * Return -EAGAIN so that there caller retries in a > > > > * blocking fashion instead of propagating -ENOMEM > > > > * to the application. > > > > */ > > > > > > I don't think it needs a comment at all, but the memory allocation > > > might be for something other than folios, so your suggested comment > > > is misleading. > > > > Isn't it all in the context of allocating or adding folio? The reason > > behind a comment is to prevent movements in the future that could > > cause a similar regression, and also to inform the poor reader that > > might be left wondering why we're converting -ENOMEM into -EAGAIN with > > FGP_NOWAIT. Can it be slightly adjusted to make it more correct? Or > > you really think it's better to remove it completely? > > I really don't think the comment is needed. This is a common mistake > when fixing a bug. Ok, so I will proceed with v4 now, removing the comment.
On Mon, Feb 24, 2025 at 12:50:48PM -0300, Raphael S. Carvalho wrote:
> Ok, so I will proceed with v4 now, removing the comment.
No. Give Christoph 24 hours to respond.
On Mon, Feb 24, 2025 at 12:51 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, Feb 24, 2025 at 12:50:48PM -0300, Raphael S. Carvalho wrote: > > Ok, so I will proceed with v4 now, removing the comment. > > No. Give Christoph 24 hours to respond. I am still getting used to linux development rules / culture. Sure.
On Mon, Feb 24, 2025 at 03:51:58PM +0000, Matthew Wilcox wrote: > On Mon, Feb 24, 2025 at 12:50:48PM -0300, Raphael S. Carvalho wrote: > > Ok, so I will proceed with v4 now, removing the comment. > > No. Give Christoph 24 hours to respond. I strongly disagree about not having a comment, because while Matthew might remember the issue in his head others don't. But in the end he is the maintainer for the relevant code, so his opinion counts.
On Mon, Feb 24, 2025 at 03:33:29PM +0000, Matthew Wilcox wrote: > I don't think it needs a comment at all, but the memory allocation > might be for something other than folios, so your suggested comment > is misleading. Then s/folio/memory/
On Mon, Feb 24, 2025 at 1:02 PM Christoph Hellwig <hch@lst.de> wrote: > > On Mon, Feb 24, 2025 at 03:33:29PM +0000, Matthew Wilcox wrote: > > I don't think it needs a comment at all, but the memory allocation > > might be for something other than folios, so your suggested comment > > is misleading. > > Then s/folio/memory/ The context of the comment is error handling. ENOMEM can come from either folio allocation / addition (there's an allocation for xarray node). So is it really wrong to say folios given the context of the comment? It's not supposed to be a generic comment, but rather one that applies to its context. Maybe this change: - * When NOWAIT I/O fails to allocate folios this could + * When NOWAIT I/O fails to allocate memory for folio Or perhaps just what hch suggested.
diff --git a/mm/filemap.c b/mm/filemap.c index 804d7365680c..d7646e73f481 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1986,8 +1986,12 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, if (err == -EEXIST) goto repeat; - if (err) + if (err) { + /* Prevents -ENOMEM from escaping to user space with FGP_NOWAIT */ + if ((fgp_flags & FGP_NOWAIT) && err == -ENOMEM) + err = -EAGAIN; return ERR_PTR(err); + } /* * filemap_add_folio locks the page, and for mmap * we expect an unlocked page.
original report: https://lore.kernel.org/all/CAKhLTr1UL3ePTpYjXOx2AJfNk8Ku2EdcEfu+CH1sf3Asr=B-Dw@mail.gmail.com/T/ When doing buffered writes with FGP_NOWAIT, under memory pressure, the system returned ENOMEM despite there being plenty of available memory, to be reclaimed from page cache. The user space used io_uring interface, which in turn submits I/O with FGP_NOWAIT (the fast path). retsnoop pointed to iomap_get_folio: 00:34:16.180612 -> 00:34:16.180651 TID/PID 253786/253721 (reactor-1/combined_tests): entry_SYSCALL_64_after_hwframe+0x76 do_syscall_64+0x82 __do_sys_io_uring_enter+0x265 io_submit_sqes+0x209 io_issue_sqe+0x5b io_write+0xdd xfs_file_buffered_write+0x84 iomap_file_buffered_write+0x1a6 32us [-ENOMEM] iomap_write_begin+0x408 iter=&{.inode=0xffff8c67aa031138,.len=4096,.flags=33,.iomap={.addr=0xffffffffffffffff,.length=4096,.type=1,.flags=3,.bdev=0x… pos=0 len=4096 foliop=0xffffb32c296b7b80 ! 4us [-ENOMEM] iomap_get_folio iter=&{.inode=0xffff8c67aa031138,.len=4096,.flags=33,.iomap={.addr=0xffffffffffffffff,.length=4096,.type=1,.flags=3,.bdev=0x… pos=0 len=4096 This is likely a regression caused by 66dabbb65d67 ("mm: return an ERR_PTR from __filemap_get_folio"), which moved error handling from io_map_get_folio() to __filemap_get_folio(), but broke FGP_NOWAIT handling, so ENOMEM is being escaped to user space. Had it correctly returned -EAGAIN with NOWAIT, either io_uring or user space itself would be able to retry the request. It's not enough to patch io_uring since the iomap interface is the one responsible for it, and pwritev2(RWF_NOWAIT) and AIO interfaces must return the proper error too. The patch was tested with scylladb test suite (its original reproducer), and the tests all pass now when memory is pressured. Fixes: 66dabbb65d67 ("mm: return an ERR_PTR from __filemap_get_folio") Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> --- mm/filemap.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)