mbox series

[RFC,0/4] Fix gfs2 readahead deadlocks

Message ID 20200702165120.1469875-1-agruenba@redhat.com (mailing list archive)
Headers show
Series Fix gfs2 readahead deadlocks | expand

Message

Andreas Gruenbacher July 2, 2020, 4:51 p.m. UTC
Hi all,

commit d4388340ae0b ("fs: convert mpage_readpages to mpage_readahead")
converted gfs2 and other filesystems from the ->readpages to the
->readahead address space operation.  Due to gfs2 doing its locking in
the ->readpage and ->readahead address space operations rather than at a
higher level, this is leading to deadlocks.  Switching to a trylock
operation in ->readahead improves things but doesn't eliminate all
deadlocks; the only reasonable fix seems to be to lift gfs2's locking to
the ->fault vm operation and ->read_iter file operation.

However, gfs2 includes an optimization that allows reads to be served
out of the page cache without any filesystem locking.  This optimization
is important in concurrent read scenarios.  The best way we could find
to preserve this optimization is by introducing a new IOCB_NOIO flag for
generic_file_read_iter.

Introducing this new flag may be too big a change for 5.8.  So this
patch queue takes a different approach:  it first walks back gfs2's
conversion to ->readahead.  Then it introduces IOCB_NOIO, fixes the
locking, and re-applies the readahead conversion.

Of this patch queue, either only the first patch or all four patches can
be applied to fix gfs2's current issues in 5.8.  Please let me know what
you think.

Thanks,
Andreas

Andreas Gruenbacher (4):
  gfs2: Revert readahead conversion
  fs: Add IOCB_NOIO flag for generic_file_read_iter
  gfs2: Rework read and page fault locking
  gfs2: Reinstate readahead conversion

 fs/gfs2/aops.c     | 45 +------------------------------------
 fs/gfs2/file.c     | 55 ++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/fs.h |  1 +
 mm/filemap.c       | 16 ++++++++++++--
 4 files changed, 69 insertions(+), 48 deletions(-)

Comments

Linus Torvalds July 2, 2020, 6:10 p.m. UTC | #1
On Thu, Jul 2, 2020 at 9:51 AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
>
> Of this patch queue, either only the first patch or all four patches can
> be applied to fix gfs2's current issues in 5.8.  Please let me know what
> you think.

I think the IOCB_NOIO flag looks fine (apart from the nit I pointed
out), abnd we could do that.

However, is the "revert and reinstate" looks odd. Is the reinstate so
different front he original that it makes sense to do that way?

Or was it done that way only to give the choice of just doing the revert?

Because if so, I think I'd rather just see a "fix" rather than
"revert+reinstate".

But I didn't look that closely at the gfs2 code itself, maybe there's
some reason you did it that way.

              Linus
Andreas Gruenbacher July 2, 2020, 6:23 p.m. UTC | #2
On Thu, Jul 2, 2020 at 8:11 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Jul 2, 2020 at 9:51 AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
> >
> > Of this patch queue, either only the first patch or all four patches can
> > be applied to fix gfs2's current issues in 5.8.  Please let me know what
> > you think.
>
> I think the IOCB_NOIO flag looks fine (apart from the nit I pointed
> out), and we could do that.

Ok, that's a step forward.

> However, is the "revert and reinstate" looks odd. Is the reinstate so
> different from the original that it makes sense to do that way?
>
> Or was it done that way only to give the choice of just doing the revert?
>
> Because if so, I think I'd rather just see a "fix" rather than
> "revert+reinstate".

I only did the "revert and reinstate" so that the revert alone will
give us a working gfs2 in 5.8. If there's agreement to add the
IOCB_NOIO flag, then we can just fix gfs2 (basically
https://lore.kernel.org/linux-fsdevel/20200619093916.1081129-3-agruenba@redhat.com/
with IOCB_CACHED renamed to IOCB_NOIO).

Thanks,
Andreas