diff mbox series

[1/4] gfs2: stop using generic_writepages in gfs2_ail1_start_one

Message ID 20220719041311.709250-2-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [1/4] gfs2: stop using generic_writepages in gfs2_ail1_start_one | expand

Commit Message

Christoph Hellwig July 19, 2022, 4:13 a.m. UTC
Use filemap_fdatawrite_wbc instead of generic_writepages in
gfs2_ail1_start_one so that the functin can also cope with address_space
operations that only implement ->writepages and to properly account
for cgroup writeback.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/log.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Andreas Gruenbacher Jan. 18, 2023, 9:22 p.m. UTC | #1
Hi Christoph,

On Tue, Jul 19, 2022 at 7:07 AM Christoph Hellwig <hch@lst.de> wrote:
>
> Use filemap_fdatawrite_wbc instead of generic_writepages in
> gfs2_ail1_start_one so that the functin can also cope with address_space
> operations that only implement ->writepages and to properly account
> for cgroup writeback.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
> ---
>  fs/gfs2/log.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
> index f0ee3ff6f9a87..a66e3b1f6d178 100644
> --- a/fs/gfs2/log.c
> +++ b/fs/gfs2/log.c
> @@ -131,7 +131,7 @@ __acquires(&sdp->sd_ail_lock)
>                 if (!mapping)
>                         continue;
>                 spin_unlock(&sdp->sd_ail_lock);
> -               ret = generic_writepages(mapping, wbc);
> +               ret = filemap_fdatawrite_wbc(mapping, wbc);

this patch unfortunately breaks journaled data inodes.

We're in function gfs2_ail1_start_one() here, which is usually called
via gfs2_log_flush() -> empty_ail1_list() -> gfs2_ail1_start() ->
gfs2_ail1_flush() -> gfs2_ail1_start_one(), and we're going through
the list of buffer heads in the transaction to be flushed. We used to
submit each dirty buffer for I/O individually, but since commit
5ac048bb7ea6 ("GFS2: Use filemap_fdatawrite() to write back the AIL"),
we're submitting all the dirty pages in the metadata address space
this buffer head belongs to. That's slightly bizarre, but it happens
to catch exactly the same buffer heads that are in the transaction, so
we end up with the same result. From what I'm being told, this was
done as a performance optimization -- but nobody actually knows the
details anymore.

The above change means that instead of calling generic_writepages(),
we end up calling filemap_fdatawrite_wbc() -> do_writepages() ->
mapping->a_ops->writepages(). But that's something completely
different; the writepages address space operation operates is outward
facing, while we really only want to write out the dirty buffers /
pages in the underlying address space. In case of journaled data
inodes, gfs2_jdata_writepages() actually ends up trying to create a
filesystem transaction, which immediately hangs because we're in the
middle of a log flush.

So I'm tempted to revert the following two of your commits; luckily
that's independent from the iomap_writepage() removal:

  d3d71901b1ea ("gfs2: remove ->writepage")
  b2b0a5e97855 ("gfs2: stop using generic_writepages in gfs2_ail1_start_one")

I think we could go through iomap_writepages() instead of
generic_writepages() here as well,  but that's for another day.

As far as cgroup writeback goes, this is journal I/O and I don't have
the faintest idea how the accounting for that is even supposed to
work.

>                 if (need_resched()) {
>                         blk_finish_plug(plug);
>                         cond_resched();
> @@ -222,8 +222,7 @@ void gfs2_ail1_flush(struct gfs2_sbd *sdp, struct writeback_control *wbc)
>         spin_unlock(&sdp->sd_ail_lock);
>         blk_finish_plug(&plug);
>         if (ret) {
> -               gfs2_lm(sdp, "gfs2_ail1_start_one (generic_writepages) "
> -                       "returned: %d\n", ret);
> +               gfs2_lm(sdp, "gfs2_ail1_start_one returned: %d\n", ret);
>                 gfs2_withdraw(sdp);
>         }
>         trace_gfs2_ail_flush(sdp, wbc, 0);
> --
> 2.30.2
>

Thanks,
Andreas
Christoph Hellwig Jan. 19, 2023, 6:22 a.m. UTC | #2
On Wed, Jan 18, 2023 at 10:22:20PM +0100, Andreas Gruenbacher wrote:
> The above change means that instead of calling generic_writepages(),
> we end up calling filemap_fdatawrite_wbc() -> do_writepages() ->
> mapping->a_ops->writepages(). But that's something completely
> different; the writepages address space operation operates is outward
> facing, while we really only want to write out the dirty buffers /
> pages in the underlying address space. In case of journaled data
> inodes, gfs2_jdata_writepages() actually ends up trying to create a
> filesystem transaction, which immediately hangs because we're in the
> middle of a log flush.
> 
> So I'm tempted to revert the following two of your commits; luckily
> that's independent from the iomap_writepage() removal:
> 
>   d3d71901b1ea ("gfs2: remove ->writepage")
>   b2b0a5e97855 ("gfs2: stop using generic_writepages in gfs2_ail1_start_one")

generic_writepages is gone in linux-next, and I'd really like to keep
it that way.  So if you have to do this, please open code it
using write_cache_pages and a direct call to the writepage method of
choice.

> I think we could go through iomap_writepages() instead of
> generic_writepages() here as well,  but that's for another day.

Well, that would obviously be much better, and actually help with the
goal of removing ->writepage.
diff mbox series

Patch

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index f0ee3ff6f9a87..a66e3b1f6d178 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -131,7 +131,7 @@  __acquires(&sdp->sd_ail_lock)
 		if (!mapping)
 			continue;
 		spin_unlock(&sdp->sd_ail_lock);
-		ret = generic_writepages(mapping, wbc);
+		ret = filemap_fdatawrite_wbc(mapping, wbc);
 		if (need_resched()) {
 			blk_finish_plug(plug);
 			cond_resched();
@@ -222,8 +222,7 @@  void gfs2_ail1_flush(struct gfs2_sbd *sdp, struct writeback_control *wbc)
 	spin_unlock(&sdp->sd_ail_lock);
 	blk_finish_plug(&plug);
 	if (ret) {
-		gfs2_lm(sdp, "gfs2_ail1_start_one (generic_writepages) "
-			"returned: %d\n", ret);
+		gfs2_lm(sdp, "gfs2_ail1_start_one returned: %d\n", ret);
 		gfs2_withdraw(sdp);
 	}
 	trace_gfs2_ail_flush(sdp, wbc, 0);