diff mbox series

[1/2] iomap: Add a page_prepare callback

Message ID 20190424171804.4305-1-agruenba@redhat.com (mailing list archive)
State New, archived
Headers show
Series [1/2] iomap: Add a page_prepare callback | expand

Commit Message

Andreas Gruenbacher April 24, 2019, 5:18 p.m. UTC
Add a page_prepare calback that's called before a page is written to.  This
will be used by gfs2 to start a transaction in page_prepare and end it in
page_done.  Other filesystems that implement data journaling will require the
same kind of mechanism.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/iomap.c            | 4 ++++
 include/linux/iomap.h | 9 ++++++---
 2 files changed, 10 insertions(+), 3 deletions(-)

Comments

Christoph Hellwig April 25, 2019, 7:59 a.m. UTC | #1
On Wed, Apr 24, 2019 at 07:18:03PM +0200, Andreas Gruenbacher wrote:
> Add a page_prepare calback that's called before a page is written to.  This
> will be used by gfs2 to start a transaction in page_prepare and end it in
> page_done.  Other filesystems that implement data journaling will require the
> same kind of mechanism.

This looks basically fine to me.  But I think it would be nicer to
add a iomap_page_ops structure so that we don't have to add more
pointers directly to the iomap.  We can make that struct pointer const
also to avoid runtime overwriting attacks.
Jan Kara April 25, 2019, 8:32 a.m. UTC | #2
On Wed 24-04-19 19:18:03, Andreas Gruenbacher wrote:
> Add a page_prepare calback that's called before a page is written to.  This
> will be used by gfs2 to start a transaction in page_prepare and end it in
> page_done.  Other filesystems that implement data journaling will require the
> same kind of mechanism.
> 
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>

Thanks for the patch. Some comments below.

> diff --git a/fs/iomap.c b/fs/iomap.c
> index 97cb9d486a7d..abd9aa76dbd1 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -684,6 +684,10 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
>  		status = __block_write_begin_int(page, pos, len, NULL, iomap);
>  	else
>  		status = __iomap_write_begin(inode, pos, len, page, iomap);
> +
> +	if (likely(!status) && iomap->page_prepare)
> +		status = iomap->page_prepare(inode, pos, len, page, iomap);
> +
>  	if (unlikely(status)) {
>  		unlock_page(page);
>  		put_page(page);

So this gets called after a page is locked. Is it OK for GFS2 to acquire
sd_log_flush_lock under page lock? Because e.g. gfs2_write_jdata_pagevec()
seems to acquire these locks the other way around so that could cause ABBA
deadlocks?

Also just looking at the code I was wondering about the following. E.g. in
iomap_write_end() we have code like:

        if (iomap->type == IOMAP_INLINE) {
		foo
	} else if (iomap->flags & IOMAP_F_BUFFER_HEAD) {
		bar
	} else {
		baz
	}

	if (iomap->page_done)
		iomap->page_done(...);

And now something very similar is in iomap_write_begin(). So won't it be
more natural to just mandate ->page_prepare() and ->page_done() callbacks
and each filesystem would set it to a helper function it needs? Probably we
could get rid of IOMAP_F_BUFFER_HEAD flag that way...

								Honza
Christoph Hellwig April 25, 2019, 3:03 p.m. UTC | #3
On Thu, Apr 25, 2019 at 10:32:52AM +0200, Jan Kara wrote:
> Also just looking at the code I was wondering about the following. E.g. in
> iomap_write_end() we have code like:
> 
>         if (iomap->type == IOMAP_INLINE) {
> 		foo
> 	} else if (iomap->flags & IOMAP_F_BUFFER_HEAD) {
> 		bar
> 	} else {
> 		baz
> 	}
> 
> 	if (iomap->page_done)
> 		iomap->page_done(...);
> 
> And now something very similar is in iomap_write_begin(). So won't it be
> more natural to just mandate ->page_prepare() and ->page_done() callbacks
> and each filesystem would set it to a helper function it needs? Probably we
> could get rid of IOMAP_F_BUFFER_HEAD flag that way...

I don't want pointless indirect calls for the default, non-buffer
head case.  Also inline really is a special case independent of
what the caller could pass in as flags or callbacks.  We could try to
hide the buffer_head stuff in there, but then again I'd rather kill
that off sooner than later.
Andreas Gruenbacher April 25, 2019, 3:26 p.m. UTC | #4
On Thu, 25 Apr 2019 at 10:32, Jan Kara <jack@suse.cz> wrote:
> On Wed 24-04-19 19:18:03, Andreas Gruenbacher wrote:
> > Add a page_prepare calback that's called before a page is written to.  This
> > will be used by gfs2 to start a transaction in page_prepare and end it in
> > page_done.  Other filesystems that implement data journaling will require the
> > same kind of mechanism.
> >
> > Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
>
> Thanks for the patch. Some comments below.
>
> > diff --git a/fs/iomap.c b/fs/iomap.c
> > index 97cb9d486a7d..abd9aa76dbd1 100644
> > --- a/fs/iomap.c
> > +++ b/fs/iomap.c
> > @@ -684,6 +684,10 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
> >               status = __block_write_begin_int(page, pos, len, NULL, iomap);
> >       else
> >               status = __iomap_write_begin(inode, pos, len, page, iomap);
> > +
> > +     if (likely(!status) && iomap->page_prepare)
> > +             status = iomap->page_prepare(inode, pos, len, page, iomap);
> > +
> >       if (unlikely(status)) {
> >               unlock_page(page);
> >               put_page(page);
>
> So this gets called after a page is locked. Is it OK for GFS2 to acquire
> sd_log_flush_lock under page lock? Because e.g. gfs2_write_jdata_pagevec()
> seems to acquire these locks the other way around so that could cause ABBA
> deadlocks?

Good catch, the callback indeed needs to happen earlier.

Thanks,
Andreas
diff mbox series

Patch

diff --git a/fs/iomap.c b/fs/iomap.c
index 97cb9d486a7d..abd9aa76dbd1 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -684,6 +684,10 @@  iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 		status = __block_write_begin_int(page, pos, len, NULL, iomap);
 	else
 		status = __iomap_write_begin(inode, pos, len, page, iomap);
+
+	if (likely(!status) && iomap->page_prepare)
+		status = iomap->page_prepare(inode, pos, len, page, iomap);
+
 	if (unlikely(status)) {
 		unlock_page(page);
 		put_page(page);
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 0fefb5455bda..0982f3e13e56 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -65,10 +65,13 @@  struct iomap {
 	void			*private; /* filesystem private */
 
 	/*
-	 * Called when finished processing a page in the mapping returned in
-	 * this iomap.  At least for now this is only supported in the buffered
-	 * write path.
+	 * Called before / after processing a page in the mapping returned in
+	 * this iomap.  At least for now, this is only supported in the
+	 * buffered write path.  When page_prepare returns 0 for a page,
+	 * page_done is called for that page as well.
 	 */
+	int (*page_prepare)(struct inode *inode, loff_t pos, unsigned len,
+			struct page *page, struct iomap *iomap);
 	void (*page_done)(struct inode *inode, loff_t pos, unsigned copied,
 			struct page *page, struct iomap *iomap);
 };