diff mbox series

[02/17] iomap: Protect read_bytes_pending with the state_lock

Message ID 20230915183707.2707298-3-willy@infradead.org (mailing list archive)
State New, archived
Headers show
Series Add folio_end_read | expand

Commit Message

Matthew Wilcox Sept. 15, 2023, 6:36 p.m. UTC
Perform one atomic operation (acquiring the spinlock) instead of
two (spinlock & atomic_sub) per read completion.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/iomap/buffered-io.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

Comments

Linus Torvalds Sept. 16, 2023, 12:11 a.m. UTC | #1
On Fri, 15 Sept 2023 at 11:37, Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> Perform one atomic operation (acquiring the spinlock) instead of
> two (spinlock & atomic_sub) per read completion.

I think this may be a worthwhile thing to do, but...

> -static void iomap_finish_folio_read(struct folio *folio, size_t offset,
> +static void iomap_finish_folio_read(struct folio *folio, size_t off,

this function really turns into a mess.

The diff is hard to read, and I'm not talking about the 'offset' ->
'off' part, but about how now about half of the function has various
'if (ifs)' tests spread out.

And I think it actually hides what is going on.

If you decide to combine all the "if (ifs)" parts on one side, and
then simplify the end result, you actually end up with a much
easier-to-read function.

I think it ends up looking like this:

  static void iomap_finish_folio_read(struct folio *folio, size_t off,
                  size_t len, int error)
  {
        struct iomap_folio_state *ifs = folio->private;
        bool uptodate = true;
        bool finished = true;

        if (ifs) {
                unsigned long flags;

                spin_lock_irqsave(&ifs->state_lock, flags);

                if (!error)
                        uptodate = ifs_set_range_uptodate(folio, ifs,
off, len);

                ifs->read_bytes_pending -= len;
                finished = !ifs->read_bytes_pending;
                spin_unlock_irqrestore(&ifs->state_lock, flags);
        }

        if (unlikely(error))
                folio_set_error(folio);
        else if (uptodate)
                folio_mark_uptodate(folio);
        if (finished)
                folio_unlock(folio);
  }

but that was just a quick hack-work by me (the above does, for
example, depend on folio_mark_uptodate() not needing the
ifs->state_lock, so the shared parts then got moved out).

I think the above looks a *lot* more legible than having three
different versions of "if (ifs)" spread out in the function, and it
also makes the parts that are actually protected by ifs->state_lock a
lot more obvious.

But again: I looked at your patch, found it very hard to follow, and
then decided to quickly do a "what  happens if I apply the patch and
then try to simplify the result".

I might have made some simplification error. But please give that a look, ok?

              Linus
Linus Torvalds Sept. 16, 2023, 12:15 a.m. UTC | #2
On Fri, 15 Sept 2023 at 17:11, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
[...]
>         if (unlikely(error))
>                 folio_set_error(folio);
>         else if (uptodate)
>                 folio_mark_uptodate(folio);
>         if (finished)
>                 folio_unlock(folio);
>   }

Note that this then becomes

        if (unlikely(error))
                folio_set_error(folio);
        if (finished)
                folio_unlock(folio, uptodate && !error);
  }

but that change would happen later, in patch 6/17.

             Linus
Matthew Wilcox Sept. 16, 2023, 3:50 p.m. UTC | #3
On Fri, Sep 15, 2023 at 05:11:55PM -0700, Linus Torvalds wrote:
> I think it ends up looking like this:
> 
>   static void iomap_finish_folio_read(struct folio *folio, size_t off,
>                   size_t len, int error)
>   {
>         struct iomap_folio_state *ifs = folio->private;
>         bool uptodate = true;
>         bool finished = true;
> 
>         if (ifs) {
>                 unsigned long flags;
> 
>                 spin_lock_irqsave(&ifs->state_lock, flags);
> 
>                 if (!error)
>                         uptodate = ifs_set_range_uptodate(folio, ifs,
> off, len);
> 
>                 ifs->read_bytes_pending -= len;
>                 finished = !ifs->read_bytes_pending;
>                 spin_unlock_irqrestore(&ifs->state_lock, flags);
>         }
> 
>         if (unlikely(error))
>                 folio_set_error(folio);
>         else if (uptodate)
>                 folio_mark_uptodate(folio);
>         if (finished)
>                 folio_unlock(folio);
>   }
> 
> but that was just a quick hack-work by me (the above does, for
> example, depend on folio_mark_uptodate() not needing the
> ifs->state_lock, so the shared parts then got moved out).

I really like this.  One tweak compared to your version:

        bool uptodate = !error;
...
        if (error)
                folio_set_error(folio);
        if (uptodate)
                folio_mark_uptodate(folio);
        if (finished)
                folio_unlock(folio);

... and then the later patch becomes

	if (finished)
		folio_end_read(folio, uptodate);
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 4c05fd457ee7..cade15b70627 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -29,9 +29,9 @@  typedef int (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length);
  * and I/O completions.
  */
 struct iomap_folio_state {
-	atomic_t		read_bytes_pending;
-	atomic_t		write_bytes_pending;
 	spinlock_t		state_lock;
+	unsigned int		read_bytes_pending;
+	atomic_t		write_bytes_pending;
 
 	/*
 	 * Each block has two bits in this bitmap:
@@ -183,7 +183,7 @@  static void ifs_free(struct folio *folio)
 
 	if (!ifs)
 		return;
-	WARN_ON_ONCE(atomic_read(&ifs->read_bytes_pending));
+	WARN_ON_ONCE(ifs->read_bytes_pending != 0);
 	WARN_ON_ONCE(atomic_read(&ifs->write_bytes_pending));
 	WARN_ON_ONCE(ifs_is_fully_uptodate(folio, ifs) !=
 			folio_test_uptodate(folio));
@@ -250,19 +250,33 @@  static void iomap_adjust_read_range(struct inode *inode, struct folio *folio,
 	*lenp = plen;
 }
 
-static void iomap_finish_folio_read(struct folio *folio, size_t offset,
+static void iomap_finish_folio_read(struct folio *folio, size_t off,
 		size_t len, int error)
 {
 	struct iomap_folio_state *ifs = folio->private;
+	unsigned long flags;
+	bool uptodate;
+	bool finished = true;
+
+	if (ifs)
+		spin_lock_irqsave(&ifs->state_lock, flags);
 
 	if (unlikely(error)) {
-		folio_clear_uptodate(folio);
+		uptodate = false;
 		folio_set_error(folio);
 	} else {
-		iomap_set_range_uptodate(folio, offset, len);
+		uptodate = !ifs || ifs_set_range_uptodate(folio, ifs, off, len);
 	}
 
-	if (!ifs || atomic_sub_and_test(len, &ifs->read_bytes_pending))
+	if (ifs) {
+		ifs->read_bytes_pending -= len;
+		finished = !ifs->read_bytes_pending;
+		spin_unlock_irqrestore(&ifs->state_lock, flags);
+	}
+
+	if (uptodate)
+		folio_mark_uptodate(folio);
+	if (finished)
 		folio_unlock(folio);
 }
 
@@ -360,8 +374,11 @@  static loff_t iomap_readpage_iter(const struct iomap_iter *iter,
 	}
 
 	ctx->cur_folio_in_bio = true;
-	if (ifs)
-		atomic_add(plen, &ifs->read_bytes_pending);
+	if (ifs) {
+		spin_lock_irq(&ifs->state_lock);
+		ifs->read_bytes_pending += plen;
+		spin_unlock_irq(&ifs->state_lock);
+	}
 
 	sector = iomap_sector(iomap, pos);
 	if (!ctx->bio ||