diff mbox series

[v4,5/9] iomap: Remove unnecessary test from iomap_release_folio()

Message ID 20230710130253.3484695-6-willy@infradead.org (mailing list archive)
State Deferred, archived
Headers show
Series Create large folios in iomap buffered write path | expand

Commit Message

Matthew Wilcox July 10, 2023, 1:02 p.m. UTC
The check for the folio being under writeback is unnecessary; the caller
has checked this and the folio is locked, so the folio cannot be under
writeback at this point.

The comment is somewhat misleading in that it talks about one specific
situation in which we can see a dirty folio.  There are others, so change
the comment to explain why we can't release the iomap_page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/iomap/buffered-io.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Darrick J. Wong July 13, 2023, 4:45 a.m. UTC | #1
[add ritesh]

On Mon, Jul 10, 2023 at 02:02:49PM +0100, Matthew Wilcox (Oracle) wrote:
> The check for the folio being under writeback is unnecessary; the caller
> has checked this and the folio is locked, so the folio cannot be under
> writeback at this point.
> 
> The comment is somewhat misleading in that it talks about one specific
> situation in which we can see a dirty folio.  There are others, so change
> the comment to explain why we can't release the iomap_page.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/iomap/buffered-io.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 1cb905140528..7aa3009f907f 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -483,12 +483,11 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags)
>  			folio_size(folio));
>  
>  	/*
> -	 * mm accommodates an old ext3 case where clean folios might
> -	 * not have had the dirty bit cleared.  Thus, it can send actual
> -	 * dirty folios to ->release_folio() via shrink_active_list();
> -	 * skip those here.
> +	 * If the folio is dirty, we refuse to release our metadata because
> +	 * it may be partially dirty.  Once we track per-block dirty state,
> +	 * we can release the metadata if every block is dirty.

Ritesh: I'm assuming that implementing this will be part of your v12 series?

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

>  	 */
> -	if (folio_test_dirty(folio) || folio_test_writeback(folio))
> +	if (folio_test_dirty(folio))
>  		return false;
>  	iomap_page_release(folio);
>  	return true;
> -- 
> 2.39.2
>
Ritesh Harjani (IBM) July 13, 2023, 5:25 a.m. UTC | #2
"Darrick J. Wong" <djwong@kernel.org> writes:

> [add ritesh]
>
> On Mon, Jul 10, 2023 at 02:02:49PM +0100, Matthew Wilcox (Oracle) wrote:
>> The check for the folio being under writeback is unnecessary; the caller
>> has checked this and the folio is locked, so the folio cannot be under
>> writeback at this point.
>> 
>> The comment is somewhat misleading in that it talks about one specific
>> situation in which we can see a dirty folio.  There are others, so change
>> the comment to explain why we can't release the iomap_page.
>> 
>> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> ---
>>  fs/iomap/buffered-io.c | 9 ++++-----
>>  1 file changed, 4 insertions(+), 5 deletions(-)
>> 
>> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
>> index 1cb905140528..7aa3009f907f 100644
>> --- a/fs/iomap/buffered-io.c
>> +++ b/fs/iomap/buffered-io.c
>> @@ -483,12 +483,11 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags)
>>  			folio_size(folio));
>>  
>>  	/*
>> -	 * mm accommodates an old ext3 case where clean folios might
>> -	 * not have had the dirty bit cleared.  Thus, it can send actual
>> -	 * dirty folios to ->release_folio() via shrink_active_list();
>> -	 * skip those here.
>> +	 * If the folio is dirty, we refuse to release our metadata because
>> +	 * it may be partially dirty.  Once we track per-block dirty state,
>> +	 * we can release the metadata if every block is dirty.
>
> Ritesh: I'm assuming that implementing this will be part of your v12 series?

No, if it's any optimization, then I think we can take it up later too,
not in v12 please (I have been doing some extensive testing of current series).
Also let me understand it a bit more.

@willy,
Is this what you are suggesting? So this is mainly to free up some
memory for iomap_folio_state structure then right?
But then whenever we are doing a writeback, we anyway would be
allocating iomap_folio_state() and marking all the bits dirty. Isn't it
sub-optimal then?  

@@ -489,8 +489,11 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags)
         * it may be partially dirty.  Once we track per-block dirty state,
         * we can release the metadata if every block is dirty.
         */
-       if (folio_test_dirty(folio))
+       if (folio_test_dirty(folio)) {
+               if (ifs_is_fully_dirty(folio, ifs))
+                       iomap_page_release(folio);
                return false;
+       }
        iomap_page_release(folio);
        return true;
 }

(Ignore the old and new apis naming in above. It is just to get an idea)

-ritesh

>
> Reviewed-by: Darrick J. Wong <djwong@kernel.org>
>
> --D
>
>>  	 */
>> -	if (folio_test_dirty(folio) || folio_test_writeback(folio))
>> +	if (folio_test_dirty(folio))
>>  		return false;
>>  	iomap_page_release(folio);
>>  	return true;
>> -- 
>> 2.39.2
>>
Darrick J. Wong July 13, 2023, 5:33 a.m. UTC | #3
On Thu, Jul 13, 2023 at 10:55:20AM +0530, Ritesh Harjani wrote:
> "Darrick J. Wong" <djwong@kernel.org> writes:
> 
> > [add ritesh]
> >
> > On Mon, Jul 10, 2023 at 02:02:49PM +0100, Matthew Wilcox (Oracle) wrote:
> >> The check for the folio being under writeback is unnecessary; the caller
> >> has checked this and the folio is locked, so the folio cannot be under
> >> writeback at this point.
> >> 
> >> The comment is somewhat misleading in that it talks about one specific
> >> situation in which we can see a dirty folio.  There are others, so change
> >> the comment to explain why we can't release the iomap_page.
> >> 
> >> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> >> Reviewed-by: Christoph Hellwig <hch@lst.de>
> >> ---
> >>  fs/iomap/buffered-io.c | 9 ++++-----
> >>  1 file changed, 4 insertions(+), 5 deletions(-)
> >> 
> >> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> >> index 1cb905140528..7aa3009f907f 100644
> >> --- a/fs/iomap/buffered-io.c
> >> +++ b/fs/iomap/buffered-io.c
> >> @@ -483,12 +483,11 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags)
> >>  			folio_size(folio));
> >>  
> >>  	/*
> >> -	 * mm accommodates an old ext3 case where clean folios might
> >> -	 * not have had the dirty bit cleared.  Thus, it can send actual
> >> -	 * dirty folios to ->release_folio() via shrink_active_list();
> >> -	 * skip those here.
> >> +	 * If the folio is dirty, we refuse to release our metadata because
> >> +	 * it may be partially dirty.  Once we track per-block dirty state,
> >> +	 * we can release the metadata if every block is dirty.
> >
> > Ritesh: I'm assuming that implementing this will be part of your v12 series?
> 
> No, if it's any optimization, then I think we can take it up later too,

<nod>

> not in v12 please (I have been doing some extensive testing of current series).
> Also let me understand it a bit more.
> 
> @willy,
> Is this what you are suggesting? So this is mainly to free up some
> memory for iomap_folio_state structure then right?

I think it's also to break up compound folios to free base pages or
other reasons.

https://lore.kernel.org/linux-xfs/20230713044326.GI108251@frogsfrogsfrogs/T/#mc83fe929d57e9aa3c1834232389cad0d62b66e7b

> But then whenever we are doing a writeback, we anyway would be
> allocating iomap_folio_state() and marking all the bits dirty. Isn't it
> sub-optimal then?  
> 
> @@ -489,8 +489,11 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags)
>          * it may be partially dirty.  Once we track per-block dirty state,
>          * we can release the metadata if every block is dirty.
>          */
> -       if (folio_test_dirty(folio))
> +       if (folio_test_dirty(folio)) {
> +               if (ifs_is_fully_dirty(folio, ifs))
> +                       iomap_page_release(folio);
>                 return false;
> +       }

I think it's more that we *dont* break up partially dirty folios:

	/*
	 * Folio is partially dirty, do not throw away the state or
	 * split the folio.
	 */
	if (folio_test_dirty(folio) && !ifs_is_fully_dirty(folio, ifs))
		return false;

	/* No more private state tracking, ok to split folio. */
	iomap_page_release(folio);
	return true;

But breaking up fully dirty folios is now possible, since the mm can
mark all the basepages dirty.

--D

>         iomap_page_release(folio);
>         return true;
>  }
> 
> (Ignore the old and new apis naming in above. It is just to get an idea)
> 
> -ritesh
> 
> >
> > Reviewed-by: Darrick J. Wong <djwong@kernel.org>
> >
> > --D
> >
> >>  	 */
> >> -	if (folio_test_dirty(folio) || folio_test_writeback(folio))
> >> +	if (folio_test_dirty(folio))
> >>  		return false;
> >>  	iomap_page_release(folio);
> >>  	return true;
> >> -- 
> >> 2.39.2
> >>
Ritesh Harjani (IBM) July 13, 2023, 5:51 a.m. UTC | #4
"Darrick J. Wong" <djwong@kernel.org> writes:

> On Thu, Jul 13, 2023 at 10:55:20AM +0530, Ritesh Harjani wrote:
>> "Darrick J. Wong" <djwong@kernel.org> writes:
>> 
>> > [add ritesh]
>> >
>> > On Mon, Jul 10, 2023 at 02:02:49PM +0100, Matthew Wilcox (Oracle) wrote:
>> >> The check for the folio being under writeback is unnecessary; the caller
>> >> has checked this and the folio is locked, so the folio cannot be under
>> >> writeback at this point.
>> >> 
>> >> The comment is somewhat misleading in that it talks about one specific
>> >> situation in which we can see a dirty folio.  There are others, so change
>> >> the comment to explain why we can't release the iomap_page.
>> >> 
>> >> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
>> >> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> >> ---
>> >>  fs/iomap/buffered-io.c | 9 ++++-----
>> >>  1 file changed, 4 insertions(+), 5 deletions(-)
>> >> 
>> >> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
>> >> index 1cb905140528..7aa3009f907f 100644
>> >> --- a/fs/iomap/buffered-io.c
>> >> +++ b/fs/iomap/buffered-io.c
>> >> @@ -483,12 +483,11 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags)
>> >>  			folio_size(folio));
>> >>  
>> >>  	/*
>> >> -	 * mm accommodates an old ext3 case where clean folios might
>> >> -	 * not have had the dirty bit cleared.  Thus, it can send actual
>> >> -	 * dirty folios to ->release_folio() via shrink_active_list();
>> >> -	 * skip those here.
>> >> +	 * If the folio is dirty, we refuse to release our metadata because
>> >> +	 * it may be partially dirty.  Once we track per-block dirty state,
>> >> +	 * we can release the metadata if every block is dirty.
>> >
>> > Ritesh: I'm assuming that implementing this will be part of your v12 series?
>> 
>> No, if it's any optimization, then I think we can take it up later too,
>
> <nod>

Thanks! 

>
>> not in v12 please (I have been doing some extensive testing of current series).
>> Also let me understand it a bit more.
>> 
>> @willy,
>> Is this what you are suggesting? So this is mainly to free up some
>> memory for iomap_folio_state structure then right?
>
> I think it's also to break up compound folios to free base pages or
> other reasons.
>
> https://lore.kernel.org/linux-xfs/20230713044326.GI108251@frogsfrogsfrogs/T/#mc83fe929d57e9aa3c1834232389cad0d62b66e7b
>
>> But then whenever we are doing a writeback, we anyway would be
>> allocating iomap_folio_state() and marking all the bits dirty. Isn't it
>> sub-optimal then?  
>> 
>> @@ -489,8 +489,11 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags)
>>          * it may be partially dirty.  Once we track per-block dirty state,
>>          * we can release the metadata if every block is dirty.
>>          */
>> -       if (folio_test_dirty(folio))
>> +       if (folio_test_dirty(folio)) {
>> +               if (ifs_is_fully_dirty(folio, ifs))
>> +                       iomap_page_release(folio);
>>                 return false;
>> +       }
>
> I think it's more that we *dont* break up partially dirty folios:
>
> 	/*
> 	 * Folio is partially dirty, do not throw away the state or
> 	 * split the folio.
> 	 */
> 	if (folio_test_dirty(folio) && !ifs_is_fully_dirty(folio, ifs))
> 		return false;
>
> 	/* No more private state tracking, ok to split folio. */
> 	iomap_page_release(folio);
> 	return true;
>

Aah got it. If the folio is dirty and all it's blocks are also dirty
then we can release the metadata and return true. This will allow the MM
to split the folio, right.

Let me test it then. Currently ifs_is_fully_dirty() will walk and test
all the bits for dirtiness. However, splitting the folio might not be
the fast path, so I am assuming it shouldn't have any performance
implication.


> But breaking up fully dirty folios is now possible, since the mm can
> mark all the basepages dirty.

Thanks. Got it.

-ritesh

>
> --D
>
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 1cb905140528..7aa3009f907f 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -483,12 +483,11 @@  bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags)
 			folio_size(folio));
 
 	/*
-	 * mm accommodates an old ext3 case where clean folios might
-	 * not have had the dirty bit cleared.  Thus, it can send actual
-	 * dirty folios to ->release_folio() via shrink_active_list();
-	 * skip those here.
+	 * If the folio is dirty, we refuse to release our metadata because
+	 * it may be partially dirty.  Once we track per-block dirty state,
+	 * we can release the metadata if every block is dirty.
 	 */
-	if (folio_test_dirty(folio) || folio_test_writeback(folio))
+	if (folio_test_dirty(folio))
 		return false;
 	iomap_page_release(folio);
 	return true;