diff mbox series

[v3,04/11] readahead: rework loop in page_cache_ra_unbounded()

Message ID 20240313170253.2324812-5-kernel@pankajraghav.com (mailing list archive)
State New
Headers show
Series enable bs > ps in XFS | expand

Commit Message

Pankaj Raghav (Samsung) March 13, 2024, 5:02 p.m. UTC
From: Hannes Reinecke <hare@suse.de>

Rework the loop in page_cache_ra_unbounded() to advance with
the number of pages in a folio instead of just one page at a time.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Co-developed-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
---
 mm/readahead.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

Comments

Matthew Wilcox March 25, 2024, 6:41 p.m. UTC | #1
On Wed, Mar 13, 2024 at 06:02:46PM +0100, Pankaj Raghav (Samsung) wrote:
> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>  			 * not worth getting one just for that.
>  			 */
>  			read_pages(ractl);
> -			ractl->_index++;
> -			i = ractl->_index + ractl->_nr_pages - index - 1;
> +			ractl->_index += folio_nr_pages(folio);
> +			i = ractl->_index + ractl->_nr_pages - index;
>  			continue;
>  		}
>  
> @@ -252,13 +252,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>  			folio_put(folio);
>  			read_pages(ractl);
>  			ractl->_index++;
> -			i = ractl->_index + ractl->_nr_pages - index - 1;
> +			i = ractl->_index + ractl->_nr_pages - index;
>  			continue;
>  		}

You changed index++ in the first hunk, but not the second hunk.  Is that
intentional?
Pankaj Raghav (Samsung) March 26, 2024, 8:56 a.m. UTC | #2
On Mon, Mar 25, 2024 at 06:41:01PM +0000, Matthew Wilcox wrote:
> On Wed, Mar 13, 2024 at 06:02:46PM +0100, Pankaj Raghav (Samsung) wrote:
> > @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> >  			 * not worth getting one just for that.
> >  			 */
> >  			read_pages(ractl);
> > -			ractl->_index++;
> > -			i = ractl->_index + ractl->_nr_pages - index - 1;
> > +			ractl->_index += folio_nr_pages(folio);
> > +			i = ractl->_index + ractl->_nr_pages - index;
> >  			continue;
> >  		}
> >  
> > @@ -252,13 +252,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> >  			folio_put(folio);
> >  			read_pages(ractl);
> >  			ractl->_index++;
> > -			i = ractl->_index + ractl->_nr_pages - index - 1;
> > +			i = ractl->_index + ractl->_nr_pages - index;
> >  			continue;
> >  		}
> 
> You changed index++ in the first hunk, but not the second hunk.  Is that
> intentional?

The reason I didn't use folio_nr_pages(folio) in the second hunk is
because we have already `put` the folio and it is not valid anymore to
use folio_nr_pages right? Because we increase the ref count in
filemap_alloc() and we put if add fails. 

Plus in the second hunk, adding the 0 order folio failed in that index,
so we just move on to the next index. Once we have the min order
support, if adding min order folio failed, we move by min_order.

And your comment on the next patch:

> Hah, you changed this here.  Please move into previous patch.

We can't do that either because I am introducing the concept of min
order in the next patch.
Hannes Reinecke March 26, 2024, 9:39 a.m. UTC | #3
On 3/25/24 19:41, Matthew Wilcox wrote:
> On Wed, Mar 13, 2024 at 06:02:46PM +0100, Pankaj Raghav (Samsung) wrote:
>> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>   			 * not worth getting one just for that.
>>   			 */
>>   			read_pages(ractl);
>> -			ractl->_index++;
>> -			i = ractl->_index + ractl->_nr_pages - index - 1;
>> +			ractl->_index += folio_nr_pages(folio);
>> +			i = ractl->_index + ractl->_nr_pages - index;
>>   			continue;
>>   		}
>>   
>> @@ -252,13 +252,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>   			folio_put(folio);
>>   			read_pages(ractl);
>>   			ractl->_index++;
>> -			i = ractl->_index + ractl->_nr_pages - index - 1;
>> +			i = ractl->_index + ractl->_nr_pages - index;
>>   			continue;
>>   		}
> 
> You changed index++ in the first hunk, but not the second hunk.  Is that
> intentional?

Hmm. Looks you are right; it should be modified, too.
Will be fixing it up.

Cheers,

Hannes
Pankaj Raghav (Samsung) March 26, 2024, 9:44 a.m. UTC | #4
Hi Hannes,

On 26/03/2024 10:39, Hannes Reinecke wrote:
> On 3/25/24 19:41, Matthew Wilcox wrote:
>> On Wed, Mar 13, 2024 at 06:02:46PM +0100, Pankaj Raghav (Samsung) wrote:
>>> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>                * not worth getting one just for that.
>>>                */
>>>               read_pages(ractl);
>>> -            ractl->_index++;
>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>> +            ractl->_index += folio_nr_pages(folio);
>>> +            i = ractl->_index + ractl->_nr_pages - index;
>>>               continue;
>>>           }
>>>   @@ -252,13 +252,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>               folio_put(folio);
>>>               read_pages(ractl);
>>>               ractl->_index++;
>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>> +            i = ractl->_index + ractl->_nr_pages - index;
>>>               continue;
>>>           }
>>
>> You changed index++ in the first hunk, but not the second hunk.  Is that
>> intentional?
> 
> Hmm. Looks you are right; it should be modified, too.
> Will be fixing it up.
> 
You initially had also in the second hunk:
ractl->index += folio_nr_pages(folio);

and I changed it to what it is now.

The reason is in my reply to willy:
https://lore.kernel.org/linux-xfs/s4jn4t4betknd3y4ltfccqxyfktzdljiz7klgbqsrccmv3rwrd@orlwjz77oyxo/

Let me know if you agree with it.

> Cheers,
> 
> Hannes
>
Hannes Reinecke March 26, 2024, 10 a.m. UTC | #5
On 3/26/24 10:44, Pankaj Raghav wrote:
> Hi Hannes,
> 
> On 26/03/2024 10:39, Hannes Reinecke wrote:
>> On 3/25/24 19:41, Matthew Wilcox wrote:
>>> On Wed, Mar 13, 2024 at 06:02:46PM +0100, Pankaj Raghav (Samsung) wrote:
>>>> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>>                 * not worth getting one just for that.
>>>>                 */
>>>>                read_pages(ractl);
>>>> -            ractl->_index++;
>>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>>> +            ractl->_index += folio_nr_pages(folio);
>>>> +            i = ractl->_index + ractl->_nr_pages - index;
>>>>                continue;
>>>>            }
>>>>    @@ -252,13 +252,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>>                folio_put(folio);
>>>>                read_pages(ractl);
>>>>                ractl->_index++;
>>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>>> +            i = ractl->_index + ractl->_nr_pages - index;
>>>>                continue;
>>>>            }
>>>
>>> You changed index++ in the first hunk, but not the second hunk.  Is that
>>> intentional?
>>
>> Hmm. Looks you are right; it should be modified, too.
>> Will be fixing it up.
>>
> You initially had also in the second hunk:
> ractl->index += folio_nr_pages(folio);
> 
> and I changed it to what it is now.
> 
> The reason is in my reply to willy:
> https://lore.kernel.org/linux-xfs/s4jn4t4betknd3y4ltfccqxyfktzdljiz7klgbqsrccmv3rwrd@orlwjz77oyxo/
> 
> Let me know if you agree with it.
> 
Bah. That really is overly complicated. When we attempt a conversion 
that conversion should be stand-alone, not rely on some other patch 
modifications later on.
We definitely need to work on that to make it easier to review, even
without having to read the mail thread.

Cheers,

Hannes
Pankaj Raghav (Samsung) March 26, 2024, 10:06 a.m. UTC | #6
On 26/03/2024 11:00, Hannes Reinecke wrote:
> On 3/26/24 10:44, Pankaj Raghav wrote:
>> Hi Hannes,
>>
>> On 26/03/2024 10:39, Hannes Reinecke wrote:
>>> On 3/25/24 19:41, Matthew Wilcox wrote:
>>>> On Wed, Mar 13, 2024 at 06:02:46PM +0100, Pankaj Raghav (Samsung) wrote:
>>>>> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>>>                 * not worth getting one just for that.
>>>>>                 */
>>>>>                read_pages(ractl);
>>>>> -            ractl->_index++;
>>>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>>>> +            ractl->_index += folio_nr_pages(folio);
>>>>> +            i = ractl->_index + ractl->_nr_pages - index;
>>>>>                continue;
>>>>>            }
>>>>>    @@ -252,13 +252,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>>>                folio_put(folio);
>>>>>                read_pages(ractl);
>>>>>                ractl->_index++;
>>>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>>>> +            i = ractl->_index + ractl->_nr_pages - index;
>>>>>                continue;
>>>>>            }
>>>>
>>>> You changed index++ in the first hunk, but not the second hunk.  Is that
>>>> intentional?
>>>
>>> Hmm. Looks you are right; it should be modified, too.
>>> Will be fixing it up.
>>>
>> You initially had also in the second hunk:
>> ractl->index += folio_nr_pages(folio);
>>
>> and I changed it to what it is now.
>>
>> The reason is in my reply to willy:
>> https://lore.kernel.org/linux-xfs/s4jn4t4betknd3y4ltfccqxyfktzdljiz7klgbqsrccmv3rwrd@orlwjz77oyxo/
>>
>> Let me know if you agree with it.
>>
> Bah. That really is overly complicated. When we attempt a conversion that conversion should be
> stand-alone, not rely on some other patch modifications later on.
> We definitely need to work on that to make it easier to review, even
> without having to read the mail thread.
> 

I don't know understand what you mean by overly complicated. This conversion is standalone and it is
wrong to use folio_nr_pages after we `put` the folio. This patch just reworks the loop and in the
next patch I add min order support to readahead.

This patch doesn't depend on the next patch.

> Cheers,
> 
> Hannes
>
Hannes Reinecke March 26, 2024, 10:55 a.m. UTC | #7
On 3/26/24 11:06, Pankaj Raghav wrote:
> On 26/03/2024 11:00, Hannes Reinecke wrote:
>> On 3/26/24 10:44, Pankaj Raghav wrote:
>>> Hi Hannes,
>>>
>>> On 26/03/2024 10:39, Hannes Reinecke wrote:
>>>> On 3/25/24 19:41, Matthew Wilcox wrote:
>>>>> On Wed, Mar 13, 2024 at 06:02:46PM +0100, Pankaj Raghav (Samsung) wrote:
>>>>>> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>>>>                  * not worth getting one just for that.
>>>>>>                  */
>>>>>>                 read_pages(ractl);
>>>>>> -            ractl->_index++;
>>>>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>>>>> +            ractl->_index += folio_nr_pages(folio);
>>>>>> +            i = ractl->_index + ractl->_nr_pages - index;
>>>>>>                 continue;
>>>>>>             }
>>>>>>     @@ -252,13 +252,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>>>>                 folio_put(folio);
>>>>>>                 read_pages(ractl);
>>>>>>                 ractl->_index++;
>>>>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>>>>> +            i = ractl->_index + ractl->_nr_pages - index;
>>>>>>                 continue;
>>>>>>             }
>>>>>
>>>>> You changed index++ in the first hunk, but not the second hunk.  Is that
>>>>> intentional?
>>>>
>>>> Hmm. Looks you are right; it should be modified, too.
>>>> Will be fixing it up.
>>>>
>>> You initially had also in the second hunk:
>>> ractl->index += folio_nr_pages(folio);
>>>
>>> and I changed it to what it is now.
>>>
>>> The reason is in my reply to willy:
>>> https://lore.kernel.org/linux-xfs/s4jn4t4betknd3y4ltfccqxyfktzdljiz7klgbqsrccmv3rwrd@orlwjz77oyxo/
>>>
>>> Let me know if you agree with it.
>>>
>> Bah. That really is overly complicated. When we attempt a conversion that conversion should be
>> stand-alone, not rely on some other patch modifications later on.
>> We definitely need to work on that to make it easier to review, even
>> without having to read the mail thread.
>>
> 
> I don't know understand what you mean by overly complicated. This conversion is standalone and it is
> wrong to use folio_nr_pages after we `put` the folio. This patch just reworks the loop and in the
> next patch I add min order support to readahead.
> 
> This patch doesn't depend on the next patch.
> 

Let me rephrase: what does 'ractl->_index' signify?
 From my understanding it should be the index of the
first folio/page in ractl, right?

If so I find it hard to understand how we _could_ increase it by one; 
_index should _always_ in units of the minimal pagemap size.
And if we don't have it here (as you suggested in the mailthread)
I'd rather move this patch _after_ the minimal pagesize is introduced
to ensure that _index is always incremented by the right amount.

Cheers,

Hannes
Pankaj Raghav (Samsung) March 26, 2024, 1:41 p.m. UTC | #8
On Tue, Mar 26, 2024 at 11:55:06AM +0100, Hannes Reinecke wrote:
> > > Bah. That really is overly complicated. When we attempt a conversion that conversion should be
> > > stand-alone, not rely on some other patch modifications later on.
> > > We definitely need to work on that to make it easier to review, even
> > > without having to read the mail thread.
> > > 
> > 
> > I don't know understand what you mean by overly complicated. This conversion is standalone and it is
> > wrong to use folio_nr_pages after we `put` the folio. This patch just reworks the loop and in the
> > next patch I add min order support to readahead.
> > 
> > This patch doesn't depend on the next patch.
> > 
> 
> Let me rephrase: what does 'ractl->_index' signify?
> From my understanding it should be the index of the
> first folio/page in ractl, right?
> 
> If so I find it hard to understand how we _could_ increase it by one; _index
> should _always_ in units of the minimal pagemap size.

I still have not introduced the minimal pagemap size concept here. That
comes in the next patch. This patch only reworks the loop and should not
have any functional changes. So the minimal pagemap size unit here is 1.

And to your next question how could we increase it only by one here:

// We come here if we didn't find any folio at index + i
...
folio = filemap_alloc_folio(gfp_mask, 0); // order 0 => 1 page
if (!folio)
	break;
if (filemap_add_folio(mapping, folio, index + i,
			gfp_mask) < 0) {
	folio_put(folio);
	read_pages(ractl);
	ractl->_index++;
	...

If we failed to add a folio of order 0 at (index + i), we put the folio
and start a read_pages() on whatever pages we added so far (ractl->index to
ractl->index + ractl->nr_pages).

read_pages() updates the ractl->index to ractl->index + ractl->nr_pages.
ractl->index after read_pages() should point to (index + i). As we had
issue adding a folio of order 0, we skip that index by incrementing the
ractl->index by 1.

Does this clarify? In your original patch, you used folio_nr_pages()
here. As I said before, we already know the size of the folio we tried
to add was 1, so we could just increment by 1, and we should not use the
folio to deduce the size after folio_put() as it is use after free.

> And if we don't have it here (as you suggested in the mailthread)
> I'd rather move this patch _after_ the minimal pagesize is introduced
> to ensure that _index is always incremented by the right amount.
> 

I intended to have it as two atomic changes where there is
non-functional change that helps with the functional change that comes
later. If it is confusing, I could also combine this with the next
patch?

Or, I could have it as the first patch before I start adding the concept
of folio_min_order. Then it makes it clear that it is intended to be a
non-function change?
--
Pankaj
Pankaj Raghav (Samsung) March 26, 2024, 3:11 p.m. UTC | #9
On Mon, Mar 25, 2024 at 06:41:01PM +0000, Matthew Wilcox wrote:
> On Wed, Mar 13, 2024 at 06:02:46PM +0100, Pankaj Raghav (Samsung) wrote:
> > @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> >  			 * not worth getting one just for that.
> >  			 */
> >  			read_pages(ractl);
> > -			ractl->_index++;
> > -			i = ractl->_index + ractl->_nr_pages - index - 1;
> > +			ractl->_index += folio_nr_pages(folio);
> > +			i = ractl->_index + ractl->_nr_pages - index;
> >  			continue;
> >  		}
> >  
> > @@ -252,13 +252,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> >  			folio_put(folio);
> >  			read_pages(ractl);
> >  			ractl->_index++;
> > -			i = ractl->_index + ractl->_nr_pages - index - 1;
> > +			i = ractl->_index + ractl->_nr_pages - index;
> >  			continue;
> >  		}
> 
> You changed index++ in the first hunk, but not the second hunk.  Is that
> intentional?
After having some back and forth with Hannes, I see where the confusion
is coming from.

I intended this to be a non-functional change that helps with adding 
min_order support later.

As this is a non-functional change, I will move this patch to be at the
start of the series as preparation patches before we start adding min_order
helpers and support.

--
Pankaj
diff mbox series

Patch

diff --git a/mm/readahead.c b/mm/readahead.c
index 369c70e2be42..37b938f4b54f 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -208,7 +208,7 @@  void page_cache_ra_unbounded(struct readahead_control *ractl,
 	struct address_space *mapping = ractl->mapping;
 	unsigned long index = readahead_index(ractl);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
-	unsigned long i;
+	unsigned long i = 0;
 
 	/*
 	 * Partway through the readahead operation, we will have added
@@ -226,7 +226,7 @@  void page_cache_ra_unbounded(struct readahead_control *ractl,
 	/*
 	 * Preallocate as many pages as we will need.
 	 */
-	for (i = 0; i < nr_to_read; i++) {
+	while (i < nr_to_read) {
 		struct folio *folio = xa_load(&mapping->i_pages, index + i);
 
 		if (folio && !xa_is_value(folio)) {
@@ -239,8 +239,8 @@  void page_cache_ra_unbounded(struct readahead_control *ractl,
 			 * not worth getting one just for that.
 			 */
 			read_pages(ractl);
-			ractl->_index++;
-			i = ractl->_index + ractl->_nr_pages - index - 1;
+			ractl->_index += folio_nr_pages(folio);
+			i = ractl->_index + ractl->_nr_pages - index;
 			continue;
 		}
 
@@ -252,13 +252,14 @@  void page_cache_ra_unbounded(struct readahead_control *ractl,
 			folio_put(folio);
 			read_pages(ractl);
 			ractl->_index++;
-			i = ractl->_index + ractl->_nr_pages - index - 1;
+			i = ractl->_index + ractl->_nr_pages - index;
 			continue;
 		}
 		if (i == nr_to_read - lookahead_size)
 			folio_set_readahead(folio);
 		ractl->_workingset |= folio_test_workingset(folio);
-		ractl->_nr_pages++;
+		ractl->_nr_pages += folio_nr_pages(folio);
+		i += folio_nr_pages(folio);
 	}
 
 	/*