diff mbox series

[01/18] mm/readahead: rework loop in page_cache_ra_unbounded()

Message ID 20230918110510.66470-2-hare@suse.de (mailing list archive)
State New, archived
Headers show
Series block: update buffer_head for Large-block I/O | expand

Commit Message

Hannes Reinecke Sept. 18, 2023, 11:04 a.m. UTC
Rework the loop in page_cache_ra_unbounded() to advance with
the number of pages in a folio instead of just one page at a time.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 mm/readahead.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

Comments

Pankaj Raghav Sept. 20, 2023, 11:56 a.m. UTC | #1
On Mon, Sep 18, 2023 at 01:04:53PM +0200, Hannes Reinecke wrote:
>  		if (folio && !xa_is_value(folio)) {
> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>  			 * not worth getting one just for that.
>  			 */
>  			read_pages(ractl);
> -			ractl->_index++;
> -			i = ractl->_index + ractl->_nr_pages - index - 1;
> +			ractl->_index += folio_nr_pages(folio);
> +			i = ractl->_index + ractl->_nr_pages - index;
I am not entirely sure if this is correct.

The above if condition only verifies if a folio is in the page cache but
doesn't tell if it is uptodate. But we are advancing the ractl->index
past this folio irrespective of that.

Am I missing something?
Hannes Reinecke Sept. 20, 2023, 2:13 p.m. UTC | #2
On 9/20/23 13:56, Pankaj Raghav wrote:
> On Mon, Sep 18, 2023 at 01:04:53PM +0200, Hannes Reinecke wrote:
>>   		if (folio && !xa_is_value(folio)) {
>> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>   			 * not worth getting one just for that.
>>   			 */
>>   			read_pages(ractl);
>> -			ractl->_index++;
>> -			i = ractl->_index + ractl->_nr_pages - index - 1;
>> +			ractl->_index += folio_nr_pages(folio);
>> +			i = ractl->_index + ractl->_nr_pages - index;
> I am not entirely sure if this is correct.
> 
> The above if condition only verifies if a folio is in the page cache but
> doesn't tell if it is uptodate. But we are advancing the ractl->index
> past this folio irrespective of that.
> 
> Am I missing something?

Confused. Which condition?
I'm not changing the condition at all, just changing the way how the 'i' 
index is calculated during the loop (ie getting rid of the weird 
decrement and increment on 'i' during the loop).
Please clarify.

Cheers,

Hannes
Matthew Wilcox Sept. 20, 2023, 2:18 p.m. UTC | #3
On Wed, Sep 20, 2023 at 01:56:43PM +0200, Pankaj Raghav wrote:
> On Mon, Sep 18, 2023 at 01:04:53PM +0200, Hannes Reinecke wrote:
> >  		if (folio && !xa_is_value(folio)) {
> > @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> >  			 * not worth getting one just for that.
> >  			 */
> >  			read_pages(ractl);
> > -			ractl->_index++;
> > -			i = ractl->_index + ractl->_nr_pages - index - 1;
> > +			ractl->_index += folio_nr_pages(folio);
> > +			i = ractl->_index + ractl->_nr_pages - index;
> I am not entirely sure if this is correct.
> 
> The above if condition only verifies if a folio is in the page cache but
> doesn't tell if it is uptodate. But we are advancing the ractl->index
> past this folio irrespective of that.
> 
> Am I missing something?

How readahead works?

Readahead is for the optimistic case where nothing has gone wrong;
we just don't have anything in the page cache yet.

If there's a !uptodate folio in the page cache, there are two
possibilities.  The most likely is that we have two threads accessing this
file at the same time.  If so, we should stop and allow the other thread
to do the readahead it has decided to do.  The less likely scenario is
that we had an error doing a previous readahead.  If that happened, we
should not try reading it again in readahead; we should let the thread
retry the read when it actually tries to access the folio.
Pankaj Raghav Sept. 21, 2023, 9:06 a.m. UTC | #4
On 2023-09-20 16:13, Hannes Reinecke wrote:
> On 9/20/23 13:56, Pankaj Raghav wrote:
>> On Mon, Sep 18, 2023 at 01:04:53PM +0200, Hannes Reinecke wrote:
>>>           if (folio && !xa_is_value(folio)) {
>>> @@ -239,8 +239,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
>>>                * not worth getting one just for that.
>>>                */
>>>               read_pages(ractl);
>>> -            ractl->_index++;
>>> -            i = ractl->_index + ractl->_nr_pages - index - 1;
>>> +            ractl->_index += folio_nr_pages(folio);
>>> +            i = ractl->_index + ractl->_nr_pages - index;
>> I am not entirely sure if this is correct.
>>
>> The above if condition only verifies if a folio is in the page cache but
>> doesn't tell if it is uptodate. But we are advancing the ractl->index
>> past this folio irrespective of that.
>>
>> Am I missing something?
> 
> Confused. Which condition?
> I'm not changing the condition at all, just changing the way how the 'i' index is calculated during
> the loop (ie getting rid of the weird decrement and increment on 'i' during the loop).
> Please clarify.
> 

I made a mistake of pointing out the wrong line in my reply. I was concerned about the increment to
the ractl->_index:

if (folio && !xa_is_value(folio)) {
....
  read_pages(ractl);
  ractl->_index += folio_nr_pages(folio); // This increment to the ractl->_index
...
}

But I think I was missing this point, as willy explained in his reply:

If there's a !uptodate folio in the page cache, <snip>. If that happened, we
should not try reading it **again in readahead**; we should let the thread
retry the read when it actually tries to access the folio.

Plus your changes optimizes the code path for a large folio where we increment the index by 1 each
time instead of moving the index directly to the next folio in the page cache.

Please ignore the noise!
diff mbox series

Patch

diff --git a/mm/readahead.c b/mm/readahead.c
index e815c114de21..40a5f1f65281 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -208,7 +208,7 @@  void page_cache_ra_unbounded(struct readahead_control *ractl,
 	struct address_space *mapping = ractl->mapping;
 	unsigned long index = readahead_index(ractl);
 	gfp_t gfp_mask = readahead_gfp_mask(mapping);
-	unsigned long i;
+	unsigned long i = 0;
 
 	/*
 	 * Partway through the readahead operation, we will have added
@@ -226,7 +226,7 @@  void page_cache_ra_unbounded(struct readahead_control *ractl,
 	/*
 	 * Preallocate as many pages as we will need.
 	 */
-	for (i = 0; i < nr_to_read; i++) {
+	do {
 		struct folio *folio = xa_load(&mapping->i_pages, index + i);
 
 		if (folio && !xa_is_value(folio)) {
@@ -239,8 +239,8 @@  void page_cache_ra_unbounded(struct readahead_control *ractl,
 			 * not worth getting one just for that.
 			 */
 			read_pages(ractl);
-			ractl->_index++;
-			i = ractl->_index + ractl->_nr_pages - index - 1;
+			ractl->_index += folio_nr_pages(folio);
+			i = ractl->_index + ractl->_nr_pages - index;
 			continue;
 		}
 
@@ -251,15 +251,16 @@  void page_cache_ra_unbounded(struct readahead_control *ractl,
 					gfp_mask) < 0) {
 			folio_put(folio);
 			read_pages(ractl);
-			ractl->_index++;
-			i = ractl->_index + ractl->_nr_pages - index - 1;
+			ractl->_index += folio_nr_pages(folio);
+			i = ractl->_index + ractl->_nr_pages - index;
 			continue;
 		}
 		if (i == nr_to_read - lookahead_size)
 			folio_set_readahead(folio);
 		ractl->_workingset |= folio_test_workingset(folio);
-		ractl->_nr_pages++;
-	}
+		ractl->_nr_pages += folio_nr_pages(folio);
+		i += folio_nr_pages(folio);
+	} while (i < nr_to_read);
 
 	/*
 	 * Now start the IO.  We ignore I/O errors - if the folio is not