diff mbox series

[v2] readahead: Correct the start and size in ondemand_readahead()

Message ID 20230628044303.1412624-1-fengwei.yin@intel.com (mailing list archive)
State New, archived
Headers show
Series [v2] readahead: Correct the start and size in ondemand_readahead() | expand

Commit Message

Yin Fengwei June 28, 2023, 4:43 a.m. UTC
The commit
9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
updated the page_cache_next_miss() to return the index beyond
range.

But it breaks the start/size of ra in ondemand_readahead() because
the offset by one is accumulated to readahead_index. As a consequence,
not best readahead order is picked.

Tracing of the order parameter of filemap_alloc_folio() showed:
     page order    : count     distribution
        0          : 892073   |                                        |
        1          : 0        |                                        |
        2          : 65120457 |****************************************|
        3          : 32914005 |********************                    |
        4          : 33020991 |********************                    |
with 9425c591e06a9.

With parent commit:
     page order    : count     distribution
        0          : 3417288  |****                                    |
        1          : 0        |                                        |
        2          : 877012   |*                                       |
        3          : 288      |                                        |
        4          : 5607522  |*******                                 |
        5          : 29974228 |****************************************|

Fix the issue by removing the offset by one when page_cache_next_miss()
returns no gaps in the range.

After the fix:
    page order     : count     distribution
        0          : 2598561  |***                                     |
        1          : 0        |                                        |
        2          : 687739   |                                        |
        3          : 288      |                                        |
        4          : 207210   |                                        |
        5          : 32628260 |****************************************|

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202306211346.1e9ff03e-oliver.sang@intel.com
Fixes: 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
---
Changes from v1:
  - only removing offset by one when there is no gaps found by
    page_cache_next_miss()
  - Update commit message to include the histogram of page order
    after fix

 mm/readahead.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Mike Kravetz July 3, 2023, 6:49 p.m. UTC | #1
On 06/28/23 12:43, Yin Fengwei wrote:
> The commit
> 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
> updated the page_cache_next_miss() to return the index beyond
> range.
> 
> But it breaks the start/size of ra in ondemand_readahead() because
> the offset by one is accumulated to readahead_index. As a consequence,
> not best readahead order is picked.
> 
> Tracing of the order parameter of filemap_alloc_folio() showed:
>      page order    : count     distribution
>         0          : 892073   |                                        |
>         1          : 0        |                                        |
>         2          : 65120457 |****************************************|
>         3          : 32914005 |********************                    |
>         4          : 33020991 |********************                    |
> with 9425c591e06a9.
> 
> With parent commit:
>      page order    : count     distribution
>         0          : 3417288  |****                                    |
>         1          : 0        |                                        |
>         2          : 877012   |*                                       |
>         3          : 288      |                                        |
>         4          : 5607522  |*******                                 |
>         5          : 29974228 |****************************************|
> 
> Fix the issue by removing the offset by one when page_cache_next_miss()
> returns no gaps in the range.
> 
> After the fix:
>     page order     : count     distribution
>         0          : 2598561  |***                                     |
>         1          : 0        |                                        |
>         2          : 687739   |                                        |
>         3          : 288      |                                        |
>         4          : 207210   |                                        |
>         5          : 32628260 |****************************************|
> 

Thank you for your detailed analysis!

When the regression was initially discovered, I sent a patch to revert
commit 9425c591e06a.  Andrew has picked up this change.  And, Andrew has
also picked up this patch.

I have not verified yet, but I suspect that this patch is going to cause
a regression because it depends on the behavior of page_cache_next_miss
in 9425c591e06a which has been reverted.

Sorry for the delay in responding as I was traveling.
Yin Fengwei July 4, 2023, 1:41 a.m. UTC | #2
On 7/4/2023 2:49 AM, Mike Kravetz wrote:
> On 06/28/23 12:43, Yin Fengwei wrote:
>> The commit
>> 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
>> updated the page_cache_next_miss() to return the index beyond
>> range.
>>
>> But it breaks the start/size of ra in ondemand_readahead() because
>> the offset by one is accumulated to readahead_index. As a consequence,
>> not best readahead order is picked.
>>
>> Tracing of the order parameter of filemap_alloc_folio() showed:
>>      page order    : count     distribution
>>         0          : 892073   |                                        |
>>         1          : 0        |                                        |
>>         2          : 65120457 |****************************************|
>>         3          : 32914005 |********************                    |
>>         4          : 33020991 |********************                    |
>> with 9425c591e06a9.
>>
>> With parent commit:
>>      page order    : count     distribution
>>         0          : 3417288  |****                                    |
>>         1          : 0        |                                        |
>>         2          : 877012   |*                                       |
>>         3          : 288      |                                        |
>>         4          : 5607522  |*******                                 |
>>         5          : 29974228 |****************************************|
>>
>> Fix the issue by removing the offset by one when page_cache_next_miss()
>> returns no gaps in the range.
>>
>> After the fix:
>>     page order     : count     distribution
>>         0          : 2598561  |***                                     |
>>         1          : 0        |                                        |
>>         2          : 687739   |                                        |
>>         3          : 288      |                                        |
>>         4          : 207210   |                                        |
>>         5          : 32628260 |****************************************|
>>
> 
> Thank you for your detailed analysis!
> 
> When the regression was initially discovered, I sent a patch to revert
> commit 9425c591e06a.  Andrew has picked up this change.  And, Andrew has
> also picked up this patch.
Oh. I didn't notice that you sent revert patch. My understanding is that
commit 9425c591e06a is a good change.

> 
> I have not verified yet, but I suspect that this patch is going to cause
> a regression because it depends on the behavior of page_cache_next_miss
> in 9425c591e06a which has been reverted.
Yes. If the 9425c591e06a was reverted, this patch could introduce regression.
Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we
can suggest to Andrew to take it.


Regards
Yin, Fengwei
Mike Kravetz July 5, 2023, 4:52 p.m. UTC | #3
On 07/04/23 09:41, Yin, Fengwei wrote:
> On 7/4/2023 2:49 AM, Mike Kravetz wrote:
> > On 06/28/23 12:43, Yin Fengwei wrote:
> > 
> > Thank you for your detailed analysis!
> > 
> > When the regression was initially discovered, I sent a patch to revert
> > commit 9425c591e06a.  Andrew has picked up this change.  And, Andrew has
> > also picked up this patch.
> Oh. I didn't notice that you sent revert patch. My understanding is that
> commit 9425c591e06a is a good change.
> 
> > 
> > I have not verified yet, but I suspect that this patch is going to cause
> > a regression because it depends on the behavior of page_cache_next_miss
> > in 9425c591e06a which has been reverted.
> Yes. If the 9425c591e06a was reverted, this patch could introduce regression.
> Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we
> can suggest to Andrew to take it.

For now, I suggest we go with the revert.  Why?
- The revert is already going into stable trees.
- I may not be remembering correctly, but I seem to recall Matthew
  mentioning plans to redo/redesign the page cache and possibly
  readahead code.  If this is the case, then better to keep the legacy
  behavior for now.  But, I am not sure if this is actually part of any
  plan or work in progress.
Yin Fengwei July 6, 2023, 1:32 a.m. UTC | #4
On 7/6/23 00:52, Mike Kravetz wrote:
> On 07/04/23 09:41, Yin, Fengwei wrote:
>> On 7/4/2023 2:49 AM, Mike Kravetz wrote:
>>> On 06/28/23 12:43, Yin Fengwei wrote:
>>>
>>> Thank you for your detailed analysis!
>>>
>>> When the regression was initially discovered, I sent a patch to revert
>>> commit 9425c591e06a.  Andrew has picked up this change.  And, Andrew has
>>> also picked up this patch.
>> Oh. I didn't notice that you sent revert patch. My understanding is that
>> commit 9425c591e06a is a good change.
>>
>>>
>>> I have not verified yet, but I suspect that this patch is going to cause
>>> a regression because it depends on the behavior of page_cache_next_miss
>>> in 9425c591e06a which has been reverted.
>> Yes. If the 9425c591e06a was reverted, this patch could introduce regression.
>> Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we
>> can suggest to Andrew to take it.
> 
> For now, I suggest we go with the revert.  Why?
> - The revert is already going into stable trees.
> - I may not be remembering correctly, but I seem to recall Matthew
>   mentioning plans to redo/redesign the page cache and possibly
>   readahead code.  If this is the case, then better to keep the legacy
>   behavior for now.  But, I am not sure if this is actually part of any
>   plan or work in progress.
> 
It's fine to me and thanks a lot for detail explanations.


Hi Andrew,
Could you please help to drop this patch? Thanks.


Regards
Yin, Fengwei
diff mbox series

Patch

diff --git a/mm/readahead.c b/mm/readahead.c
index 47afbca1d122..a93af773686f 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -614,9 +614,17 @@  static void ondemand_readahead(struct readahead_control *ractl,
 				max_pages);
 		rcu_read_unlock();
 
-		if (!start || start - index > max_pages)
+		if (!start || start - index - 1 > max_pages)
 			return;
 
+		/*
+		 * If no gaps in the range, page_cache_next_miss() returns
+		 * index beyond range. Adjust it back to make sure
+		 * ractl->_index is updated correctly later.
+		 */
+		if ((start - index - 1) == max_pages)
+			start--;
+
 		ra->start = start;
 		ra->size = start - index;	/* old async_size */
 		ra->size += req_size;