Message ID | 20230628044303.1412624-1-fengwei.yin@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] readahead: Correct the start and size in ondemand_readahead() | expand |
On 06/28/23 12:43, Yin Fengwei wrote: > The commit > 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one") > updated the page_cache_next_miss() to return the index beyond > range. > > But it breaks the start/size of ra in ondemand_readahead() because > the offset by one is accumulated to readahead_index. As a consequence, > not best readahead order is picked. > > Tracing of the order parameter of filemap_alloc_folio() showed: > page order : count distribution > 0 : 892073 | | > 1 : 0 | | > 2 : 65120457 |****************************************| > 3 : 32914005 |******************** | > 4 : 33020991 |******************** | > with 9425c591e06a9. > > With parent commit: > page order : count distribution > 0 : 3417288 |**** | > 1 : 0 | | > 2 : 877012 |* | > 3 : 288 | | > 4 : 5607522 |******* | > 5 : 29974228 |****************************************| > > Fix the issue by removing the offset by one when page_cache_next_miss() > returns no gaps in the range. > > After the fix: > page order : count distribution > 0 : 2598561 |*** | > 1 : 0 | | > 2 : 687739 | | > 3 : 288 | | > 4 : 207210 | | > 5 : 32628260 |****************************************| > Thank you for your detailed analysis! When the regression was initially discovered, I sent a patch to revert commit 9425c591e06a. Andrew has picked up this change. And, Andrew has also picked up this patch. I have not verified yet, but I suspect that this patch is going to cause a regression because it depends on the behavior of page_cache_next_miss in 9425c591e06a which has been reverted. Sorry for the delay in responding as I was traveling.
On 7/4/2023 2:49 AM, Mike Kravetz wrote: > On 06/28/23 12:43, Yin Fengwei wrote: >> The commit >> 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one") >> updated the page_cache_next_miss() to return the index beyond >> range. >> >> But it breaks the start/size of ra in ondemand_readahead() because >> the offset by one is accumulated to readahead_index. As a consequence, >> not best readahead order is picked. >> >> Tracing of the order parameter of filemap_alloc_folio() showed: >> page order : count distribution >> 0 : 892073 | | >> 1 : 0 | | >> 2 : 65120457 |****************************************| >> 3 : 32914005 |******************** | >> 4 : 33020991 |******************** | >> with 9425c591e06a9. >> >> With parent commit: >> page order : count distribution >> 0 : 3417288 |**** | >> 1 : 0 | | >> 2 : 877012 |* | >> 3 : 288 | | >> 4 : 5607522 |******* | >> 5 : 29974228 |****************************************| >> >> Fix the issue by removing the offset by one when page_cache_next_miss() >> returns no gaps in the range. >> >> After the fix: >> page order : count distribution >> 0 : 2598561 |*** | >> 1 : 0 | | >> 2 : 687739 | | >> 3 : 288 | | >> 4 : 207210 | | >> 5 : 32628260 |****************************************| >> > > Thank you for your detailed analysis! > > When the regression was initially discovered, I sent a patch to revert > commit 9425c591e06a. Andrew has picked up this change. And, Andrew has > also picked up this patch. Oh. I didn't notice that you sent revert patch. My understanding is that commit 9425c591e06a is a good change. > > I have not verified yet, but I suspect that this patch is going to cause > a regression because it depends on the behavior of page_cache_next_miss > in 9425c591e06a which has been reverted. Yes. If the 9425c591e06a was reverted, this patch could introduce regression. Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we can suggest to Andrew to take it. Regards Yin, Fengwei
On 07/04/23 09:41, Yin, Fengwei wrote: > On 7/4/2023 2:49 AM, Mike Kravetz wrote: > > On 06/28/23 12:43, Yin Fengwei wrote: > > > > Thank you for your detailed analysis! > > > > When the regression was initially discovered, I sent a patch to revert > > commit 9425c591e06a. Andrew has picked up this change. And, Andrew has > > also picked up this patch. > Oh. I didn't notice that you sent revert patch. My understanding is that > commit 9425c591e06a is a good change. > > > > > I have not verified yet, but I suspect that this patch is going to cause > > a regression because it depends on the behavior of page_cache_next_miss > > in 9425c591e06a which has been reverted. > Yes. If the 9425c591e06a was reverted, this patch could introduce regression. > Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we > can suggest to Andrew to take it. For now, I suggest we go with the revert. Why? - The revert is already going into stable trees. - I may not be remembering correctly, but I seem to recall Matthew mentioning plans to redo/redesign the page cache and possibly readahead code. If this is the case, then better to keep the legacy behavior for now. But, I am not sure if this is actually part of any plan or work in progress.
On 7/6/23 00:52, Mike Kravetz wrote: > On 07/04/23 09:41, Yin, Fengwei wrote: >> On 7/4/2023 2:49 AM, Mike Kravetz wrote: >>> On 06/28/23 12:43, Yin Fengwei wrote: >>> >>> Thank you for your detailed analysis! >>> >>> When the regression was initially discovered, I sent a patch to revert >>> commit 9425c591e06a. Andrew has picked up this change. And, Andrew has >>> also picked up this patch. >> Oh. I didn't notice that you sent revert patch. My understanding is that >> commit 9425c591e06a is a good change. >> >>> >>> I have not verified yet, but I suspect that this patch is going to cause >>> a regression because it depends on the behavior of page_cache_next_miss >>> in 9425c591e06a which has been reverted. >> Yes. If the 9425c591e06a was reverted, this patch could introduce regression. >> Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we >> can suggest to Andrew to take it. > > For now, I suggest we go with the revert. Why? > - The revert is already going into stable trees. > - I may not be remembering correctly, but I seem to recall Matthew > mentioning plans to redo/redesign the page cache and possibly > readahead code. If this is the case, then better to keep the legacy > behavior for now. But, I am not sure if this is actually part of any > plan or work in progress. > It's fine to me and thanks a lot for detail explanations. Hi Andrew, Could you please help to drop this patch? Thanks. Regards Yin, Fengwei
diff --git a/mm/readahead.c b/mm/readahead.c index 47afbca1d122..a93af773686f 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -614,9 +614,17 @@ static void ondemand_readahead(struct readahead_control *ractl, max_pages); rcu_read_unlock(); - if (!start || start - index > max_pages) + if (!start || start - index - 1 > max_pages) return; + /* + * If no gaps in the range, page_cache_next_miss() returns + * index beyond range. Adjust it back to make sure + * ractl->_index is updated correctly later. + */ + if ((start - index - 1) == max_pages) + start--; + ra->start = start; ra->size = start - index; /* old async_size */ ra->size += req_size;
The commit 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one") updated the page_cache_next_miss() to return the index beyond range. But it breaks the start/size of ra in ondemand_readahead() because the offset by one is accumulated to readahead_index. As a consequence, not best readahead order is picked. Tracing of the order parameter of filemap_alloc_folio() showed: page order : count distribution 0 : 892073 | | 1 : 0 | | 2 : 65120457 |****************************************| 3 : 32914005 |******************** | 4 : 33020991 |******************** | with 9425c591e06a9. With parent commit: page order : count distribution 0 : 3417288 |**** | 1 : 0 | | 2 : 877012 |* | 3 : 288 | | 4 : 5607522 |******* | 5 : 29974228 |****************************************| Fix the issue by removing the offset by one when page_cache_next_miss() returns no gaps in the range. After the fix: page order : count distribution 0 : 2598561 |*** | 1 : 0 | | 2 : 687739 | | 3 : 288 | | 4 : 207210 | | 5 : 32628260 |****************************************| Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202306211346.1e9ff03e-oliver.sang@intel.com Fixes: 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one") Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> --- Changes from v1: - only removing offset by one when there is no gaps found by page_cache_next_miss() - Update commit message to include the histogram of page order after fix mm/readahead.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)