diff mbox series

mm: filemap: clear idle flag for writes

Message ID 1593020612-13051-1-git-send-email-yang.shi@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series mm: filemap: clear idle flag for writes | expand

Commit Message

Yang Shi June 24, 2020, 5:43 p.m. UTC
Since commit bbddabe2e436aa7869b3ac5248df5c14ddde0cbf ("mm: filemap:
only do access activations on reads"), mark_page_accessed() is called
for reads only.  But the idle flag is cleared by mark_page_accessed() so
the idle flag won't get cleared if the page is write accessed only.

Basically idle page tracking is used to estimate workingset size of
workload, noticeable size of workingset might be missed if the idle flag
is not maintained correctly.

It seems good enough to just clear idle flag for write operations.

Fixes: bbddabe2e436 ("mm: filemap: only do access activations on reads")
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Shakeel Butt <shakeelb@google.com>
Reported-by: Gang Deng <gavin.dg@linux.alibaba.com>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 mm/filemap.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Andrew Morton June 24, 2020, 6:53 p.m. UTC | #1
On Thu, 25 Jun 2020 01:43:32 +0800 Yang Shi <yang.shi@linux.alibaba.com> wrote:

> Since commit bbddabe2e436aa7869b3ac5248df5c14ddde0cbf ("mm: filemap:
> only do access activations on reads"), mark_page_accessed() is called
> for reads only.  But the idle flag is cleared by mark_page_accessed() so
> the idle flag won't get cleared if the page is write accessed only.
> 
> Basically idle page tracking is used to estimate workingset size of
> workload, noticeable size of workingset might be missed if the idle flag
> is not maintained correctly.
> 
> It seems good enough to just clear idle flag for write operations.
> 
> ...
>
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -41,6 +41,7 @@
>  #include <linux/delayacct.h>
>  #include <linux/psi.h>
>  #include <linux/ramfs.h>
> +#include <linux/page_idle.h>
>  #include "internal.h"
>  
>  #define CREATE_TRACE_POINTS
> @@ -1630,6 +1631,11 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
>  
>  	if (fgp_flags & FGP_ACCESSED)
>  		mark_page_accessed(page);
> +	else if (fgp_flags & FGP_WRITE) {
> +		/* Clear idle flag for buffer write */
> +		if (page_is_idle(page))
> +			clear_page_idle(page);
> +	}
>  
>  no_page:
>  	if (!page && (fgp_flags & FGP_CREAT)) {

The kerneldoc comment for pagecache_get_page() could do with some
updating - it fails to mention FGP_WRITE, FGP_NOFS and FGP_NOWAIT.

This change seems correct but also will have runtime effects.  What are
they?
Yang Shi June 24, 2020, 7:18 p.m. UTC | #2
On 6/24/20 11:53 AM, Andrew Morton wrote:
> On Thu, 25 Jun 2020 01:43:32 +0800 Yang Shi <yang.shi@linux.alibaba.com> wrote:
>
>> Since commit bbddabe2e436aa7869b3ac5248df5c14ddde0cbf ("mm: filemap:
>> only do access activations on reads"), mark_page_accessed() is called
>> for reads only.  But the idle flag is cleared by mark_page_accessed() so
>> the idle flag won't get cleared if the page is write accessed only.
>>
>> Basically idle page tracking is used to estimate workingset size of
>> workload, noticeable size of workingset might be missed if the idle flag
>> is not maintained correctly.
>>
>> It seems good enough to just clear idle flag for write operations.
>>
>> ...
>>
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -41,6 +41,7 @@
>>   #include <linux/delayacct.h>
>>   #include <linux/psi.h>
>>   #include <linux/ramfs.h>
>> +#include <linux/page_idle.h>
>>   #include "internal.h"
>>   
>>   #define CREATE_TRACE_POINTS
>> @@ -1630,6 +1631,11 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
>>   
>>   	if (fgp_flags & FGP_ACCESSED)
>>   		mark_page_accessed(page);
>> +	else if (fgp_flags & FGP_WRITE) {
>> +		/* Clear idle flag for buffer write */
>> +		if (page_is_idle(page))
>> +			clear_page_idle(page);
>> +	}
>>   
>>   no_page:
>>   	if (!page && (fgp_flags & FGP_CREAT)) {
> The kerneldoc comment for pagecache_get_page() could do with some
> updating - it fails to mention FGP_WRITE, FGP_NOFS and FGP_NOWAIT.

Yes, will propose a separate patch later on.

>
> This change seems correct but also will have runtime effects.  What are
> they?

Other than a couple of extra cycles when idle page tracking is enabled, 
I didn't think of other effects. It should be negligible. The idle flag 
doesn't play a role in page reclaim algorithm, so it won't have impact 
on that.
Shakeel Butt June 24, 2020, 7:49 p.m. UTC | #3
On Wed, Jun 24, 2020 at 10:43 AM Yang Shi <yang.shi@linux.alibaba.com> wrote:
>
> Since commit bbddabe2e436aa7869b3ac5248df5c14ddde0cbf ("mm: filemap:
> only do access activations on reads"), mark_page_accessed() is called
> for reads only.  But the idle flag is cleared by mark_page_accessed() so
> the idle flag won't get cleared if the page is write accessed only.
>
> Basically idle page tracking is used to estimate workingset size of
> workload, noticeable size of workingset might be missed if the idle flag
> is not maintained correctly.
>
> It seems good enough to just clear idle flag for write operations.
>
> Fixes: bbddabe2e436 ("mm: filemap: only do access activations on reads")
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Shakeel Butt <shakeelb@google.com>
> Reported-by: Gang Deng <gavin.dg@linux.alibaba.com>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>

Reviewed-by: Shakeel Butt <shakeelb@google.com>
Shakeel Butt June 24, 2020, 7:50 p.m. UTC | #4
On Wed, Jun 24, 2020 at 12:18 PM Yang Shi <yang.shi@linux.alibaba.com> wrote:
>
>
>
> On 6/24/20 11:53 AM, Andrew Morton wrote:
> > On Thu, 25 Jun 2020 01:43:32 +0800 Yang Shi <yang.shi@linux.alibaba.com> wrote:
> >
> >> Since commit bbddabe2e436aa7869b3ac5248df5c14ddde0cbf ("mm: filemap:
> >> only do access activations on reads"), mark_page_accessed() is called
> >> for reads only.  But the idle flag is cleared by mark_page_accessed() so
> >> the idle flag won't get cleared if the page is write accessed only.
> >>
> >> Basically idle page tracking is used to estimate workingset size of
> >> workload, noticeable size of workingset might be missed if the idle flag
> >> is not maintained correctly.
> >>
> >> It seems good enough to just clear idle flag for write operations.
> >>
> >> ...
> >>
> >> --- a/mm/filemap.c
> >> +++ b/mm/filemap.c
> >> @@ -41,6 +41,7 @@
> >>   #include <linux/delayacct.h>
> >>   #include <linux/psi.h>
> >>   #include <linux/ramfs.h>
> >> +#include <linux/page_idle.h>
> >>   #include "internal.h"
> >>
> >>   #define CREATE_TRACE_POINTS
> >> @@ -1630,6 +1631,11 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
> >>
> >>      if (fgp_flags & FGP_ACCESSED)
> >>              mark_page_accessed(page);
> >> +    else if (fgp_flags & FGP_WRITE) {
> >> +            /* Clear idle flag for buffer write */
> >> +            if (page_is_idle(page))
> >> +                    clear_page_idle(page);
> >> +    }
> >>
> >>   no_page:
> >>      if (!page && (fgp_flags & FGP_CREAT)) {
> > The kerneldoc comment for pagecache_get_page() could do with some
> > updating - it fails to mention FGP_WRITE, FGP_NOFS and FGP_NOWAIT.
>
> Yes, will propose a separate patch later on.
>
> >
> > This change seems correct but also will have runtime effects.  What are
> > they?
>
> Other than a couple of extra cycles when idle page tracking is enabled,
> I didn't think of other effects. It should be negligible. The idle flag
> doesn't play a role in page reclaim algorithm, so it won't have impact
> on that.
>
>

The only user visible impact will be on idle page tracking users. They
will get more accurate data.
Yang Shi June 24, 2020, 8:24 p.m. UTC | #5
On 6/24/20 12:50 PM, Shakeel Butt wrote:
> On Wed, Jun 24, 2020 at 12:18 PM Yang Shi <yang.shi@linux.alibaba.com> wrote:
>>
>>
>> On 6/24/20 11:53 AM, Andrew Morton wrote:
>>> On Thu, 25 Jun 2020 01:43:32 +0800 Yang Shi <yang.shi@linux.alibaba.com> wrote:
>>>
>>>> Since commit bbddabe2e436aa7869b3ac5248df5c14ddde0cbf ("mm: filemap:
>>>> only do access activations on reads"), mark_page_accessed() is called
>>>> for reads only.  But the idle flag is cleared by mark_page_accessed() so
>>>> the idle flag won't get cleared if the page is write accessed only.
>>>>
>>>> Basically idle page tracking is used to estimate workingset size of
>>>> workload, noticeable size of workingset might be missed if the idle flag
>>>> is not maintained correctly.
>>>>
>>>> It seems good enough to just clear idle flag for write operations.
>>>>
>>>> ...
>>>>
>>>> --- a/mm/filemap.c
>>>> +++ b/mm/filemap.c
>>>> @@ -41,6 +41,7 @@
>>>>    #include <linux/delayacct.h>
>>>>    #include <linux/psi.h>
>>>>    #include <linux/ramfs.h>
>>>> +#include <linux/page_idle.h>
>>>>    #include "internal.h"
>>>>
>>>>    #define CREATE_TRACE_POINTS
>>>> @@ -1630,6 +1631,11 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
>>>>
>>>>       if (fgp_flags & FGP_ACCESSED)
>>>>               mark_page_accessed(page);
>>>> +    else if (fgp_flags & FGP_WRITE) {
>>>> +            /* Clear idle flag for buffer write */
>>>> +            if (page_is_idle(page))
>>>> +                    clear_page_idle(page);
>>>> +    }
>>>>
>>>>    no_page:
>>>>       if (!page && (fgp_flags & FGP_CREAT)) {
>>> The kerneldoc comment for pagecache_get_page() could do with some
>>> updating - it fails to mention FGP_WRITE, FGP_NOFS and FGP_NOWAIT.
>> Yes, will propose a separate patch later on.
>>
>>> This change seems correct but also will have runtime effects.  What are
>>> they?
>> Other than a couple of extra cycles when idle page tracking is enabled,
>> I didn't think of other effects. It should be negligible. The idle flag
>> doesn't play a role in page reclaim algorithm, so it won't have impact
>> on that.
>>
>>
> The only user visible impact will be on idle page tracking users. They
> will get more accurate data.

Thanks for elaborating this.
diff mbox series

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index f0ae9a6..0589aef 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -41,6 +41,7 @@ 
 #include <linux/delayacct.h>
 #include <linux/psi.h>
 #include <linux/ramfs.h>
+#include <linux/page_idle.h>
 #include "internal.h"
 
 #define CREATE_TRACE_POINTS
@@ -1630,6 +1631,11 @@  struct page *pagecache_get_page(struct address_space *mapping, pgoff_t index,
 
 	if (fgp_flags & FGP_ACCESSED)
 		mark_page_accessed(page);
+	else if (fgp_flags & FGP_WRITE) {
+		/* Clear idle flag for buffer write */
+		if (page_is_idle(page))
+			clear_page_idle(page);
+	}
 
 no_page:
 	if (!page && (fgp_flags & FGP_CREAT)) {