diff mbox series

mm:vmscan: fix shrink sc->nr counter values issue

Message ID 20231129130126.2130-1-justinjiang@vivo.com (mailing list archive)
State New
Headers show
Series mm:vmscan: fix shrink sc->nr counter values issue | expand

Commit Message

zhiguojiang Nov. 29, 2023, 1:01 p.m. UTC
It is needed to ensure sc->nr.unqueued_dirty > 0, which can avoid to
set PGDAT_DIRTY flag when sc->nr.unqueued_dirty and sc->nr.file_taken
are both zero at the same time.

It can't be guaranteed for the PGDAT_WRITEBACK flag that only pages
marked for immediate reclaim are on evictable LRUs in other following
shrink processes of the same kswapd shrink recycling. So when both a
small amount of pages marked for immediate reclaim and a large amount
of pages marked for non-immediate reclaim are on evictable LRUs at the
same time, if it's only determined that there is at least a page marked
for immediate reclaim on evictable LRUs, kswapd shrink is throttled to
sleep, which will increase kswapd process consumption.

It can be fixed to throttle kswapd shrink when sc->nr.immediate is equal
to sc->nr.file_taken.

Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
---
 mm/vmscan.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
 mode change 100644 => 100755 mm/vmscan.c

Comments

Matthew Wilcox (Oracle) Nov. 29, 2023, 3:17 p.m. UTC | #1
On Wed, Nov 29, 2023 at 09:01:26PM +0800, Zhiguo Jiang wrote:
> It is needed to ensure sc->nr.unqueued_dirty > 0, which can avoid to
> set PGDAT_DIRTY flag when sc->nr.unqueued_dirty and sc->nr.file_taken
> are both zero at the same time.

Have you observed this happening, or is this from code review?

> It can't be guaranteed for the PGDAT_WRITEBACK flag that only pages
> marked for immediate reclaim are on evictable LRUs in other following
> shrink processes of the same kswapd shrink recycling. So when both a
> small amount of pages marked for immediate reclaim and a large amount
> of pages marked for non-immediate reclaim are on evictable LRUs at the
> same time, if it's only determined that there is at least a page marked
> for immediate reclaim on evictable LRUs, kswapd shrink is throttled to
> sleep, which will increase kswapd process consumption.
> 
> It can be fixed to throttle kswapd shrink when sc->nr.immediate is equal
> to sc->nr.file_taken.

So you're fixing two distinct things in the same patch?

> +++ b/mm/vmscan.c
> @@ -5915,17 +5915,17 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>  			set_bit(PGDAT_WRITEBACK, &pgdat->flags);
>  
>  		/* Allow kswapd to start writing pages during reclaim.*/
> -		if (sc->nr.unqueued_dirty == sc->nr.file_taken)
> +		if (sc->nr.unqueued_dirty && sc->nr.unqueued_dirty == sc->nr.file_taken)
>  			set_bit(PGDAT_DIRTY, &pgdat->flags);
>  
>  		/*
> -		 * If kswapd scans pages marked for immediate
> +		 * If kswapd scans massive pages marked for immediate

I don't understand why you've added the word "massive".  Do you mean
that the pages are large, or that kswapd has scanned a lot of pages?

>  		 * reclaim and under writeback (nr_immediate), it
>  		 * implies that pages are cycling through the LRU
>  		 * faster than they are written so forcibly stall
>  		 * until some pages complete writeback.
>  		 */
> -		if (sc->nr.immediate)
> +		if (sc->nr.immediate && sc->nr.immediate == sc->nr.file_taken)
>  			reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
>  	}
zhiguojiang Nov. 30, 2023, 1:56 a.m. UTC | #2
在 2023/11/29 23:17, Matthew Wilcox 写道:
> On Wed, Nov 29, 2023 at 09:01:26PM +0800, Zhiguo Jiang wrote:
>> It is needed to ensure sc->nr.unqueued_dirty > 0, which can avoid to
>> set PGDAT_DIRTY flag when sc->nr.unqueued_dirty and sc->nr.file_taken
>> are both zero at the same time.
> Have you observed this happening, or is this from code review?

Found in code review. The other sc->nr parameters are also judged whether they themselves are zero first in shrink_node.

>
>> It can't be guaranteed for the PGDAT_WRITEBACK flag that only pages
>> marked for immediate reclaim are on evictable LRUs in other following
>> shrink processes of the same kswapd shrink recycling. So when both a
>> small amount of pages marked for immediate reclaim and a large amount
>> of pages marked for non-immediate reclaim are on evictable LRUs at the
>> same time, if it's only determined that there is at least a page marked
>> for immediate reclaim on evictable LRUs, kswapd shrink is throttled to
>> sleep, which will increase kswapd process consumption.
>>
>> It can be fixed to throttle kswapd shrink when sc->nr.immediate is equal
>> to sc->nr.file_taken.
> So you're fixing two distinct things in the same patch?
It can be understood as two issues, and I will submit them separately.
>
>> +++ b/mm/vmscan.c
>> @@ -5915,17 +5915,17 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>>   			set_bit(PGDAT_WRITEBACK, &pgdat->flags);
>>   
>>   		/* Allow kswapd to start writing pages during reclaim.*/
>> -		if (sc->nr.unqueued_dirty == sc->nr.file_taken)
>> +		if (sc->nr.unqueued_dirty && sc->nr.unqueued_dirty == sc->nr.file_taken)
>>   			set_bit(PGDAT_DIRTY, &pgdat->flags);
>>   
>>   		/*
>> -		 * If kswapd scans pages marked for immediate
>> +		 * If kswapd scans massive pages marked for immediate
> I don't understand why you've added the word "massive".  Do you mean
> that the pages are large, or that kswapd has scanned a lot of pages?
The added "massive" means that there are a large number of pages marked 
for immediate reclaim on evictable LRUs.

The added "massive" is relative to the situation that there is only a 
small amount of pages marked for immediate reclaim or even only one page 
marked for immediate reclaim on the evictable LRUs for throttle kswapd, 
and I think this situation don't need to throttle, because there may be 
other types of pages on evictable LRUs.
>
>>   		 * reclaim and under writeback (nr_immediate), it
>>   		 * implies that pages are cycling through the LRU
>>   		 * faster than they are written so forcibly stall
>>   		 * until some pages complete writeback.
>>   		 */
>> -		if (sc->nr.immediate)
>> +		if (sc->nr.immediate && sc->nr.immediate == sc->nr.file_taken)
>>   			reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
>>   	}
Matthew Wilcox (Oracle) Nov. 30, 2023, 3:29 a.m. UTC | #3
On Thu, Nov 30, 2023 at 09:56:59AM +0800, zhiguojiang wrote:
> > > -		 * If kswapd scans pages marked for immediate
> > > +		 * If kswapd scans massive pages marked for immediate
> > I don't understand why you've added the word "massive".  Do you mean
> > that the pages are large, or that kswapd has scanned a lot of pages?
> The added "massive" means that there are a large number of pages marked for
> immediate reclaim on evictable LRUs.

Then the word "many" communicates your meaning better.  "massive" would
mean that each page is very big, while "many" means that there are a
lot of pages.

It was foolish to send out a v2 so swiftly.  Best wait for someone who's
familiar with this code to respond to it now that you've clarified what
you were doing.
zhiguojiang Nov. 30, 2023, 3:40 a.m. UTC | #4
在 2023/11/30 11:29, Matthew Wilcox 写道:
> On Thu, Nov 30, 2023 at 09:56:59AM +0800, zhiguojiang wrote:
>>>> -		 * If kswapd scans pages marked for immediate
>>>> +		 * If kswapd scans massive pages marked for immediate
>>> I don't understand why you've added the word "massive".  Do you mean
>>> that the pages are large, or that kswapd has scanned a lot of pages?
>> The added "massive" means that there are a large number of pages marked for
>> immediate reclaim on evictable LRUs.
> Then the word "many" communicates your meaning better.  "massive" would
> mean that each page is very big, while "many" means that there are a
> lot of pages.
>
> It was foolish to send out a v2 so swiftly.  Best wait for someone who's
> familiar with this code to respond to it now that you've clarified what
> you were doing.
Thanks for you suggestions, I will update the newer version patch later.
diff mbox series

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d8c3338fee0f..5723672bbdc2
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -5915,17 +5915,17 @@  static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 			set_bit(PGDAT_WRITEBACK, &pgdat->flags);
 
 		/* Allow kswapd to start writing pages during reclaim.*/
-		if (sc->nr.unqueued_dirty == sc->nr.file_taken)
+		if (sc->nr.unqueued_dirty && sc->nr.unqueued_dirty == sc->nr.file_taken)
 			set_bit(PGDAT_DIRTY, &pgdat->flags);
 
 		/*
-		 * If kswapd scans pages marked for immediate
+		 * If kswapd scans massive pages marked for immediate
 		 * reclaim and under writeback (nr_immediate), it
 		 * implies that pages are cycling through the LRU
 		 * faster than they are written so forcibly stall
 		 * until some pages complete writeback.
 		 */
-		if (sc->nr.immediate)
+		if (sc->nr.immediate && sc->nr.immediate == sc->nr.file_taken)
 			reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
 	}