diff mbox series

[RFC] mm: check global free_list if there is ongoing reclaiming when pcp fail

Message ID 1663325892-9825-1-git-send-email-zhaoyang.huang@unisoc.com (mailing list archive)
State New
Headers show
Series [RFC] mm: check global free_list if there is ongoing reclaiming when pcp fail | expand

Commit Message

zhaoyang.huang Sept. 16, 2022, 10:58 a.m. UTC
From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

Check the global free list again even if rmqueue_bulk failed for pcp pages when
there is ongoing reclaiming, which could eliminate potential direct reclaim by
chance.

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 mm/page_alloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Mel Gorman Sept. 19, 2022, 10:16 a.m. UTC | #1
On Fri, Sep 16, 2022 at 06:58:12PM +0800, zhaoyang.huang wrote:
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> Check the global free list again even if rmqueue_bulk failed for pcp pages when
> there is ongoing reclaiming, which could eliminate potential direct reclaim by
> chance.
> 
> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

Patch does not apply and may be based on a custom kernel that introduced
a problem. There is no description of what problem this is trying to
fix. Checking the status of reclaim for a specific zone in this path would
be a little unexpected.  If allocation pressure is exceeding the ability
of reclaim to make progress then the caller likely needs to take action
like direct reclaim. If the allocation failure is due to a high-order
failure then it may need to enter direct compaction etc.
Zhaoyang Huang Sept. 20, 2022, 1:45 a.m. UTC | #2
On Mon, Sep 19, 2022 at 6:22 PM Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Fri, Sep 16, 2022 at 06:58:12PM +0800, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >
> > Check the global free list again even if rmqueue_bulk failed for pcp pages when
> > there is ongoing reclaiming, which could eliminate potential direct reclaim by
> > chance.
> >
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
>
> Patch does not apply and may be based on a custom kernel that introduced
> a problem. There is no description of what problem this is trying to
> fix. Checking the status of reclaim for a specific zone in this path would
> be a little unexpected.  If allocation pressure is exceeding the ability
> of reclaim to make progress then the caller likely needs to take action
> like direct reclaim. If the allocation failure is due to a high-order
> failure then it may need to enter direct compaction etc.
Agree with the above comment. This is a proposal aiming at avoiding
direct reclaiming things with minimum cost, that is to say, about 5
CPU instructions in return with the overhead of function calls which
has both of several loops inside and potential throttle sleep by IO
congestion etc.
>
> --
> Mel Gorman
> SUSE Labs
Mel Gorman Sept. 20, 2022, 8:45 a.m. UTC | #3
On Tue, Sep 20, 2022 at 09:45:35AM +0800, Zhaoyang Huang wrote:
> On Mon, Sep 19, 2022 at 6:22 PM Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > On Fri, Sep 16, 2022 at 06:58:12PM +0800, zhaoyang.huang wrote:
> > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> > >
> > > Check the global free list again even if rmqueue_bulk failed for pcp pages when
> > > there is ongoing reclaiming, which could eliminate potential direct reclaim by
> > > chance.
> > >
> > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >
> > Patch does not apply and may be based on a custom kernel that introduced
> > a problem. There is no description of what problem this is trying to
> > fix. Checking the status of reclaim for a specific zone in this path would
> > be a little unexpected.  If allocation pressure is exceeding the ability
> > of reclaim to make progress then the caller likely needs to take action
> > like direct reclaim. If the allocation failure is due to a high-order
> > failure then it may need to enter direct compaction etc.
>
> Agree with the above comment. This is a proposal aiming at avoiding
> direct reclaiming things with minimum cost, that is to say, about 5
> CPU instructions in return with the overhead of function calls which
> has both of several loops inside and potential throttle sleep by IO
> congestion etc.

If the refill fails and kswapd is failing to keep up then actions like
direct reclaim or compaction are inevitable. At best, this patch would
race to allocate pages in one context that are being freed in parallel by
another context.

Nak.
Zhaoyang Huang Sept. 20, 2022, 8:48 a.m. UTC | #4
On Tue, Sep 20, 2022 at 4:46 PM Mel Gorman <mgorman@techsingularity.net> wrote:
>
> On Tue, Sep 20, 2022 at 09:45:35AM +0800, Zhaoyang Huang wrote:
> > On Mon, Sep 19, 2022 at 6:22 PM Mel Gorman <mgorman@techsingularity.net> wrote:
> > >
> > > On Fri, Sep 16, 2022 at 06:58:12PM +0800, zhaoyang.huang wrote:
> > > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> > > >
> > > > Check the global free list again even if rmqueue_bulk failed for pcp pages when
> > > > there is ongoing reclaiming, which could eliminate potential direct reclaim by
> > > > chance.
> > > >
> > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> > >
> > > Patch does not apply and may be based on a custom kernel that introduced
> > > a problem. There is no description of what problem this is trying to
> > > fix. Checking the status of reclaim for a specific zone in this path would
> > > be a little unexpected.  If allocation pressure is exceeding the ability
> > > of reclaim to make progress then the caller likely needs to take action
> > > like direct reclaim. If the allocation failure is due to a high-order
> > > failure then it may need to enter direct compaction etc.
> >
> > Agree with the above comment. This is a proposal aiming at avoiding
> > direct reclaiming things with minimum cost, that is to say, about 5
> > CPU instructions in return with the overhead of function calls which
> > has both of several loops inside and potential throttle sleep by IO
> > congestion etc.
>
> If the refill fails and kswapd is failing to keep up then actions like
> direct reclaim or compaction are inevitable. At best, this patch would
> race to allocate pages in one context that are being freed in parallel by
> another context.
>
> Nak.
ok, I have noticed that the latest modification has made some changes
on this path. thanks for comment
>
> --
> Mel Gorman
> SUSE Labs
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e008a3d..7e99f7d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3729,7 +3729,8 @@  struct page *rmqueue(struct zone *preferred_zone,
 				migratetype != MIGRATE_MOVABLE) {
 			page = rmqueue_pcplist(preferred_zone, zone, order,
 					gfp_flags, migratetype, alloc_flags);
-			goto out;
+			if (page || !test_bit(ZONE_RECLAIM_ACTIVE, &zone->flags))
+				goto out;
 		}
 	}