diff mbox series

mm: vmscan.c: fix OOM on swap stress test

Message ID 20240904-lru-flag-v1-1-36638d6a524c@kernel.org (mailing list archive)
State New
Headers show
Series mm: vmscan.c: fix OOM on swap stress test | expand

Commit Message

Chris Li Sept. 5, 2024, 6:21 a.m. UTC
I found a regression on mm-unstable during my swap stress test,
using tmpfs to compile linux. The test OOM very soon after
the make spawns many cc processes.

It bisects down to this change: 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9
(mm/gup: clear the LRU flag of a page before adding to LRU batch)

Yu Zhao propose the fix: "I think this is one of the potential side
effects -- Huge mentioned earlier about isolate_lru_folios():"

I test that with it the swap stress test no longer OOM.

Link: https://lore.kernel.org/r/CAOUHufYi9h0kz5uW3LHHS3ZrVwEq-kKp8S6N-MZUmErNAXoXmw@mail.gmail.com/
Fixes: 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch")
Suggested-by: Yu Zhao <yuzhao@google.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Tested-by: Chris Li <chrisl@kernel.org>
Signed-off-by: Chris Li <chrisl@kernel.org>
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


---
base-commit: 756ca36d643324d028b325a170e73e392b9590cd
change-id: 20240904-lru-flag-2af2f955740e

Best regards,

Comments

Yu Zhao Sept. 5, 2024, 6:42 a.m. UTC | #1
On Thu, Sep 5, 2024 at 12:21 AM Chris Li <chrisl@kernel.org> wrote:
>
> I found a regression on mm-unstable during my swap stress test,
> using tmpfs to compile linux. The test OOM very soon after
> the make spawns many cc processes.
>
> It bisects down to this change: 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9
> (mm/gup: clear the LRU flag of a page before adding to LRU batch)
>
> Yu Zhao propose the fix: "I think this is one of the potential side
> effects -- Huge mentioned earlier about isolate_lru_folios():"
>
> I test that with it the swap stress test no longer OOM.
>
> Link: https://lore.kernel.org/r/CAOUHufYi9h0kz5uW3LHHS3ZrVwEq-kKp8S6N-MZUmErNAXoXmw@mail.gmail.com/
> Fixes: 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch")
> Suggested-by: Yu Zhao <yuzhao@google.com>
> Suggested-by: Hugh Dickins <hughd@google.com>
> Tested-by: Chris Li <chrisl@kernel.org>
> Signed-off-by: Chris Li <chrisl@kernel.org>

Closes: https://lore.kernel.org/56651be8-1466-475f-b1c5-4087995cc5ae@leemhuis.info/
Thorsten Leemhuis Sept. 5, 2024, 6:53 a.m. UTC | #2
On 05.09.24 08:42, Yu Zhao wrote:
> On Thu, Sep 5, 2024 at 12:21 AM Chris Li <chrisl@kernel.org> wrote:
>>
>> I found a regression on mm-unstable during my swap stress test,
>> using tmpfs to compile linux. The test OOM very soon after
>> the make spawns many cc processes.
>>
>> It bisects down to this change: 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9
>> (mm/gup: clear the LRU flag of a page before adding to LRU batch)
>>
>> Yu Zhao propose the fix: "I think this is one of the potential side
>> effects -- Huge mentioned earlier about isolate_lru_folios():"
>>
>> I test that with it the swap stress test no longer OOM.
>>
>> Link: https://lore.kernel.org/r/CAOUHufYi9h0kz5uW3LHHS3ZrVwEq-kKp8S6N-MZUmErNAXoXmw@mail.gmail.com/
>> Fixes: 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch")
>> Suggested-by: Yu Zhao <yuzhao@google.com>
>> Suggested-by: Hugh Dickins <hughd@google.com>
>> Tested-by: Chris Li <chrisl@kernel.org>
>> Signed-off-by: Chris Li <chrisl@kernel.org>

Thx for taking care of this, Chris!

> Closes: https://lore.kernel.org/56651be8-1466-475f-b1c5-4087995cc5ae@leemhuis.info/

FWIW, no big deal, but that ideally should be (in general and for
regression tracking) the following instead, as that link above is just
at the end of the thread with the report, but not the report itself --
and that is what often needed when someone needs to look up the
backstory of this chance sooner or later:

Closes:
https://lore.kernel.org/all/CAF8kJuNP5iTj2p07QgHSGOJsiUfYpJ2f4R1Q5-3BN9JiD9W_KA@mail.gmail.com/

Ciao, Thorsten
Chris Li Sept. 5, 2024, 8:19 a.m. UTC | #3
On Wed, Sep 4, 2024 at 11:54 PM Thorsten Leemhuis
<regressions@leemhuis.info> wrote:
>
>
>
> On 05.09.24 08:42, Yu Zhao wrote:
> > On Thu, Sep 5, 2024 at 12:21 AM Chris Li <chrisl@kernel.org> wrote:
> >>
> >> I found a regression on mm-unstable during my swap stress test,
> >> using tmpfs to compile linux. The test OOM very soon after
> >> the make spawns many cc processes.
> >>
> >> It bisects down to this change: 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9
> >> (mm/gup: clear the LRU flag of a page before adding to LRU batch)
> >>
> >> Yu Zhao propose the fix: "I think this is one of the potential side
> >> effects -- Huge mentioned earlier about isolate_lru_folios():"
> >>
> >> I test that with it the swap stress test no longer OOM.
> >>
> >> Link: https://lore.kernel.org/r/CAOUHufYi9h0kz5uW3LHHS3ZrVwEq-kKp8S6N-MZUmErNAXoXmw@mail.gmail.com/
> >> Fixes: 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch")
> >> Suggested-by: Yu Zhao <yuzhao@google.com>
> >> Suggested-by: Hugh Dickins <hughd@google.com>
> >> Tested-by: Chris Li <chrisl@kernel.org>
> >> Signed-off-by: Chris Li <chrisl@kernel.org>
>
> Thx for taking care of this, Chris!
>
> > Closes: https://lore.kernel.org/56651be8-1466-475f-b1c5-4087995cc5ae@leemhuis.info/
>
> FWIW, no big deal, but that ideally should be (in general and for
> regression tracking) the following instead, as that link above is just
> at the end of the thread with the report, but not the report itself --
> and that is what often needed when someone needs to look up the
> backstory of this chance sooner or later:
>
> Closes:
> https://lore.kernel.org/all/CAF8kJuNP5iTj2p07QgHSGOJsiUfYpJ2f4R1Q5-3BN9JiD9W_KA@mail.gmail.com/

Thanks you Yu and Thorsten,

I just submitted the V2 to include the Closes tag. Technically it
passes midnight here so it is another day I can submit another version
:-).

Chris
diff mbox series

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index a9b6a8196f95..96abf4a52382 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4323,7 +4323,7 @@  static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c
 	}
 
 	/* ineligible */
-	if (zone > sc->reclaim_idx) {
+	if (!folio_test_lru(folio) || zone > sc->reclaim_idx) {
 		gen = folio_inc_gen(lruvec, folio, false);
 		list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
 		return true;