Message ID | CAN2Y7hxDdATNfb=R5J1as3pqA1RsP8c8LubC4QxojK5cJS9Q9w@mail.gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/vmscan: when the swappiness is set to 0, memory swapping should be prohibited during the global reclaim process | expand |
On Thu, 27 Feb 2025 22:34:51 +0800 ying chen <yc1082463@gmail.com> wrote: Hi Ying, I hope you are having a great day! I wanted to share a few thoughts: Previously, when the system is under a lot of memory pressure and is facing OOMs, global reclaim can create space for the system and prevent going out of memory by swapping, even when swappiness is 0. If this patch removes that check, it would mean that global reclaim can no longer "bypass" the swappiness == 0 condition. I am also CCing Johannes, who is the original author of this section [1], who clarified in the patch that swappiness == 0 has different meanings for global reclaim and memory cgroup reclaim. > When we use zram as swap disks, global reclaim may cause the memory in some > cgroups with memory.swappiness set to 0 to be swapped into zram. This memory > won't be swapped back immediately after the free memory increases. Instead, > it will continue to occupy the zram space, which may result in no available > zram space for the cgroups with swapping enabled. Therefore, I think that IMHO, I think that even with zram, we would still want to allow the system to reclaim memory & swap out, in case we are facing imminent OOMs. Even if the memory isn't immediately swapped back in when we are able to manage the memory spike and see free memory, I imagine that we might not even be able to manage the spike if we prevent global reclaim from swapping. These are just some thoughts that I had about the patch. However, my understanding of zram and reclaim is limited; please feel free to correct me if you see anything that I am not understanding correctly. Thank you for your time, have a great day! Joshua [1] https://lore.kernel.org/linux-mm/1355767957-4913-4-git-send-email-hannes@cmpxchg.org/ > when the vm.swappiness is set to 0, global reclaim should also refrain > from memory swapping, just like these cgroups. > > Signed-off-by: yc1082463 <yc1082463@gmail.com> > --- > mm/vmscan.c | 9 +-------- > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index c767d71c43d7..bdbb0fc03412 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2426,14 +2426,7 @@ static void get_scan_count(struct lruvec > *lruvec, struct scan_control *sc, > goto out; > } > > - /* > - * Global reclaim will swap to prevent OOM even with no > - * swappiness, but memcg users want to use this knob to > - * disable swapping for individual groups completely when > - * using the memory controller's swap limit feature would be > - * too expensive. > - */ > - if (cgroup_reclaim(sc) && !swappiness) { > + if (!swappiness) { > scan_balance = SCAN_FILE; > goto out; > } > -- > 2.34.1 Sent using hkml (https://github.com/sjp38/hackermail)
Hello, On Thu, Feb 27, 2025 at 07:54:27AM -0800, Joshua Hahn wrote: > On Thu, 27 Feb 2025 22:34:51 +0800 ying chen <yc1082463@gmail.com> wrote: > Previously, when the system is under a lot of memory pressure and is > facing OOMs, global reclaim can create space for the system and prevent > going out of memory by swapping, even when swappiness is 0. If this patch > removes that check, it would mean that global reclaim can no longer > "bypass" the swappiness == 0 condition. > > I am also CCing Johannes, who is the original author of this section [1], > who clarified in the patch that swappiness == 0 has different meanings for > global reclaim and memory cgroup reclaim. Yes. It's been the behavior for decades that swappiness is merely a preference, and that the VM *will* swap to avert OOM. You would break users making this change. If you want to hard-exempt cgroups, set memory.swap.max=0. [ Yes, it's inconsistent. But it's really cgroup_reclaim() that is the oddball in this. Also for historical reasons... ] > > when the vm.swappiness is set to 0, global reclaim should also refrain > > from memory swapping, just like these cgroups. > > > > Signed-off-by: yc1082463 <yc1082463@gmail.com> Nacked-by: Johannes Weiner <hannes@cmpxchg.org>
On Thu, Feb 27, 2025 at 10:34:51PM +0800, ying chen wrote: > When we use zram as swap disks, global reclaim may cause the memory in some > cgroups with memory.swappiness set to 0 to be swapped into zram. This memory > won't be swapped back immediately after the free memory increases. Instead, > it will continue to occupy the zram space, which may result in no available > zram space for the cgroups with swapping enabled. Therefore, I think that > when the vm.swappiness is set to 0, global reclaim should also refrain > from memory swapping, just like these cgroups. > > Signed-off-by: yc1082463 <yc1082463@gmail.com> It seems like you are still on memcg-v1. What is stopping you to move to memcg-v2 and use memory.swap.max = 0?
Yes, I'm still using memcg-v1. But it's too expensive for us to migrate the production environment to memcg-v2. On Fri, Feb 28, 2025 at 3:12 AM Shakeel Butt <shakeel.butt@linux.dev> wrote: > > On Thu, Feb 27, 2025 at 10:34:51PM +0800, ying chen wrote: > > When we use zram as swap disks, global reclaim may cause the memory in some > > cgroups with memory.swappiness set to 0 to be swapped into zram. This memory > > won't be swapped back immediately after the free memory increases. Instead, > > it will continue to occupy the zram space, which may result in no available > > zram space for the cgroups with swapping enabled. Therefore, I think that > > when the vm.swappiness is set to 0, global reclaim should also refrain > > from memory swapping, just like these cgroups. > > > > Signed-off-by: yc1082463 <yc1082463@gmail.com> > > It seems like you are still on memcg-v1. What is stopping you to move to > memcg-v2 and use memory.swap.max = 0? >
We only create a limited zram disk size for the cgroups that allow swapping. If this part of the swap space is occupied by other cgroups that don't allow swapping, the cgroups that allow swapping may not have enough swap space available. On Thu, Feb 27, 2025 at 11:54 PM Joshua Hahn <joshua.hahnjy@gmail.com> wrote: > > On Thu, 27 Feb 2025 22:34:51 +0800 ying chen <yc1082463@gmail.com> wrote: > > Hi Ying, > > I hope you are having a great day! I wanted to share a few thoughts: > > Previously, when the system is under a lot of memory pressure and is > facing OOMs, global reclaim can create space for the system and prevent > going out of memory by swapping, even when swappiness is 0. If this patch > removes that check, it would mean that global reclaim can no longer > "bypass" the swappiness == 0 condition. > > I am also CCing Johannes, who is the original author of this section [1], > who clarified in the patch that swappiness == 0 has different meanings for > global reclaim and memory cgroup reclaim. > > > When we use zram as swap disks, global reclaim may cause the memory in some > > cgroups with memory.swappiness set to 0 to be swapped into zram. This memory > > won't be swapped back immediately after the free memory increases. Instead, > > it will continue to occupy the zram space, which may result in no available > > zram space for the cgroups with swapping enabled. Therefore, I think that > > IMHO, I think that even with zram, we would still want to allow the system > to reclaim memory & swap out, in case we are facing imminent OOMs. Even if > the memory isn't immediately swapped back in when we are able to manage the > memory spike and see free memory, I imagine that we might not even be able > to manage the spike if we prevent global reclaim from swapping. > > These are just some thoughts that I had about the patch. However, my > understanding of zram and reclaim is limited; please feel free to > correct me if you see anything that I am not understanding correctly. > > Thank you for your time, have a great day! > Joshua > > [1] https://lore.kernel.org/linux-mm/1355767957-4913-4-git-send-email-hannes@cmpxchg.org/ > > > when the vm.swappiness is set to 0, global reclaim should also refrain > > from memory swapping, just like these cgroups. > > > > Signed-off-by: yc1082463 <yc1082463@gmail.com> > > --- > > mm/vmscan.c | 9 +-------- > > 1 file changed, 1 insertion(+), 8 deletions(-) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index c767d71c43d7..bdbb0fc03412 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2426,14 +2426,7 @@ static void get_scan_count(struct lruvec > > *lruvec, struct scan_control *sc, > > goto out; > > } > > > > - /* > > - * Global reclaim will swap to prevent OOM even with no > > - * swappiness, but memcg users want to use this knob to > > - * disable swapping for individual groups completely when > > - * using the memory controller's swap limit feature would be > > - * too expensive. > > - */ > > - if (cgroup_reclaim(sc) && !swappiness) { > > + if (!swappiness) { > > scan_balance = SCAN_FILE; > > goto out; > > } > > -- > > 2.34.1 > > Sent using hkml (https://github.com/sjp38/hackermail) >
Got it. Thank you very much. On Fri, Feb 28, 2025 at 12:19 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > Hello, > > On Thu, Feb 27, 2025 at 07:54:27AM -0800, Joshua Hahn wrote: > > On Thu, 27 Feb 2025 22:34:51 +0800 ying chen <yc1082463@gmail.com> wrote: > > > Previously, when the system is under a lot of memory pressure and is > > facing OOMs, global reclaim can create space for the system and prevent > > going out of memory by swapping, even when swappiness is 0. If this patch > > removes that check, it would mean that global reclaim can no longer > > "bypass" the swappiness == 0 condition. > > > > I am also CCing Johannes, who is the original author of this section [1], > > who clarified in the patch that swappiness == 0 has different meanings for > > global reclaim and memory cgroup reclaim. > > Yes. It's been the behavior for decades that swappiness is merely a > preference, and that the VM *will* swap to avert OOM. You would break > users making this change. > > If you want to hard-exempt cgroups, set memory.swap.max=0. > > [ Yes, it's inconsistent. But it's really cgroup_reclaim() that is the > oddball in this. Also for historical reasons... ] > > > > when the vm.swappiness is set to 0, global reclaim should also refrain > > > from memory swapping, just like these cgroups. > > > > > > Signed-off-by: yc1082463 <yc1082463@gmail.com> > > Nacked-by: Johannes Weiner <hannes@cmpxchg.org>
On Fri, Feb 28, 2025 at 12:19 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > Hello, > > On Thu, Feb 27, 2025 at 07:54:27AM -0800, Joshua Hahn wrote: > > On Thu, 27 Feb 2025 22:34:51 +0800 ying chen <yc1082463@gmail.com> wrote: > > > Previously, when the system is under a lot of memory pressure and is > > facing OOMs, global reclaim can create space for the system and prevent > > going out of memory by swapping, even when swappiness is 0. If this patch > > removes that check, it would mean that global reclaim can no longer > > "bypass" the swappiness == 0 condition. > > > > I am also CCing Johannes, who is the original author of this section [1], > > who clarified in the patch that swappiness == 0 has different meanings for > > global reclaim and memory cgroup reclaim. > > Yes. It's been the behavior for decades that swappiness is merely a > preference, and that the VM *will* swap to avert OOM. You would break > users making this change. Hello Johannes, How about introducing a new value, vm.swappiness=-1, to disable swapping for global reclaim? diff --git a/mm/vmscan.c b/mm/vmscan.c index 76378bc257e3..4c22352c331c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2387,13 +2387,19 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, } /* - * Global reclaim will swap to prevent OOM even with no - * swappiness, but memcg users want to use this knob to - * disable swapping for individual groups completely when - * using the memory controller's swap limit feature would be - * too expensive. + * swappiness > 0: + * Swapping is enabled for both global reclaim and memcg reclaim. + * + * swappiness = 0: + * Swapping is completely disabled for individual groups when using + * the memory controller's swap limit feature would be too costly. + * + * swappiness = -1: + * Swapping is disabled for both global reclaim and memcg reclaim. + * This is useful when you want to enable swapping for certain + * memory cgroups while disabling it for others. */ - if (cgroup_reclaim(sc) && !swappiness) { + if ((cgroup_reclaim(sc) && !swappiness) || swappiness == -1) scan_balance = SCAN_FILE; goto out; } Other parts of the code will also need to be updated to accommodate this new swappiness value. > > If you want to hard-exempt cgroups, set memory.swap.max=0. This does not apply to the root memcg. > > [ Yes, it's inconsistent. But it's really cgroup_reclaim() that is the > oddball in this. Also for historical reasons... ] > > > > when the vm.swappiness is set to 0, global reclaim should also refrain > > > from memory swapping, just like these cgroups. > > > > > > Signed-off-by: yc1082463 <yc1082463@gmail.com> > > Nacked-by: Johannes Weiner <hannes@cmpxchg.org> >
On Thu 27-02-25 22:34:51, ying chen wrote: > When we use zram as swap disks, global reclaim may cause the memory in some > cgroups with memory.swappiness set to 0 to be swapped into zram. This memory > won't be swapped back immediately after the free memory increases. Instead, > it will continue to occupy the zram space, which may result in no available > zram space for the cgroups with swapping enabled. Therefore, I think that > when the vm.swappiness is set to 0, global reclaim should also refrain > from memory swapping, just like these cgroups. You are changing well established and understood semantic while working around a problem that is not really clear to me. If the zram space is limited then you should be using swap limits to control who can swap out, no? > Signed-off-by: yc1082463 <yc1082463@gmail.com> > --- > mm/vmscan.c | 9 +-------- > 1 file changed, 1 insertion(+), 8 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index c767d71c43d7..bdbb0fc03412 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2426,14 +2426,7 @@ static void get_scan_count(struct lruvec > *lruvec, struct scan_control *sc, > goto out; > } > > - /* > - * Global reclaim will swap to prevent OOM even with no > - * swappiness, but memcg users want to use this knob to > - * disable swapping for individual groups completely when > - * using the memory controller's swap limit feature would be > - * too expensive. > - */ > - if (cgroup_reclaim(sc) && !swappiness) { > + if (!swappiness) { > scan_balance = SCAN_FILE; > goto out; > } > -- > 2.34.1
diff --git a/mm/vmscan.c b/mm/vmscan.c index c767d71c43d7..bdbb0fc03412 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2426,14 +2426,7 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, goto out; } - /* - * Global reclaim will swap to prevent OOM even with no - * swappiness, but memcg users want to use this knob to - * disable swapping for individual groups completely when - * using the memory controller's swap limit feature would be - * too expensive. - */ - if (cgroup_reclaim(sc) && !swappiness) { + if (!swappiness) { scan_balance = SCAN_FILE; goto out;
When we use zram as swap disks, global reclaim may cause the memory in some cgroups with memory.swappiness set to 0 to be swapped into zram. This memory won't be swapped back immediately after the free memory increases. Instead, it will continue to occupy the zram space, which may result in no available zram space for the cgroups with swapping enabled. Therefore, I think that when the vm.swappiness is set to 0, global reclaim should also refrain from memory swapping, just like these cgroups. Signed-off-by: yc1082463 <yc1082463@gmail.com> --- mm/vmscan.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) } -- 2.34.1