Message ID | 20220310113116.2068859-1-yaliang.wang@windriver.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 2bc5bab9a763d520937e4f3fe8df51c6a1eceb97 |
Headers | show |
Series | MIPS: pgalloc: fix memory leak caused by pgd_free() | expand |
On Thu, Mar 10, 2022 at 07:31:16PM +0800, yaliang.wang@windriver.com wrote: > From: Yaliang Wang <Yaliang.Wang@windriver.com> > > pgd page is freed by generic implementation pgd_free() since commit > f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"), > however, there are scenarios that the system uses more than one page as > the pgd table, in such cases the generic implementation pgd_free() won't > be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and > MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER" > will be set as "1", which will cause allocating two pages as the pgd > table. Well, at the same time, the generic implementation pgd_free() > just free one pgd page, which will result in the memory leak. > > The memory leak can be easily detected by executing shell command: > "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done" > > Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()") > Signed-off-by: Yaliang Wang <Yaliang.Wang@windriver.com> > --- > arch/mips/include/asm/pgalloc.h | 6 ++++++ > 1 file changed, 6 insertions(+) applied to mips-next. Thomas.
On Thu, 10 Mar 2022, yaliang.wang@windriver.com wrote: > pgd page is freed by generic implementation pgd_free() since commit > f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"), > however, there are scenarios that the system uses more than one page as > the pgd table, in such cases the generic implementation pgd_free() won't > be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and > MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER" > will be set as "1", which will cause allocating two pages as the pgd > table. Well, at the same time, the generic implementation pgd_free() > just free one pgd page, which will result in the memory leak. > > The memory leak can be easily detected by executing shell command: > "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done" > > Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()") > Signed-off-by: Yaliang Wang <Yaliang.Wang@windriver.com> As a critical regression shouldn't this have been marked for backporting to stable branches? Maciej
On 3/4/2022 12:48 am, Maciej W. Rozycki wrote: > On Thu, 10 Mar 2022, yaliang.wang@windriver.com wrote: > >> pgd page is freed by generic implementation pgd_free() since commit >> f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"), >> however, there are scenarios that the system uses more than one page as >> the pgd table, in such cases the generic implementation pgd_free() won't >> be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and >> MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER" >> will be set as "1", which will cause allocating two pages as the pgd >> table. Well, at the same time, the generic implementation pgd_free() >> just free one pgd page, which will result in the memory leak. >> >> The memory leak can be easily detected by executing shell command: >> "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done" >> >> Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()") >> Signed-off-by: Yaliang Wang <Yaliang.Wang@windriver.com> > > As a critical regression shouldn't this have been marked for backporting > to stable branches? Very yes please - this bug has been driving several of us at OpenWrt crazy for quite[1] some[2] time now, mostly on Octeon devices. We'd (wrongly) suspected the octeon-ethernet driver, but this morning finally bisected it down to f9cb654cb550 and can confirm this patch fixes the regression. MIPS64 has essentially been broken/unusable for 8 kernel releases, including two LTS kernels, since the original commit landed. Should there not have been CI/tests that caught this? It's pretty major! - Andrew [1] https://forum.openwrt.org/t/oom-killer-dnsmasq-when-physical-free-ram-remains/109351 [2] https://forum.openwrt.org/t/upstream-kernel-memleak-5-10-octeon-ethernet-ko/111827
Hi there, This fix should backported. Effectively, all mips64-based Cavium Octeon processors have been broken since the original commit; this has been a persistent and surreptitious issue for the OpenWrt community since nearly a year ago (referencing https://forum.openwrt.org/t/upstream-kernel-memleak-5-10-octeon-ethernet-ko/111827/), and nearly ended with OpenWrt marking the target as entirely unsupported and moving on. We understand that there are ultimately few users of mips64, but it's really important that our memory management work. While OpenWrt can backport until upstream inclusion, I am sure there are other users and software platforms unaware of this fix (and unable to take advantage of it), thus making those platforms on this arch completely unusable.
On Sun, 3 Apr 2022, Andrew Holmes wrote: > MIPS64 has essentially been broken/unusable for 8 kernel releases, > including two LTS kernels, since the original commit landed. Should > there not have been CI/tests that caught this? It's pretty major! AFAIK the MIPS port is only maintained on the best effort basis nowadays I'm afraid. I.e. it's enthusiasts investing their free time for the joy of fiddling with things. So things are bound to break from time to time and remain unnoticed for a while. We're doing our best, but our resources are limited. Taking these limitations into account I think Thomas has been doing a tremendous job maintaining the MIPS port, but he hasn't been cc-ed on the submission of the original change and it's very easy to miss stuff in the flood that has only been posted to a mailing list. Maciej
On 3/04/2022 8:37 pm, Maciej W. Rozycki wrote: > AFAIK the MIPS port is only maintained on the best effort basis > nowadays I'm afraid. I.e. it's enthusiasts investing their free time > for the joy of fiddling with things. So things are bound to break > from time to time and remain unnoticed for a while. We're doing our > best, but our resources are limited. > > Taking these limitations into account I think Thomas has been doing a > tremendous job maintaining the MIPS port, but he hasn't been cc-ed on > the submission of the original change and it's very easy to miss > stuff in the flood that has only been posted to a mailing list. > > Maciej Fair enough :) apologies, didn't mean to sound combative or ungrateful. I know there's far more work to go around than people to do it, everyone's doing the best they can, and I have nothing but appreciation for all the work the kernel community does. It's just surprising that this *could* go unnoticed for over a year - though I suppose most of the MIPS64 systems out there are running on one or another old vendor SDK kernel so won't have been affected... Would the best way to get this merged into 5.10/15 (and maybe .16 just for good measure) be to email the stable team (since it's already in Linus' tree)? Documentation/process/stable-kernel-rules seems to say yes, but I'd like to avoid stepping on anyone's toes given that it's not my patch. - Andrew
On 4/3/2022 06:37, Maciej W. Rozycki wrote: > On Sun, 3 Apr 2022, Andrew Holmes wrote: > >> MIPS64 has essentially been broken/unusable for 8 kernel releases, >> including two LTS kernels, since the original commit landed. Should >> there not have been CI/tests that caught this? It's pretty major! > > AFAIK the MIPS port is only maintained on the best effort basis nowadays > I'm afraid. I.e. it's enthusiasts investing their free time for the joy > of fiddling with things. So things are bound to break from time to time > and remain unnoticed for a while. We're doing our best, but our resources > are limited. > > Taking these limitations into account I think Thomas has been doing a > tremendous job maintaining the MIPS port, but he hasn't been cc-ed on the > submission of the original change and it's very easy to miss stuff in the > flood that has only been posted to a mailing list. > > Maciej > FWIW, hot off the presses is RFC9225: https://datatracker.ietf.org/doc/html/rfc9225 4. Best Current Practises 1. Authors MUST NOT implement bugs. 2. If bugs are introduced in code, they MUST be clearly documented. 3. When implementing specifications that are broken by design, it is RECOMMENDED to aggregate multiple smaller bugs into one larger bug. This will be easier to document: rather than having a lot of hard-to-track inconsequential bugs, there will be only a few easy-to-recognise significant bugs. 4. The aphorism "It's not a bug, it's a feature" is considered rude. 5. Assume all external input is the result of (a series of) bugs. (Especially in machine-to-machine applications such as implementations of network protocols.) 6. In fact, assume all internal inputs also are the result of bugs.
On Mon, 4 Apr 2022, Andrew Powers-Holmes wrote: > Fair enough :) apologies, didn't mean to sound combative or ungrateful. > I know there's far more work to go around than people to do it, > everyone's doing the best they can, and I have nothing but appreciation > for all the work the kernel community does. No offence taken; I just wanted to make it absolutely clear what the situation currently is. > It's just surprising that this *could* go unnoticed for over a year - > though I suppose most of the MIPS64 systems out there are running on one > or another old vendor SDK kernel so won't have been affected... That's subject to the probability theory and depending on what people's usage models are. > Would the best way to get this merged into 5.10/15 (and maybe .16 just > for good measure) be to email the stable team (since it's already in > Linus' tree)? Documentation/process/stable-kernel-rules seems to say > yes, but I'd like to avoid stepping on anyone's toes given that it's not > my patch. You seem the most severely affected so far, so why not act in your best interest? I think option #2 applies here and seems quite straightforward to follow, referring commit 2bc5bab9a763 and using your use case as the justification. It doesn't have to be the author to request a backport. NB I think it has to be backported to all the stable branches made since the original breakage; i.e. v5.9+ (I haven't kept track of what they are). Maciej
On Tue, Apr 05, 2022 at 10:42:21AM +0100, Maciej W. Rozycki wrote: > On Mon, 4 Apr 2022, Andrew Powers-Holmes wrote: > > > Would the best way to get this merged into 5.10/15 (and maybe .16 just > > for good measure) be to email the stable team (since it's already in > > Linus' tree)? Documentation/process/stable-kernel-rules seems to say > > yes, but I'd like to avoid stepping on anyone's toes given that it's not > > my patch. > > You seem the most severely affected so far, so why not act in your best > interest? I think option #2 applies here and seems quite straightforward > to follow, referring commit 2bc5bab9a763 and using your use case as the > justification. It doesn't have to be the author to request a backport. > > NB I think it has to be backported to all the stable branches made since > the original breakage; i.e. v5.9+ (I haven't kept track of what they are). the fix has a Fixes tag so it will usually ported to stable/longterm kernels. I already saw it in Greg's patch bombs. Thomas.
On Tue, 5 Apr 2022, Thomas Bogendoerfer wrote: > > NB I think it has to be backported to all the stable branches made since > > the original breakage; i.e. v5.9+ (I haven't kept track of what they are). > > the fix has a Fixes tag so it will usually ported to stable/longterm kernels. > I already saw it in Greg's patch bombs. Hmm, not all fixes qualify for or indeed are worth backporting and I'd expect those that have no stable annotation to remain on trunk only. I have been following this principle with my submissions anyway. Indeed I can see a backport to 5.17 has literally just been posted in this humongous patch set, but in this case I suspect Greg has just picked this up by hand (thanks, Greg!) having seen this discussion (though how he manages to escape alive through the flood of messages has been astonishing me). Maciej
diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h index c7925d0e9874..867e9c3db76e 100644 --- a/arch/mips/include/asm/pgalloc.h +++ b/arch/mips/include/asm/pgalloc.h @@ -15,6 +15,7 @@ #define __HAVE_ARCH_PMD_ALLOC_ONE #define __HAVE_ARCH_PUD_ALLOC_ONE +#define __HAVE_ARCH_PGD_FREE #include <asm-generic/pgalloc.h> static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd, @@ -48,6 +49,11 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd) extern void pgd_init(unsigned long page); extern pgd_t *pgd_alloc(struct mm_struct *mm); +static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd) +{ + free_pages((unsigned long)pgd, PGD_ORDER); +} + #define __pte_free_tlb(tlb,pte,address) \ do { \ pgtable_pte_page_dtor(pte); \