diff mbox series

MIPS: pgalloc: fix memory leak caused by pgd_free()

Message ID 20220310113116.2068859-1-yaliang.wang@windriver.com (mailing list archive)
State Accepted
Commit 2bc5bab9a763d520937e4f3fe8df51c6a1eceb97
Headers show
Series MIPS: pgalloc: fix memory leak caused by pgd_free() | expand

Commit Message

yaliang.wang@windriver.com March 10, 2022, 11:31 a.m. UTC
From: Yaliang Wang <Yaliang.Wang@windriver.com>

pgd page is freed by generic implementation pgd_free() since commit
f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
however, there are scenarios that the system uses more than one page as
the pgd table, in such cases the generic implementation pgd_free() won't
be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
will be set as "1", which will cause allocating two pages as the pgd
table. Well, at the same time, the generic implementation pgd_free()
just free one pgd page, which will result in the memory leak.

The memory leak can be easily detected by executing shell command:
"while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"

Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
Signed-off-by: Yaliang Wang <Yaliang.Wang@windriver.com>
---
 arch/mips/include/asm/pgalloc.h | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Thomas Bogendoerfer March 14, 2022, 2:51 p.m. UTC | #1
On Thu, Mar 10, 2022 at 07:31:16PM +0800, yaliang.wang@windriver.com wrote:
> From: Yaliang Wang <Yaliang.Wang@windriver.com>
> 
> pgd page is freed by generic implementation pgd_free() since commit
> f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
> however, there are scenarios that the system uses more than one page as
> the pgd table, in such cases the generic implementation pgd_free() won't
> be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
> MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
> will be set as "1", which will cause allocating two pages as the pgd
> table. Well, at the same time, the generic implementation pgd_free()
> just free one pgd page, which will result in the memory leak.
> 
> The memory leak can be easily detected by executing shell command:
> "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"
> 
> Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
> Signed-off-by: Yaliang Wang <Yaliang.Wang@windriver.com>
> ---
>  arch/mips/include/asm/pgalloc.h | 6 ++++++
>  1 file changed, 6 insertions(+)

applied to mips-next.

Thomas.
Maciej W. Rozycki April 2, 2022, 1:48 p.m. UTC | #2
On Thu, 10 Mar 2022, yaliang.wang@windriver.com wrote:

> pgd page is freed by generic implementation pgd_free() since commit
> f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
> however, there are scenarios that the system uses more than one page as
> the pgd table, in such cases the generic implementation pgd_free() won't
> be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
> MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
> will be set as "1", which will cause allocating two pages as the pgd
> table. Well, at the same time, the generic implementation pgd_free()
> just free one pgd page, which will result in the memory leak.
> 
> The memory leak can be easily detected by executing shell command:
> "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"
> 
> Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
> Signed-off-by: Yaliang Wang <Yaliang.Wang@windriver.com>

 As a critical regression shouldn't this have been marked for backporting 
to stable branches?

  Maciej
Andrew Powers-Holmes April 3, 2022, 3:34 a.m. UTC | #3
On 3/4/2022 12:48 am, Maciej W. Rozycki wrote:
> On Thu, 10 Mar 2022, yaliang.wang@windriver.com wrote:
> 
>> pgd page is freed by generic implementation pgd_free() since commit
>> f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
>> however, there are scenarios that the system uses more than one page as
>> the pgd table, in such cases the generic implementation pgd_free() won't
>> be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
>> MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
>> will be set as "1", which will cause allocating two pages as the pgd
>> table. Well, at the same time, the generic implementation pgd_free()
>> just free one pgd page, which will result in the memory leak.
>>
>> The memory leak can be easily detected by executing shell command:
>> "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"
>>
>> Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
>> Signed-off-by: Yaliang Wang <Yaliang.Wang@windriver.com>
> 
>   As a critical regression shouldn't this have been marked for backporting
> to stable branches?

Very yes please - this bug has been driving several of us at OpenWrt
crazy for quite[1] some[2] time now, mostly on Octeon devices. We'd
(wrongly) suspected the octeon-ethernet driver, but this morning finally
bisected it down to f9cb654cb550 and can confirm this patch fixes the
regression.

MIPS64 has essentially been broken/unusable for 8 kernel releases,
including two LTS kernels, since the original commit landed. Should
there not have been CI/tests that caught this? It's pretty major!

- Andrew

[1] 
https://forum.openwrt.org/t/oom-killer-dnsmasq-when-physical-free-ram-remains/109351
[2] 
https://forum.openwrt.org/t/upstream-kernel-memleak-5-10-octeon-ethernet-ko/111827
Donald Hoskins April 3, 2022, 4:15 a.m. UTC | #4
Hi there,

This fix should backported.  Effectively, all mips64-based Cavium Octeon 
processors have been broken since the original commit; this has been a 
persistent and surreptitious issue for the OpenWrt community since 
nearly a year ago (referencing 
https://forum.openwrt.org/t/upstream-kernel-memleak-5-10-octeon-ethernet-ko/111827/), 
and nearly ended with OpenWrt marking the target as entirely unsupported 
and moving on.

We understand that there are ultimately few users of mips64, but it's 
really important that our memory management work. While OpenWrt can 
backport until upstream inclusion, I am sure there are other users and 
software platforms unaware of this fix (and unable to take advantage of 
it), thus making those platforms on this arch completely unusable.
Maciej W. Rozycki April 3, 2022, 10:37 a.m. UTC | #5
On Sun, 3 Apr 2022, Andrew Holmes wrote:

> MIPS64 has essentially been broken/unusable for 8 kernel releases,
> including two LTS kernels, since the original commit landed. Should
> there not have been CI/tests that caught this? It's pretty major!

 AFAIK the MIPS port is only maintained on the best effort basis nowadays 
I'm afraid.  I.e. it's enthusiasts investing their free time for the joy 
of fiddling with things.  So things are bound to break from time to time 
and remain unnoticed for a while.  We're doing our best, but our resources 
are limited.

 Taking these limitations into account I think Thomas has been doing a 
tremendous job maintaining the MIPS port, but he hasn't been cc-ed on the 
submission of the original change and it's very easy to miss stuff in the 
flood that has only been posted to a mailing list.

  Maciej
Andrew Powers-Holmes April 4, 2022, 1:16 p.m. UTC | #6
On 3/04/2022 8:37 pm, Maciej W. Rozycki wrote:
> AFAIK the MIPS port is only maintained on the best effort basis 
> nowadays I'm afraid.  I.e. it's enthusiasts investing their free time
> for the joy of fiddling with things.  So things are bound to break
> from time to time and remain unnoticed for a while.  We're doing our
> best, but our resources are limited.
> 
> Taking these limitations into account I think Thomas has been doing a
> tremendous job maintaining the MIPS port, but he hasn't been cc-ed on
> the submission of the original change and it's very easy to miss 
> stuff in the flood that has only been posted to a mailing list.
> 
> Maciej

Fair enough :) apologies, didn't mean to sound combative or ungrateful.
I know there's far more work to go around than people to do it,
everyone's doing the best they can, and I have nothing but appreciation
for all the work the kernel community does.

It's just surprising that this *could* go unnoticed for over a year -
though I suppose most of the MIPS64 systems out there are running on one
or another old vendor SDK kernel so won't have been affected...

Would the best way to get this merged into 5.10/15 (and maybe .16 just
for good measure) be to email the stable team (since it's already in
Linus' tree)? Documentation/process/stable-kernel-rules seems to say
yes, but I'd like to avoid stepping on anyone's toes given that it's not
my patch.

- Andrew
Joshua Kinard April 4, 2022, 9:10 p.m. UTC | #7
On 4/3/2022 06:37, Maciej W. Rozycki wrote:
> On Sun, 3 Apr 2022, Andrew Holmes wrote:
> 
>> MIPS64 has essentially been broken/unusable for 8 kernel releases,
>> including two LTS kernels, since the original commit landed. Should
>> there not have been CI/tests that caught this? It's pretty major!
> 
>  AFAIK the MIPS port is only maintained on the best effort basis nowadays 
> I'm afraid.  I.e. it's enthusiasts investing their free time for the joy 
> of fiddling with things.  So things are bound to break from time to time 
> and remain unnoticed for a while.  We're doing our best, but our resources 
> are limited.
> 
>  Taking these limitations into account I think Thomas has been doing a 
> tremendous job maintaining the MIPS port, but he hasn't been cc-ed on the 
> submission of the original change and it's very easy to miss stuff in the 
> flood that has only been posted to a mailing list.
> 
>   Maciej
> 

FWIW, hot off the presses is RFC9225:
https://datatracker.ietf.org/doc/html/rfc9225

4.  Best Current Practises

   1.  Authors MUST NOT implement bugs.

   2.  If bugs are introduced in code, they MUST be clearly documented.

   3.  When implementing specifications that are broken by design, it is
       RECOMMENDED to aggregate multiple smaller bugs into one larger
       bug.  This will be easier to document: rather than having a lot
       of hard-to-track inconsequential bugs, there will be only a few
       easy-to-recognise significant bugs.

   4.  The aphorism "It's not a bug, it's a feature" is considered rude.

   5.  Assume all external input is the result of (a series of) bugs.
       (Especially in machine-to-machine applications such as
       implementations of network protocols.)

   6.  In fact, assume all internal inputs also are the result of bugs.
Maciej W. Rozycki April 5, 2022, 9:42 a.m. UTC | #8
On Mon, 4 Apr 2022, Andrew Powers-Holmes wrote:

> Fair enough :) apologies, didn't mean to sound combative or ungrateful.
> I know there's far more work to go around than people to do it,
> everyone's doing the best they can, and I have nothing but appreciation
> for all the work the kernel community does.

 No offence taken; I just wanted to make it absolutely clear what the 
situation currently is.

> It's just surprising that this *could* go unnoticed for over a year -
> though I suppose most of the MIPS64 systems out there are running on one
> or another old vendor SDK kernel so won't have been affected...

 That's subject to the probability theory and depending on what people's 
usage models are.

> Would the best way to get this merged into 5.10/15 (and maybe .16 just
> for good measure) be to email the stable team (since it's already in
> Linus' tree)? Documentation/process/stable-kernel-rules seems to say
> yes, but I'd like to avoid stepping on anyone's toes given that it's not
> my patch.

 You seem the most severely affected so far, so why not act in your best 
interest?  I think option #2 applies here and seems quite straightforward 
to follow, referring commit 2bc5bab9a763 and using your use case as the 
justification.  It doesn't have to be the author to request a backport.

 NB I think it has to be backported to all the stable branches made since 
the original breakage; i.e. v5.9+ (I haven't kept track of what they are).

  Maciej
Thomas Bogendoerfer April 5, 2022, 10:45 a.m. UTC | #9
On Tue, Apr 05, 2022 at 10:42:21AM +0100, Maciej W. Rozycki wrote:
> On Mon, 4 Apr 2022, Andrew Powers-Holmes wrote:
> 
> > Would the best way to get this merged into 5.10/15 (and maybe .16 just
> > for good measure) be to email the stable team (since it's already in
> > Linus' tree)? Documentation/process/stable-kernel-rules seems to say
> > yes, but I'd like to avoid stepping on anyone's toes given that it's not
> > my patch.
> 
>  You seem the most severely affected so far, so why not act in your best 
> interest?  I think option #2 applies here and seems quite straightforward 
> to follow, referring commit 2bc5bab9a763 and using your use case as the 
> justification.  It doesn't have to be the author to request a backport.
> 
>  NB I think it has to be backported to all the stable branches made since 
> the original breakage; i.e. v5.9+ (I haven't kept track of what they are).

the fix has a Fixes tag so it will usually ported to stable/longterm kernels.
I already saw it in Greg's patch bombs.

Thomas.
Maciej W. Rozycki April 5, 2022, 11:31 a.m. UTC | #10
On Tue, 5 Apr 2022, Thomas Bogendoerfer wrote:

> >  NB I think it has to be backported to all the stable branches made since 
> > the original breakage; i.e. v5.9+ (I haven't kept track of what they are).
> 
> the fix has a Fixes tag so it will usually ported to stable/longterm kernels.
> I already saw it in Greg's patch bombs.

 Hmm, not all fixes qualify for or indeed are worth backporting and I'd 
expect those that have no stable annotation to remain on trunk only.  I 
have been following this principle with my submissions anyway.

 Indeed I can see a backport to 5.17 has literally just been posted in 
this humongous patch set, but in this case I suspect Greg has just picked 
this up by hand (thanks, Greg!) having seen this discussion (though how he 
manages to escape alive through the flood of messages has been astonishing 
me).

  Maciej
diff mbox series

Patch

diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h
index c7925d0e9874..867e9c3db76e 100644
--- a/arch/mips/include/asm/pgalloc.h
+++ b/arch/mips/include/asm/pgalloc.h
@@ -15,6 +15,7 @@ 
 
 #define __HAVE_ARCH_PMD_ALLOC_ONE
 #define __HAVE_ARCH_PUD_ALLOC_ONE
+#define __HAVE_ARCH_PGD_FREE
 #include <asm-generic/pgalloc.h>
 
 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
@@ -48,6 +49,11 @@  static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 extern void pgd_init(unsigned long page);
 extern pgd_t *pgd_alloc(struct mm_struct *mm);
 
+static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
+{
+	free_pages((unsigned long)pgd, PGD_ORDER);
+}
+
 #define __pte_free_tlb(tlb,pte,address)			\
 do {							\
 	pgtable_pte_page_dtor(pte);			\