mm: Make kvmalloc refuse to allocate more than 2GB

Message ID	20210721184131.2264356-1-willy@infradead.org (mailing list archive)
State	New
Headers	show Return-Path: <SRS0=5lZ1=MN=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 265AA6128A From: "Matthew Wilcox (Oracle)" <willy@infradead.org> To: Al Viro <viro@zeniv.linux.org.uk>, Qualys Security Advisory <qsa@qualys.com>, Eric Sandeen <sandeen@redhat.com>, Linus Torvalds <torvalds@linux-foundation.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Subject: [PATCH] mm: Make kvmalloc refuse to allocate more than 2GB Date: Wed, 21 Jul 2021 19:41:31 +0100 Message-Id: <20210721184131.2264356-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	mm: Make kvmalloc refuse to allocate more than 2GB \| expand mm: Make kvmalloc refuse to allocate more than 2GB

Message ID

20210721184131.2264356-1-willy@infradead.org (mailing list archive)

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 265AA6128A
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
To: Al Viro <viro@zeniv.linux.org.uk>,
	Qualys Security Advisory <qsa@qualys.com>,
	Eric Sandeen <sandeen@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: [PATCH] mm: Make kvmalloc refuse to allocate more than 2GB
Date: Wed, 21 Jul 2021 19:41:31 +0100
Message-Id: <20210721184131.2264356-1-willy@infradead.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

mm: Make kvmalloc refuse to allocate more than 2GB | expand

Commit Message

Matthew Wilcox July 21, 2021, 6:41 p.m. UTC

It's generally dangerous to allocate such large quantities of memory
within the kernel owing to our propensity to use 'int' to represent
a length.  If somebody really needs it, we can add a kvmalloc_large()
later, but let's default to "You can't allocate that much memory".

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 fs/seq_file.c                                     | 3 ---
 mm/util.c                                         | 7 +++++++
 2 files changed, 7 insertions(+), 3 deletions(-)

Comments

Linus Torvalds July 21, 2021, 8:46 p.m. UTC | #1

On Wed, Jul 21, 2021 at 11:42 AM Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> It's generally dangerous to allocate such large quantities of memory
> within the kernel owing to our propensity to use 'int' to represent
> a length.  If somebody really needs it, we can add a kvmalloc_large()
> later, but let's default to "You can't allocate that much memory".

I really think that without the WARN_ON_ONCE(), this is just moving
that failure point from a known good place ("we know this must not
succeed") to a possibly bad place ("this might cause silent and
hard-to-understand failures elsewhere").

IOW, in seq_buf_alloc() there's no need to warn. It's clear that a
bigger allocation can never be valid.

But in kvmalloc(), it needs to warn, because if it ever triggers we
need to check what triggered it.

So this is not just moving code from one place to another equivalent one.

                 Linus

Matthew Wilcox July 22, 2021, 12:14 a.m. UTC | #2

On Wed, Jul 21, 2021 at 01:46:09PM -0700, Linus Torvalds wrote:
> On Wed, Jul 21, 2021 at 11:42 AM Matthew Wilcox (Oracle)
> <willy@infradead.org> wrote:
> >
> > It's generally dangerous to allocate such large quantities of memory
> > within the kernel owing to our propensity to use 'int' to represent
> > a length.  If somebody really needs it, we can add a kvmalloc_large()
> > later, but let's default to "You can't allocate that much memory".
> 
> I really think that without the WARN_ON_ONCE(), this is just moving
> that failure point from a known good place ("we know this must not
> succeed") to a possibly bad place ("this might cause silent and
> hard-to-understand failures elsewhere").

To a certain extent, yes.  On the other hand, if you don't have any
error handling on your kvmalloc of 2GB, Qualys seems to have a reliable
way to run you out of vmalloc space, and that's going to get exercised.

My initial thought was to leverage the existing __GFP_NOWARN code:

        if (size > PAGE_SIZE) {
-               kmalloc_flags |= __GFP_NOWARN;
+               if (size <= INT_MAX)
+                       kmalloc_flags |= __GFP_NOWARN;

because that dumps some interesting information (ratelimited), which
might help the sysadmin realise they're under attack.  A WARN_ON_ONCE
is one-and-done, so an attacker can hide their tracks.  Unfortunately,
we actually bail out before getting there:

        if (unlikely(order >= MAX_ORDER)) {
                WARN_ON_ONCE(!(gfp & __GFP_NOWARN));
                return NULL;
        }

... maybe that should call warn_alloc() too.

So I'm now thinking (relative to the earlier patch):

-       if (size > INT_MAX)
+       if (size > INT_MAX) {
+               warn_alloc(flags, NULL, "oversized allocation:%zu", size);
                return NULL;
+       }

Theodore Ts'o July 22, 2021, 2:43 p.m. UTC | #3

On Wed, Jul 21, 2021 at 07:41:31PM +0100, Matthew Wilcox (Oracle) wrote:
> It's generally dangerous to allocate such large quantities of memory
> within the kernel owing to our propensity to use 'int' to represent
> a length.  If somebody really needs it, we can add a kvmalloc_large()
> later, but let's default to "You can't allocate that much memory".

If we really need it, maybe we can add a GFP_LARGE_ALLOC to allow
allocations larger than 2GB later on?  I can't quite see why that
would ever be needed, but that's probably a failure of my imagination.  :-)

      	      	      	  	 - Ted

Michal Hocko July 27, 2021, 7:38 a.m. UTC | #4

On Wed 21-07-21 19:41:31, Matthew Wilcox wrote:
> It's generally dangerous to allocate such large quantities of memory
> within the kernel owing to our propensity to use 'int' to represent
> a length.  If somebody really needs it, we can add a kvmalloc_large()
> later, but let's default to "You can't allocate that much memory".

I do agree that limiting kvmalloc allocation size is a reasonable thing
to do but I do not really see why we should remove the check from
seq_buf_alloc. Implicitly relying on kvmalloc to workaround a bug that
was in seq_buf code seems like a step backwards to me.

diff --git a/fs/seq_file.c b/fs/seq_file.c
index 4a2cda04d3e2..b117b212ef28 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -32,9 +32,6 @@  static void seq_set_overflow(struct seq_file *m)
 
 static void *seq_buf_alloc(unsigned long size)
 {
-	if (unlikely(size > MAX_RW_COUNT))
-		return NULL;
-
 	return kvmalloc(size, GFP_KERNEL_ACCOUNT);
 }
 
diff --git a/mm/util.c b/mm/util.c
index 9043d03750a7..8ff2a8924d5f 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -593,6 +593,13 @@  void *kvmalloc_node(size_t size, gfp_t flags, int node)
 	if (ret || size <= PAGE_SIZE)
 		return ret;
 
+	/*
+	 * Succeeding for sizes above 2GiB can lead to truncation if
+	 * someone casts the size to an int.
+	 */
+	if (size > INT_MAX)
+		return NULL;
+
 	return __vmalloc_node(size, 1, flags, node,
 			__builtin_return_address(0));
 }

mm: Make kvmalloc refuse to allocate more than 2GB

Commit Message

Comments

Patch