drm/panfrost: Avoid user size passed to kvmalloc()

Message ID	20211216161603.983711-1-steven.price@arm.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Steven Price <steven.price@arm.com> To: Daniel Vetter <daniel@ffwll.ch>, David Airlie <airlied@linux.ie>, Rob Herring <robh@kernel.org>, Tomeu Vizoso <tomeu.vizoso@collabora.com>, Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>, Boris Brezillon <boris.brezillon@collabora.com> Subject: [PATCH] drm/panfrost: Avoid user size passed to kvmalloc() Date: Thu, 16 Dec 2021 16:16:03 +0000 Message-Id: <20211216161603.983711-1-steven.price@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Cc: Dan Carpenter <dan.carpenter@oracle.com>, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Steven Price <steven.price@arm.com> Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/panfrost: Avoid user size passed to kvmalloc() \| expand drm/panfrost: Avoid user size passed to kvmalloc()

Steven Price Dec. 16, 2021, 4:16 p.m. UTC

panfrost_copy_in_sync() takes the number of fences from user space
(in_sync_count) and used to kvmalloc() an array to hold that number of
fences before processing them. This provides an easy method for user
space to trigger the OOM killer (by temporarily allocating large amounts
of kernel memory) or hit the WARN_ONCE() added by 7661809d493b ("mm:
don't allow oversized kvmalloc() calls").

Since we don't expect there to be a large number of fences we can
instead iterate over the fences one-by-one and avoid the temporary
allocation altogether. This also makes the code simpler.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 44 ++++++++-----------------
 1 file changed, 14 insertions(+), 30 deletions(-)

Rob Herring (Arm) Dec. 16, 2021, 5:12 p.m. UTC | #1

On Thu, Dec 16, 2021 at 10:16 AM Steven Price <steven.price@arm.com> wrote:
>
> panfrost_copy_in_sync() takes the number of fences from user space
> (in_sync_count) and used to kvmalloc() an array to hold that number of
> fences before processing them. This provides an easy method for user
> space to trigger the OOM killer (by temporarily allocating large amounts
> of kernel memory) or hit the WARN_ONCE() added by 7661809d493b ("mm:
> don't allow oversized kvmalloc() calls").
>
> Since we don't expect there to be a large number of fences we can
> instead iterate over the fences one-by-one and avoid the temporary
> allocation altogether. This also makes the code simpler.

Doesn't the BO lookup suffer from the same issue?

Rob

Alyssa Rosenzweig Dec. 16, 2021, 5:49 p.m. UTC | #2

> This provides an easy method for user
> space to trigger the OOM killer (by temporarily allocating large amounts
> of kernel memory)

panfrost user space has a lot of easy ways to trigger to the OOM killer
unfortunately .... if this is something we want to fix there are a lot
more patches coming :(

Dan Carpenter Dec. 17, 2021, 6:40 a.m. UTC | #3

On Thu, Dec 16, 2021 at 12:49:53PM -0500, Alyssa Rosenzweig wrote:
> > This provides an easy method for user
> > space to trigger the OOM killer (by temporarily allocating large amounts
> > of kernel memory)
> 
> panfrost user space has a lot of easy ways to trigger to the OOM killer
> unfortunately .... if this is something we want to fix there are a lot
> more patches coming :(

My static checker is only looking at the new rule that kvmalloc() can't
allocate more than 2GB.  But eventually syzbot will figure out how to
trigger the other OOMs so fixing that is going to be a requirement at
some point.

regards,
dan carpenter

Tomeu Vizoso Dec. 17, 2021, 7:38 a.m. UTC | #4

On 12/16/21 6:49 PM, Alyssa Rosenzweig wrote:
>> This provides an easy method for user
>> space to trigger the OOM killer (by temporarily allocating large amounts
>> of kernel memory)
> 
> panfrost user space has a lot of easy ways to trigger to the OOM killer
> unfortunately .... if this is something we want to fix there are a lot
> more patches coming :(

What are you thinking of, Alyssa?

My understanding is that the problem are kernel allocations that aren't 
accounted per userspace process. I would expect shmem-backed BOs to be 
taken into account by the OOM killer, so the offending process would be 
terminated without affecting the other processes in the system.

Cheers,

Tomeu

Steven Price Dec. 17, 2021, 8:55 a.m. UTC | #5

On 16/12/2021 17:12, Rob Herring wrote:
> On Thu, Dec 16, 2021 at 10:16 AM Steven Price <steven.price@arm.com> wrote:
>>
>> panfrost_copy_in_sync() takes the number of fences from user space
>> (in_sync_count) and used to kvmalloc() an array to hold that number of
>> fences before processing them. This provides an easy method for user
>> space to trigger the OOM killer (by temporarily allocating large amounts
>> of kernel memory) or hit the WARN_ONCE() added by 7661809d493b ("mm:
>> don't allow oversized kvmalloc() calls").
>>
>> Since we don't expect there to be a large number of fences we can
>> instead iterate over the fences one-by-one and avoid the temporary
>> allocation altogether. This also makes the code simpler.
> 
> Doesn't the BO lookup suffer from the same issue?

Ah! Yes it does - I'd looked at this and seen the call to
drm_gem_objects_lookup() and assumed that since it's a DRM function (not
Panfrost specific) it must already handle this gracefully. However
clearly it doesn't, and a quick grep confirms that Panfrost is actually
the only user of the function anyway.

However this one is harder to fix without setting an arbitrary cap on
the number of BOs during a sumbit. I'm not sure how other drivers handle
this - the ones I've looked at so far all have the same issue. There's
obviously the list that Dan already sent, but e.g. msm has the same bug
in msm_gem_submit.c:submit_create() with an amusing bug where the check
for (sz > SIZE_MAX) will never hit, although the call is to kzalloc() so
large allocations are going to fail anyway.

Has anyone got any views on how this should be handled generally? I'm
somewhat wary of setting arbitrary limits, especially as it's
effectively an ABI change.

Steve

Steven Price Dec. 17, 2021, 8:56 a.m. UTC | #6

On 16/12/2021 17:49, Alyssa Rosenzweig wrote:
>> This provides an easy method for user
>> space to trigger the OOM killer (by temporarily allocating large amounts
>> of kernel memory)
> 
> panfrost user space has a lot of easy ways to trigger to the OOM killer
> unfortunately .... if this is something we want to fix there are a lot
> more patches coming :(
> 

Sorry I should have been a bit clearer in my wording. The issue isn't
triggering the OOM killer as such - it's triggering the OOM killer
without allocating memory accounted to the process. So e.g. allocating
large numbers of BOs should be ok as the OOM killer will come for the
process that allocated those BOs (so this is not much different from
allocating normal user space memory). However in this path there's no
user space allocation - so we can have a tiny process that triggers a
large kernel allocation and the OOM killer will go after every other
process first.

As Dan points out syzbot can have fun with this sort of thing (if it can
figure out the ioctl).

Thanks,

Steve

Dan Carpenter Dec. 17, 2021, 9:10 a.m. UTC | #7

On Fri, Dec 17, 2021 at 08:55:50AM +0000, Steven Price wrote:
> However this one is harder to fix without setting an arbitrary cap on
> the number of BOs during a sumbit. I'm not sure how other drivers handle
> this - the ones I've looked at so far all have the same issue. There's
> obviously the list that Dan already sent, but e.g. msm has the same bug
> in msm_gem_submit.c:submit_create() with an amusing bug where the check
> for (sz > SIZE_MAX) will never hit, although the call is to kzalloc() so
> large allocations are going to fail anyway.

sz is u64 and SIZE_MAX is ULONG_MAX so the (sz > SIZE_MAX) condition
does work to prevent an integer overflow on 32bit systems.  But it's not
beautiful.

regards,
dan carpenter

Steven Price Dec. 17, 2021, 9:16 a.m. UTC | #8

On 17/12/2021 09:10, Dan Carpenter wrote:
> On Fri, Dec 17, 2021 at 08:55:50AM +0000, Steven Price wrote:
>> However this one is harder to fix without setting an arbitrary cap on
>> the number of BOs during a sumbit. I'm not sure how other drivers handle
>> this - the ones I've looked at so far all have the same issue. There's
>> obviously the list that Dan already sent, but e.g. msm has the same bug
>> in msm_gem_submit.c:submit_create() with an amusing bug where the check
>> for (sz > SIZE_MAX) will never hit, although the call is to kzalloc() so
>> large allocations are going to fail anyway.
> 
> sz is u64 and SIZE_MAX is ULONG_MAX so the (sz > SIZE_MAX) condition
> does work to prevent an integer overflow on 32bit systems.  But it's not
> beautiful.

sz is the result of struct_size() which returns a size_t, and SIZE_MAX
in case of an overflow. However the check is *greater than* SIZE_MAX
which will never occur even on 32 bit systems.

However the chances of kzalloc() allocating SIZE_MAX are 0 so I don't
see it's an exploitable bug.

Steve

Dan Carpenter Dec. 17, 2021, 9:28 a.m. UTC | #9

On Fri, Dec 17, 2021 at 09:16:19AM +0000, Steven Price wrote:
> On 17/12/2021 09:10, Dan Carpenter wrote:
> > On Fri, Dec 17, 2021 at 08:55:50AM +0000, Steven Price wrote:
> >> However this one is harder to fix without setting an arbitrary cap on
> >> the number of BOs during a sumbit. I'm not sure how other drivers handle
> >> this - the ones I've looked at so far all have the same issue. There's
> >> obviously the list that Dan already sent, but e.g. msm has the same bug
> >> in msm_gem_submit.c:submit_create() with an amusing bug where the check
> >> for (sz > SIZE_MAX) will never hit, although the call is to kzalloc() so
> >> large allocations are going to fail anyway.
> > 
> > sz is u64 and SIZE_MAX is ULONG_MAX so the (sz > SIZE_MAX) condition
> > does work to prevent an integer overflow on 32bit systems.  But it's not
> > beautiful.
> 
> sz is the result of struct_size() which returns a size_t, and SIZE_MAX
> in case of an overflow.

Correct.

> However the check is *greater than* SIZE_MAX
> which will never occur even on 32 bit systems.

You've missed a part.  We add ((u64)nr_cmds * sizeof(submit->cmd[0]))
to SIZE_MAX.  If nr_cmds is zero then, whatever, the kzmalloc() will
fail.  No big deal.  But if it's non-zero then "sz" is larger than
SIZE_MAX and we allocate a smaller buffer than expected leading to
memory corruption.

Btw, it turns out that I had a hand in writing that check so hooray for
me.  :)  #HoorayForMe

regards,
dan carpenter

Steven Price Dec. 17, 2021, 9:48 a.m. UTC | #10

On 17/12/2021 09:28, Dan Carpenter wrote:
> On Fri, Dec 17, 2021 at 09:16:19AM +0000, Steven Price wrote:
>> On 17/12/2021 09:10, Dan Carpenter wrote:
>>> On Fri, Dec 17, 2021 at 08:55:50AM +0000, Steven Price wrote:
>>>> However this one is harder to fix without setting an arbitrary cap on
>>>> the number of BOs during a sumbit. I'm not sure how other drivers handle
>>>> this - the ones I've looked at so far all have the same issue. There's
>>>> obviously the list that Dan already sent, but e.g. msm has the same bug
>>>> in msm_gem_submit.c:submit_create() with an amusing bug where the check
>>>> for (sz > SIZE_MAX) will never hit, although the call is to kzalloc() so
>>>> large allocations are going to fail anyway.
>>>
>>> sz is u64 and SIZE_MAX is ULONG_MAX so the (sz > SIZE_MAX) condition
>>> does work to prevent an integer overflow on 32bit systems.  But it's not
>>> beautiful.
>>
>> sz is the result of struct_size() which returns a size_t, and SIZE_MAX
>> in case of an overflow.
> 
> Correct.
> 
>> However the check is *greater than* SIZE_MAX
>> which will never occur even on 32 bit systems.
> 
> You've missed a part.  We add ((u64)nr_cmds * sizeof(submit->cmd[0]))
> to SIZE_MAX.  If nr_cmds is zero then, whatever, the kzmalloc() will
> fail.  No big deal.  But if it's non-zero then "sz" is larger than
> SIZE_MAX and we allocate a smaller buffer than expected leading to
> memory corruption.

Ah, my bracket matching is obviously off today - somehow I hadn't
noticed that the second line wasn't part of the call to struct_size().

> Btw, it turns out that I had a hand in writing that check so hooray for
> me.  :)  #HoorayForMe

Indeed hooray for you! ;) Although it still all seems like a very
round-a-bout way of enforcing an arbitrary maximum on the size!

Thanks,

Steve

drm/panfrost: Avoid user size passed to kvmalloc()

Commit Message

Comments

Patch