drm/nouveau/secboot/acr: Remove VLA usage

Message ID	20180524172436.GA17738@beast (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> Date: Thu, 24 May 2018 10:24:36 -0700 From: Kees Cook <keescook@chromium.org> To: linux-kernel@vger.kernel.org Subject: [PATCH] drm/nouveau/secboot/acr: Remove VLA usage Message-ID: <20180524172436.GA17738@beast> MIME-Version: 1.0 Content-Disposition: inline Precedence: list Cc: nouveau@lists.freedesktop.org, Alexandre Courbot <acourbot@nvidia.com>, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Kees Cook May 24, 2018, 5:24 p.m. UTC

In the quest to remove all stack VLA usage from the kernel[1], this
allocates the working buffers before starting the writing so it won't
abort in the middle. This needs an initial walk of the lists to figure
out how large the buffer should be.

[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 .../nouveau/nvkm/subdev/secboot/acr_r352.c    | 25 ++++++++++++++++---
 .../nouveau/nvkm/subdev/secboot/acr_r367.c    | 16 +++++++++++-
 2 files changed, 37 insertions(+), 4 deletions(-)

Kees Cook June 20, 2018, 4:45 a.m. UTC | #1

On Thu, May 24, 2018 at 10:24 AM, Kees Cook <keescook@chromium.org> wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> allocates the working buffers before starting the writing so it won't
> abort in the middle. This needs an initial walk of the lists to figure
> out how large the buffer should be.
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <keescook@chromium.org>

Friendly ping. Who is best to take this patch?

Thanks!

-Kees

> ---
>  .../nouveau/nvkm/subdev/secboot/acr_r352.c    | 25 ++++++++++++++++---
>  .../nouveau/nvkm/subdev/secboot/acr_r367.c    | 16 +++++++++++-
>  2 files changed, 37 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
> index a721354249ce..d02e183717dc 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
> @@ -414,6 +414,20 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>  {
>         struct ls_ucode_img *_img;
>         u32 pos = 0;
> +       u32 max_desc_size = 0;
> +       u8 *gdesc;
> +
> +       /* Figure out how large we need gdesc to be. */
> +       list_for_each_entry(_img, imgs, node) {
> +               const struct acr_r352_ls_func *ls_func =
> +                                           acr->func->ls_func[_img->falcon_id];
> +
> +               max_desc_size = max(max_desc_size, ls_func->bl_desc_size);
> +       }
> +
> +       gdesc = kmalloc(max_desc_size, GFP_KERNEL);
> +       if (!gdesc)
> +               return -ENOMEM;
>
>         nvkm_kmap(wpr_blob);
>
> @@ -421,7 +435,6 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>                 struct ls_ucode_img_r352 *img = ls_ucode_img_r352(_img);
>                 const struct acr_r352_ls_func *ls_func =
>                                             acr->func->ls_func[_img->falcon_id];
> -               u8 gdesc[ls_func->bl_desc_size];
>
>                 nvkm_gpuobj_memcpy_to(wpr_blob, pos, &img->wpr_header,
>                                       sizeof(img->wpr_header));
> @@ -447,6 +460,8 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>
>         nvkm_done(wpr_blob);
>
> +       kfree(gdesc);
> +
>         return 0;
>  }
>
> @@ -771,7 +786,11 @@ acr_r352_load(struct nvkm_acr *_acr, struct nvkm_falcon *falcon,
>         struct fw_bl_desc *hsbl_desc;
>         void *bl, *blob_data, *hsbl_code, *hsbl_data;
>         u32 code_size;
> -       u8 bl_desc[bl_desc_size];
> +       u8 *bl_desc;
> +
> +       bl_desc = kzalloc(bl_desc_size, GFP_KERNEL);
> +       if (!bl_desc)
> +               return -ENOMEM;
>
>         /* Find the bootloader descriptor for our blob and copy it */
>         if (blob == acr->load_blob) {
> @@ -802,7 +821,6 @@ acr_r352_load(struct nvkm_acr *_acr, struct nvkm_falcon *falcon,
>                               code_size, hsbl_desc->start_tag, 0, false);
>
>         /* Generate the BL header */
> -       memset(bl_desc, 0, bl_desc_size);
>         acr->func->generate_hs_bl_desc(load_hdr, bl_desc, offset);
>
>         /*
> @@ -811,6 +829,7 @@ acr_r352_load(struct nvkm_acr *_acr, struct nvkm_falcon *falcon,
>         nvkm_falcon_load_dmem(falcon, bl_desc, hsbl_desc->dmem_load_off,
>                               bl_desc_size, 0);
>
> +       kfree(bl_desc);
>         return hsbl_desc->start_tag << 8;
>  }
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r367.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r367.c
> index 866877b88797..978ad0790367 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r367.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r367.c
> @@ -265,6 +265,19 @@ acr_r367_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>  {
>         struct ls_ucode_img *_img;
>         u32 pos = 0;
> +       u32 max_desc_size = 0;
> +       u8 *gdesc;
> +
> +       list_for_each_entry(_img, imgs, node) {
> +               const struct acr_r352_ls_func *ls_func =
> +                                           acr->func->ls_func[_img->falcon_id];
> +
> +               max_desc_size = max(max_desc_size, ls_func->bl_desc_size);
> +       }
> +
> +       gdesc = kmalloc(max_desc_size, GFP_KERNEL);
> +       if (!gdesc)
> +               return -ENOMEM;
>
>         nvkm_kmap(wpr_blob);
>
> @@ -272,7 +285,6 @@ acr_r367_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>                 struct ls_ucode_img_r367 *img = ls_ucode_img_r367(_img);
>                 const struct acr_r352_ls_func *ls_func =
>                                             acr->func->ls_func[_img->falcon_id];
> -               u8 gdesc[ls_func->bl_desc_size];
>
>                 nvkm_gpuobj_memcpy_to(wpr_blob, pos, &img->wpr_header,
>                                       sizeof(img->wpr_header));
> @@ -298,6 +310,8 @@ acr_r367_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>
>         nvkm_done(wpr_blob);
>
> +       kfree(gdesc);
> +
>         return 0;
>  }
>
> --
> 2.17.0
>
>
> --
> Kees Cook
> Pixel Security

Karol Herbst June 22, 2018, 5:50 p.m. UTC | #2

On Thu, May 24, 2018 at 7:24 PM, Kees Cook <keescook@chromium.org> wrote:
> In the quest to remove all stack VLA usage from the kernel[1], this
> allocates the working buffers before starting the writing so it won't
> abort in the middle. This needs an initial walk of the lists to figure
> out how large the buffer should be.
>
> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  .../nouveau/nvkm/subdev/secboot/acr_r352.c    | 25 ++++++++++++++++---
>  .../nouveau/nvkm/subdev/secboot/acr_r367.c    | 16 +++++++++++-
>  2 files changed, 37 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
> index a721354249ce..d02e183717dc 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
> @@ -414,6 +414,20 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>  {
>         struct ls_ucode_img *_img;
>         u32 pos = 0;
> +       u32 max_desc_size = 0;
> +       u8 *gdesc;
> +
> +       /* Figure out how large we need gdesc to be. */
> +       list_for_each_entry(_img, imgs, node) {
> +               const struct acr_r352_ls_func *ls_func =
> +                                           acr->func->ls_func[_img->falcon_id];
> +
> +               max_desc_size = max(max_desc_size, ls_func->bl_desc_size);
> +       }
> +
> +       gdesc = kmalloc(max_desc_size, GFP_KERNEL);
> +       if (!gdesc)
> +               return -ENOMEM;
>
>         nvkm_kmap(wpr_blob);
>
> @@ -421,7 +435,6 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>                 struct ls_ucode_img_r352 *img = ls_ucode_img_r352(_img);
>                 const struct acr_r352_ls_func *ls_func =
>                                             acr->func->ls_func[_img->falcon_id];
> -               u8 gdesc[ls_func->bl_desc_size];
>

if there are no guarantees that (ls_func->bl_desc_size & 0x4 == 0),
then we need to memset a bit more, because 4 bytes at the time are
actually copied inside nvkm_gpuobj_memcpy_to later in that code, but
the last 4 bytes are only partly memset to 0.

If ls_func->bl_desc_size is always a multiple of 0x4, then it isn't as
important, but still better to be fixed. Or maybe
nvkm_gpuobj_memcpy_to should do that handling and check if the size is
a multiple of 0x4 and otherwise handle that case?

Same is valid for the changes in the r367 file.

>                 nvkm_gpuobj_memcpy_to(wpr_blob, pos, &img->wpr_header,
>                                       sizeof(img->wpr_header));
> @@ -447,6 +460,8 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>
>         nvkm_done(wpr_blob);
>
> +       kfree(gdesc);
> +
>         return 0;
>  }
>
> @@ -771,7 +786,11 @@ acr_r352_load(struct nvkm_acr *_acr, struct nvkm_falcon *falcon,
>         struct fw_bl_desc *hsbl_desc;
>         void *bl, *blob_data, *hsbl_code, *hsbl_data;
>         u32 code_size;
> -       u8 bl_desc[bl_desc_size];
> +       u8 *bl_desc;
> +
> +       bl_desc = kzalloc(bl_desc_size, GFP_KERNEL);
> +       if (!bl_desc)
> +               return -ENOMEM;
>
>         /* Find the bootloader descriptor for our blob and copy it */
>         if (blob == acr->load_blob) {
> @@ -802,7 +821,6 @@ acr_r352_load(struct nvkm_acr *_acr, struct nvkm_falcon *falcon,
>                               code_size, hsbl_desc->start_tag, 0, false);
>
>         /* Generate the BL header */
> -       memset(bl_desc, 0, bl_desc_size);
>         acr->func->generate_hs_bl_desc(load_hdr, bl_desc, offset);
>
>         /*
> @@ -811,6 +829,7 @@ acr_r352_load(struct nvkm_acr *_acr, struct nvkm_falcon *falcon,
>         nvkm_falcon_load_dmem(falcon, bl_desc, hsbl_desc->dmem_load_off,
>                               bl_desc_size, 0);
>
> +       kfree(bl_desc);
>         return hsbl_desc->start_tag << 8;
>  }
>
> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r367.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r367.c
> index 866877b88797..978ad0790367 100644
> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r367.c
> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r367.c
> @@ -265,6 +265,19 @@ acr_r367_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>  {
>         struct ls_ucode_img *_img;
>         u32 pos = 0;
> +       u32 max_desc_size = 0;
> +       u8 *gdesc;
> +
> +       list_for_each_entry(_img, imgs, node) {
> +               const struct acr_r352_ls_func *ls_func =
> +                                           acr->func->ls_func[_img->falcon_id];
> +
> +               max_desc_size = max(max_desc_size, ls_func->bl_desc_size);
> +       }
> +
> +       gdesc = kmalloc(max_desc_size, GFP_KERNEL);
> +       if (!gdesc)
> +               return -ENOMEM;
>
>         nvkm_kmap(wpr_blob);
>
> @@ -272,7 +285,6 @@ acr_r367_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>                 struct ls_ucode_img_r367 *img = ls_ucode_img_r367(_img);
>                 const struct acr_r352_ls_func *ls_func =
>                                             acr->func->ls_func[_img->falcon_id];
> -               u8 gdesc[ls_func->bl_desc_size];
>
>                 nvkm_gpuobj_memcpy_to(wpr_blob, pos, &img->wpr_header,
>                                       sizeof(img->wpr_header));
> @@ -298,6 +310,8 @@ acr_r367_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>
>         nvkm_done(wpr_blob);
>
> +       kfree(gdesc);
> +
>         return 0;
>  }
>
> --
> 2.17.0
>
>
> --
> Kees Cook
> Pixel Security
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

Kees Cook June 22, 2018, 9:34 p.m. UTC | #3

On Fri, Jun 22, 2018 at 10:50 AM, Karol Herbst <kherbst@redhat.com> wrote:
> On Thu, May 24, 2018 at 7:24 PM, Kees Cook <keescook@chromium.org> wrote:
>> In the quest to remove all stack VLA usage from the kernel[1], this
>> allocates the working buffers before starting the writing so it won't
>> abort in the middle. This needs an initial walk of the lists to figure
>> out how large the buffer should be.
>>
>> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>>  .../nouveau/nvkm/subdev/secboot/acr_r352.c    | 25 ++++++++++++++++---
>>  .../nouveau/nvkm/subdev/secboot/acr_r367.c    | 16 +++++++++++-
>>  2 files changed, 37 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
>> index a721354249ce..d02e183717dc 100644
>> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
>> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
>> @@ -414,6 +414,20 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>>  {
>>         struct ls_ucode_img *_img;
>>         u32 pos = 0;
>> +       u32 max_desc_size = 0;
>> +       u8 *gdesc;
>> +
>> +       /* Figure out how large we need gdesc to be. */
>> +       list_for_each_entry(_img, imgs, node) {
>> +               const struct acr_r352_ls_func *ls_func =
>> +                                           acr->func->ls_func[_img->falcon_id];
>> +
>> +               max_desc_size = max(max_desc_size, ls_func->bl_desc_size);
>> +       }
>> +
>> +       gdesc = kmalloc(max_desc_size, GFP_KERNEL);
>> +       if (!gdesc)
>> +               return -ENOMEM;
>>
>>         nvkm_kmap(wpr_blob);
>>
>> @@ -421,7 +435,6 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>>                 struct ls_ucode_img_r352 *img = ls_ucode_img_r352(_img);
>>                 const struct acr_r352_ls_func *ls_func =
>>                                             acr->func->ls_func[_img->falcon_id];
>> -               u8 gdesc[ls_func->bl_desc_size];
>>
>
> if there are no guarantees that (ls_func->bl_desc_size & 0x4 == 0),
> then we need to memset a bit more, because 4 bytes at the time are
> actually copied inside nvkm_gpuobj_memcpy_to later in that code, but
> the last 4 bytes are only partly memset to 0.

I think this is unchanged from the original code, yes? The memset() is
always against bl_desc_size; I haven't changed that.

> If ls_func->bl_desc_size is always a multiple of 0x4, then it isn't as
> important, but still better to be fixed. Or maybe
> nvkm_gpuobj_memcpy_to should do that handling and check if the size is
> a multiple of 0x4 and otherwise handle that case?
>
> Same is valid for the changes in the r367 file.

Should I resend with both the allocation and the memset getting
rounded up to the next multiple of 4?

-Kees

Karol Herbst June 22, 2018, 9:40 p.m. UTC | #4

On Fri, Jun 22, 2018 at 11:34 PM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, Jun 22, 2018 at 10:50 AM, Karol Herbst <kherbst@redhat.com> wrote:
>> On Thu, May 24, 2018 at 7:24 PM, Kees Cook <keescook@chromium.org> wrote:
>>> In the quest to remove all stack VLA usage from the kernel[1], this
>>> allocates the working buffers before starting the writing so it won't
>>> abort in the middle. This needs an initial walk of the lists to figure
>>> out how large the buffer should be.
>>>
>>> [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
>>>
>>> Signed-off-by: Kees Cook <keescook@chromium.org>
>>> ---
>>>  .../nouveau/nvkm/subdev/secboot/acr_r352.c    | 25 ++++++++++++++++---
>>>  .../nouveau/nvkm/subdev/secboot/acr_r367.c    | 16 +++++++++++-
>>>  2 files changed, 37 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
>>> index a721354249ce..d02e183717dc 100644
>>> --- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
>>> +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/acr_r352.c
>>> @@ -414,6 +414,20 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>>>  {
>>>         struct ls_ucode_img *_img;
>>>         u32 pos = 0;
>>> +       u32 max_desc_size = 0;
>>> +       u8 *gdesc;
>>> +
>>> +       /* Figure out how large we need gdesc to be. */
>>> +       list_for_each_entry(_img, imgs, node) {
>>> +               const struct acr_r352_ls_func *ls_func =
>>> +                                           acr->func->ls_func[_img->falcon_id];
>>> +
>>> +               max_desc_size = max(max_desc_size, ls_func->bl_desc_size);
>>> +       }
>>> +
>>> +       gdesc = kmalloc(max_desc_size, GFP_KERNEL);
>>> +       if (!gdesc)
>>> +               return -ENOMEM;
>>>
>>>         nvkm_kmap(wpr_blob);
>>>
>>> @@ -421,7 +435,6 @@ acr_r352_ls_write_wpr(struct acr_r352 *acr, struct list_head *imgs,
>>>                 struct ls_ucode_img_r352 *img = ls_ucode_img_r352(_img);
>>>                 const struct acr_r352_ls_func *ls_func =
>>>                                             acr->func->ls_func[_img->falcon_id];
>>> -               u8 gdesc[ls_func->bl_desc_size];
>>>
>>
>> if there are no guarantees that (ls_func->bl_desc_size & 0x4 == 0),
>> then we need to memset a bit more, because 4 bytes at the time are
>> actually copied inside nvkm_gpuobj_memcpy_to later in that code, but
>> the last 4 bytes are only partly memset to 0.
>
> I think this is unchanged from the original code, yes? The memset() is
> always against bl_desc_size; I haven't changed that.
>

right, but I think before we would upload undefined data (because we
run out of bounds for cerain bl_desc_size), now we get what ever was
left inside the buffer from the previous iteration. Both cases are not
good. It isn't an issue with your patch, the code before wasn't 100%
correct either. But maybe that's fine, because bl_desc_size is always
a multple of 0x4.

>> If ls_func->bl_desc_size is always a multiple of 0x4, then it isn't as
>> important, but still better to be fixed. Or maybe
>> nvkm_gpuobj_memcpy_to should do that handling and check if the size is
>> a multiple of 0x4 and otherwise handle that case?
>>
>> Same is valid for the changes in the r367 file.
>
> Should I resend with both the allocation and the memset getting
> rounded up to the next multiple of 4?

Yeah, I think copying 0 is better than random data.

Your patch is fine as it is though, because it doesn't add a new
issue, it just showed us there is a potential one. We should keep that
in mind and see how we want to fix that up. I can imagine that this
might cause some issues in some places, maybe it is totally fine.

Thanks

>
> -Kees
>
> --
> Kees Cook
> Pixel Security

Kees Cook June 25, 2018, 9:47 p.m. UTC | #5

On Fri, Jun 22, 2018 at 2:40 PM, Karol Herbst <kherbst@redhat.com> wrote:
> Your patch is fine as it is though, because it doesn't add a new
> issue, it just showed us there is a potential one. We should keep that
> in mind and see how we want to fix that up. I can imagine that this
> might cause some issues in some places, maybe it is totally fine.

Okay, thanks! Who can take the patch into their tree?

-Kees

drm/nouveau/secboot/acr: Remove VLA usage

Commit Message

Comments

Patch