diff mbox

[v2] rslib: Remove VLAs by setting upper bound on nroots

Message ID 20180315225919.GA43806@beast (mailing list archive)
State New, archived
Headers show

Commit Message

Kees Cook March 15, 2018, 10:59 p.m. UTC
Avoid stack VLAs[1] by always allocating the upper bound of stack space
needed. The existing users of rslib appear to max out at 24 roots[2],
so use that as the upper bound until we have a reason to change it.

Alternative considered: make init_rs() a true caller-instance and
pre-allocate the workspaces. This would possibly need locking and
a refactoring of the returned structure.

Using kmalloc in this path doesn't look great, especially since at
least one caller (pstore) is sensitive to allocations during rslib
usage (it expects to run it during an Oops, for example).

[1] https://lkml.org/lkml/2018/3/7/621
[2] https://lkml.org/lkml/2018/3/9/838

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
v2:
Resending to akpm, since this is in lib without an obvious owner. Added
tglx's Reviewed-by.
---
 lib/reed_solomon/decode_rs.c    | 7 ++++---
 lib/reed_solomon/reed_solomon.c | 5 ++++-
 2 files changed, 8 insertions(+), 4 deletions(-)

Comments

Andrew Morton March 16, 2018, 10:38 p.m. UTC | #1
On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote:

> Avoid stack VLAs[1] by always allocating the upper bound of stack space
> needed. The existing users of rslib appear to max out at 24 roots[2],
> so use that as the upper bound until we have a reason to change it.
> 
> Alternative considered: make init_rs() a true caller-instance and
> pre-allocate the workspaces. This would possibly need locking and
> a refactoring of the returned structure.
> 
> Using kmalloc in this path doesn't look great, especially since at
> least one caller (pstore) is sensitive to allocations during rslib
> usage (it expects to run it during an Oops, for example).
> 
> [1] https://lkml.org/lkml/2018/3/7/621
> [2] https://lkml.org/lkml/2018/3/9/838

Yeah, email [2] is rather important.  This patch is a bit of a pig to
review!

The restriction to 24 roots might be a significant one, dunno.  Perhaps
we should just kmalloc these things?

Otherwise, RS_MAX_ROOTS becomes part of the rs interface and should be
documented and published in rslib.h?
Andrew Morton March 16, 2018, 10:59 p.m. UTC | #2
On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote:

> Avoid stack VLAs[1] by always allocating the upper bound of stack space
> needed. The existing users of rslib appear to max out at 24 roots[2],
> so use that as the upper bound until we have a reason to change it.
> 
> Alternative considered: make init_rs() a true caller-instance and
> pre-allocate the workspaces. This would possibly need locking and
> a refactoring of the returned structure.
> 
> Using kmalloc in this path doesn't look great, especially since at
> least one caller (pstore) is sensitive to allocations during rslib
> usage (it expects to run it during an Oops, for example).

Oh.

Could we allocate the storage during init_rs(), attach it to `struct
rs_control'?
Kees Cook March 17, 2018, 6:25 a.m. UTC | #3
On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote:
>
>> Avoid stack VLAs[1] by always allocating the upper bound of stack space
>> needed. The existing users of rslib appear to max out at 24 roots[2],
>> so use that as the upper bound until we have a reason to change it.
>>
>> Alternative considered: make init_rs() a true caller-instance and
>> pre-allocate the workspaces. This would possibly need locking and
>> a refactoring of the returned structure.
>>
>> Using kmalloc in this path doesn't look great, especially since at
>> least one caller (pstore) is sensitive to allocations during rslib
>> usage (it expects to run it during an Oops, for example).
>
> Oh.
>
> Could we allocate the storage during init_rs(), attach it to `struct
> rs_control'?

No, because they're modified during decode, and struct rs_control is
shared between users. :(

Doing those changes is possible, but it requires a rather extensive
analysis of callers, etc.

Hence, the 24 ultimately.

-Kees
Kees Cook March 26, 2018, 11:17 p.m. UTC | #4
On Fri, Mar 16, 2018 at 11:25 PM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
>> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote:
>>
>>> Avoid stack VLAs[1] by always allocating the upper bound of stack space
>>> needed. The existing users of rslib appear to max out at 24 roots[2],
>>> so use that as the upper bound until we have a reason to change it.
>>>
>>> Alternative considered: make init_rs() a true caller-instance and
>>> pre-allocate the workspaces. This would possibly need locking and
>>> a refactoring of the returned structure.
>>>
>>> Using kmalloc in this path doesn't look great, especially since at
>>> least one caller (pstore) is sensitive to allocations during rslib
>>> usage (it expects to run it during an Oops, for example).
>>
>> Oh.
>>
>> Could we allocate the storage during init_rs(), attach it to `struct
>> rs_control'?
>
> No, because they're modified during decode, and struct rs_control is
> shared between users. :(
>
> Doing those changes is possible, but it requires a rather extensive
> analysis of callers, etc.
>
> Hence, the 24 ultimately.

Can this land in -mm, or does this need further discussion?

-Kees
Andrew Morton March 27, 2018, 11:45 p.m. UTC | #5
On Mon, 26 Mar 2018 16:17:57 -0700 Kees Cook <keescook@chromium.org> wrote:

> On Fri, Mar 16, 2018 at 11:25 PM, Kees Cook <keescook@chromium.org> wrote:
> > On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton
> > <akpm@linux-foundation.org> wrote:
> >> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote:
> >>
> >>> Avoid stack VLAs[1] by always allocating the upper bound of stack space
> >>> needed. The existing users of rslib appear to max out at 24 roots[2],
> >>> so use that as the upper bound until we have a reason to change it.
> >>>
> >>> Alternative considered: make init_rs() a true caller-instance and
> >>> pre-allocate the workspaces. This would possibly need locking and
> >>> a refactoring of the returned structure.
> >>>
> >>> Using kmalloc in this path doesn't look great, especially since at
> >>> least one caller (pstore) is sensitive to allocations during rslib
> >>> usage (it expects to run it during an Oops, for example).
> >>
> >> Oh.
> >>
> >> Could we allocate the storage during init_rs(), attach it to `struct
> >> rs_control'?
> >
> > No, because they're modified during decode, and struct rs_control is
> > shared between users. :(
> >
> > Doing those changes is possible, but it requires a rather extensive
> > analysis of callers, etc.
> >
> > Hence, the 24 ultimately.
> 
> Can this land in -mm, or does this need further discussion?

Grumble.  That share-the-rs_control-if-there's-already-a-matching-one
thing looks like premature optimization to me :(

I guess if we put this storage into the rs_control (rather than on the
stack) then we'd have to worry about concurrent uses of it.  It looks
like all the other fields are immutable once it's set up so there might
be such users.  In fact, I suspect there are...
Kees Cook March 27, 2018, 11:55 p.m. UTC | #6
On Tue, Mar 27, 2018 at 4:45 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Mon, 26 Mar 2018 16:17:57 -0700 Kees Cook <keescook@chromium.org> wrote:
>
>> On Fri, Mar 16, 2018 at 11:25 PM, Kees Cook <keescook@chromium.org> wrote:
>> > On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton
>> > <akpm@linux-foundation.org> wrote:
>> >> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote:
>> >>
>> >>> Avoid stack VLAs[1] by always allocating the upper bound of stack space
>> >>> needed. The existing users of rslib appear to max out at 24 roots[2],
>> >>> so use that as the upper bound until we have a reason to change it.
>> >>>
>> >>> Alternative considered: make init_rs() a true caller-instance and
>> >>> pre-allocate the workspaces. This would possibly need locking and
>> >>> a refactoring of the returned structure.
>> >>>
>> >>> Using kmalloc in this path doesn't look great, especially since at
>> >>> least one caller (pstore) is sensitive to allocations during rslib
>> >>> usage (it expects to run it during an Oops, for example).
>> >>
>> >> Oh.
>> >>
>> >> Could we allocate the storage during init_rs(), attach it to `struct
>> >> rs_control'?
>> >
>> > No, because they're modified during decode, and struct rs_control is
>> > shared between users. :(
>> >
>> > Doing those changes is possible, but it requires a rather extensive
>> > analysis of callers, etc.
>> >
>> > Hence, the 24 ultimately.
>>
>> Can this land in -mm, or does this need further discussion?
>
> Grumble.  That share-the-rs_control-if-there's-already-a-matching-one
> thing looks like premature optimization to me :(
>
> I guess if we put this storage into the rs_control (rather than on the
> stack) then we'd have to worry about concurrent uses of it.  It looks
> like all the other fields are immutable once it's set up so there might
> be such users.  In fact, I suspect there are...

Exactly. :( This is the same conclusion tglx and I came to.

-Kees
Thomas Gleixner March 28, 2018, 8:22 a.m. UTC | #7
On Tue, 27 Mar 2018, Kees Cook wrote:
> On Tue, Mar 27, 2018 at 4:45 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > On Mon, 26 Mar 2018 16:17:57 -0700 Kees Cook <keescook@chromium.org> wrote:
> >
> >> On Fri, Mar 16, 2018 at 11:25 PM, Kees Cook <keescook@chromium.org> wrote:
> >> > On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton
> >> > <akpm@linux-foundation.org> wrote:
> >> >> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote:
> >> >>
> >> >>> Avoid stack VLAs[1] by always allocating the upper bound of stack space
> >> >>> needed. The existing users of rslib appear to max out at 24 roots[2],
> >> >>> so use that as the upper bound until we have a reason to change it.
> >> >>>
> >> >>> Alternative considered: make init_rs() a true caller-instance and
> >> >>> pre-allocate the workspaces. This would possibly need locking and
> >> >>> a refactoring of the returned structure.
> >> >>>
> >> >>> Using kmalloc in this path doesn't look great, especially since at
> >> >>> least one caller (pstore) is sensitive to allocations during rslib
> >> >>> usage (it expects to run it during an Oops, for example).
> >> >>
> >> >> Oh.
> >> >>
> >> >> Could we allocate the storage during init_rs(), attach it to `struct
> >> >> rs_control'?
> >> >
> >> > No, because they're modified during decode, and struct rs_control is
> >> > shared between users. :(
> >> >
> >> > Doing those changes is possible, but it requires a rather extensive
> >> > analysis of callers, etc.
> >> >
> >> > Hence, the 24 ultimately.
> >>
> >> Can this land in -mm, or does this need further discussion?
> >
> > Grumble.  That share-the-rs_control-if-there's-already-a-matching-one
> > thing looks like premature optimization to me :(

That was done back then in the days when the first NAND chips required Reed
solomon error correction and we were still running on rather small devices.

> > I guess if we put this storage into the rs_control (rather than on the
> > stack) then we'd have to worry about concurrent uses of it.  It looks
> > like all the other fields are immutable once it's set up so there might
> > be such users.  In fact, I suspect there are...
> 
> Exactly. :( This is the same conclusion tglx and I came to.

I think we can lift that and just let all users set up a new rs_control
from scratch. Takes some time to init the tables, but ....

Thanks,

	tglx
Thomas Gleixner March 28, 2018, 10:10 a.m. UTC | #8
On Wed, 28 Mar 2018, Thomas Gleixner wrote:
> On Tue, 27 Mar 2018, Kees Cook wrote:
> 
> > > I guess if we put this storage into the rs_control (rather than on the
> > > stack) then we'd have to worry about concurrent uses of it.  It looks
> > > like all the other fields are immutable once it's set up so there might
> > > be such users.  In fact, I suspect there are...
> > 
> > Exactly. :( This is the same conclusion tglx and I came to.
> 
> I think we can lift that and just let all users set up a new rs_control
> from scratch. Takes some time to init the tables, but ....

Bah, no. dm-verity really wants to share them.....
diff mbox

Patch

diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c
index 0ec3f257ffdf..3e3becb836a6 100644
--- a/lib/reed_solomon/decode_rs.c
+++ b/lib/reed_solomon/decode_rs.c
@@ -31,9 +31,10 @@ 
 	 * of nroots is 8. So the necessary stack size will be about
 	 * 220 bytes max.
 	 */
-	uint16_t lambda[nroots + 1], syn[nroots];
-	uint16_t b[nroots + 1], t[nroots + 1], omega[nroots + 1];
-	uint16_t root[nroots], reg[nroots + 1], loc[nroots];
+	uint16_t lambda[RS_MAX_ROOTS + 1], syn[RS_MAX_ROOTS];
+	uint16_t b[RS_MAX_ROOTS + 1], t[RS_MAX_ROOTS + 1];
+	uint16_t omega[RS_MAX_ROOTS + 1], root[RS_MAX_ROOTS];
+	uint16_t reg[RS_MAX_ROOTS + 1], loc[RS_MAX_ROOTS];
 	int count = 0;
 	uint16_t msk = (uint16_t) rs->nn;
 
diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c
index 06d04cfa9339..3e218e70ac2e 100644
--- a/lib/reed_solomon/reed_solomon.c
+++ b/lib/reed_solomon/reed_solomon.c
@@ -51,6 +51,9 @@  static LIST_HEAD (rslist);
 /* Protection for the list */
 static DEFINE_MUTEX(rslistlock);
 
+/* Ultimately controls the upper bounds of the on-stack buffers. */
+#define RS_MAX_ROOTS	24
+
 /**
  * rs_init - Initialize a Reed-Solomon codec
  * @symsize:	symbol size, bits (1-8)
@@ -210,7 +213,7 @@  static struct rs_control *init_rs_internal(int symsize, int gfpoly,
     		return NULL;
 	if (prim <= 0 || prim >= (1<<symsize))
     		return NULL;
-	if (nroots < 0 || nroots >= (1<<symsize))
+	if (nroots < 0 || nroots >= (1<<symsize) || nroots > RS_MAX_ROOTS)
 		return NULL;
 
 	mutex_lock(&rslistlock);