Message ID | 20180315225919.GA43806@beast (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote: > Avoid stack VLAs[1] by always allocating the upper bound of stack space > needed. The existing users of rslib appear to max out at 24 roots[2], > so use that as the upper bound until we have a reason to change it. > > Alternative considered: make init_rs() a true caller-instance and > pre-allocate the workspaces. This would possibly need locking and > a refactoring of the returned structure. > > Using kmalloc in this path doesn't look great, especially since at > least one caller (pstore) is sensitive to allocations during rslib > usage (it expects to run it during an Oops, for example). > > [1] https://lkml.org/lkml/2018/3/7/621 > [2] https://lkml.org/lkml/2018/3/9/838 Yeah, email [2] is rather important. This patch is a bit of a pig to review! The restriction to 24 roots might be a significant one, dunno. Perhaps we should just kmalloc these things? Otherwise, RS_MAX_ROOTS becomes part of the rs interface and should be documented and published in rslib.h?
On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote: > Avoid stack VLAs[1] by always allocating the upper bound of stack space > needed. The existing users of rslib appear to max out at 24 roots[2], > so use that as the upper bound until we have a reason to change it. > > Alternative considered: make init_rs() a true caller-instance and > pre-allocate the workspaces. This would possibly need locking and > a refactoring of the returned structure. > > Using kmalloc in this path doesn't look great, especially since at > least one caller (pstore) is sensitive to allocations during rslib > usage (it expects to run it during an Oops, for example). Oh. Could we allocate the storage during init_rs(), attach it to `struct rs_control'?
On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote: > >> Avoid stack VLAs[1] by always allocating the upper bound of stack space >> needed. The existing users of rslib appear to max out at 24 roots[2], >> so use that as the upper bound until we have a reason to change it. >> >> Alternative considered: make init_rs() a true caller-instance and >> pre-allocate the workspaces. This would possibly need locking and >> a refactoring of the returned structure. >> >> Using kmalloc in this path doesn't look great, especially since at >> least one caller (pstore) is sensitive to allocations during rslib >> usage (it expects to run it during an Oops, for example). > > Oh. > > Could we allocate the storage during init_rs(), attach it to `struct > rs_control'? No, because they're modified during decode, and struct rs_control is shared between users. :( Doing those changes is possible, but it requires a rather extensive analysis of callers, etc. Hence, the 24 ultimately. -Kees
On Fri, Mar 16, 2018 at 11:25 PM, Kees Cook <keescook@chromium.org> wrote: > On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton > <akpm@linux-foundation.org> wrote: >> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote: >> >>> Avoid stack VLAs[1] by always allocating the upper bound of stack space >>> needed. The existing users of rslib appear to max out at 24 roots[2], >>> so use that as the upper bound until we have a reason to change it. >>> >>> Alternative considered: make init_rs() a true caller-instance and >>> pre-allocate the workspaces. This would possibly need locking and >>> a refactoring of the returned structure. >>> >>> Using kmalloc in this path doesn't look great, especially since at >>> least one caller (pstore) is sensitive to allocations during rslib >>> usage (it expects to run it during an Oops, for example). >> >> Oh. >> >> Could we allocate the storage during init_rs(), attach it to `struct >> rs_control'? > > No, because they're modified during decode, and struct rs_control is > shared between users. :( > > Doing those changes is possible, but it requires a rather extensive > analysis of callers, etc. > > Hence, the 24 ultimately. Can this land in -mm, or does this need further discussion? -Kees
On Mon, 26 Mar 2018 16:17:57 -0700 Kees Cook <keescook@chromium.org> wrote: > On Fri, Mar 16, 2018 at 11:25 PM, Kees Cook <keescook@chromium.org> wrote: > > On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton > > <akpm@linux-foundation.org> wrote: > >> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote: > >> > >>> Avoid stack VLAs[1] by always allocating the upper bound of stack space > >>> needed. The existing users of rslib appear to max out at 24 roots[2], > >>> so use that as the upper bound until we have a reason to change it. > >>> > >>> Alternative considered: make init_rs() a true caller-instance and > >>> pre-allocate the workspaces. This would possibly need locking and > >>> a refactoring of the returned structure. > >>> > >>> Using kmalloc in this path doesn't look great, especially since at > >>> least one caller (pstore) is sensitive to allocations during rslib > >>> usage (it expects to run it during an Oops, for example). > >> > >> Oh. > >> > >> Could we allocate the storage during init_rs(), attach it to `struct > >> rs_control'? > > > > No, because they're modified during decode, and struct rs_control is > > shared between users. :( > > > > Doing those changes is possible, but it requires a rather extensive > > analysis of callers, etc. > > > > Hence, the 24 ultimately. > > Can this land in -mm, or does this need further discussion? Grumble. That share-the-rs_control-if-there's-already-a-matching-one thing looks like premature optimization to me :( I guess if we put this storage into the rs_control (rather than on the stack) then we'd have to worry about concurrent uses of it. It looks like all the other fields are immutable once it's set up so there might be such users. In fact, I suspect there are...
On Tue, Mar 27, 2018 at 4:45 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Mon, 26 Mar 2018 16:17:57 -0700 Kees Cook <keescook@chromium.org> wrote: > >> On Fri, Mar 16, 2018 at 11:25 PM, Kees Cook <keescook@chromium.org> wrote: >> > On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton >> > <akpm@linux-foundation.org> wrote: >> >> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote: >> >> >> >>> Avoid stack VLAs[1] by always allocating the upper bound of stack space >> >>> needed. The existing users of rslib appear to max out at 24 roots[2], >> >>> so use that as the upper bound until we have a reason to change it. >> >>> >> >>> Alternative considered: make init_rs() a true caller-instance and >> >>> pre-allocate the workspaces. This would possibly need locking and >> >>> a refactoring of the returned structure. >> >>> >> >>> Using kmalloc in this path doesn't look great, especially since at >> >>> least one caller (pstore) is sensitive to allocations during rslib >> >>> usage (it expects to run it during an Oops, for example). >> >> >> >> Oh. >> >> >> >> Could we allocate the storage during init_rs(), attach it to `struct >> >> rs_control'? >> > >> > No, because they're modified during decode, and struct rs_control is >> > shared between users. :( >> > >> > Doing those changes is possible, but it requires a rather extensive >> > analysis of callers, etc. >> > >> > Hence, the 24 ultimately. >> >> Can this land in -mm, or does this need further discussion? > > Grumble. That share-the-rs_control-if-there's-already-a-matching-one > thing looks like premature optimization to me :( > > I guess if we put this storage into the rs_control (rather than on the > stack) then we'd have to worry about concurrent uses of it. It looks > like all the other fields are immutable once it's set up so there might > be such users. In fact, I suspect there are... Exactly. :( This is the same conclusion tglx and I came to. -Kees
On Tue, 27 Mar 2018, Kees Cook wrote: > On Tue, Mar 27, 2018 at 4:45 PM, Andrew Morton > <akpm@linux-foundation.org> wrote: > > On Mon, 26 Mar 2018 16:17:57 -0700 Kees Cook <keescook@chromium.org> wrote: > > > >> On Fri, Mar 16, 2018 at 11:25 PM, Kees Cook <keescook@chromium.org> wrote: > >> > On Fri, Mar 16, 2018 at 3:59 PM, Andrew Morton > >> > <akpm@linux-foundation.org> wrote: > >> >> On Thu, 15 Mar 2018 15:59:19 -0700 Kees Cook <keescook@chromium.org> wrote: > >> >> > >> >>> Avoid stack VLAs[1] by always allocating the upper bound of stack space > >> >>> needed. The existing users of rslib appear to max out at 24 roots[2], > >> >>> so use that as the upper bound until we have a reason to change it. > >> >>> > >> >>> Alternative considered: make init_rs() a true caller-instance and > >> >>> pre-allocate the workspaces. This would possibly need locking and > >> >>> a refactoring of the returned structure. > >> >>> > >> >>> Using kmalloc in this path doesn't look great, especially since at > >> >>> least one caller (pstore) is sensitive to allocations during rslib > >> >>> usage (it expects to run it during an Oops, for example). > >> >> > >> >> Oh. > >> >> > >> >> Could we allocate the storage during init_rs(), attach it to `struct > >> >> rs_control'? > >> > > >> > No, because they're modified during decode, and struct rs_control is > >> > shared between users. :( > >> > > >> > Doing those changes is possible, but it requires a rather extensive > >> > analysis of callers, etc. > >> > > >> > Hence, the 24 ultimately. > >> > >> Can this land in -mm, or does this need further discussion? > > > > Grumble. That share-the-rs_control-if-there's-already-a-matching-one > > thing looks like premature optimization to me :( That was done back then in the days when the first NAND chips required Reed solomon error correction and we were still running on rather small devices. > > I guess if we put this storage into the rs_control (rather than on the > > stack) then we'd have to worry about concurrent uses of it. It looks > > like all the other fields are immutable once it's set up so there might > > be such users. In fact, I suspect there are... > > Exactly. :( This is the same conclusion tglx and I came to. I think we can lift that and just let all users set up a new rs_control from scratch. Takes some time to init the tables, but .... Thanks, tglx
On Wed, 28 Mar 2018, Thomas Gleixner wrote: > On Tue, 27 Mar 2018, Kees Cook wrote: > > > > I guess if we put this storage into the rs_control (rather than on the > > > stack) then we'd have to worry about concurrent uses of it. It looks > > > like all the other fields are immutable once it's set up so there might > > > be such users. In fact, I suspect there are... > > > > Exactly. :( This is the same conclusion tglx and I came to. > > I think we can lift that and just let all users set up a new rs_control > from scratch. Takes some time to init the tables, but .... Bah, no. dm-verity really wants to share them.....
diff --git a/lib/reed_solomon/decode_rs.c b/lib/reed_solomon/decode_rs.c index 0ec3f257ffdf..3e3becb836a6 100644 --- a/lib/reed_solomon/decode_rs.c +++ b/lib/reed_solomon/decode_rs.c @@ -31,9 +31,10 @@ * of nroots is 8. So the necessary stack size will be about * 220 bytes max. */ - uint16_t lambda[nroots + 1], syn[nroots]; - uint16_t b[nroots + 1], t[nroots + 1], omega[nroots + 1]; - uint16_t root[nroots], reg[nroots + 1], loc[nroots]; + uint16_t lambda[RS_MAX_ROOTS + 1], syn[RS_MAX_ROOTS]; + uint16_t b[RS_MAX_ROOTS + 1], t[RS_MAX_ROOTS + 1]; + uint16_t omega[RS_MAX_ROOTS + 1], root[RS_MAX_ROOTS]; + uint16_t reg[RS_MAX_ROOTS + 1], loc[RS_MAX_ROOTS]; int count = 0; uint16_t msk = (uint16_t) rs->nn; diff --git a/lib/reed_solomon/reed_solomon.c b/lib/reed_solomon/reed_solomon.c index 06d04cfa9339..3e218e70ac2e 100644 --- a/lib/reed_solomon/reed_solomon.c +++ b/lib/reed_solomon/reed_solomon.c @@ -51,6 +51,9 @@ static LIST_HEAD (rslist); /* Protection for the list */ static DEFINE_MUTEX(rslistlock); +/* Ultimately controls the upper bounds of the on-stack buffers. */ +#define RS_MAX_ROOTS 24 + /** * rs_init - Initialize a Reed-Solomon codec * @symsize: symbol size, bits (1-8) @@ -210,7 +213,7 @@ static struct rs_control *init_rs_internal(int symsize, int gfpoly, return NULL; if (prim <= 0 || prim >= (1<<symsize)) return NULL; - if (nroots < 0 || nroots >= (1<<symsize)) + if (nroots < 0 || nroots >= (1<<symsize) || nroots > RS_MAX_ROOTS) return NULL; mutex_lock(&rslistlock);