Message ID | bfb9c077-b9a6-47f4-8cd8-a7a86b056a21@moroto.mountain (mailing list archive) |
---|---|
State | Accepted |
Commit | 75e6def3b26736e7ff80639810098c9074229737 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [1/2,net] sctp: handle invalid error codes without calling BUG() | expand |
On Fri, Jun 9, 2023 at 7:05 AM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > The sctp_sf_eat_auth() function is supposed to enum sctp_disposition > values and returning a kernel error code will cause issues in the > caller. Change -ENOMEM to SCTP_DISPOSITION_NOMEM. > > Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.") > Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> > --- > net/sctp/sm_statefuns.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c > index 97f1155a2045..08fdf1251f46 100644 > --- a/net/sctp/sm_statefuns.c > +++ b/net/sctp/sm_statefuns.c > @@ -4482,7 +4482,7 @@ enum sctp_disposition sctp_sf_eat_auth(struct net *net, > SCTP_AUTH_NEW_KEY, GFP_ATOMIC); > > if (!ev) > - return -ENOMEM; > + return SCTP_DISPOSITION_NOMEM; > > sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, > SCTP_ULPEVENT(ev)); > -- > 2.39.2 > This one looks good to me. But for the patch 1/2 (somehow it doesn't show up in my mailbox): default: pr_err("impossible disposition %d in state %d, event_type %d, event_id %d\n", status, state, event_type, subtype.chunk); - BUG(); + error = status; + if (error >= 0) + error = -EINVAL; + WARN_ON_ONCE(1); I think from the sctp_do_sm() perspective, it expects the state_fn status only from enum sctp_disposition. It is a BUG to receive any other values and must be fixed, as you did in 2/2. It does the same thing as other functions in SCTP code, like sctp_sf_eat_data_*(), sctp_retransmit() etc. Thanks.
On Fri, Jun 09, 2023 at 11:13:03AM -0400, Xin Long wrote: > This one looks good to me. > > But for the patch 1/2 (somehow it doesn't show up in my mailbox): > > default: > pr_err("impossible disposition %d in state %d, event_type %d, event_id %d\n", > status, state, event_type, subtype.chunk); > - BUG(); > + error = status; > + if (error >= 0) > + error = -EINVAL; > + WARN_ON_ONCE(1); > > I think from the sctp_do_sm() perspective, it expects the state_fn > status only from > enum sctp_disposition. It is a BUG to receive any other values and > must be fixed, > as you did in 2/2. It does the same thing as other functions in SCTP code, like > sctp_sf_eat_data_*(), sctp_retransmit() etc. It is a bug, sure. And after my patch is applied it will still trigger a stack trace. But we should only call the actual BUG() function in order to prevent filesystem corruption or a privilege escalation or something along those lines. Calling BUG() makes the system unusable so it makes bugs harder to debug. This is even mentioned in checkpatch.pl "Do not crash the kernel unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery code (if feasible) instead of BUG() or variants". regards, dan carpenter
On Fri, Jun 9, 2023 at 12:41 PM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > On Fri, Jun 09, 2023 at 11:13:03AM -0400, Xin Long wrote: > > This one looks good to me. > > > > But for the patch 1/2 (somehow it doesn't show up in my mailbox): > > > > default: > > pr_err("impossible disposition %d in state %d, event_type %d, event_id %d\n", > > status, state, event_type, subtype.chunk); > > - BUG(); > > + error = status; > > + if (error >= 0) > > + error = -EINVAL; > > + WARN_ON_ONCE(1); > > > > I think from the sctp_do_sm() perspective, it expects the state_fn > > status only from > > enum sctp_disposition. It is a BUG to receive any other values and > > must be fixed, > > as you did in 2/2. It does the same thing as other functions in SCTP code, like > > sctp_sf_eat_data_*(), sctp_retransmit() etc. > > It is a bug, sure. And after my patch is applied it will still trigger > a stack trace. But we should only call the actual BUG() function > in order to prevent filesystem corruption or a privilege escalation or > something along those lines. Hi, Dan, Sorry, I'm not sure about this. Look at the places where it's using BUG(), it's not exactly the case, like in ping_err() or ping_common_sendmsg(), BUG() are used more for unexpected cases, which don't cause any filesystem corruption or a privilege escalation. You may also check more others under net/*. > > Calling BUG() makes the system unusable so it makes bugs harder to > debug. This is even mentioned in checkpatch.pl "Do not crash the kernel > unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery > code (if feasible) instead of BUG() or variants". > "absolutely unavoidable", I think in a module, if it gets a case that is not expected at all, and the consequence (it may cause or has caused) is unsure, WARN_ON_ONCE() is not enough. Thanks.
On Fri, 9 Jun 2023 19:04:17 -0400 Xin Long wrote: > On Fri, Jun 9, 2023 at 12:41 PM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > > > On Fri, Jun 09, 2023 at 11:13:03AM -0400, Xin Long wrote: > > > This one looks good to me. > > > > > > But for the patch 1/2 (somehow it doesn't show up in my mailbox): > > > > > > default: > > > pr_err("impossible disposition %d in state %d, event_type %d, event_id %d\n", > > > status, state, event_type, subtype.chunk); > > > - BUG(); > > > + error = status; > > > + if (error >= 0) > > > + error = -EINVAL; > > > + WARN_ON_ONCE(1); > > > > > > I think from the sctp_do_sm() perspective, it expects the state_fn > > > status only from > > > enum sctp_disposition. It is a BUG to receive any other values and > > > must be fixed, > > > as you did in 2/2. It does the same thing as other functions in SCTP code, like > > > sctp_sf_eat_data_*(), sctp_retransmit() etc. > > > > It is a bug, sure. And after my patch is applied it will still trigger > > a stack trace. But we should only call the actual BUG() function > > in order to prevent filesystem corruption or a privilege escalation or > > something along those lines. > Hi, Dan, > > Sorry, I'm not sure about this. > > Look at the places where it's using BUG(), it's not exactly the case, like > in ping_err() or ping_common_sendmsg(), BUG() are used more for > unexpected cases, which don't cause any filesystem corruption or a > privilege escalation. > > You may also check more others under net/*. Most BUG()s under net/ are historic. The legit BUG() uses I can think of are at boot, if something fails you can't bring up the system at all. https://docs.kernel.org/process/deprecated.html?highlight=bug#bug-and-bug-on > > Calling BUG() makes the system unusable so it makes bugs harder to > > debug. This is even mentioned in checkpatch.pl "Do not crash the kernel > > unless it is absolutely unavoidable--use WARN_ON_ONCE() plus recovery > > code (if feasible) instead of BUG() or variants". > > > "absolutely unavoidable", I think in a module, if it gets a case that is not > expected at all, and the consequence (it may cause or has caused) is > unsure, WARN_ON_ONCE() is not enough.
On Fri, Jun 09, 2023 at 07:04:17PM -0400, Xin Long wrote: > On Fri, Jun 9, 2023 at 12:41 PM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > > > On Fri, Jun 09, 2023 at 11:13:03AM -0400, Xin Long wrote: > > It is a bug, sure. And after my patch is applied it will still trigger > > a stack trace. But we should only call the actual BUG() function > > in order to prevent filesystem corruption or a privilege escalation or > > something along those lines. > Hi, Dan, > > Sorry, I'm not sure about this. > > Look at the places where it's using BUG(), it's not exactly the case, like > in ping_err() or ping_common_sendmsg(), BUG() are used more for > unexpected cases, which don't cause any filesystem corruption or a > privilege escalation. > > You may also check more others under net/*. > Linus has been very clear that the BUG() in ping_err() is wrong and should be removed. But to me if you're very very sure a BUG() can't be triggered that's more like a style or philosophy debate than a real life issue. https://lore.kernel.org/all/CAHk-=wg40EAZofO16Eviaj7mfqDhZ2gVEbvfsMf6gYzspRjYvw@mail.gmail.com/ When you look at ping_err() then it's like. Ugh... If we leave off the else statement then GCC and other static checkers will complain that the variables are uninitialized. It we add a return then it communicates to the reader that this path is possible. But the BUG() silences the static checker warning and communicates that the path is impossible. A different solution might be to do a WARN(); followed by a return. Or unreachable();. But the last time I proposed using unreachable() for annotating impossible paths it lead to link errors and I haven't had time to investigate. Another idea is that we could create a WARN() that included an unreachable() annotation. } else { IMPOSSIBLE("An impossible thing has occured"); } As a static analysis developer, I have made Smatch ignore WARN() information because warnings happen regularly and the information they provide is not useful. Smatch does consider unreachable() annotations as accurate. Anyway, in this patch the situation is completely different. Returning wrong error codes is a very common bug. It's already happened once and it will likely happen again. My main worry with this patch is that the networking maintainers will say, "Thanks, but please delete all the calls to BUG() in this function". I just selected this one because it was particularly bad and it needs to be handled a bit specially. Deleting all the other calls to BUG() isn't something that I want to take on. regards, dan carpenter
On Sat, Jun 10, 2023 at 2:28 AM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > On Fri, Jun 09, 2023 at 07:04:17PM -0400, Xin Long wrote: > > On Fri, Jun 9, 2023 at 12:41 PM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > > > > > On Fri, Jun 09, 2023 at 11:13:03AM -0400, Xin Long wrote: > > > It is a bug, sure. And after my patch is applied it will still trigger > > > a stack trace. But we should only call the actual BUG() function > > > in order to prevent filesystem corruption or a privilege escalation or > > > something along those lines. > > Hi, Dan, > > > > Sorry, I'm not sure about this. > > > > Look at the places where it's using BUG(), it's not exactly the case, like > > in ping_err() or ping_common_sendmsg(), BUG() are used more for > > unexpected cases, which don't cause any filesystem corruption or a > > privilege escalation. > > > > You may also check more others under net/*. > > > > Linus has been very clear that the BUG() in ping_err() is wrong and > should be removed. But to me if you're very very sure a BUG() can't be > triggered that's more like a style or philosophy debate than a real life > issue. > > https://lore.kernel.org/all/CAHk-=wg40EAZofO16Eviaj7mfqDhZ2gVEbvfsMf6gYzspRjYvw@mail.gmail.com/ I see, didn't know that there are so many BUG() uses that are historic in the kernel, including net/*. > > When you look at ping_err() then it's like. Ugh... If we leave off the > else statement then GCC and other static checkers will complain that the > variables are uninitialized. It we add a return then it communicates to > the reader that this path is possible. But the BUG() silences the > static checker warning and communicates that the path is impossible. > > A different solution might be to do a WARN(); followed by a return. Or > unreachable();. But the last time I proposed using unreachable() for > annotating impossible paths it lead to link errors and I haven't had > time to investigate. Another idea is that we could create a WARN() that > included an unreachable() annotation. > > } else { > IMPOSSIBLE("An impossible thing has occured"); > } > > As a static analysis developer, I have made Smatch ignore WARN() > information because warnings happen regularly and the information they > provide is not useful. Smatch does consider unreachable() annotations > as accurate. Got it, thanks for the extra information. A WARN() with unreachable() annotation sounds like a good idea. > > Anyway, in this patch the situation is completely different. Returning > wrong error codes is a very common bug. It's already happened once and > it will likely happen again. > > My main worry with this patch is that the networking maintainers will > say, "Thanks, but please delete all the calls to BUG() in this function". > I just selected this one because it was particularly bad and it needs to > be handled a bit specially. Deleting all the other calls to BUG() isn't > something that I want to take on. > Yeah, we should gradually replace these bogus BUG()s. Anyway, for these two patches: Acked-by: Xin Long <lucien.xin@gmail.com> Thanks.
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index 97f1155a2045..08fdf1251f46 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -4482,7 +4482,7 @@ enum sctp_disposition sctp_sf_eat_auth(struct net *net, SCTP_AUTH_NEW_KEY, GFP_ATOMIC); if (!ev) - return -ENOMEM; + return SCTP_DISPOSITION_NOMEM; sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, SCTP_ULPEVENT(ev));
The sctp_sf_eat_auth() function is supposed to enum sctp_disposition values and returning a kernel error code will cause issues in the caller. Change -ENOMEM to SCTP_DISPOSITION_NOMEM. Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> --- net/sctp/sm_statefuns.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)