Message ID | 20240522183133.729159-2-lars@oddbit.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v4] ax25: Fix refcount imbalance on inbound connections | expand |
On Wed, May 22, 2024 at 02:31:34PM -0400, lars@oddbit.com wrote: > From: Lars Kellogg-Stedman <lars@oddbit.com> > > When releasing a socket in ax25_release(), we call netdev_put() to > decrease the refcount on the associated ax.25 device. However, the > execution path for accepting an incoming connection never calls > netdev_hold(). This imbalance leads to refcount errors, and ultimately > to kernel crashes. > > A typical call trace for the above situation looks like this: > > Call Trace: > <TASK> > ? show_regs+0x64/0x70 > ? __warn+0x83/0x120 > ? refcount_warn_saturate+0xb2/0x100 > ? report_bug+0x158/0x190 > ? prb_read_valid+0x20/0x30 > ? handle_bug+0x3e/0x70 > ? exc_invalid_op+0x1c/0x70 > ? asm_exc_invalid_op+0x1f/0x30 > ? refcount_warn_saturate+0xb2/0x100 > ? refcount_warn_saturate+0xb2/0x100 > ax25_release+0x2ad/0x360 > __sock_release+0x35/0xa0 > sock_close+0x19/0x20 > [...] > > On reboot (or any attempt to remove the interface), the kernel gets > stuck in an infinite loop: > > unregister_netdevice: waiting for ax0 to become free. Usage count = 0 > > This patch corrects these issues by ensuring that we call netdev_hold() > and ax25_dev_hold() for new connections in ax25_accept(). > > Fixes: 7d8a3a477b ("ax25: Fix ax25 session cleanup problems") I thought the fixes tag was: Fixes: 9fd75b66b8f6 ("ax25: Fix refcount leaks caused by ax25_cb_del()") I've already said that I don't think the patch is correct and offered an alternative which takes a reference in accept() but also adds a matching put()... But I can't really test my patch so if we're going to do something that we know is wrong, I'd prefer to just revert Duoming's patch. regards, dan carpenter
On Thu, May 23, 2024 at 11:05 AM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > [snip] > > I've already said that I don't think the patch is correct and offered > an alternative which takes a reference in accept() but also adds a > matching put()... But I can't really test my patch so if we're going to > do something that we know is wrong, I'd prefer to just revert Duoming's > patch. Dan, may I ask how you determined that Lars's patch is incorrect? Testing so far indicates that it works as expected. On the other hand, Lars tested your patch and found that it did not address the underlying issue (https://marc.info/?l=linux-hams&m=171646940902757&w=2). If I may suggest a path forward, given that observed results show that Lars's patch works as expected, perhaps we can commit that and then work to incorporate a more robust ref counting strategy a la your patch? - Dan C.
On Thu, May 23, 2024 at 11:22:43AM -0400, Dan Cross wrote: > On Thu, May 23, 2024 at 11:05 AM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > > [snip] > > > > I've already said that I don't think the patch is correct and offered > > an alternative which takes a reference in accept() but also adds a > > matching put()... But I can't really test my patch so if we're going to > > do something that we know is wrong, I'd prefer to just revert Duoming's > > patch. > > Dan, may I ask how you determined that Lars's patch is incorrect? The problem is that accept() and ax25_release() are not mirrored pairs. We're just taking the reference and never dropping it. Which fixes the use after free but introduces a leak. > Testing so far indicates that it works as expected. On the other hand, > Lars tested your patch and found that it did not address the > underlying issue > (https://marc.info/?l=linux-hams&m=171646940902757&w=2). Yeah. I've said a couple times that my patch wasn't complete. I keep hoping that Duoming will chime in here... > > If I may suggest a path forward, given that observed results show that > Lars's patch works as expected, perhaps we can commit that and then > work to incorporate a more robust ref counting strategy a la your > patch? The argument for this patch is that it works in testing even though we think it's not totally correct. That's not really a good argument. Like we can revert patches that clearly don't work so we could revert Duoming's patch, but when we're adding code then that should work. regards, dan carpenter
On Thu, May 23, 2024 at 2:23 PM Dan Carpenter <dan.carpenter@linaro.org> wrote: > On Thu, May 23, 2024 at 11:22:43AM -0400, Dan Cross wrote: > > On Thu, May 23, 2024 at 11:05 AM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > > > [snip] > > > > > > I've already said that I don't think the patch is correct and offered > > > an alternative which takes a reference in accept() but also adds a > > > matching put()... But I can't really test my patch so if we're going to > > > do something that we know is wrong, I'd prefer to just revert Duoming's > > > patch. > > > > Dan, may I ask how you determined that Lars's patch is incorrect? > > The problem is that accept() and ax25_release() are not mirrored pairs. I'm having a hard time understanding why. Here's my reasoning; please correct me if I'm wrong? Taking a step back, the semantics of `accept` are that, on successful completion, it creates a new socket associated with the accepted connection. It makes sense that such a new socket would take a reference on the underlying device, since the socket is inherently tied to that device; this is what Lars's patch does. Indeed, consider the case that a connection was accepted, and then the bound listening socket was immediately closed, thus dropping the reference on the device: it seems that adding a reference onto the device in the `accept` path is necessary. So how does `ax25_release` get called? That ends up getting invoked from `close`; I traced this through the kernel from `sys_close` until the invocation of the `.release` function from the `proto_ops` vector. The call graph looks something like this: sys_close (fs/open.c) -> file_close_fd (fs/file.c) -> file_closed_fd_locked(same) <- returns struct file to file_close_fd <- returns struct file to sys_close -> filp_flush (fs/open.c) -> ops vec `.flush` (nop for socket) -> __fput_sync (fs/file_table.c): decref(f_count) => __fput -> __fput (fs/file_table.c) -> fsnotify_close(file) (irrelevant) -> f_op->release -> sock_close (net/socket.c) -> __sock_release (net/socket.c) -> proto_ops vec `.release` -> ax25_release (net/ax25/af_ax25.c) There may be other ways it's invoked, but that's likely the main one. It seems clear that this will happen for sockets that have a ref on the device either via `bind` or via `accept`. > We're just taking the reference and never dropping it. Which fixes the > use after free but introduces a leak. I'm not sure that's true. It looks to me like the ref is dropped when the accepted socket is eventually closed. What am I missing? > > Testing so far indicates that it works as expected. On the other hand, > > Lars tested your patch and found that it did not address the > > underlying issue > > (https://marc.info/?l=linux-hams&m=171646940902757&w=2). > > Yeah. I've said a couple times that my patch wasn't complete. I keep > hoping that Duoming will chime in here... +1! > > If I may suggest a path forward, given that observed results show that > > Lars's patch works as expected, perhaps we can commit that and then > > work to incorporate a more robust ref counting strategy a la your > > patch? > > The argument for this patch is that it works in testing even though we > think it's not totally correct. That's not really a good argument. > Like we can revert patches that clearly don't work so we could revert > Duoming's patch, but when we're adding code then that should work. I agree that absence of evidence is not evidence of absence, but I'm not sure I follow the reasoning behind their being a leak. It's not clear to me that Lars's patch is obviously wrong. I _do_ think this code hasn't been shown much love in a long time, and I am totally prepared to admit that I'm wrong, but right now, I don't see it? - Dan C.
On Thu, May 23, 2024 at 04:39:27PM GMT, Dan Cross wrote: > On Thu, May 23, 2024 at 2:23 PM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > The problem is that accept() and ax25_release() are not mirrored pairs. Right, but my in making this patch I wasn't thinking so much about accept/ax25_release, which as you say are not necessarily a mirrored pair... > It seems clear that this will happen for sockets that have a ref on > the device either via `bind` or via `accept`. ...but rather bind/accept, which *are*. The patch I've submitted gives us equivalent behavior on the code paths for inbound and outbound connections. Without this change, the ax.25 subsystem is completely broken. Maybe we can come up with a more correct fix down the road, or maybe we'll refactor all the things, but I would prefer to return the subsystem to a usable state while we figure that out.
On Fri, May 24, 2024 at 11:25 AM Lars Kellogg-Stedman <lars@oddbit.com> wrote: > On Thu, May 23, 2024 at 04:39:27PM GMT, Dan Cross wrote: > > On Thu, May 23, 2024 at 2:23 PM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > > The problem is that accept() and ax25_release() are not mirrored pairs. > > Right, but my in making this patch I wasn't thinking so much about > accept/ax25_release, which as you say are not necessarily a mirrored > pair... > > > It seems clear that this will happen for sockets that have a ref on > > the device either via `bind` or via `accept`. > > ...but rather bind/accept, which *are*. The patch I've submitted gives > us equivalent behavior on the code paths for inbound and outbound > connections. > > Without this change, the ax.25 subsystem is completely broken. Maybe we > can come up with a more correct fix down the road, or maybe we'll > refactor all the things, but I would prefer to return the subsystem to a > usable state while we figure that out. I think the main thrust of my point yesterday was that your accept/release _are_ paired, via `close`. It seems logical to me that every active socket associated with a device should hold a ref against that device, and your change does that. Since I wrote yesterday, I've enabled a bunch of additional debugging code in my kernel (ref tracking, memory leak detection, etc). I can detect no evidence of leaks, despite trying several times to induce them. Perhaps I'm not looking correctly? Dan Carpenter seems fairly convinced that there are, but if that were the case, we ought to be able to detect some evidence of it, no? Absent evidence to the contrary, I believe that the patch you submitted for integration is correct. There may be other bugs lurking in that general area, but if so, they're only tangentially related, if at all. - Dan C.
On Fri, May 24, 2024 at 11:25:36AM -0400, Lars Kellogg-Stedman wrote: > On Thu, May 23, 2024 at 04:39:27PM GMT, Dan Cross wrote: > > On Thu, May 23, 2024 at 2:23 PM Dan Carpenter <dan.carpenter@linaro.org> wrote: > > > The problem is that accept() and ax25_release() are not mirrored pairs. > > Right, but my in making this patch I wasn't thinking so much about > accept/ax25_release, which as you say are not necessarily a mirrored > pair... > > > It seems clear that this will happen for sockets that have a ref on > > the device either via `bind` or via `accept`. > > ...but rather bind/accept, which *are*. The patch I've submitted gives > us equivalent behavior on the code paths for inbound and outbound > connections. This is the explanation I was looking for. Sorry, I meant to review these patches again over the weekend but I got caught up in other things. Give me until tomorrow to review it again. regards, dan carpenter
On Wed, 2024-05-22 at 14:31 -0400, lars@oddbit.com wrote: > From: Lars Kellogg-Stedman <lars@oddbit.com> > > When releasing a socket in ax25_release(), we call netdev_put() to > decrease the refcount on the associated ax.25 device. However, the > execution path for accepting an incoming connection never calls > netdev_hold(). This imbalance leads to refcount errors, and ultimately > to kernel crashes. > > A typical call trace for the above situation looks like this: > > Call Trace: > <TASK> > ? show_regs+0x64/0x70 > ? __warn+0x83/0x120 > ? refcount_warn_saturate+0xb2/0x100 > ? report_bug+0x158/0x190 > ? prb_read_valid+0x20/0x30 > ? handle_bug+0x3e/0x70 > ? exc_invalid_op+0x1c/0x70 > ? asm_exc_invalid_op+0x1f/0x30 > ? refcount_warn_saturate+0xb2/0x100 > ? refcount_warn_saturate+0xb2/0x100 > ax25_release+0x2ad/0x360 > __sock_release+0x35/0xa0 > sock_close+0x19/0x20 > [...] > > On reboot (or any attempt to remove the interface), the kernel gets > stuck in an infinite loop: > > unregister_netdevice: waiting for ax0 to become free. Usage count = 0 > > This patch corrects these issues by ensuring that we call netdev_hold() > and ax25_dev_hold() for new connections in ax25_accept(). > > Fixes: 7d8a3a477b ("ax25: Fix ax25 session cleanup problems") Note that the fixes tag above is still wrong - the hash must be 12 chars long, see: https://elixir.bootlin.com/linux/v6.10-rc1/source/Documentation/process/5.Posting.rst#L207 so regardless of Dan's eventual ack you will have to repost this patch. Please run checkpatch locally before the next submission, thanks, Paolo
On Tue, May 28, 2024 at 11:40:38AM GMT, Paolo Abeni wrote: > Note that the fixes tag above is still wrong - the hash must be 12 > chars long, see: I had spotted that, thanks. Just waiting to see if there are any real change requests before re-submitting.
[ Sorry for the delay on sending this. My email daemon messed up so all my outgoing mail has been going to /dev/null for the past few days. Resending everything... -dan ] On Tue, May 28, 2024 at 12:06:39PM -0400, Lars Kellogg-Stedman wrote: > On Tue, May 28, 2024 at 11:40:38AM GMT, Paolo Abeni wrote: > > Note that the fixes tag above is still wrong - the hash must be 12 > > chars long, see: > > I had spotted that, thanks. Just waiting to see if there are any real > change requests before re-submitting. Okay. It looks good to me. Sorry for all the headache. 1) The Fixes tag points to the wrong commit, though, right? The one you have here doesn't make sense and it doesn't match the bisect. 2) Can we edit the commitmessage a bit to say include what you wrote about "but rather bind/accept" being paired. We increment in bind and we should increment in accept as well. It's the same. 3) The other thing that I notice is that Duoming dropped part of his commit when he resent v6. https://lore.kernel.org/all/5c61fea1b20f3c1596e4fb46282c3dedc54513a3.1715065005.git.duoming@zju.edu.cn/ That part of the commit was correct. Maybe it wasn't necessary but it feels right and it's more readable and it's obviously harmless. I can resend that. regards, dan carpenter
On Wed, May 29, 2024 at 05:34:20PM GMT, Dan Carpenter wrote: > 1) The Fixes tag points to the wrong commit, though, right? The one > you have here doesn't make sense and it doesn't match the bisect. I'll double check that; thanks for checking. > 2) Can we edit the commitmessage a bit to say include what you wrote > about "but rather bind/accept" being paired. We increment in bind > and we should increment in accept as well. It's the same. I'll update the wording. > 3) The other thing that I notice is that Duoming dropped part of his > commit when he resent v6. > https://lore.kernel.org/all/5c61fea1b20f3c1596e4fb46282c3dedc54513a3.1715065005.git.duoming@zju.edu.cn/ > That part of the commit was correct. Maybe it wasn't necessary but it > feels right and it's more readable and it's obviously harmless. I can > resend that. Just so that I'm clear, with that comment you're not suggesting any changes to my patch, right?
On Wed, 29 May 2024 17:34:20 +0300 Dan Carpenter wrote: > 1) The Fixes tag points to the wrong commit, though, right? The one > you have here doesn't make sense and it doesn't match the bisect. I also have tested Lars Kellogg-Stedman`s patch, it works well. I think the Fixes tag shoud be 9fd75b66b8f6 ("ax25: Fix refcount leaks caused by ax25_cb_del()"). > 2) Can we edit the commitmessage a bit to say include what you wrote > about "but rather bind/accept" being paired. We increment in bind > and we should increment in accept as well. It's the same. > > 3) The other thing that I notice is that Duoming dropped part of his > commit when he resent v6. > https://lore.kernel.org/all/5c61fea1b20f3c1596e4fb46282c3dedc54513a3.1715065005.git.duoming@zju.edu.cn/ > That part of the commit was correct. Maybe it wasn't necessary but it > feels right and it's more readable and it's obviously harmless. I can > resend that. I will resend it latter. Best regards, Duoming Zhou
On Wed, May 29, 2024 at 10:54:45AM -0400, Lars Kellogg-Stedman wrote: > > 3) The other thing that I notice is that Duoming dropped part of his > > commit when he resent v6. > > https://lore.kernel.org/all/5c61fea1b20f3c1596e4fb46282c3dedc54513a3.1715065005.git.duoming@zju.edu.cn/ > > That part of the commit was correct. Maybe it wasn't necessary but it > > feels right and it's more readable and it's obviously harmless. I can > > resend that. > > Just so that I'm clear, with that comment you're not suggesting any > changes to my patch, right? No, I just noticed it while reviewing the code. I'll take care of it. regards, dan carpenter
On Wed, May 29, 2024 at 11:01:52PM +0800, duoming@zju.edu.cn wrote: > On Wed, 29 May 2024 17:34:20 +0300 Dan Carpenter wrote: > > 1) The Fixes tag points to the wrong commit, though, right? The one > > you have here doesn't make sense and it doesn't match the bisect. > > I also have tested Lars Kellogg-Stedman`s patch, it works well. I think the Fixes > tag shoud be 9fd75b66b8f6 ("ax25: Fix refcount leaks caused by ax25_cb_del()"). > > > 2) Can we edit the commitmessage a bit to say include what you wrote > > about "but rather bind/accept" being paired. We increment in bind > > and we should increment in accept as well. It's the same. > > > > 3) The other thing that I notice is that Duoming dropped part of his > > commit when he resent v6. > > https://lore.kernel.org/all/5c61fea1b20f3c1596e4fb46282c3dedc54513a3.1715065005.git.duoming@zju.edu.cn/ > > That part of the commit was correct. Maybe it wasn't necessary but it > > feels right and it's more readable and it's obviously harmless. I can > > resend that. > > I will resend it latter. Awesome! Thanks, and thanks for testing as well. regards, dan carpenter
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index 8077cf2ee44..d6f9fae06a9 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -1378,8 +1378,10 @@ static int ax25_accept(struct socket *sock, struct socket *newsock, { struct sk_buff *skb; struct sock *newsk; + ax25_dev *ax25_dev; DEFINE_WAIT(wait); struct sock *sk; + ax25_cb *ax25; int err = 0; if (sock->state != SS_UNCONNECTED) @@ -1434,6 +1436,10 @@ static int ax25_accept(struct socket *sock, struct socket *newsock, kfree_skb(skb); sk_acceptq_removed(sk); newsock->state = SS_CONNECTED; + ax25 = sk_to_ax25(newsk); + ax25_dev = ax25->ax25_dev; + netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC); + ax25_dev_hold(ax25_dev); out: release_sock(sk);