diff mbox series

[v2] ax25: Fix refcount imbalance on inbound connections

Message ID 46ydfjtpinm3py3zt6lltxje4cpdvuugaatbvx4y27m7wxc2hz@4wdtoq7yfrd5 (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [v2] ax25: Fix refcount imbalance on inbound connections | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 905 this patch: 905
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 2 blamed authors not CCed: pabeni@redhat.com duoming@zju.edu.cn; 5 maintainers not CCed: pabeni@redhat.com kuba@kernel.org duoming@zju.edu.cn edumazet@google.com jreuter@yaina.de
netdev/build_clang success Errors and warnings before: 909 this patch: 909
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes fail Problems with Fixes tag: 2
netdev/build_allmodconfig_warn success Errors and warnings before: 909 this patch: 909
netdev/checkpatch warning WARNING: Please use correct Fixes: style 'Fixes: <12 chars of sha1> ("<title line>")' - ie: 'Fixes: 2fa653e0a4db ("ax25: Fix refcount imbalance on inbound connections")' WARNING: Use lore.kernel.org archive links when possible - see https://lore.kernel.org/lists.html
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-05-21--15-00 (tests: 1038)

Commit Message

Lars Kellogg-Stedman May 21, 2024, 1:48 p.m. UTC
The first version of this patch was posted only to the linux-hams
mailing list. It has been difficult to get the patch reviewed, but the
patch has now been tested successfully by three people (that includes
me) who have all verified that it prevents the crashes that were
previously plaguing inbound ax.25 connections.

Related discussions:

- https://marc.info/?l=linux-hams&m=171629285223248&w=2
- https://marc.info/?l=linux-hams&m=171270115728031&w=2

>8------------------------------------------------------8<

When releasing a socket in ax25_release(), we call netdev_put() to
decrease the refcount on the associated ax.25 device. However, the
execution path for accepting an incoming connection never calls
netdev_hold(). This imbalance leads to refcount errors, and ultimately
to kernel crashes.

A typical call trace for the above situation looks like this:

    Call Trace:
    <TASK>
    ? show_regs+0x64/0x70
    ? __warn+0x83/0x120
    ? refcount_warn_saturate+0xb2/0x100
    ? report_bug+0x158/0x190
    ? prb_read_valid+0x20/0x30
    ? handle_bug+0x3e/0x70
    ? exc_invalid_op+0x1c/0x70
    ? asm_exc_invalid_op+0x1f/0x30
    ? refcount_warn_saturate+0xb2/0x100
    ? refcount_warn_saturate+0xb2/0x100
    ax25_release+0x2ad/0x360
    __sock_release+0x35/0xa0
    sock_close+0x19/0x20
    [...]

On reboot (or any attempt to remove the interface), the kernel gets
stuck in an infinite loop:

    unregister_netdevice: waiting for ax0 to become free. Usage count = 0

This patch corrects these issues by ensuring that we call netdev_hold()
and ax25_dev_hold() for new connections in ax25_accept(), balancing the
calls to netdev_put() and ax25_dev_put() in ax25_release.

Fixes: 7d8a3a477b
Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
---
 net/ax25/af_ax25.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Chris Maness May 21, 2024, 2:32 p.m. UTC | #1
Should I expect to this downstream in 6.9.2?

-Chris KQ6UP

On Tue, May 21, 2024 at 7:26 AM Lars Kellogg-Stedman <lars@oddbit.com> wrote:
>
> The first version of this patch was posted only to the linux-hams
> mailing list. It has been difficult to get the patch reviewed, but the
> patch has now been tested successfully by three people (that includes
> me) who have all verified that it prevents the crashes that were
> previously plaguing inbound ax.25 connections.
>
> Related discussions:
>
> - https://marc.info/?l=linux-hams&m=171629285223248&w=2
> - https://marc.info/?l=linux-hams&m=171270115728031&w=2
>
> >8------------------------------------------------------8<
>
> When releasing a socket in ax25_release(), we call netdev_put() to
> decrease the refcount on the associated ax.25 device. However, the
> execution path for accepting an incoming connection never calls
> netdev_hold(). This imbalance leads to refcount errors, and ultimately
> to kernel crashes.
>
> A typical call trace for the above situation looks like this:
>
>     Call Trace:
>     <TASK>
>     ? show_regs+0x64/0x70
>     ? __warn+0x83/0x120
>     ? refcount_warn_saturate+0xb2/0x100
>     ? report_bug+0x158/0x190
>     ? prb_read_valid+0x20/0x30
>     ? handle_bug+0x3e/0x70
>     ? exc_invalid_op+0x1c/0x70
>     ? asm_exc_invalid_op+0x1f/0x30
>     ? refcount_warn_saturate+0xb2/0x100
>     ? refcount_warn_saturate+0xb2/0x100
>     ax25_release+0x2ad/0x360
>     __sock_release+0x35/0xa0
>     sock_close+0x19/0x20
>     [...]
>
> On reboot (or any attempt to remove the interface), the kernel gets
> stuck in an infinite loop:
>
>     unregister_netdevice: waiting for ax0 to become free. Usage count = 0
>
> This patch corrects these issues by ensuring that we call netdev_hold()
> and ax25_dev_hold() for new connections in ax25_accept(), balancing the
> calls to netdev_put() and ax25_dev_put() in ax25_release.
>
> Fixes: 7d8a3a477b
> Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
> ---
>  net/ax25/af_ax25.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
> index 8077cf2ee44..ff921272d40 100644
> --- a/net/ax25/af_ax25.c
> +++ b/net/ax25/af_ax25.c
> @@ -1381,6 +1381,8 @@ static int ax25_accept(struct socket *sock, struct socket *newsock,
>         DEFINE_WAIT(wait);
>         struct sock *sk;
>         int err = 0;
> +       ax25_cb *ax25;
> +       ax25_dev *ax25_dev;
>
>         if (sock->state != SS_UNCONNECTED)
>                 return -EINVAL;
> @@ -1434,6 +1436,10 @@ static int ax25_accept(struct socket *sock, struct socket *newsock,
>         kfree_skb(skb);
>         sk_acceptq_removed(sk);
>         newsock->state = SS_CONNECTED;
> +       ax25 = sk_to_ax25(newsk);
> +       ax25_dev = ax25->ax25_dev;
> +       netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
> +       ax25_dev_hold(ax25_dev);
>
>  out:
>         release_sock(sk);
> --
> 2.45.1
>
> --
> Lars Kellogg-Stedman <lars@oddbit.com> | larsks @ {irc,twitter,github}
> http://blog.oddbit.com/                | N1LKS
>
Naveen Mamindlapalli May 21, 2024, 5:21 p.m. UTC | #2
> -----Original Message-----
> From: Lars Kellogg-Stedman <lars@oddbit.com>
> Sent: Tuesday, May 21, 2024 7:19 PM
> To: netdev@vger.kernel.org; linux-hams@vger.kernel.org
> Subject: [PATCH v2] ax25: Fix refcount imbalance on inbound
> connections
> 
> The first version of this patch was posted only to the linux-hams mailing list. It
> has been difficult to get the patch reviewed, but the patch has now been tested
> successfully by three people (that includes
> me) who have all verified that it prevents the crashes that were previously
> plaguing inbound ax.25 connections.
> 
> Related discussions:
> 
> - https://urldefense.proofpoint.com/v2/url?u=https-3A__marc.info_-3Fl-3Dlinux-
> 2Dhams-26m-3D171629285223248-26w-
> 3D2&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=TwreqwV6mQ8K9wIpqwFO8y
> jikO_w1jUOe2MzChg4Rmg&m=T0z8LSTV_ukO1CjzoYF7lKkzfKSd1iQ3b4biO_v8e
> 0R8Llg9vHP8lSQ3tGo5sXr4&s=-
> MpoDvDpxv4ixHLeXbhPn1j9UnlUz6cJ29sdjMqsA6I&e=
> - https://urldefense.proofpoint.com/v2/url?u=https-3A__marc.info_-3Fl-3Dlinux-
> 2Dhams-26m-3D171270115728031-26w-
> 3D2&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=TwreqwV6mQ8K9wIpqwFO8y
> jikO_w1jUOe2MzChg4Rmg&m=T0z8LSTV_ukO1CjzoYF7lKkzfKSd1iQ3b4biO_v8e
> 0R8Llg9vHP8lSQ3tGo5sXr4&s=3ZUXfkS5ChKuooKI5BBasvYQeJMH-
> FYqbeskhJKJdOI&e=
> 
> >8------------------------------------------------------8<
> 
> When releasing a socket in ax25_release(), we call netdev_put() to decrease the
> refcount on the associated ax.25 device. However, the execution path for
> accepting an incoming connection never calls netdev_hold(). This imbalance
> leads to refcount errors, and ultimately to kernel crashes.
> 
> A typical call trace for the above situation looks like this:
> 
>     Call Trace:
>     <TASK>
>     ? show_regs+0x64/0x70
>     ? __warn+0x83/0x120
>     ? refcount_warn_saturate+0xb2/0x100
>     ? report_bug+0x158/0x190
>     ? prb_read_valid+0x20/0x30
>     ? handle_bug+0x3e/0x70
>     ? exc_invalid_op+0x1c/0x70
>     ? asm_exc_invalid_op+0x1f/0x30
>     ? refcount_warn_saturate+0xb2/0x100
>     ? refcount_warn_saturate+0xb2/0x100
>     ax25_release+0x2ad/0x360
>     __sock_release+0x35/0xa0
>     sock_close+0x19/0x20
>     [...]
> 
> On reboot (or any attempt to remove the interface), the kernel gets stuck in an
> infinite loop:
> 
>     unregister_netdevice: waiting for ax0 to become free. Usage count = 0
> 
> This patch corrects these issues by ensuring that we call netdev_hold() and
> ax25_dev_hold() for new connections in ax25_accept(), balancing the calls to
> netdev_put() and ax25_dev_put() in ax25_release.
> 
> Fixes: 7d8a3a477b
> Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
> ---
>  net/ax25/af_ax25.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index
> 8077cf2ee44..ff921272d40 100644
> --- a/net/ax25/af_ax25.c
> +++ b/net/ax25/af_ax25.c
> @@ -1381,6 +1381,8 @@ static int ax25_accept(struct socket *sock, struct
> socket *newsock,
>  	DEFINE_WAIT(wait);
>  	struct sock *sk;
>  	int err = 0;
> +	ax25_cb *ax25;
> +	ax25_dev *ax25_dev;

nit: Please follow reverse Christmas tree.

> 
>  	if (sock->state != SS_UNCONNECTED)
>  		return -EINVAL;
> @@ -1434,6 +1436,10 @@ static int ax25_accept(struct socket *sock, struct
> socket *newsock,
>  	kfree_skb(skb);
>  	sk_acceptq_removed(sk);
>  	newsock->state = SS_CONNECTED;
> +	ax25 = sk_to_ax25(newsk);
> +	ax25_dev = ax25->ax25_dev;
> +	netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
> +	ax25_dev_hold(ax25_dev);
> 
>  out:
>  	release_sock(sk);
> --
> 2.45.1
> 
> --
> Lars Kellogg-Stedman <lars@oddbit.com> | larsks @ {irc,twitter,github}
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__blog.oddbit.com_&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=TwreqwV6
> mQ8K9wIpqwFO8yjikO_w1jUOe2MzChg4Rmg&m=T0z8LSTV_ukO1CjzoYF7lKkzf
> KSd1iQ3b4biO_v8e0R8Llg9vHP8lSQ3tGo5sXr4&s=8ug-soRMsh7y1vLY4tO8OSx-
> bKun8bXlZ7uAPiQnePU&e=                 | N1LKS
Lars Kellogg-Stedman May 21, 2024, 5:56 p.m. UTC | #3
On Tue, May 21, 2024 at 05:21:40PM GMT, Naveen Mamindlapalli wrote:
> > socket *newsock,
> >  	DEFINE_WAIT(wait);
> >  	struct sock *sk;
> >  	int err = 0;
> > +	ax25_cb *ax25;
> > +	ax25_dev *ax25_dev;
> 
> nit: Please follow reverse Christmas tree.

That is a new phrase for me; I had to look it up. Do you mean this:

        DEFINE_WAIT(wait);
        struct sock *sk;
        int err = 0;
+	      ax25_dev *ax25_dev;
+	      ax25_cb *ax25;

Or should I apply this to the entire block of variable declarations,
like this:

        struct sk_buff *skb;
        struct sock *newsk;
+       ax25_dev *ax25_dev;
        DEFINE_WAIT(wait);
        struct sock *sk;
+       ax25_cb *ax25;
        int err = 0;

Thanks,
Naveen Mamindlapalli May 23, 2024, 6:37 a.m. UTC | #4
> -----Original Message-----
> From: Lars Kellogg-Stedman <lars@oddbit.com>
> Sent: Tuesday, May 21, 2024 11:27 PM
> To: Naveen Mamindlapalli <naveenm@marvell.com>
> Cc: netdev@vger.kernel.org; linux-hams@vger.kernel.org
> Subject: Re: [PATCH v2] ax25: Fix refcount imbalance on inbound
> connections
> 
> On Tue, May 21, 2024 at 05:21:40PM GMT, Naveen Mamindlapalli wrote:
> > > socket *newsock,
> > >  	DEFINE_WAIT(wait);
> > >  	struct sock *sk;
> > >  	int err = 0;
> > > +	ax25_cb *ax25;
> > > +	ax25_dev *ax25_dev;
> >
> > nit: Please follow reverse Christmas tree.
> 
> That is a new phrase for me; I had to look it up. Do you mean this:
> 
>         DEFINE_WAIT(wait);
>         struct sock *sk;
>         int err = 0;
> +	      ax25_dev *ax25_dev;
> +	      ax25_cb *ax25;
> 
> Or should I apply this to the entire block of variable declarations, like this:
> 
>         struct sk_buff *skb;
>         struct sock *newsk;
> +       ax25_dev *ax25_dev;
>         DEFINE_WAIT(wait);
>         struct sock *sk;
> +       ax25_cb *ax25;
>         int err = 0;

Yes, apply reverse xmas tree order to entire block.

Thanks,
Naveen

> 
> Thanks,
> 
> --
> Lars Kellogg-Stedman <lars@oddbit.com> | larsks @ {irc,twitter,github}
> https://urldefense.proofpoint.com/v2/url?u=http-
> 3A__blog.oddbit.com_&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=TwreqwV6
> mQ8K9wIpqwFO8yjikO_w1jUOe2MzChg4Rmg&m=FtlS2pOuM2TyZSjXUe6s5L7w
> o2YtvcK9Ep3HQRqf8dMeSy9VLui3rMQDUcVMFLcK&s=0fvO6BKSQQg3niGImy4
> VLvjVZ0kOAeAjIB2WwdZNYRs&e=                 | N1LKS
diff mbox series

Patch

diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index 8077cf2ee44..ff921272d40 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1381,6 +1381,8 @@  static int ax25_accept(struct socket *sock, struct socket *newsock,
 	DEFINE_WAIT(wait);
 	struct sock *sk;
 	int err = 0;
+	ax25_cb *ax25;
+	ax25_dev *ax25_dev;
 
 	if (sock->state != SS_UNCONNECTED)
 		return -EINVAL;
@@ -1434,6 +1436,10 @@  static int ax25_accept(struct socket *sock, struct socket *newsock,
 	kfree_skb(skb);
 	sk_acceptq_removed(sk);
 	newsock->state = SS_CONNECTED;
+	ax25 = sk_to_ax25(newsk);
+	ax25_dev = ax25->ax25_dev;
+	netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
+	ax25_dev_hold(ax25_dev);
 
 out:
 	release_sock(sk);