Message ID | 20220311014624.51117-1-duoming@zju.edu.cn (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [V3] ax25: Fix refcount leaks caused by ax25_cb_del() | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
On Fri, Mar 11, 2022 at 09:46:24AM +0800, Duoming Zhou wrote: > This patch adds a flag in ax25_dev in order to prevent reference count > leaks. If the above condition happens, the "test_bit" condition check > in ax25_kill_by_device() could pass and the refcounts could be > decreased properly. > Why are you using test_bit()? Just use booleans. I am not a networking person but this whole thing feel like adding on layers of cruft. If refcounting needs a lot of additional tracking information then probably the reference counts are not being done in the correct location. > diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c > index 6bd09718077..fc564b87acc 100644 > --- a/net/ax25/af_ax25.c > +++ b/net/ax25/af_ax25.c > @@ -86,6 +86,7 @@ static void ax25_kill_by_device(struct net_device *dev) > again: > ax25_for_each(s, &ax25_list) { > if (s->ax25_dev == ax25_dev) { > + set_bit(AX25_DEV_KILL, &ax25_dev->flag); > sk = s->sk; > if (!sk) { > spin_unlock_bh(&ax25_list_lock); Why can this not be a local variable. bool found = false; ax25_for_each(s, &ax25_list) { if (s->ax25_dev == ax25_dev) { found = true; ... > @@ -115,6 +116,10 @@ static void ax25_kill_by_device(struct net_device *dev) > } > } > spin_unlock_bh(&ax25_list_lock); > + if (!test_bit(AX25_DEV_KILL, &ax25_dev->flag) && test_bit(AX25_DEV_BIND, &ax25_dev->flag)) { The comments for ax25_kill_by_device() say: /* * Kill all bound sockets on a dropped device. */ So how can test_bit(AX25_DEV_BIND, &ax25_dev->flag) ever be false at this location? if (!found) { dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); ax25_dev_put(ax25_dev); } But even here, my instinct is that if the refcounting is were done in the correct place we would not need any additional variables. Is there no simpler solution? > diff --git a/net/ax25/ax25_dev.c b/net/ax25/ax25_dev.c > index d2a244e1c26..9b04d74a1be 100644 > --- a/net/ax25/ax25_dev.c > +++ b/net/ax25/ax25_dev.c > @@ -77,6 +77,7 @@ void ax25_dev_device_up(struct net_device *dev) > ax25_dev->values[AX25_VALUES_PACLEN] = AX25_DEF_PACLEN; > ax25_dev->values[AX25_VALUES_PROTOCOL] = AX25_DEF_PROTOCOL; > ax25_dev->values[AX25_VALUES_DS_TIMEOUT]= AX25_DEF_DS_TIMEOUT; > + ax25_dev->flag = AX25_DEV_INIT; There is no need for this, it's allocated with kzalloc(). > > #if defined(CONFIG_AX25_DAMA_SLAVE) || defined(CONFIG_AX25_DAMA_MASTER) > ax25_ds_setup_timer(ax25_dev); regards, dan carpenter
Hello, On Fri, Mar 11, 2022 Dan Carpen wrote: > But even here, my instinct is that if the refcounting is were done in > the correct place we would not need any additional variables. Is there > no simpler solution? I sent "[PATCH net V2 1/2] ax25: Fix refcount leaks caused by ax25_cb_del()" on On Fri, Mar 11, 2022. Could this patch solve your question? Best wishes, Duoming Zhou
On Sun, Mar 13, 2022 at 12:26:45PM +0800, 周多明 wrote: > Hello, > > On Fri, Mar 11, 2022 Dan Carpen wrote: > > > But even here, my instinct is that if the refcounting is were done in > > the correct place we would not need any additional variables. Is there > > no simpler solution? > > I sent "[PATCH net V2 1/2] ax25: Fix refcount leaks caused by ax25_cb_del()" > on On Fri, Mar 11, 2022. Could this patch solve your question? I had a bunch of questions... You just ignored them, and sent a patch called v2 instead of v4 so I was puzzled and confused. I guess the answer is no, could you please answer the questions? regards, dan carpenter
Hello, On Mon, 14 Mar 2022 14:02:56 +0300, Dan Carpen wrote: > > > But even here, my instinct is that if the refcounting is were done in > > > the correct place we would not need any additional variables. Is there > > > no simpler solution? I think there is a simpler solution instead of using any additional variables, which is shown below: diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index 6bd09718077..0886109421a 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -98,8 +98,10 @@ static void ax25_kill_by_device(struct net_device *dev) spin_unlock_bh(&ax25_list_lock); lock_sock(sk); s->ax25_dev = NULL; - dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); - ax25_dev_put(ax25_dev); + if (sk->sk_wq) { + dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); + ax25_dev_put(ax25_dev); + } ax25_disconnect(s, ENETUNREACH); release_sock(sk); spin_lock_bh(&ax25_list_lock); @@ -979,14 +981,20 @@ static int ax25_release(struct socket *sock) { struct sock *sk = sock->sk; ax25_cb *ax25; + ax25_dev *ax25_dev; if (sk == NULL) return 0; sock_hold(sk); - sock_orphan(sk); lock_sock(sk); + sock_orphan(sk); ax25 = sk_to_ax25(sk); + ax25_dev = ax25->ax25_dev; + if (ax25_dev) { + dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); + ax25_dev_put(ax25_dev); + } if (sk->sk_type == SOCK_SEQPACKET) { switch (ax25->state) { we add decrements of refcounts in ax25_release(), and use lock_sock() to do synchronization. If refcounts decrease in ax25_release(), the decrements of refcounts in ax25_kill_by_device() will not be executed and vice versa. 1. If we decrease the refcounts in ax25_release(), the decrements of refcounts in ax25_kill_by_device() will not execute. Because we set NULL to sk->sk_wq in sock_orphan() and we check whether sk->sk_wq is NULL in ax25_kill_by_device(). Only positions (3) and (4) could execute. (Thread 1) | ax25_bind() | ... | ax25_addr_ax25dev() | ax25_dev_hold() //(1) | ... | dev_hold_track() //(2) | (Thread 2) ... | ax25_release() | ... | lock_sock(sk) | sock_orphan(sk) | sk->sk_wq = NULL //set NULL | ... | if (ax25_dev) { | dev_put_track() //(3) | ax25_dev_put() //(4) | (thread 3) | ax25_kill_by_device() | ... | lock_sock(sk) | s->ax25_dev = NULL | if (sk->sk_wq) { //check | dev_put_track() //(5) | ax25_dev_put() //(6) | 2. If we decrease refcounts in ax25_kill_by_device(), the decrements of refcounts in ax25_release() will not execute. Because we set NULL to s->ax25_dev in ax25_kill_by_device() and we check whether ax25_dev is NULL in ax25_release(). Only positions (3) and (4) could execute. (Thread 1) | ax25_bind() | ... | ax25_addr_ax25dev() | ax25_dev_hold() //(1) | ... | dev_hold_track() //(2) | (Thread 2) ... | ax25_kill_by_device() | ... | lock_sock(sk) | s->ax25_dev = NULL //set NULL | if (sk->sk_wq) { | dev_put_track() //(3) | ax25_dev_put() //(4) | ... | (thread 3) | ax25_release() | ... | lock_sock(sk); | sock_orphan(sk); | ... | if (ax25_dev) { //check | dev_put_track() //(5) | ax25_dev_put() //(6) | > > I sent "[PATCH net V2 1/2] ax25: Fix refcount leaks caused by ax25_cb_del()" > > on On Fri, Mar 11, 2022. Could this patch solve your question? > > I had a bunch of questions... You just ignored them, and sent a patch > called v2 instead of v4 so I was puzzled and confused. I guess the > answer is no, could you please answer the questions? I hope my answer could solve your questions. If you still have any questions welcome to send email to me. I will send "[PATCH net V4 1/2] ax25: Fix refcount leaks caused by ax25_cb_del()" as soon as possible. What`s more, I found NPD bugs in ax25 timers, I will send "[PATCH net V4 2/2] ax25: Fix NULL pointer dereferences in ax25 timers" together. Best wishes, Duoming Zhou
diff --git a/include/net/ax25.h b/include/net/ax25.h index 8221af1811d..ea6ca385190 100644 --- a/include/net/ax25.h +++ b/include/net/ax25.h @@ -158,6 +158,10 @@ enum { #define AX25_DEF_PROTOCOL AX25_PROTO_STD_SIMPLEX /* Standard AX.25 */ #define AX25_DEF_DS_TIMEOUT 180000 /* DAMA timeout 3 minutes */ +#define AX25_DEV_INIT 0 +#define AX25_DEV_KILL 0 +#define AX25_DEV_BIND 1 + typedef struct ax25_uid_assoc { struct hlist_node uid_node; refcount_t refcount; @@ -240,6 +244,7 @@ typedef struct ax25_dev { ax25_dama_info dama; #endif refcount_t refcount; + unsigned long flag; } ax25_dev; typedef struct ax25_cb { diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index 6bd09718077..fc564b87acc 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -86,6 +86,7 @@ static void ax25_kill_by_device(struct net_device *dev) again: ax25_for_each(s, &ax25_list) { if (s->ax25_dev == ax25_dev) { + set_bit(AX25_DEV_KILL, &ax25_dev->flag); sk = s->sk; if (!sk) { spin_unlock_bh(&ax25_list_lock); @@ -115,6 +116,10 @@ static void ax25_kill_by_device(struct net_device *dev) } } spin_unlock_bh(&ax25_list_lock); + if (!test_bit(AX25_DEV_KILL, &ax25_dev->flag) && test_bit(AX25_DEV_BIND, &ax25_dev->flag)) { + dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); + ax25_dev_put(ax25_dev); + } } /* @@ -1132,6 +1137,7 @@ static int ax25_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) done: ax25_cb_add(ax25); sock_reset_flag(sk, SOCK_ZAPPED); + set_bit(AX25_DEV_BIND, &ax25_dev->flag); out: release_sock(sk); diff --git a/net/ax25/ax25_dev.c b/net/ax25/ax25_dev.c index d2a244e1c26..9b04d74a1be 100644 --- a/net/ax25/ax25_dev.c +++ b/net/ax25/ax25_dev.c @@ -77,6 +77,7 @@ void ax25_dev_device_up(struct net_device *dev) ax25_dev->values[AX25_VALUES_PACLEN] = AX25_DEF_PACLEN; ax25_dev->values[AX25_VALUES_PROTOCOL] = AX25_DEF_PROTOCOL; ax25_dev->values[AX25_VALUES_DS_TIMEOUT]= AX25_DEF_DS_TIMEOUT; + ax25_dev->flag = AX25_DEV_INIT; #if defined(CONFIG_AX25_DAMA_SLAVE) || defined(CONFIG_AX25_DAMA_MASTER) ax25_ds_setup_timer(ax25_dev);
The previous commit d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") and commit feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") increase the refcounts of ax25_dev and net_device in ax25_bind() and decrease the matching refcounts in ax25_kill_by_device() in order to prevent UAF bugs, but there are reference count leaks. The root cause of refcount leaks is shown below: (Thread 1) | (Thread 2) ax25_bind() | ... | ax25_addr_ax25dev() | ax25_dev_hold() //(1) | ... | dev_hold_track() //(2) | ... | ax25_destroy_socket() | ax25_cb_del() | ... | hlist_del_init() //(3) | | (thread 3) | ax25_kill_by_device() | ... | ax25_for_each(s, &ax25_list) { | if (s->ax25_dev == ax25_dev) //(4) | ... | Firstly, we use ax25_bind() to increase the refcount of ax25_dev in position (1) and increase the refcount of net_device in position (2). Then, we use ax25_cb_del() invoked by ax25_destroy_socket() to delete ax25_cb in hlist in position (3) before calling ax25_kill_by_device(). Finally, the decrements of refcounts in ax25_kill_by_device() will not be executed, because no s->ax25_dev equals to ax25_dev in position (4). This patch adds a flag in ax25_dev in order to prevent reference count leaks. If the above condition happens, the "test_bit" condition check in ax25_kill_by_device() could pass and the refcounts could be decreased properly. Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Fixes: feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") Reported-by: Thomas Osterried <thomas@osterried.de> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> --- Changes in V3: - Use one flag in ax25_dev to check. - Make commit message more clearer. - Fix format problem. include/net/ax25.h | 5 +++++ net/ax25/af_ax25.c | 6 ++++++ net/ax25/ax25_dev.c | 1 + 3 files changed, 12 insertions(+)