Message ID | 20220315015403.79201-1-duoming@zju.edu.cn (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,V4,1/2] ax25: Fix refcount leaks caused by ax25_cb_del() | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for net |
netdev/fixes_present | success | Fixes tag present in non-next series |
netdev/subject_prefix | success | Link |
netdev/cover_letter | success | Single patches do not need cover letters |
netdev/patch_count | success | Link |
netdev/header_inline | success | No static functions without inline keyword in header files |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/cc_maintainers | success | CCed 8 of 8 maintainers |
netdev/build_clang | success | Errors and warnings before: 0 this patch: 0 |
netdev/module_param | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Signed-off-by tag matches author and committer |
netdev/verify_fixes | success | Fixes tag looks correct |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/checkpatch | warning | WARNING: line length of 85 exceeds 80 columns |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/source_inline | success | Was 0 now: 0 |
On Tue, Mar 15, 2022 at 09:54:03AM +0800, Duoming Zhou wrote: > The previous commit d01ffb9eee4a ("ax25: add refcount in ax25_dev to > avoid UAF bugs") and commit feef318c855a ("ax25: fix UAF bugs of > net_device caused by rebinding operation") increase the refcounts of > ax25_dev and net_device in ax25_bind() and decrease the matching refcounts > in ax25_kill_by_device() in order to prevent UAF bugs, but there are > reference count leaks. > > The root cause of refcount leaks is shown below: > > (Thread 1) | (Thread 2) > ax25_bind() | > ... | > ax25_addr_ax25dev() | > ax25_dev_hold() //(1) | > ... | > dev_hold_track() //(2) | > ... | ax25_destroy_socket() > | ax25_cb_del() > | ... > | hlist_del_init() //(3) > | > | > (Thread 3) | > ax25_kill_by_device() | > ... | > ax25_for_each(s, &ax25_list) { | > if (s->ax25_dev == ax25_dev) //(4) | > ... | > > Firstly, we use ax25_bind() to increase the refcount of ax25_dev in > position (1) and increase the refcount of net_device in position (2). > Then, we use ax25_cb_del() invoked by ax25_destroy_socket() to delete > ax25_cb in hlist in position (3) before calling ax25_kill_by_device(). > Finally, the decrements of refcounts in ax25_kill_by_device() will not > be executed, because no s->ax25_dev equals to ax25_dev in position (4). > > This patch adds decrements of refcounts in ax25_release() and use > lock_sock() to do synchronization. If refcounts decrease in ax25_release(), > the decrements of refcounts in ax25_kill_by_device() will not be > executed and vice versa. > > Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") > Fixes: 87563a043cef ("ax25: fix reference count leaks of ax25_dev") > Fixes: feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") > Reported-by: Thomas Osterried <thomas@osterried.de> > Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> > --- > Changes in V4: > - Add decrements of refcounts in ax25_release() instead of using any additional variables. I'm happy that this is simpler. I'm not super happy about the if (sk->sk_wq) check. That seems like a fragile side-effect condition instead of something deliberate. But I don't know networking so maybe this is something which we can rely on. When you sent the earlier patch then I asked if the devices in ax25_kill_by_device() were always bound and if we could just use a local variable instead of something tied to the ax25_dev struct. I still wonder about that. In other words, could we just do this? regards, dan carpenter diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index 6bd097180772..4af9d9a939c6 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -78,6 +78,7 @@ static void ax25_kill_by_device(struct net_device *dev) ax25_dev *ax25_dev; ax25_cb *s; struct sock *sk; + bool found = false; if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL) return; @@ -86,6 +87,7 @@ static void ax25_kill_by_device(struct net_device *dev) again: ax25_for_each(s, &ax25_list) { if (s->ax25_dev == ax25_dev) { + found = true; sk = s->sk; if (!sk) { spin_unlock_bh(&ax25_list_lock); @@ -115,6 +117,11 @@ static void ax25_kill_by_device(struct net_device *dev) } } spin_unlock_bh(&ax25_list_lock); + + if (!found) { + dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); + ax25_dev_put(ax25_dev); + } } /*
Hello, On Tue, 15 Mar 2022 13:26:57 +0300, Dan Carpenter wrote: > I'm happy that this is simpler. I'm not super happy about the > if (sk->sk_wq) check. That seems like a fragile side-effect condition > instead of something deliberate. But I don't know networking so maybe > this is something which we can rely on. The variable sk->sk_wq is the address of waiting queue of sock, it is initialized to the address of sock->wq through the following path: sock_create->__sock_create->ax25_create()->sock_init_data()->RCU_INIT_POINTER(sk->sk_wq, &sock->wq). Because we have used sock_alloc() to allocate the socket in __sock_create(), sock or the address of sock->wq is not null. What`s more, sk->sk_wq is set to null only in sock_orphan(). Another solution: We could also use sk->sk_socket to check. We set sk->sk_socket to sock in the following path: sock_create()->__sock_create()->ax25_create()->sock_init_data()->sk_set_socket(sk, sock). Because we have used sock_alloc() to allocate the socket in __sock_create(), sock or sk->sk_socket is not null. What`s more, sk->sk_socket is set to null only in sock_orphan(). I will change the if (sk->sk_wq) check to if(sk->sk_socket) check, because I think it is easier to understand. > When you sent the earlier patch then I asked if the devices in > ax25_kill_by_device() were always bound and if we could just use a local > variable instead of something tied to the ax25_dev struct. I still > wonder about that. In other words, could we just do this? > > diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c > index 6bd097180772..4af9d9a939c6 100644 > --- a/net/ax25/af_ax25.c > +++ b/net/ax25/af_ax25.c > @@ -78,6 +78,7 @@ static void ax25_kill_by_device(struct net_device *dev) > ax25_dev *ax25_dev; > ax25_cb *s; > struct sock *sk; > + bool found = false; > > if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL) > return; > @@ -86,6 +87,7 @@ static void ax25_kill_by_device(struct net_device *dev) > again: > ax25_for_each(s, &ax25_list) { > if (s->ax25_dev == ax25_dev) { > + found = true; > sk = s->sk; > if (!sk) { > spin_unlock_bh(&ax25_list_lock); > @@ -115,6 +117,11 @@ static void ax25_kill_by_device(struct net_device *dev) > } > } > spin_unlock_bh(&ax25_list_lock); > + > + if (!found) { > + dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); > + ax25_dev_put(ax25_dev); > + } > } If we just use ax25_dev_device_up() to bring device up without using ax25_bind(), the "found" flag could be false when we enter ax25_kill_by_device() and the refcounts underflow will happen. So we should use two additional variables. If we use additional variables to fix the bug, I think there is a problem. In the real world, the device could be detached only once. If the following race condition happens, we could not deallocate ax25_dev and net_device anymore, because we could not call ax25_kill_by_device() again. (Thread 1) | (Thread 2) ax25_bind() | | ax25_kill_by_device() //decrease refcounts (Thread 3) | ax25_bind() | ... | ... ax25_dev_hold() //(1) | dev_hold_track() //(2) | | ax25_dev_device_down() In patch "[PATCH net V4 1/2] ax25: Fix refcount leaks caused by ax25_cb_del()", even the device has been detached, we could also decrease the refcouns by using ax25_release(), which could ensure ax25_dev and net_device could be deallocated. So I think "[PATCH net V4 1/2]" is better. Best wishes, Duoming Zhou
On Tue, Mar 15, 2022 at 10:11:10PM +0800, 周多明 wrote: > Hello, > > On Tue, 15 Mar 2022 13:26:57 +0300, Dan Carpenter wrote: > > I'm happy that this is simpler. I'm not super happy about the > > if (sk->sk_wq) check. That seems like a fragile side-effect condition > > instead of something deliberate. But I don't know networking so maybe > > this is something which we can rely on. > > The variable sk->sk_wq is the address of waiting queue of sock, it is initialized to the > address of sock->wq through the following path: > sock_create->__sock_create->ax25_create()->sock_init_data()->RCU_INIT_POINTER(sk->sk_wq, &sock->wq). > Because we have used sock_alloc() to allocate the socket in __sock_create(), sock or the address of > sock->wq is not null. > What`s more, sk->sk_wq is set to null only in sock_orphan(). > > Another solution: > We could also use sk->sk_socket to check. We set sk->sk_socket to sock in the following path: > sock_create()->__sock_create()->ax25_create()->sock_init_data()->sk_set_socket(sk, sock). > Because we have used sock_alloc() to allocate the socket in __sock_create(), sock or sk->sk_socket > is not null. > What`s more, sk->sk_socket is set to null only in sock_orphan(). > > I will change the if (sk->sk_wq) check to if(sk->sk_socket) check, because I think it is > easier to understand. > > > When you sent the earlier patch then I asked if the devices in > > ax25_kill_by_device() were always bound and if we could just use a local > > variable instead of something tied to the ax25_dev struct. I still > > wonder about that. In other words, could we just do this? > > > > diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c > > index 6bd097180772..4af9d9a939c6 100644 > > --- a/net/ax25/af_ax25.c > > +++ b/net/ax25/af_ax25.c > > @@ -78,6 +78,7 @@ static void ax25_kill_by_device(struct net_device *dev) > > ax25_dev *ax25_dev; > > ax25_cb *s; > > struct sock *sk; > > + bool found = false; > > > > if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL) > > return; > > @@ -86,6 +87,7 @@ static void ax25_kill_by_device(struct net_device *dev) > > again: > > ax25_for_each(s, &ax25_list) { > > if (s->ax25_dev == ax25_dev) { > > + found = true; > > sk = s->sk; > > if (!sk) { > > spin_unlock_bh(&ax25_list_lock); > > @@ -115,6 +117,11 @@ static void ax25_kill_by_device(struct net_device *dev) > > } > > } > > spin_unlock_bh(&ax25_list_lock); > > + > > + if (!found) { > > + dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); > > + ax25_dev_put(ax25_dev); > > + } > > } > > If we just use ax25_dev_device_up() to bring device up without using ax25_bind(), > the "found" flag could be false when we enter ax25_kill_by_device() and the refcounts > underflow will happen. So we should use two additional variables. That answers my question. Thank you. > > If we use additional variables to fix the bug, I think there is a problem. So the v3 patch was buggy? Why was this not explained under the --- cut off line? regards, dan carpenter
Hello, On Tue, 15 Mar 2022 17:19:05 +0300, Dan Carpenter wrote: > So the v3 patch was buggy? I think v3 is not a good patch that could be applied in the real world. > Why was this not explained under the --- cut off line? I will add explanation under the --- cut off line in [PATCH net V5 1/2] and send it to you as soon as possible. Best wishes, Duoming Zhou
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c index 6bd09718077..0886109421a 100644 --- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -98,8 +98,10 @@ static void ax25_kill_by_device(struct net_device *dev) spin_unlock_bh(&ax25_list_lock); lock_sock(sk); s->ax25_dev = NULL; - dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); - ax25_dev_put(ax25_dev); + if (sk->sk_wq) { + dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); + ax25_dev_put(ax25_dev); + } ax25_disconnect(s, ENETUNREACH); release_sock(sk); spin_lock_bh(&ax25_list_lock); @@ -979,14 +981,20 @@ static int ax25_release(struct socket *sock) { struct sock *sk = sock->sk; ax25_cb *ax25; + ax25_dev *ax25_dev; if (sk == NULL) return 0; sock_hold(sk); - sock_orphan(sk); lock_sock(sk); + sock_orphan(sk); ax25 = sk_to_ax25(sk); + ax25_dev = ax25->ax25_dev; + if (ax25_dev) { + dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); + ax25_dev_put(ax25_dev); + } if (sk->sk_type == SOCK_SEQPACKET) { switch (ax25->state) {
The previous commit d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") and commit feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") increase the refcounts of ax25_dev and net_device in ax25_bind() and decrease the matching refcounts in ax25_kill_by_device() in order to prevent UAF bugs, but there are reference count leaks. The root cause of refcount leaks is shown below: (Thread 1) | (Thread 2) ax25_bind() | ... | ax25_addr_ax25dev() | ax25_dev_hold() //(1) | ... | dev_hold_track() //(2) | ... | ax25_destroy_socket() | ax25_cb_del() | ... | hlist_del_init() //(3) | | (Thread 3) | ax25_kill_by_device() | ... | ax25_for_each(s, &ax25_list) { | if (s->ax25_dev == ax25_dev) //(4) | ... | Firstly, we use ax25_bind() to increase the refcount of ax25_dev in position (1) and increase the refcount of net_device in position (2). Then, we use ax25_cb_del() invoked by ax25_destroy_socket() to delete ax25_cb in hlist in position (3) before calling ax25_kill_by_device(). Finally, the decrements of refcounts in ax25_kill_by_device() will not be executed, because no s->ax25_dev equals to ax25_dev in position (4). This patch adds decrements of refcounts in ax25_release() and use lock_sock() to do synchronization. If refcounts decrease in ax25_release(), the decrements of refcounts in ax25_kill_by_device() will not be executed and vice versa. Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Fixes: 87563a043cef ("ax25: fix reference count leaks of ax25_dev") Fixes: feef318c855a ("ax25: fix UAF bugs of net_device caused by rebinding operation") Reported-by: Thomas Osterried <thomas@osterried.de> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> --- Changes in V4: - Add decrements of refcounts in ax25_release() instead of using any additional variables. net/ax25/af_ax25.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)