Message ID | 20200604144910.133756-1-zhengbin13@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | sunrpc: need delete xprt->timer in xs_destroy | expand |
On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote: > If RPC use udp as it's transport protocol, transport->connect_worker > will call xs_udp_setup_socket. > xs_udp_setup_socket > sock = xs_create_sock > if (IS_ERR(sock)) > goto out; > out: > xprt_unlock_connect > xprt_schedule_autodisconnect > mod_timer > internal_add_timer -->insert xprt->timer to base timer > list > > xs_destroy > cancel_delayed_work_sync(&transport->connect_worker) > xs_xprt_free(xprt) -->free xprt > > Thus use-after-free will happen. > > Signed-off-by: Zheng Bin <zhengbin13@huawei.com> > --- > net/sunrpc/xprtsock.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > index 845d0be805ec..c796808e7f7a 100644 > --- a/net/sunrpc/xprtsock.c > +++ b/net/sunrpc/xprtsock.c > @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt) > dprintk("RPC: xs_destroy xprt %p\n", xprt); > > cancel_delayed_work_sync(&transport->connect_worker); > + del_timer_sync(&xprt->timer); > xs_close(xprt); > cancel_work_sync(&transport->recv_worker); > cancel_work_sync(&transport->error_worker); > -- > 2.26.0.106.g9fadedd > I'm confused. How can this happen given that xprt_destroy() first takes the XPRT_LOCK, and then deletes xprt->timer? Right now, the socket code knows nothing about the details of xprt- >timer and what it is used for. I'd prefer to keep it that way if possible.
The complete process is like this: xprt_destroy wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_UNINTERRUPTIBLE) -->getlock del_timer_sync(&xprt->timer) -->del xprt->timer INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb) xprt_destroy_cb xs_destroy(xprt->ops->destroy) cancel_delayed_work_sync -->will call transport->connect_worker, whose callback is xs_udp_setup_socket xs_xprt_free(xprt) -->free xprt xs_udp_setup_socket sock = xs_create_sock xprt_unlock_connect if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is XPRT_LOCKED xprt_schedule_autodisconnect mod_timer internal_add_timer -->insert xprt->timer to base timer list On 2020/6/4 23:39, Trond Myklebust wrote: > On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote: >> If RPC use udp as it's transport protocol, transport->connect_worker >> will call xs_udp_setup_socket. >> xs_udp_setup_socket >> sock = xs_create_sock >> if (IS_ERR(sock)) >> goto out; >> out: >> xprt_unlock_connect >> xprt_schedule_autodisconnect >> mod_timer >> internal_add_timer -->insert xprt->timer to base timer >> list >> >> xs_destroy >> cancel_delayed_work_sync(&transport->connect_worker) >> xs_xprt_free(xprt) -->free xprt >> >> Thus use-after-free will happen. >> >> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> >> --- >> net/sunrpc/xprtsock.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c >> index 845d0be805ec..c796808e7f7a 100644 >> --- a/net/sunrpc/xprtsock.c >> +++ b/net/sunrpc/xprtsock.c >> @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt) >> dprintk("RPC: xs_destroy xprt %p\n", xprt); >> >> cancel_delayed_work_sync(&transport->connect_worker); >> + del_timer_sync(&xprt->timer); >> xs_close(xprt); >> cancel_work_sync(&transport->recv_worker); >> cancel_work_sync(&transport->error_worker); >> -- >> 2.26.0.106.g9fadedd >> > I'm confused. How can this happen given that xprt_destroy() first takes > the XPRT_LOCK, and then deletes xprt->timer? > > Right now, the socket code knows nothing about the details of xprt- >> timer and what it is used for. I'd prefer to keep it that way if > possible. >
The issue trigger like this: mount_fs nfs_try_mount nfs_create_server nfs_init_server nfs_init_client nfs_create_rpc_client rpc_create xprt_create_transport xs_setup_udp xs_setup_xprt INIT_DELAYED_WORK(xs_udp_setup_socket)//timer xs_connect queue_delayed_work rpc_create_xprt rpc_new_client rpc_ping //fail rpc_shutdown rpc_release_client rpc_free_auth rpc_free_client xprt_put xprt_destroy XPRT_LOCKED //wait lock del_timer_sync xprt_destroy_cb xs_destroy cancel_delayed_work_sync --fire--> xs_udp_setup_socket xprt_unlock_connect xprt_schedule_autodisconnect mod_timer xs_xprt_free timer out, access xprt //UAF On 2020/6/5 10:10, Zhengbin (OSKernel) wrote: > The complete process is like this: > > xprt_destroy > wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_UNINTERRUPTIBLE) -->getlock > del_timer_sync(&xprt->timer) -->del xprt->timer > INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb) > > xprt_destroy_cb > xs_destroy(xprt->ops->destroy) > cancel_delayed_work_sync -->will call transport->connect_worker, whose callback is xs_udp_setup_socket > xs_xprt_free(xprt) -->free xprt > > xs_udp_setup_socket > sock = xs_create_sock > xprt_unlock_connect > if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is XPRT_LOCKED > xprt_schedule_autodisconnect > mod_timer > internal_add_timer -->insert xprt->timer to base timer list > > On 2020/6/4 23:39, Trond Myklebust wrote: >> On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote: >>> If RPC use udp as it's transport protocol, transport->connect_worker >>> will call xs_udp_setup_socket. >>> xs_udp_setup_socket >>> sock = xs_create_sock >>> if (IS_ERR(sock)) >>> goto out; >>> out: >>> xprt_unlock_connect >>> xprt_schedule_autodisconnect >>> mod_timer >>> internal_add_timer -->insert xprt->timer to base timer >>> list >>> >>> xs_destroy >>> cancel_delayed_work_sync(&transport->connect_worker) >>> xs_xprt_free(xprt) -->free xprt >>> >>> Thus use-after-free will happen. >>> >>> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> >>> --- >>> net/sunrpc/xprtsock.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c >>> index 845d0be805ec..c796808e7f7a 100644 >>> --- a/net/sunrpc/xprtsock.c >>> +++ b/net/sunrpc/xprtsock.c >>> @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt) >>> dprintk("RPC: xs_destroy xprt %p\n", xprt); >>> >>> cancel_delayed_work_sync(&transport->connect_worker); >>> + del_timer_sync(&xprt->timer); >>> xs_close(xprt); >>> cancel_work_sync(&transport->recv_worker); >>> cancel_work_sync(&transport->error_worker); >>> -- >>> 2.26.0.106.g9fadedd >>> >> I'm confused. How can this happen given that xprt_destroy() first takes >> the XPRT_LOCK, and then deletes xprt->timer? >> >> Right now, the socket code knows nothing about the details of xprt- >>> timer and what it is used for. I'd prefer to keep it that way if >> possible. >> > > > . >
On Fri, 2020-06-05 at 10:10 +0800, Zhengbin (OSKernel) wrote: > The complete process is like this: > > xprt_destroy > wait_on_bit_lock(&xprt->state, XPRT_LOCKED, > TASK_UNINTERRUPTIBLE) > -->getlock > del_timer_sync(&xprt->timer) -->del xprt->timer > INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb) > > xprt_destroy_cb > xs_destroy(xprt->ops->destroy) > cancel_delayed_work_sync -->will call > transport->connect_worker, whose callback is xs_udp_setup_socket > xs_xprt_free(xprt) -->free xprt > > xs_udp_setup_socket > sock = xs_create_sock > xprt_unlock_connect > if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is > XPRT_LOCKED > xprt_schedule_autodisconnect > mod_timer > internal_add_timer -->insert xprt->timer to base timer > list The call to xprt_lock_connect() in xs_connect() is supposed to ensure exclusion w.r.t. wait_on_bit_lock(). In other words if the transport- >connect_worker is actually queued, it is also supposed to be holding the XPRT_LOCK and ensuring that we can't get into the above situation. Why is that not the case here? > > On 2020/6/4 23:39, Trond Myklebust wrote: > > On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote: > > > If RPC use udp as it's transport protocol, transport- > > > >connect_worker > > > will call xs_udp_setup_socket. > > > xs_udp_setup_socket > > > sock = xs_create_sock > > > if (IS_ERR(sock)) > > > goto out; > > > out: > > > xprt_unlock_connect > > > xprt_schedule_autodisconnect > > > mod_timer > > > internal_add_timer -->insert xprt->timer to base > > > timer > > > list > > > > > > xs_destroy > > > cancel_delayed_work_sync(&transport->connect_worker) > > > xs_xprt_free(xprt) -->free xprt > > > > > > Thus use-after-free will happen. > > > > > > Signed-off-by: Zheng Bin <zhengbin13@huawei.com> > > > --- > > > net/sunrpc/xprtsock.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c > > > index 845d0be805ec..c796808e7f7a 100644 > > > --- a/net/sunrpc/xprtsock.c > > > +++ b/net/sunrpc/xprtsock.c > > > @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt > > > *xprt) > > > dprintk("RPC: xs_destroy xprt %p\n", xprt); > > > > > > cancel_delayed_work_sync(&transport->connect_worker); > > > + del_timer_sync(&xprt->timer); > > > xs_close(xprt); > > > cancel_work_sync(&transport->recv_worker); > > > cancel_work_sync(&transport->error_worker); > > > -- > > > 2.26.0.106.g9fadedd > > > > > I'm confused. How can this happen given that xprt_destroy() first > > takes > > the XPRT_LOCK, and then deletes xprt->timer? > > > > Right now, the socket code knows nothing about the details of xprt- > > > timer and what it is used for. I'd prefer to keep it that way if > > possible. > >
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 845d0be805ec..c796808e7f7a 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt) dprintk("RPC: xs_destroy xprt %p\n", xprt); cancel_delayed_work_sync(&transport->connect_worker); + del_timer_sync(&xprt->timer); xs_close(xprt); cancel_work_sync(&transport->recv_worker); cancel_work_sync(&transport->error_worker);
If RPC use udp as it's transport protocol, transport->connect_worker will call xs_udp_setup_socket. xs_udp_setup_socket sock = xs_create_sock if (IS_ERR(sock)) goto out; out: xprt_unlock_connect xprt_schedule_autodisconnect mod_timer internal_add_timer -->insert xprt->timer to base timer list xs_destroy cancel_delayed_work_sync(&transport->connect_worker) xs_xprt_free(xprt) -->free xprt Thus use-after-free will happen. Signed-off-by: Zheng Bin <zhengbin13@huawei.com> --- net/sunrpc/xprtsock.c | 1 + 1 file changed, 1 insertion(+) -- 2.26.0.106.g9fadedd