diff mbox series

sunrpc: need delete xprt->timer in xs_destroy

Message ID 20200604144910.133756-1-zhengbin13@huawei.com (mailing list archive)
State New, archived
Headers show
Series sunrpc: need delete xprt->timer in xs_destroy | expand

Commit Message

Zheng Bin June 4, 2020, 2:49 p.m. UTC
If RPC use udp as it's transport protocol, transport->connect_worker
will call xs_udp_setup_socket.
xs_udp_setup_socket
  sock = xs_create_sock
  if (IS_ERR(sock))
    goto out;
  out:
    xprt_unlock_connect
      xprt_schedule_autodisconnect
        mod_timer
          internal_add_timer  -->insert xprt->timer to base timer list

xs_destroy
  cancel_delayed_work_sync(&transport->connect_worker)
  xs_xprt_free(xprt)           -->free xprt

Thus use-after-free will happen.

Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
---
 net/sunrpc/xprtsock.c | 1 +
 1 file changed, 1 insertion(+)

--
2.26.0.106.g9fadedd

Comments

Trond Myklebust June 4, 2020, 3:39 p.m. UTC | #1
On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
> If RPC use udp as it's transport protocol, transport->connect_worker
> will call xs_udp_setup_socket.
> xs_udp_setup_socket
>   sock = xs_create_sock
>   if (IS_ERR(sock))
>     goto out;
>   out:
>     xprt_unlock_connect
>       xprt_schedule_autodisconnect
>         mod_timer
>           internal_add_timer  -->insert xprt->timer to base timer
> list
> 
> xs_destroy
>   cancel_delayed_work_sync(&transport->connect_worker)
>   xs_xprt_free(xprt)           -->free xprt
> 
> Thus use-after-free will happen.
> 
> Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
> ---
>  net/sunrpc/xprtsock.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 845d0be805ec..c796808e7f7a 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt)
>  	dprintk("RPC:       xs_destroy xprt %p\n", xprt);
> 
>  	cancel_delayed_work_sync(&transport->connect_worker);
> +	del_timer_sync(&xprt->timer);
>  	xs_close(xprt);
>  	cancel_work_sync(&transport->recv_worker);
>  	cancel_work_sync(&transport->error_worker);
> --
> 2.26.0.106.g9fadedd
> 

I'm confused. How can this happen given that xprt_destroy() first takes
the XPRT_LOCK, and then deletes xprt->timer?

Right now, the socket code knows nothing about the details of xprt-
>timer and what it is used for. I'd prefer to keep it that way if
possible.
Zheng Bin June 5, 2020, 2:10 a.m. UTC | #2
The complete process is like this:

xprt_destroy
   wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_UNINTERRUPTIBLE)  
-->getlock
   del_timer_sync(&xprt->timer)   -->del xprt->timer
   INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb)

xprt_destroy_cb
   xs_destroy(xprt->ops->destroy)
     cancel_delayed_work_sync     -->will call 
transport->connect_worker, whose callback is xs_udp_setup_socket
     xs_xprt_free(xprt)                    -->free xprt

xs_udp_setup_socket
   sock = xs_create_sock
   xprt_unlock_connect
       if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is XPRT_LOCKED
       xprt_schedule_autodisconnect
         mod_timer
           internal_add_timer  -->insert xprt->timer to base timer list

On 2020/6/4 23:39, Trond Myklebust wrote:
> On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
>> If RPC use udp as it's transport protocol, transport->connect_worker
>> will call xs_udp_setup_socket.
>> xs_udp_setup_socket
>>    sock = xs_create_sock
>>    if (IS_ERR(sock))
>>      goto out;
>>    out:
>>      xprt_unlock_connect
>>        xprt_schedule_autodisconnect
>>          mod_timer
>>            internal_add_timer  -->insert xprt->timer to base timer
>> list
>>
>> xs_destroy
>>    cancel_delayed_work_sync(&transport->connect_worker)
>>    xs_xprt_free(xprt)           -->free xprt
>>
>> Thus use-after-free will happen.
>>
>> Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
>> ---
>>   net/sunrpc/xprtsock.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>> index 845d0be805ec..c796808e7f7a 100644
>> --- a/net/sunrpc/xprtsock.c
>> +++ b/net/sunrpc/xprtsock.c
>> @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt)
>>   	dprintk("RPC:       xs_destroy xprt %p\n", xprt);
>>
>>   	cancel_delayed_work_sync(&transport->connect_worker);
>> +	del_timer_sync(&xprt->timer);
>>   	xs_close(xprt);
>>   	cancel_work_sync(&transport->recv_worker);
>>   	cancel_work_sync(&transport->error_worker);
>> --
>> 2.26.0.106.g9fadedd
>>
> I'm confused. How can this happen given that xprt_destroy() first takes
> the XPRT_LOCK, and then deletes xprt->timer?
>
> Right now, the socket code knows nothing about the details of xprt-
>> timer and what it is used for. I'd prefer to keep it that way if
> possible.
>
Yue Haibing June 5, 2020, 10:18 a.m. UTC | #3
The issue trigger like this:

mount_fs
  nfs_try_mount
   nfs_create_server
    nfs_init_server
     nfs_init_client
      nfs_create_rpc_client
       rpc_create
        xprt_create_transport
          xs_setup_udp
            xs_setup_xprt
          INIT_DELAYED_WORK(xs_udp_setup_socket)//timer
                                                                xs_connect
                                                                  queue_delayed_work
        rpc_create_xprt
          rpc_new_client
          rpc_ping //fail
          rpc_shutdown
            rpc_release_client
               rpc_free_auth
                  rpc_free_client
                      xprt_put
                        xprt_destroy
                          XPRT_LOCKED //wait lock
                          del_timer_sync
                          xprt_destroy_cb
                            xs_destroy
                               cancel_delayed_work_sync  --fire--> xs_udp_setup_socket
                                                                 xprt_unlock_connect
                                                                   xprt_schedule_autodisconnect
                                                                     mod_timer
                                 xs_xprt_free
                                                                 timer out, access xprt //UAF


On 2020/6/5 10:10, Zhengbin (OSKernel) wrote:
> The complete process is like this:
> 
> xprt_destroy
>   wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_UNINTERRUPTIBLE)  -->getlock
>   del_timer_sync(&xprt->timer)   -->del xprt->timer
>   INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb)
> 
> xprt_destroy_cb
>   xs_destroy(xprt->ops->destroy)
>     cancel_delayed_work_sync     -->will call transport->connect_worker, whose callback is xs_udp_setup_socket
>     xs_xprt_free(xprt)                    -->free xprt
> 
> xs_udp_setup_socket
>   sock = xs_create_sock
>   xprt_unlock_connect
>       if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is XPRT_LOCKED
>       xprt_schedule_autodisconnect
>         mod_timer
>           internal_add_timer  -->insert xprt->timer to base timer list
> 
> On 2020/6/4 23:39, Trond Myklebust wrote:
>> On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
>>> If RPC use udp as it's transport protocol, transport->connect_worker
>>> will call xs_udp_setup_socket.
>>> xs_udp_setup_socket
>>>    sock = xs_create_sock
>>>    if (IS_ERR(sock))
>>>      goto out;
>>>    out:
>>>      xprt_unlock_connect
>>>        xprt_schedule_autodisconnect
>>>          mod_timer
>>>            internal_add_timer  -->insert xprt->timer to base timer
>>> list
>>>
>>> xs_destroy
>>>    cancel_delayed_work_sync(&transport->connect_worker)
>>>    xs_xprt_free(xprt)           -->free xprt
>>>
>>> Thus use-after-free will happen.
>>>
>>> Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
>>> ---
>>>   net/sunrpc/xprtsock.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>>> index 845d0be805ec..c796808e7f7a 100644
>>> --- a/net/sunrpc/xprtsock.c
>>> +++ b/net/sunrpc/xprtsock.c
>>> @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt)
>>>       dprintk("RPC:       xs_destroy xprt %p\n", xprt);
>>>
>>>       cancel_delayed_work_sync(&transport->connect_worker);
>>> +    del_timer_sync(&xprt->timer);
>>>       xs_close(xprt);
>>>       cancel_work_sync(&transport->recv_worker);
>>>       cancel_work_sync(&transport->error_worker);
>>> -- 
>>> 2.26.0.106.g9fadedd
>>>
>> I'm confused. How can this happen given that xprt_destroy() first takes
>> the XPRT_LOCK, and then deletes xprt->timer?
>>
>> Right now, the socket code knows nothing about the details of xprt-
>>> timer and what it is used for. I'd prefer to keep it that way if
>> possible.
>>
> 
> 
> .
>
Trond Myklebust June 7, 2020, 1:03 a.m. UTC | #4
On Fri, 2020-06-05 at 10:10 +0800, Zhengbin (OSKernel) wrote:
> The complete process is like this:
> 
> xprt_destroy
>    wait_on_bit_lock(&xprt->state, XPRT_LOCKED,
> TASK_UNINTERRUPTIBLE)  
> -->getlock
>    del_timer_sync(&xprt->timer)   -->del xprt->timer
>    INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb)
> 
> xprt_destroy_cb
>    xs_destroy(xprt->ops->destroy)
>      cancel_delayed_work_sync     -->will call 
> transport->connect_worker, whose callback is xs_udp_setup_socket
>      xs_xprt_free(xprt)                    -->free xprt
> 
> xs_udp_setup_socket
>    sock = xs_create_sock
>    xprt_unlock_connect
>        if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is
> XPRT_LOCKED
>        xprt_schedule_autodisconnect
>          mod_timer
>            internal_add_timer  -->insert xprt->timer to base timer
> list

The call to xprt_lock_connect() in xs_connect() is supposed to ensure
exclusion w.r.t. wait_on_bit_lock(). In other words if the transport-
>connect_worker is actually queued, it is also supposed to be holding
the XPRT_LOCK and ensuring that we can't get into the above situation.

Why is that not the case here?

> 
> On 2020/6/4 23:39, Trond Myklebust wrote:
> > On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
> > > If RPC use udp as it's transport protocol, transport-
> > > >connect_worker
> > > will call xs_udp_setup_socket.
> > > xs_udp_setup_socket
> > >    sock = xs_create_sock
> > >    if (IS_ERR(sock))
> > >      goto out;
> > >    out:
> > >      xprt_unlock_connect
> > >        xprt_schedule_autodisconnect
> > >          mod_timer
> > >            internal_add_timer  -->insert xprt->timer to base
> > > timer
> > > list
> > > 
> > > xs_destroy
> > >    cancel_delayed_work_sync(&transport->connect_worker)
> > >    xs_xprt_free(xprt)           -->free xprt
> > > 
> > > Thus use-after-free will happen.
> > > 
> > > Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
> > > ---
> > >   net/sunrpc/xprtsock.c | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > > index 845d0be805ec..c796808e7f7a 100644
> > > --- a/net/sunrpc/xprtsock.c
> > > +++ b/net/sunrpc/xprtsock.c
> > > @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt
> > > *xprt)
> > >   	dprintk("RPC:       xs_destroy xprt %p\n", xprt);
> > > 
> > >   	cancel_delayed_work_sync(&transport->connect_worker);
> > > +	del_timer_sync(&xprt->timer);
> > >   	xs_close(xprt);
> > >   	cancel_work_sync(&transport->recv_worker);
> > >   	cancel_work_sync(&transport->error_worker);
> > > --
> > > 2.26.0.106.g9fadedd
> > > 
> > I'm confused. How can this happen given that xprt_destroy() first
> > takes
> > the XPRT_LOCK, and then deletes xprt->timer?
> > 
> > Right now, the socket code knows nothing about the details of xprt-
> > > timer and what it is used for. I'd prefer to keep it that way if
> > possible.
> >
diff mbox series

Patch

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 845d0be805ec..c796808e7f7a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1242,6 +1242,7 @@  static void xs_destroy(struct rpc_xprt *xprt)
 	dprintk("RPC:       xs_destroy xprt %p\n", xprt);

 	cancel_delayed_work_sync(&transport->connect_worker);
+	del_timer_sync(&xprt->timer);
 	xs_close(xprt);
 	cancel_work_sync(&transport->recv_worker);
 	cancel_work_sync(&transport->error_worker);