mbox series

[v3,0/3] auth_gss: netns refcount leaks when use-gss-proxy==1

Message ID 1560341370-24197-1-git-send-email-wenbinzeng@tencent.com (mailing list archive)
Headers show
Series auth_gss: netns refcount leaks when use-gss-proxy==1 | expand

Message

Wenbin Zeng June 12, 2019, 12:09 p.m. UTC
This patch series fixes an auth_gss bug that results in netns refcount
leaks when use-gss-proxy is set to 1.

The problem was found in privileged docker containers with gssproxy service
enabled and /proc/net/rpc/use-gss-proxy set to 1, the corresponding
struct net->count ends up at 2 after container gets killed, the consequence
is that the struct net cannot be freed.

It turns out that write_gssp() called gssp_rpc_create() to create a rpc
client, this increases net->count by 2; rpcsec_gss_exit_net() is supposed
to decrease net->count but it never gets called because its call-path is:
        net->count==0 -> cleanup_net -> ops_exit_list -> rpcsec_gss_exit_net
Before rpcsec_gss_exit_net() gets called, net->count cannot reach 0, this
is a deadlock situation.

To fix the problem, we must break the deadlock, rpcsec_gss_exit_net()
should move out of the put() path and find another chance to get called,
I think nsfs_evict() is a good place to go, when netns inode gets evicted
we call rpcsec_gss_exit_net() to free the rpc client, this requires a new
callback i.e. evict to be added in struct proc_ns_operations, and add
netns_evict() as one of netns_operations as well.

v1->v2:
 * in nsfs_evict(), move ->evict() in front of ->put()
v2->v3:
 * rpcsec_gss_evict_net() directly call gss_svc_shutdown_net() regardless
   if gssp_clnt is null, this is exactly same to what rpcsec_gss_exit_net()
   previously did

Wenbin Zeng (3):
  nsfs: add evict callback into struct proc_ns_operations
  netns: add netns_evict into netns_operations
  auth_gss: fix deadlock that blocks rpcsec_gss_exit_net when
    use-gss-proxy==1

 fs/nsfs.c                      |  2 ++
 include/linux/proc_ns.h        |  1 +
 include/net/net_namespace.h    |  1 +
 net/core/net_namespace.c       | 12 ++++++++++++
 net/sunrpc/auth_gss/auth_gss.c |  4 ++--
 5 files changed, 18 insertions(+), 2 deletions(-)

Comments

J. Bruce Fields Aug. 1, 2019, 7:53 p.m. UTC | #1
I lost track, what happened to these patches?

--b.

On Wed, Jun 12, 2019 at 08:09:27PM +0800, Wenbin Zeng wrote:
> This patch series fixes an auth_gss bug that results in netns refcount
> leaks when use-gss-proxy is set to 1.
> 
> The problem was found in privileged docker containers with gssproxy service
> enabled and /proc/net/rpc/use-gss-proxy set to 1, the corresponding
> struct net->count ends up at 2 after container gets killed, the consequence
> is that the struct net cannot be freed.
> 
> It turns out that write_gssp() called gssp_rpc_create() to create a rpc
> client, this increases net->count by 2; rpcsec_gss_exit_net() is supposed
> to decrease net->count but it never gets called because its call-path is:
>         net->count==0 -> cleanup_net -> ops_exit_list -> rpcsec_gss_exit_net
> Before rpcsec_gss_exit_net() gets called, net->count cannot reach 0, this
> is a deadlock situation.
> 
> To fix the problem, we must break the deadlock, rpcsec_gss_exit_net()
> should move out of the put() path and find another chance to get called,
> I think nsfs_evict() is a good place to go, when netns inode gets evicted
> we call rpcsec_gss_exit_net() to free the rpc client, this requires a new
> callback i.e. evict to be added in struct proc_ns_operations, and add
> netns_evict() as one of netns_operations as well.
> 
> v1->v2:
>  * in nsfs_evict(), move ->evict() in front of ->put()
> v2->v3:
>  * rpcsec_gss_evict_net() directly call gss_svc_shutdown_net() regardless
>    if gssp_clnt is null, this is exactly same to what rpcsec_gss_exit_net()
>    previously did
> 
> Wenbin Zeng (3):
>   nsfs: add evict callback into struct proc_ns_operations
>   netns: add netns_evict into netns_operations
>   auth_gss: fix deadlock that blocks rpcsec_gss_exit_net when
>     use-gss-proxy==1
> 
>  fs/nsfs.c                      |  2 ++
>  include/linux/proc_ns.h        |  1 +
>  include/net/net_namespace.h    |  1 +
>  net/core/net_namespace.c       | 12 ++++++++++++
>  net/sunrpc/auth_gss/auth_gss.c |  4 ++--
>  5 files changed, 18 insertions(+), 2 deletions(-)
> 
> -- 
> 1.8.3.1
Wang Hai Aug. 28, 2021, 11:26 a.m. UTC | #2
在 2019/8/2 3:53, J. Bruce Fields 写道:
> I lost track, what happened to these patches?
>
> --b.
>
> On Wed, Jun 12, 2019 at 08:09:27PM +0800, Wenbin Zeng wrote:
>> This patch series fixes an auth_gss bug that results in netns refcount
>> leaks when use-gss-proxy is set to 1.
>>
>> The problem was found in privileged docker containers with gssproxy service
>> enabled and /proc/net/rpc/use-gss-proxy set to 1, the corresponding
>> struct net->count ends up at 2 after container gets killed, the consequence
>> is that the struct net cannot be freed.
>>
>> It turns out that write_gssp() called gssp_rpc_create() to create a rpc
>> client, this increases net->count by 2; rpcsec_gss_exit_net() is supposed
>> to decrease net->count but it never gets called because its call-path is:
>>          net->count==0 -> cleanup_net -> ops_exit_list -> rpcsec_gss_exit_net
>> Before rpcsec_gss_exit_net() gets called, net->count cannot reach 0, this
>> is a deadlock situation.
>>
>> To fix the problem, we must break the deadlock, rpcsec_gss_exit_net()
>> should move out of the put() path and find another chance to get called,
>> I think nsfs_evict() is a good place to go, when netns inode gets evicted
>> we call rpcsec_gss_exit_net() to free the rpc client, this requires a new
>> callback i.e. evict to be added in struct proc_ns_operations, and add
>> netns_evict() as one of netns_operations as well.
>>
>> v1->v2:
>>   * in nsfs_evict(), move ->evict() in front of ->put()
>> v2->v3:
>>   * rpcsec_gss_evict_net() directly call gss_svc_shutdown_net() regardless
>>     if gssp_clnt is null, this is exactly same to what rpcsec_gss_exit_net()
>>     previously did
>>
>> Wenbin Zeng (3):
>>    nsfs: add evict callback into struct proc_ns_operations
>>    netns: add netns_evict into netns_operations
>>    auth_gss: fix deadlock that blocks rpcsec_gss_exit_net when
>>      use-gss-proxy==1
>>
>>   fs/nsfs.c                      |  2 ++
>>   include/linux/proc_ns.h        |  1 +
>>   include/net/net_namespace.h    |  1 +
>>   net/core/net_namespace.c       | 12 ++++++++++++
>>   net/sunrpc/auth_gss/auth_gss.c |  4 ++--
>>   5 files changed, 18 insertions(+), 2 deletions(-)
>>
>> -- 
>> 1.8.3.1
These patchsets don't seem to merge into the mainline, are there any 
other patches that fix this bug?