diff mbox

net/sunrpc: Add user namespace support

Message ID 20180719174246.GA19824@ircssh-2.c.rugged-nimbus-611.internal (mailing list archive)
State New, archived
Headers show

Commit Message

Sargun Dhillon July 19, 2018, 5:42 p.m. UTC
This adds the ability to pass a non-init user namespace to rpcauth_create,
via rpc_auth_create_args. If the specific authentication mechanism
does not support non-init user namespaces, then it will return an
error.

Currently, the only two authentication mechanisms that support
non-init user namespaces are auth_null, and auth_unix. auth_unix
will send the UID / GID from the user namespace for authentication.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
---
 fs/nfs/nfs4proc.c              |  3 +-
 include/linux/sunrpc/auth.h    |  9 ++--
 net/sunrpc/auth.c              | 17 +++-----
 net/sunrpc/auth_generic.c      |  1 +
 net/sunrpc/auth_gss/auth_gss.c | 10 +++--
 net/sunrpc/auth_null.c         |  3 +-
 net/sunrpc/auth_unix.c         | 97 ++++++++++++++++++++++++++++--------------
 net/sunrpc/clnt.c              |  5 ++-
 8 files changed, 89 insertions(+), 56 deletions(-)

Comments

Trond Myklebust July 19, 2018, 7:45 p.m. UTC | #1
On Thu, 2018-07-19 at 17:42 +0000, Sargun Dhillon wrote:
> This adds the ability to pass a non-init user namespace to

> rpcauth_create,

> via rpc_auth_create_args. If the specific authentication mechanism

> does not support non-init user namespaces, then it will return an

> error.

> 

> Currently, the only two authentication mechanisms that support

> non-init user namespaces are auth_null, and auth_unix. auth_unix

> will send the UID / GID from the user namespace for authentication.

> 


Firstly, please at least Cc the linux-nfs mailing list (as per the
MAINTAINERS file) when changing NFS and sunrpc code.

Secondly, can you please explain why we would want to use any user
namespace other than the one specified in the net namespace structure
(struct net) when communicating with network resources such as
rpc.gssd, the idmapper or, for that matter, the NFS server?

Thanks
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
Sargun Dhillon July 20, 2018, midnight UTC | #2
On Thu, Jul 19, 2018 at 12:45 PM, Trond Myklebust
<trondmy@hammerspace.com> wrote:
>
> On Thu, 2018-07-19 at 17:42 +0000, Sargun Dhillon wrote:
> > This adds the ability to pass a non-init user namespace to
> > rpcauth_create,
> > via rpc_auth_create_args. If the specific authentication mechanism
> > does not support non-init user namespaces, then it will return an
> > error.
> >
> > Currently, the only two authentication mechanisms that support
> > non-init user namespaces are auth_null, and auth_unix. auth_unix
> > will send the UID / GID from the user namespace for authentication.
> >
>
> Firstly, please at least Cc the linux-nfs mailing list (as per the
> MAINTAINERS file) when changing NFS and sunrpc code.
Sorry about that.

>
> Secondly, can you please explain why we would want to use any user
> namespace other than the one specified in the net namespace structure
> (struct net) when communicating with network resources such as
> rpc.gssd, the idmapper or, for that matter, the NFS server?
We mount NFS volumes for containers (user namespaces) today. On
multiple machines, they may have different mappings of uids in the
user namespace to kuids. If this is the case, it breaks auth_unix
because it uses the kuid in the init user ns mapping for the uid it
sends to the server.

I think that if we moved to using the net->user_ns for auth_unix,
that'd be great, but it'd break userspace, as far as I know. We have a
slightly hacked version of this patch that uses the s_user_ns from the
nfs superblock, and I think that uids from the backing store (whether
it be a block device, or a server), should be written as the kuid, and
translated when it goes in and out of the userns.

Do you have any other suggestions, if we eventually want to enable
NFS4 for user namespaces?
>
> Thanks
>   Trond
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
Trond Myklebust July 20, 2018, 12:37 a.m. UTC | #3
On Thu, 2018-07-19 at 17:00 -0700, Sargun Dhillon wrote:
> On Thu, Jul 19, 2018 at 12:45 PM, Trond Myklebust

> <trondmy@hammerspace.com> wrote:

> > 

> > On Thu, 2018-07-19 at 17:42 +0000, Sargun Dhillon wrote:

> > > This adds the ability to pass a non-init user namespace to

> > > rpcauth_create,

> > > via rpc_auth_create_args. If the specific authentication

> > > mechanism

> > > does not support non-init user namespaces, then it will return an

> > > error.

> > > 

> > > Currently, the only two authentication mechanisms that support

> > > non-init user namespaces are auth_null, and auth_unix. auth_unix

> > > will send the UID / GID from the user namespace for

> > > authentication.

> > > 

> > 

> > Firstly, please at least Cc the linux-nfs mailing list (as per the

> > MAINTAINERS file) when changing NFS and sunrpc code.

> 

> Sorry about that.

> 

> > 

> > Secondly, can you please explain why we would want to use any user

> > namespace other than the one specified in the net namespace

> > structure

> > (struct net) when communicating with network resources such as

> > rpc.gssd, the idmapper or, for that matter, the NFS server?

> 

> We mount NFS volumes for containers (user namespaces) today. On

> multiple machines, they may have different mappings of uids in the

> user namespace to kuids. If this is the case, it breaks auth_unix

> because it uses the kuid in the init user ns mapping for the uid it

> sends to the server.

> 


The point is that the user namespace conversions that happen in the
sunrpc layer are all for dealing with services. The AUTH_GSS upcalls
should _only_ be speaking to an rpc.gssd daemon that runs in whatever
container that owns the net namespace (and that created the rpc_pipefs
objects).

Ditto for the idmapper although if you use the keyring based (i.e. the
non legacy) idmapper, that runs in the init namespace.

> I think that if we moved to using the net->user_ns for auth_unix,

> that'd be great, but it'd break userspace, as far as I know. We have

> a

> slightly hacked version of this patch that uses the s_user_ns from

> the

> nfs superblock, and I think that uids from the backing store (whether

> it be a block device, or a server), should be written as the kuid,

> and

> translated when it goes in and out of the userns.


The actual applications running in the containers are interacting
through the standard system calls. They do not need any extra
conversion, because the syscalls convert them to kuids and back.

IOW: We can completely ignore the user namespace of the container,
since that is taken care of at the syscall level.

The only namespaces we care about are:

1) The container that set up the mount in the first place, since
presumably is is authorised to use its own uid/gids when talking to the
mountpoint. That user namespace had better be the same one as the one
saved in 'struct net' that was saved when we set up the mountpoint.

2) The containers that are running rpc.gssd and rpc.idmapd. Again,
those are tied to struct net.

> Do you have any other suggestions, if we eventually want to enable

> NFS4 for user namespaces?


See above.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
Sargun Dhillon July 20, 2018, 6:12 a.m. UTC | #4
On Thu, Jul 19, 2018 at 5:37 PM, Trond Myklebust
<trondmy@hammerspace.com> wrote:
> On Thu, 2018-07-19 at 17:00 -0700, Sargun Dhillon wrote:
>> On Thu, Jul 19, 2018 at 12:45 PM, Trond Myklebust
>> <trondmy@hammerspace.com> wrote:
>> >
>> > On Thu, 2018-07-19 at 17:42 +0000, Sargun Dhillon wrote:
>> > > This adds the ability to pass a non-init user namespace to
>> > > rpcauth_create,
>> > > via rpc_auth_create_args. If the specific authentication
>> > > mechanism
>> > > does not support non-init user namespaces, then it will return an
>> > > error.
>> > >
>> > > Currently, the only two authentication mechanisms that support
>> > > non-init user namespaces are auth_null, and auth_unix. auth_unix
>> > > will send the UID / GID from the user namespace for
>> > > authentication.
>> > >
>> >
>> > Firstly, please at least Cc the linux-nfs mailing list (as per the
>> > MAINTAINERS file) when changing NFS and sunrpc code.
>>
>> Sorry about that.
>>
>> >
>> > Secondly, can you please explain why we would want to use any user
>> > namespace other than the one specified in the net namespace
>> > structure
>> > (struct net) when communicating with network resources such as
>> > rpc.gssd, the idmapper or, for that matter, the NFS server?
>>
>> We mount NFS volumes for containers (user namespaces) today. On
>> multiple machines, they may have different mappings of uids in the
>> user namespace to kuids. If this is the case, it breaks auth_unix
>> because it uses the kuid in the init user ns mapping for the uid it
>> sends to the server.
>>
>
> The point is that the user namespace conversions that happen in the
> sunrpc layer are all for dealing with services. The AUTH_GSS upcalls
> should _only_ be speaking to an rpc.gssd daemon that runs in whatever
> container that owns the net namespace (and that created the rpc_pipefs
> objects).
>
> Ditto for the idmapper although if you use the keyring based (i.e. the
> non legacy) idmapper, that runs in the init namespace.
>
>> I think that if we moved to using the net->user_ns for auth_unix,
>> that'd be great, but it'd break userspace, as far as I know. We have
>> a
>> slightly hacked version of this patch that uses the s_user_ns from
>> the
>> nfs superblock, and I think that uids from the backing store (whether
>> it be a block device, or a server), should be written as the kuid,
>> and
>> translated when it goes in and out of the userns.
>
> The actual applications running in the containers are interacting
> through the standard system calls. They do not need any extra
> conversion, because the syscalls convert them to kuids and back.
>
> IOW: We can completely ignore the user namespace of the container,
> since that is taken care of at the syscall level.
>
> The only namespaces we care about are:
>
> 1) The container that set up the mount in the first place, since
> presumably is is authorised to use its own uid/gids when talking to the
> mountpoint. That user namespace had better be the same one as the one
> saved in 'struct net' that was saved when we set up the mountpoint.
>
> 2) The containers that are running rpc.gssd and rpc.idmapd. Again,
> those are tied to struct net.
>

When the server presents with NFS_CAP_UIDGID_NOMAP, and you use
auth_unix there are no upcalls to rpc.gssd, nor rpc.idmapd. The
mapping to uid in the init user ns are sent to the NFS server, even if
net->user_ns is not init_user_ns. The syscall happens with a user in a
user namespace with, say, ID 0, and their cred has the
from_kuid(&init_user_ns...) of 100, the uid the server receives is
still 100.

If we choose to convert them based on the network namespace, it would
solve the problem just fine, but that'd be a userspace breaking
change. I think we have to use the s_user_ns.

>> Do you have any other suggestions, if we eventually want to enable
>> NFS4 for user namespaces?
>
> See above.
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
Trond Myklebust July 20, 2018, 11:48 a.m. UTC | #5
On Thu, 2018-07-19 at 23:12 -0700, Sargun Dhillon wrote:
> On Thu, Jul 19, 2018 at 5:37 PM, Trond Myklebust

> <trondmy@hammerspace.com> wrote:

> > On Thu, 2018-07-19 at 17:00 -0700, Sargun Dhillon wrote:

> > > On Thu, Jul 19, 2018 at 12:45 PM, Trond Myklebust

> > > <trondmy@hammerspace.com> wrote:

> > > > 

> > > > On Thu, 2018-07-19 at 17:42 +0000, Sargun Dhillon wrote:

> > > > > This adds the ability to pass a non-init user namespace to

> > > > > rpcauth_create,

> > > > > via rpc_auth_create_args. If the specific authentication

> > > > > mechanism

> > > > > does not support non-init user namespaces, then it will

> > > > > return an

> > > > > error.

> > > > > 

> > > > > Currently, the only two authentication mechanisms that

> > > > > support

> > > > > non-init user namespaces are auth_null, and auth_unix.

> > > > > auth_unix

> > > > > will send the UID / GID from the user namespace for

> > > > > authentication.

> > > > > 

> > > > 

> > > > Firstly, please at least Cc the linux-nfs mailing list (as per

> > > > the

> > > > MAINTAINERS file) when changing NFS and sunrpc code.

> > > 

> > > Sorry about that.

> > > 

> > > > 

> > > > Secondly, can you please explain why we would want to use any

> > > > user

> > > > namespace other than the one specified in the net namespace

> > > > structure

> > > > (struct net) when communicating with network resources such as

> > > > rpc.gssd, the idmapper or, for that matter, the NFS server?

> > > 

> > > We mount NFS volumes for containers (user namespaces) today. On

> > > multiple machines, they may have different mappings of uids in

> > > the

> > > user namespace to kuids. If this is the case, it breaks auth_unix

> > > because it uses the kuid in the init user ns mapping for the uid

> > > it

> > > sends to the server.

> > > 

> > 

> > The point is that the user namespace conversions that happen in the

> > sunrpc layer are all for dealing with services. The AUTH_GSS

> > upcalls

> > should _only_ be speaking to an rpc.gssd daemon that runs in

> > whatever

> > container that owns the net namespace (and that created the

> > rpc_pipefs

> > objects).

> > 

> > Ditto for the idmapper although if you use the keyring based (i.e.

> > the

> > non legacy) idmapper, that runs in the init namespace.

> > 

> > > I think that if we moved to using the net->user_ns for auth_unix,

> > > that'd be great, but it'd break userspace, as far as I know. We

> > > have

> > > a

> > > slightly hacked version of this patch that uses the s_user_ns

> > > from

> > > the

> > > nfs superblock, and I think that uids from the backing store

> > > (whether

> > > it be a block device, or a server), should be written as the

> > > kuid,

> > > and

> > > translated when it goes in and out of the userns.

> > 

> > The actual applications running in the containers are interacting

> > through the standard system calls. They do not need any extra

> > conversion, because the syscalls convert them to kuids and back.

> > 

> > IOW: We can completely ignore the user namespace of the container,

> > since that is taken care of at the syscall level.

> > 

> > The only namespaces we care about are:

> > 

> > 1) The container that set up the mount in the first place, since

> > presumably is is authorised to use its own uid/gids when talking to

> > the

> > mountpoint. That user namespace had better be the same one as the

> > one

> > saved in 'struct net' that was saved when we set up the mountpoint.

> > 

> > 2) The containers that are running rpc.gssd and rpc.idmapd. Again,

> > those are tied to struct net.

> > 

> 

> When the server presents with NFS_CAP_UIDGID_NOMAP, and you use

> auth_unix there are no upcalls to rpc.gssd, nor rpc.idmapd. The

> mapping to uid in the init user ns are sent to the NFS server, even

> if

> net->user_ns is not init_user_ns. The syscall happens with a user in

> a

> user namespace with, say, ID 0, and their cred has the

> from_kuid(&init_user_ns...) of 100, the uid the server receives is

> still 100.


The current code assumes that the init namespace sets up all
mountpoints. It is broken if the mountpoint gets set up from inside a
container.

> If we choose to convert them based on the network namespace, it would

> solve the problem just fine, but that'd be a userspace breaking

> change. I think we have to use the s_user_ns.


The s_user_ns doesn't relate to anything special on the server. It
doesn't relate to the rpc.gssd process, and it doesn't relate to the
rpc.idmapd process. Why would we want to give it a role at all for NFS?

Aside from that, why would a container orchestrator process (or
whatever is setting up the mountpoint here) need to run with a
different user namespace in its process creds and its net namespace?
That would mean that we'd be using different user namespaces for
rpc_pipefs and for the NFS filesystem.
IOW: when talking to the rpc.gssd daemon, I'd end up using one user
namespace for setting up the link to the daemon via rpc_pipefs, then
I'd be using a different user namespace when communicating with the
rpc.gssd daemon on the other end of that link. In what user namespace
would the rpc.gssd daemon be expected to run in this kind of scenario?
Ditto for rpc.idmapd.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com
Sargun Dhillon July 20, 2018, 5:06 p.m. UTC | #6
On Fri, Jul 20, 2018 at 4:48 AM, Trond Myklebust
<trondmy@hammerspace.com> wrote:
> On Thu, 2018-07-19 at 23:12 -0700, Sargun Dhillon wrote:
>> On Thu, Jul 19, 2018 at 5:37 PM, Trond Myklebust
>> <trondmy@hammerspace.com> wrote:
>> > On Thu, 2018-07-19 at 17:00 -0700, Sargun Dhillon wrote:
>> > > On Thu, Jul 19, 2018 at 12:45 PM, Trond Myklebust
>> > > <trondmy@hammerspace.com> wrote:
>> > > >
>> > > > On Thu, 2018-07-19 at 17:42 +0000, Sargun Dhillon wrote:
>> > > > > This adds the ability to pass a non-init user namespace to
>> > > > > rpcauth_create,
>> > > > > via rpc_auth_create_args. If the specific authentication
>> > > > > mechanism
>> > > > > does not support non-init user namespaces, then it will
>> > > > > return an
>> > > > > error.
>> > > > >
>> > > > > Currently, the only two authentication mechanisms that
>> > > > > support
>> > > > > non-init user namespaces are auth_null, and auth_unix.
>> > > > > auth_unix
>> > > > > will send the UID / GID from the user namespace for
>> > > > > authentication.
>> > > > >
>> > > >
>> > > > Firstly, please at least Cc the linux-nfs mailing list (as per
>> > > > the
>> > > > MAINTAINERS file) when changing NFS and sunrpc code.
>> > >
>> > > Sorry about that.
>> > >
>> > > >
>> > > > Secondly, can you please explain why we would want to use any
>> > > > user
>> > > > namespace other than the one specified in the net namespace
>> > > > structure
>> > > > (struct net) when communicating with network resources such as
>> > > > rpc.gssd, the idmapper or, for that matter, the NFS server?
>> > >
>> > > We mount NFS volumes for containers (user namespaces) today. On
>> > > multiple machines, they may have different mappings of uids in
>> > > the
>> > > user namespace to kuids. If this is the case, it breaks auth_unix
>> > > because it uses the kuid in the init user ns mapping for the uid
>> > > it
>> > > sends to the server.
>> > >
>> >
>> > The point is that the user namespace conversions that happen in the
>> > sunrpc layer are all for dealing with services. The AUTH_GSS
>> > upcalls
>> > should _only_ be speaking to an rpc.gssd daemon that runs in
>> > whatever
>> > container that owns the net namespace (and that created the
>> > rpc_pipefs
>> > objects).
>> >
>> > Ditto for the idmapper although if you use the keyring based (i.e.
>> > the
>> > non legacy) idmapper, that runs in the init namespace.
>> >
>> > > I think that if we moved to using the net->user_ns for auth_unix,
>> > > that'd be great, but it'd break userspace, as far as I know. We
>> > > have
>> > > a
>> > > slightly hacked version of this patch that uses the s_user_ns
>> > > from
>> > > the
>> > > nfs superblock, and I think that uids from the backing store
>> > > (whether
>> > > it be a block device, or a server), should be written as the
>> > > kuid,
>> > > and
>> > > translated when it goes in and out of the userns.
>> >
>> > The actual applications running in the containers are interacting
>> > through the standard system calls. They do not need any extra
>> > conversion, because the syscalls convert them to kuids and back.
>> >
>> > IOW: We can completely ignore the user namespace of the container,
>> > since that is taken care of at the syscall level.
>> >
>> > The only namespaces we care about are:
>> >
>> > 1) The container that set up the mount in the first place, since
>> > presumably is is authorised to use its own uid/gids when talking to
>> > the
>> > mountpoint. That user namespace had better be the same one as the
>> > one
>> > saved in 'struct net' that was saved when we set up the mountpoint.
>> >
>> > 2) The containers that are running rpc.gssd and rpc.idmapd. Again,
>> > those are tied to struct net.
>> >
>>
>> When the server presents with NFS_CAP_UIDGID_NOMAP, and you use
>> auth_unix there are no upcalls to rpc.gssd, nor rpc.idmapd. The
>> mapping to uid in the init user ns are sent to the NFS server, even
>> if
>> net->user_ns is not init_user_ns. The syscall happens with a user in
>> a
>> user namespace with, say, ID 0, and their cred has the
>> from_kuid(&init_user_ns...) of 100, the uid the server receives is
>> still 100.
>
> The current code assumes that the init namespace sets up all
> mountpoints. It is broken if the mountpoint gets set up from inside a
> container.
>
So, is it okay to change the current "broken" behaviour, even if it
breaks existing users, who do NFS mounts from network namespaces,
which are in turn owned by non init user namespaces? You can do this
today by:
# Session 1
unshare -U
unshare -n
PID=$(echo $$)

# Session 2
nsenter -t $PID -n
Setup networking

# Session 1
mount ${VOLUME that has NFS_CAP_UIDGID_NOMAP}:/ /mnt/tmp

# And then it'll send init user NS UIDs instead of user namespace UIDs
to the NFS server for auth_unix, writes. This means you have to have
the same mapping of user NS UIDs to init user NS UIDs across all
systems.

Is this the "broken" behaviour you're talking about? Can we change
this behavour, so auth_unix looks at the network namespace -> user_ns
when encoding UIDs on the wire?

>> If we choose to convert them based on the network namespace, it would
>> solve the problem just fine, but that'd be a userspace breaking
>> change. I think we have to use the s_user_ns.
>
> The s_user_ns doesn't relate to anything special on the server. It
> doesn't relate to the rpc.gssd process, and it doesn't relate to the
> rpc.idmapd process. Why would we want to give it a role at all for NFS?
See above. Right now, s_user_ns is always init_user_ns, since we don't
allow the mount to be owned by a non-init user ns. This would allow us
to safely change the behaviour in the future, without changing the
behaviour on userspace.

>
> Aside from that, why would a container orchestrator process (or
> whatever is setting up the mountpoint here) need to run with a
> different user namespace in its process creds and its net namespace?
> That would mean that we'd be using different user namespaces for
> rpc_pipefs and for the NFS filesystem.
> IOW: when talking to the rpc.gssd daemon, I'd end up using one user
> namespace for setting up the link to the daemon via rpc_pipefs, then
> I'd be using a different user namespace when communicating with the
> rpc.gssd daemon on the other end of that link. In what user namespace
> would the rpc.gssd daemon be expected to run in this kind of scenario?
> Ditto for rpc.idmapd.
I don't have strong opinions about this. The only thing I care about
is which UIDs get sent to and fro the NFS server via AUTH_UNIX, and
how are UIDs interpreted when you have NFS_CAP_UIDGID_NOMAP? Right
now, all of this is interpreted based on init_user_ns.

>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
diff mbox

Patch

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6dd146885da9..ab92ac8d48a8 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3657,7 +3657,8 @@  static int nfs4_lookup_root_sec(struct nfs_server *server, struct nfs_fh *fhandl
 				struct nfs_fsinfo *info, rpc_authflavor_t flavor)
 {
 	struct rpc_auth_create_args auth_args = {
-		.pseudoflavor = flavor,
+		.pseudoflavor	= flavor,
+		.user_ns	= &init_user_ns,
 	};
 	struct rpc_auth *auth;
 
diff --git a/include/linux/sunrpc/auth.h b/include/linux/sunrpc/auth.h
index d9af474a857d..7f320be28efc 100644
--- a/include/linux/sunrpc/auth.h
+++ b/include/linux/sunrpc/auth.h
@@ -111,6 +111,7 @@  struct rpc_auth {
 
 struct rpc_auth_create_args {
 	rpc_authflavor_t pseudoflavor;
+	struct user_namespace *user_ns;
 	const char *target_name;
 };
 
@@ -125,7 +126,9 @@  struct rpc_authops {
 	struct module		*owner;
 	rpc_authflavor_t	au_flavor;	/* flavor (RPC_AUTH_*) */
 	char *			au_name;
-	struct rpc_auth *	(*create)(struct rpc_auth_create_args *, struct rpc_clnt *);
+	bool			user_ns;	/* supports user namespaces */
+	struct rpc_auth *	(*create)(const struct rpc_auth_create_args *,
+					  struct rpc_clnt *);
 	void			(*destroy)(struct rpc_auth *);
 
 	int			(*hash_cred)(struct auth_cred *, unsigned int);
@@ -161,12 +164,10 @@  struct rpc_credops {
 extern const struct rpc_authops	authunix_ops;
 extern const struct rpc_authops	authnull_ops;
 
-int __init		rpc_init_authunix(void);
 int __init		rpc_init_generic_auth(void);
 int __init		rpcauth_init_module(void);
 void			rpcauth_remove_module(void);
 void			rpc_destroy_generic_auth(void);
-void 			rpc_destroy_authunix(void);
 
 struct rpc_cred *	rpc_lookup_cred(void);
 struct rpc_cred *	rpc_lookup_cred_nonblock(void);
@@ -174,7 +175,7 @@  struct rpc_cred *	rpc_lookup_generic_cred(struct auth_cred *, int, gfp_t);
 struct rpc_cred *	rpc_lookup_machine_cred(const char *service_name);
 int			rpcauth_register(const struct rpc_authops *);
 int			rpcauth_unregister(const struct rpc_authops *);
-struct rpc_auth *	rpcauth_create(struct rpc_auth_create_args *,
+struct rpc_auth *	rpcauth_create(const struct rpc_auth_create_args *,
 				struct rpc_clnt *);
 void			rpcauth_release(struct rpc_auth *);
 rpc_authflavor_t	rpcauth_get_pseudoflavor(rpc_authflavor_t,
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index d2623b9f23d6..9cf1076375d5 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -253,7 +253,7 @@  rpcauth_list_flavors(rpc_authflavor_t *array, int size)
 EXPORT_SYMBOL_GPL(rpcauth_list_flavors);
 
 struct rpc_auth *
-rpcauth_create(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
+rpcauth_create(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
 {
 	struct rpc_auth		*auth;
 	const struct rpc_authops *ops;
@@ -272,7 +272,8 @@  rpcauth_create(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
 		goto out;
 	}
 	spin_unlock(&rpc_authflavor_lock);
-	auth = ops->create(args, clnt);
+	if (args->user_ns == &init_user_ns || ops->user_ns)
+		auth = ops->create(args, clnt);
 	module_put(ops->owner);
 	if (IS_ERR(auth))
 		return auth;
@@ -870,27 +871,21 @@  int __init rpcauth_init_module(void)
 {
 	int err;
 
-	err = rpc_init_authunix();
-	if (err < 0)
-		goto out1;
 	err = rpc_init_generic_auth();
 	if (err < 0)
-		goto out2;
+		goto out1;
 	err = register_shrinker(&rpc_cred_shrinker);
 	if (err < 0)
-		goto out3;
+		goto out2;
 	return 0;
-out3:
-	rpc_destroy_generic_auth();
 out2:
-	rpc_destroy_authunix();
+	rpc_destroy_generic_auth();
 out1:
 	return err;
 }
 
 void rpcauth_remove_module(void)
 {
-	rpc_destroy_authunix();
 	rpc_destroy_generic_auth();
 	unregister_shrinker(&rpc_cred_shrinker);
 }
diff --git a/net/sunrpc/auth_generic.c b/net/sunrpc/auth_generic.c
index f1df9837f1ac..2ce9dc8a843b 100644
--- a/net/sunrpc/auth_generic.c
+++ b/net/sunrpc/auth_generic.c
@@ -270,6 +270,7 @@  static const struct rpc_authops generic_auth_ops = {
 	.lookup_cred = generic_lookup_cred,
 	.crcreate = generic_create_cred,
 	.key_timeout = generic_key_timeout,
+	.user_ns = false,
 };
 
 static struct rpc_auth generic_auth = {
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index be8f103d22fd..34ec2770c71c 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -985,7 +985,7 @@  static void gss_pipe_free(struct gss_pipe *p)
  * parameters based on the input flavor (which must be a pseudoflavor)
  */
 static struct gss_auth *
-gss_create_new(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
+gss_create_new(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
 {
 	rpc_authflavor_t flavor = args->pseudoflavor;
 	struct gss_auth *gss_auth;
@@ -1132,7 +1132,7 @@  gss_destroy(struct rpc_auth *auth)
  * (which is guaranteed to last as long as any of its descendants).
  */
 static struct gss_auth *
-gss_auth_find_or_add_hashed(struct rpc_auth_create_args *args,
+gss_auth_find_or_add_hashed(const struct rpc_auth_create_args *args,
 		struct rpc_clnt *clnt,
 		struct gss_auth *new)
 {
@@ -1169,7 +1169,8 @@  gss_auth_find_or_add_hashed(struct rpc_auth_create_args *args,
 }
 
 static struct gss_auth *
-gss_create_hashed(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
+gss_create_hashed(const struct rpc_auth_create_args *args,
+		  struct rpc_clnt *clnt)
 {
 	struct gss_auth *gss_auth;
 	struct gss_auth *new;
@@ -1188,7 +1189,7 @@  gss_create_hashed(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
 }
 
 static struct rpc_auth *
-gss_create(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
+gss_create(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
 {
 	struct gss_auth *gss_auth;
 	struct rpc_xprt_switch *xps = rcu_access_pointer(clnt->cl_xpi.xpi_xpswitch);
@@ -2005,6 +2006,7 @@  static const struct rpc_authops authgss_ops = {
 	.list_pseudoflavors = gss_mech_list_pseudoflavors,
 	.info2flavor	= gss_mech_info2flavor,
 	.flavor2info	= gss_mech_flavor2info,
+	.user_ns	= false,
 };
 
 static const struct rpc_credops gss_credops = {
diff --git a/net/sunrpc/auth_null.c b/net/sunrpc/auth_null.c
index 75d72e109a04..a2743bfc79f9 100644
--- a/net/sunrpc/auth_null.c
+++ b/net/sunrpc/auth_null.c
@@ -19,7 +19,7 @@  static struct rpc_auth null_auth;
 static struct rpc_cred null_cred;
 
 static struct rpc_auth *
-nul_create(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
+nul_create(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
 {
 	atomic_inc(&null_auth.au_count);
 	return &null_auth;
@@ -110,6 +110,7 @@  const struct rpc_authops authnull_ops = {
 	.create		= nul_create,
 	.destroy	= nul_destroy,
 	.lookup_cred	= nul_lookup_cred,
+	.user_ns	= true,
 };
 
 static
diff --git a/net/sunrpc/auth_unix.c b/net/sunrpc/auth_unix.c
index dafd6b870ba3..9935e878aac0 100644
--- a/net/sunrpc/auth_unix.c
+++ b/net/sunrpc/auth_unix.c
@@ -15,10 +15,16 @@ 
 #include <linux/sunrpc/auth.h>
 #include <linux/user_namespace.h>
 
+struct unix_auth {
+	struct rpc_auth		rpc_auth;
+	struct user_namespace	*user_ns;
+};
+
 struct unx_cred {
 	struct rpc_cred		uc_base;
 	kgid_t			uc_gid;
 	kgid_t			uc_gids[UNX_NGROUPS];
+	struct user_namespace	*user_ns;
 };
 #define uc_uid			uc_base.cr_uid
 
@@ -26,31 +32,71 @@  struct unx_cred {
 # define RPCDBG_FACILITY	RPCDBG_AUTH
 #endif
 
-static struct rpc_auth		unix_auth;
 static const struct rpc_credops	unix_credops;
 
 static struct rpc_auth *
-unx_create(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
+unx_create(const struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
 {
+	struct unix_auth *unix_auth;
+	struct rpc_auth *auth;
+	int err = -ENOMEM;
+
 	dprintk("RPC:       creating UNIX authenticator for client %p\n",
 			clnt);
-	atomic_inc(&unix_auth.au_count);
-	return &unix_auth;
+	if (!try_module_get(THIS_MODULE))
+		return ERR_PTR(-EINVAL);
+	unix_auth = kmalloc(sizeof(*unix_auth), GFP_KERNEL);
+	if (!unix_auth)
+		goto error;
+
+	unix_auth->user_ns = get_user_ns(args->user_ns);
+	auth = &unix_auth->rpc_auth;
+
+	auth->au_cslack = UNX_CALLSLACK;
+	auth->au_rslack = NUL_REPLYSLACK,
+	auth->au_flags = RPCAUTH_AUTH_NO_CRKEY_TIMEOUT,
+	auth->au_ops = &authunix_ops,
+	auth->au_flavor = RPC_AUTH_UNIX;
+	atomic_set(&unix_auth->rpc_auth.au_count, 1);
+
+	err = rpcauth_init_credcache(auth);
+	if (err)
+		goto error_free_auth;
+
+	return auth;
+
+error_free_auth:
+	put_user_ns(unix_auth->user_ns);
+	kfree(unix_auth);
+error:
+	module_put(THIS_MODULE);
+	return ERR_PTR(err);
 }
 
 static void
 unx_destroy(struct rpc_auth *auth)
 {
+	struct unix_auth *unix_auth;
+
+	unix_auth = container_of(auth, struct unix_auth, rpc_auth);
 	dprintk("RPC:       destroying UNIX authenticator %p\n", auth);
-	rpcauth_clear_credcache(auth->au_credcache);
+	rpcauth_destroy_credcache(auth);
+	put_user_ns(unix_auth->user_ns);
+	kfree(unix_auth);
+	module_put(THIS_MODULE);
 }
 
 static int
 unx_hash_cred(struct auth_cred *acred, unsigned int hashbits)
 {
-	return hash_64(from_kgid(&init_user_ns, acred->gid) |
-		((u64)from_kuid(&init_user_ns, acred->uid) <<
-			(sizeof(gid_t) * 8)), hashbits);
+	/*
+	 * No need to convert this based on the user namespace, because
+	 * the cred cache is only scoped to the unix_auth instances
+	 */
+	uid_t uid = __kuid_val(acred->uid);
+	gid_t gid = __kgid_val(acred->gid);
+
+	return hash_64(gid | ((u64)uid << (sizeof(gid_t) * 8)), hashbits);
 }
 
 /*
@@ -65,19 +111,22 @@  unx_lookup_cred(struct rpc_auth *auth, struct auth_cred *acred, int flags)
 static struct rpc_cred *
 unx_create_cred(struct rpc_auth *auth, struct auth_cred *acred, int flags, gfp_t gfp)
 {
+	struct unix_auth *unix_auth;
 	struct unx_cred	*cred;
 	unsigned int groups = 0;
 	unsigned int i;
 
+	unix_auth = container_of(auth, struct unix_auth, rpc_auth);
 	dprintk("RPC:       allocating UNIX cred for uid %d gid %d\n",
-			from_kuid(&init_user_ns, acred->uid),
-			from_kgid(&init_user_ns, acred->gid));
+			from_kuid(unix_auth->user_ns, acred->uid),
+			from_kgid(unix_auth->user_ns, acred->gid));
 
 	if (!(cred = kmalloc(sizeof(*cred), gfp)))
 		return ERR_PTR(-ENOMEM);
 
 	rpcauth_init_cred(&cred->uc_base, acred, auth, &unix_credops);
 	cred->uc_base.cr_flags = 1UL << RPCAUTH_CRED_UPTODATE;
+	cred->user_ns = get_user_ns(unix_auth->user_ns);
 
 	if (acred->group_info != NULL)
 		groups = acred->group_info->ngroups;
@@ -97,6 +146,7 @@  static void
 unx_free_cred(struct unx_cred *unx_cred)
 {
 	dprintk("RPC:       unx_free_cred %p\n", unx_cred);
+	put_user_ns(unx_cred->user_ns);
 	kfree(unx_cred);
 }
 
@@ -162,11 +212,11 @@  unx_marshal(struct rpc_task *task, __be32 *p)
 	 */
 	p = xdr_encode_array(p, clnt->cl_nodename, clnt->cl_nodelen);
 
-	*p++ = htonl((u32) from_kuid(&init_user_ns, cred->uc_uid));
-	*p++ = htonl((u32) from_kgid(&init_user_ns, cred->uc_gid));
+	*p++ = htonl((u32)from_kuid(cred->user_ns, cred->uc_uid));
+	*p++ = htonl((u32)from_kgid(cred->user_ns, cred->uc_gid));
 	hold = p++;
 	for (i = 0; i < UNX_NGROUPS && gid_valid(cred->uc_gids[i]); i++)
-		*p++ = htonl((u32) from_kgid(&init_user_ns, cred->uc_gids[i]));
+		*p++ = htonl((u32)from_kgid(cred->user_ns, cred->uc_gids[i]));
 	*hold = htonl(p - hold - 1);		/* gid array length */
 	*base = htonl((p - base - 1) << 2);	/* cred length */
 
@@ -211,16 +261,6 @@  unx_validate(struct rpc_task *task, __be32 *p)
 	return p;
 }
 
-int __init rpc_init_authunix(void)
-{
-	return rpcauth_init_credcache(&unix_auth);
-}
-
-void rpc_destroy_authunix(void)
-{
-	rpcauth_destroy_credcache(&unix_auth);
-}
-
 const struct rpc_authops authunix_ops = {
 	.owner		= THIS_MODULE,
 	.au_flavor	= RPC_AUTH_UNIX,
@@ -230,16 +270,7 @@  const struct rpc_authops authunix_ops = {
 	.hash_cred	= unx_hash_cred,
 	.lookup_cred	= unx_lookup_cred,
 	.crcreate	= unx_create_cred,
-};
-
-static
-struct rpc_auth		unix_auth = {
-	.au_cslack	= UNX_CALLSLACK,
-	.au_rslack	= NUL_REPLYSLACK,
-	.au_flags	= RPCAUTH_AUTH_NO_CRKEY_TIMEOUT,
-	.au_ops		= &authunix_ops,
-	.au_flavor	= RPC_AUTH_UNIX,
-	.au_count	= ATOMIC_INIT(0),
+	.user_ns	= false,
 };
 
 static
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index d839c33ae7d9..33d4c18060e4 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -294,8 +294,9 @@  static int rpc_client_register(struct rpc_clnt *clnt,
 			       const char *client_name)
 {
 	struct rpc_auth_create_args auth_args = {
-		.pseudoflavor = pseudoflavor,
-		.target_name = client_name,
+		.pseudoflavor	= pseudoflavor,
+		.target_name	= client_name,
+		.user_ns	= &init_user_ns,
 	};
 	struct rpc_auth *auth;
 	struct net *net = rpc_net_ns(clnt);