diff mbox

[RFC,1/1] destroy_creds.2: new page documenting destroy_creds()

Message ID 20170807212355.29127-3-kolga@netapp.com (mailing list archive)
State New, archived
Headers show

Commit Message

Olga Kornievskaia Aug. 7, 2017, 9:23 p.m. UTC
destroy_creds() is a new system call for destroying file system
credentials. This is usefulf for file systems that manage its
own security contexts that were bootstrapped via some user land
credentials (such as Kerberos).

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 man2/destroy_creds.2 | 130 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 130 insertions(+)
 create mode 100644 man2/destroy_creds.2

Comments

Jeff Layton Aug. 9, 2017, 12:30 p.m. UTC | #1
On Mon, 2017-08-07 at 17:23 -0400, Olga Kornievskaia wrote:
> destroy_creds() is a new system call for destroying file system
> credentials. This is usefulf for file systems that manage its
> own security contexts that were bootstrapped via some user land
> credentials (such as Kerberos).
> 
> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> ---
>  man2/destroy_creds.2 | 130
> +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 130 insertions(+)
>  create mode 100644 man2/destroy_creds.2
> 
> diff --git a/man2/destroy_creds.2 b/man2/destroy_creds.2
> new file mode 100644
> index 0000000..7b41c9d
> --- /dev/null
> +++ b/man2/destroy_creds.2
> @@ -0,0 +1,130 @@
> +.\"This manpage is Copyright (C) 2015 Olga Kornievskaia <kolga@Netap
> p.com>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of
> this
> +.\" manual provided the copyright notice and this permission notice
> are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions
> of
> +.\" this manual under the conditions for verbatim copying, provided
> that
> +.\" the entire resulting derived work is distributed under the terms
> of
> +.\" a permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing,
> this
> +.\" manual page may be incorrect or out-of-date.  The author(s)
> assume
> +.\" no responsibility for errors or omissions, or for damages
> resulting
> +.\" from the use of the information contained herein.  The author(s)
> may
> +.\" not have taken the same level of care in the production of this
> +.\" manual, which is licensed free of charge, as they might when
> working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied
> by
> +.\" the source, must acknowledge the copyright and authors of this
> work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH COPY 2 2017-08-07 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +destroy_creds \- destroy current user's file system credentials for
> a mount point
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/syscall.h>
> +.B #include <unistd.h>
> +
> +.BI "int destroy_creds(int " fd ");
> +.fi
> +.SH DESCRIPTION
> +The
> +.BR destroy ()
> +system call performs destruction of file system credentials for the
> current
> +user. It identifies the file system by the supplied file descriptor
> in
> +.I fd
> +that represents a mount point.
> +
> +.SH RETURN VALUE
> +Upon successful completion,
> +.BR destroy_creds ()
> +will return 0.
> +
> +On error,
> +.BR destroy_creds ()
> +returns \-1 and
> +.I errno
> +is set to indicate the error.
> +.SH ERRORS
> +.TP
> +.B EBADF
> +.I fd
> +file descriptor is not valid
> +.TP
> +.B EINVAL
> +if the input file descriptor is not a directory
> +.TP
> +.B ENOENT
> +no credentials found
> +.TP
> +.B EACCES
> +unable to access credentials
> +.TP
> +.B ENOSYS
> +file system does not implement destroy_creds() functionality
> +.SH VERSIONS
> +The
> +.BR destroy_creds ()
> +system call first appeared in Linux 4.1?.
> +.SH CONFORMING TO
> +The
> +.BR destroy_creds ()
> +system call is a nonstandard Linux extension.
> +.SH NOTES
> +
> +.BR destroy_creds ()
> +gives filesystems an opportunity to destroy credentials. For
> instance,
> +NFS uses Kerberos credentials stored in Kerberos credential cache to
> +create its security contexts that then are stored and managed by the
> +kernel. Once the user logs out and destroys Kerberos credentials via
> +kdestroy, NFS security contexts associate with that user are valid 
> +until they expire. fslogout application such provided by the example
> +allows the user driven credential destruction in the file system.
> +
> +.SH EXAMPLE
> +.nf
> +#define _GNU_SOURCE
> +#include <fcntl.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <sys/stat.h>
> +#include <sys/syscall.h>
> +#include <unistd.h>
> +
> +static int
> +destroy_creds(int fd)
> +{
> +    return syscall(__NR_destroy_creds, fd);
> +}
> +
> +int
> +main(int argc, char **argv)
> +{
> +    int fd, ret;
> +
> +    if (argc != 2) {
> +        fprintf(stderr, "Usage: %s <mount point>\\n", argv[0]);
> +        exit(EXIT_FAILURE);
> +    }
> +
> +    fd = open(argv[1], O_DIRECTORY|O_RDONLY);
> +    if (fd == \-1) {
> +        perror("open (argv[1])");
> +        exit(EXIT_FAILURE);
> +    }
> +
> +    ret = destroy_creds(fd);
> +    if (ret == \-1) {
> +        perror("destroy_creds");
> +        exit(EXIT_FAILURE);
> +    }
> +
> +    close(fd);
> +    exit(EXIT_SUCCESS);
> +}
> +.fi

Thanks, that helps a bit. I'm less clear on what the higher-level
vision is here though:

Are we all going to be running scripts on logout that scrape
/proc/mounts and run fslogout on each? Will this be added to kdestroy?

Or are you aiming to have KCM do this on some trigger? (see:
https://fedoraproject.org/wiki/Changes/KerberosKCMCache)

Also, doing this per-mount seems wrong to me. Shouldn't this be done on
a per-net-namespace basis or maybe even globally?

It seems like we can afford to be rather cavalier about destroying
creds here. Even if we purge creds for a user that should have remained
valid, we just end up having to re-upcall for them, right?
Olga Kornievskaia Aug. 9, 2017, 3:45 p.m. UTC | #2
On Wed, Aug 9, 2017 at 8:30 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Mon, 2017-08-07 at 17:23 -0400, Olga Kornievskaia wrote:
>> destroy_creds() is a new system call for destroying file system
>> credentials. This is usefulf for file systems that manage its
>> own security contexts that were bootstrapped via some user land
>> credentials (such as Kerberos).
>>
>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>> ---
>>  man2/destroy_creds.2 | 130
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 130 insertions(+)
>>  create mode 100644 man2/destroy_creds.2
>>
>> diff --git a/man2/destroy_creds.2 b/man2/destroy_creds.2
>> new file mode 100644
>> index 0000000..7b41c9d
>> --- /dev/null
>> +++ b/man2/destroy_creds.2
>> @@ -0,0 +1,130 @@
>> +.\"This manpage is Copyright (C) 2015 Olga Kornievskaia <kolga@Netap
>> p.com>
>> +.\"
>> +.\" %%%LICENSE_START(VERBATIM)
>> +.\" Permission is granted to make and distribute verbatim copies of
>> this
>> +.\" manual provided the copyright notice and this permission notice
>> are
>> +.\" preserved on all copies.
>> +.\"
>> +.\" Permission is granted to copy and distribute modified versions
>> of
>> +.\" this manual under the conditions for verbatim copying, provided
>> that
>> +.\" the entire resulting derived work is distributed under the terms
>> of
>> +.\" a permission notice identical to this one.
>> +.\"
>> +.\" Since the Linux kernel and libraries are constantly changing,
>> this
>> +.\" manual page may be incorrect or out-of-date.  The author(s)
>> assume
>> +.\" no responsibility for errors or omissions, or for damages
>> resulting
>> +.\" from the use of the information contained herein.  The author(s)
>> may
>> +.\" not have taken the same level of care in the production of this
>> +.\" manual, which is licensed free of charge, as they might when
>> working
>> +.\" professionally.
>> +.\"
>> +.\" Formatted or processed versions of this manual, if unaccompanied
>> by
>> +.\" the source, must acknowledge the copyright and authors of this
>> work.
>> +.\" %%%LICENSE_END
>> +.\"
>> +.TH COPY 2 2017-08-07 "Linux" "Linux Programmer's Manual"
>> +.SH NAME
>> +destroy_creds \- destroy current user's file system credentials for
>> a mount point
>> +.SH SYNOPSIS
>> +.nf
>> +.B #include <sys/syscall.h>
>> +.B #include <unistd.h>
>> +
>> +.BI "int destroy_creds(int " fd ");
>> +.fi
>> +.SH DESCRIPTION
>> +The
>> +.BR destroy ()
>> +system call performs destruction of file system credentials for the
>> current
>> +user. It identifies the file system by the supplied file descriptor
>> in
>> +.I fd
>> +that represents a mount point.
>> +
>> +.SH RETURN VALUE
>> +Upon successful completion,
>> +.BR destroy_creds ()
>> +will return 0.
>> +
>> +On error,
>> +.BR destroy_creds ()
>> +returns \-1 and
>> +.I errno
>> +is set to indicate the error.
>> +.SH ERRORS
>> +.TP
>> +.B EBADF
>> +.I fd
>> +file descriptor is not valid
>> +.TP
>> +.B EINVAL
>> +if the input file descriptor is not a directory
>> +.TP
>> +.B ENOENT
>> +no credentials found
>> +.TP
>> +.B EACCES
>> +unable to access credentials
>> +.TP
>> +.B ENOSYS
>> +file system does not implement destroy_creds() functionality
>> +.SH VERSIONS
>> +The
>> +.BR destroy_creds ()
>> +system call first appeared in Linux 4.1?.
>> +.SH CONFORMING TO
>> +The
>> +.BR destroy_creds ()
>> +system call is a nonstandard Linux extension.
>> +.SH NOTES
>> +
>> +.BR destroy_creds ()
>> +gives filesystems an opportunity to destroy credentials. For
>> instance,
>> +NFS uses Kerberos credentials stored in Kerberos credential cache to
>> +create its security contexts that then are stored and managed by the
>> +kernel. Once the user logs out and destroys Kerberos credentials via
>> +kdestroy, NFS security contexts associate with that user are valid
>> +until they expire. fslogout application such provided by the example
>> +allows the user driven credential destruction in the file system.
>> +
>> +.SH EXAMPLE
>> +.nf
>> +#define _GNU_SOURCE
>> +#include <fcntl.h>
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <sys/stat.h>
>> +#include <sys/syscall.h>
>> +#include <unistd.h>
>> +
>> +static int
>> +destroy_creds(int fd)
>> +{
>> +    return syscall(__NR_destroy_creds, fd);
>> +}
>> +
>> +int
>> +main(int argc, char **argv)
>> +{
>> +    int fd, ret;
>> +
>> +    if (argc != 2) {
>> +        fprintf(stderr, "Usage: %s <mount point>\\n", argv[0]);
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    fd = open(argv[1], O_DIRECTORY|O_RDONLY);
>> +    if (fd == \-1) {
>> +        perror("open (argv[1])");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    ret = destroy_creds(fd);
>> +    if (ret == \-1) {
>> +        perror("destroy_creds");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    close(fd);
>> +    exit(EXIT_SUCCESS);
>> +}
>> +.fi
>
> Thanks, that helps a bit. I'm less clear on what the higher-level
> vision is here though:

My vision is simple. Provide simple user land application as is and
have it then customized to whatever environment might need it.

> Are we all going to be running scripts on logout that scrape
> /proc/mounts and run fslogout on each?

Yes I think this would be a good use case.

> Will this be added to kdestroy?

At the time http://marc.info/?l=linux-nfs&m=138246272628823&w=2,
Simo pointed out that adding something to kdestroy was not general
enough as there might be other applications calling krb5 libraries directly
and managing credential cache. He suggested a standalone app that is
used as needed.

> Or are you aiming to have KCM do this on some trigger? (see:
> https://fedoraproject.org/wiki/Changes/KerberosKCMCache)

Again wasn't thinking about this due to
http://marc.info/?l=linux-nfs&m=138497973702474&w=2
Greg Hudson wrote:
"Kerberos credential caches can be used for several different purposes;
they aren't only used to store login credentials.  For instance, a user could
run a server process which receives delegated credentials from a client, or
could run admin and get credentials for username/admin to administer the
realm's KDB. Notifying the kernel any time any credential cache is destroyed
would create a lot of false positives. I would be happy to have a pluggable
interface which allows for implementations of new ccache types, but I don't
think I would welcome a hook-style interface which causes ccache operations
to have arbitrary side effects beyond changing the ccache."

> Also, doing this per-mount seems wrong to me. Shouldn't this be done on
> a per-net-namespace basis or maybe even globally?

One example of use that really doesn't deal with user logging out of the
system had to do with scientific computing. In such environment, there is an
infrastructure where runs jobs with different uids and associated with it tgts.
But once the job is done and creds are destroyed, they can't start the new
job until the NFS gss context expires. So kinit, run job, kdestroy, fslogout,
kinit as a different user and run another job.

Again another similar example from use a place that uses microscopes for
large scale data recording. They had a setup where there is a "single user" in
the machine but in reality the human users are multiple (but from what I
understand serialized, one at a time) and when they kinit they store
their creds
in the ticket cache of the "single user". But if the new user logs in
prior to gss
context expiring, the NFS context in the kernel is the one from the old user.

Whether the user land application does a log out of all the mount points or
specific mount points I think should be configurable and something that an
administrator can manipulate.

While I stated that it's too late for AFS to make use of it, but AFS's "unlog"
provides an option to remove creds from a specific cell. But perhaps some
other future FS might want to have an ability to do the same. So doing it
on a per-mount seems the right thing.

> It seems like we can afford to be rather cavalier about destroying
> creds here. Even if we purge creds for a user that should have remained
> valid, we just end up having to re-upcall for them, right?

Correct, if the user didn't destroy kerberos credentials, then fslogout just
invalidates current creds and new creds will be acquired using the ticket cache.

> --
> Jeff Layton <jlayton@redhat.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andy Lutomirski Aug. 9, 2017, 4:08 p.m. UTC | #3
On Mon, Aug 7, 2017 at 2:23 PM, Olga Kornievskaia <kolga@netapp.com> wrote:
> destroy_creds() is a new system call for destroying file system
> credentials. This is usefulf for file systems that manage its
> own security contexts that were bootstrapped via some user land
> credentials (such as Kerberos).
>
> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> ---
>  man2/destroy_creds.2 | 130 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 130 insertions(+)
>  create mode 100644 man2/destroy_creds.2
>
> diff --git a/man2/destroy_creds.2 b/man2/destroy_creds.2
> new file mode 100644
> index 0000000..7b41c9d
> --- /dev/null
> +++ b/man2/destroy_creds.2
> @@ -0,0 +1,130 @@
> +.\"This manpage is Copyright (C) 2015 Olga Kornievskaia <kolga@Netapp.com>
> +.\"
> +.\" %%%LICENSE_START(VERBATIM)
> +.\" Permission is granted to make and distribute verbatim copies of this
> +.\" manual provided the copyright notice and this permission notice are
> +.\" preserved on all copies.
> +.\"
> +.\" Permission is granted to copy and distribute modified versions of
> +.\" this manual under the conditions for verbatim copying, provided that
> +.\" the entire resulting derived work is distributed under the terms of
> +.\" a permission notice identical to this one.
> +.\"
> +.\" Since the Linux kernel and libraries are constantly changing, this
> +.\" manual page may be incorrect or out-of-date.  The author(s) assume
> +.\" no responsibility for errors or omissions, or for damages resulting
> +.\" from the use of the information contained herein.  The author(s) may
> +.\" not have taken the same level of care in the production of this
> +.\" manual, which is licensed free of charge, as they might when working
> +.\" professionally.
> +.\"
> +.\" Formatted or processed versions of this manual, if unaccompanied by
> +.\" the source, must acknowledge the copyright and authors of this work.
> +.\" %%%LICENSE_END
> +.\"
> +.TH COPY 2 2017-08-07 "Linux" "Linux Programmer's Manual"
> +.SH NAME
> +destroy_creds \- destroy current user's file system credentials for a mount point
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/syscall.h>
> +.B #include <unistd.h>
> +
> +.BI "int destroy_creds(int " fd ");
> +.fi
> +.SH DESCRIPTION
> +The
> +.BR destroy ()
> +system call performs destruction of file system credentials for the current
> +user. It identifies the file system by the supplied file descriptor in
> +.I fd
> +that represents a mount point.

Does this mean that whatever credentials are used for the current
*fsuid* are destroyed?  Are there actually per-uid credentials in the
first place?

What privileges, if any, are needed to call this?

What if fd points to a bind mount?
Olga Kornievskaia Aug. 9, 2017, 4:44 p.m. UTC | #4
On Wed, Aug 9, 2017 at 12:08 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Aug 7, 2017 at 2:23 PM, Olga Kornievskaia <kolga@netapp.com> wrote:
>> destroy_creds() is a new system call for destroying file system
>> credentials. This is usefulf for file systems that manage its
>> own security contexts that were bootstrapped via some user land
>> credentials (such as Kerberos).
>>
>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>> ---
>>  man2/destroy_creds.2 | 130 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 130 insertions(+)
>>  create mode 100644 man2/destroy_creds.2
>>
>> diff --git a/man2/destroy_creds.2 b/man2/destroy_creds.2
>> new file mode 100644
>> index 0000000..7b41c9d
>> --- /dev/null
>> +++ b/man2/destroy_creds.2
>> @@ -0,0 +1,130 @@
>> +.\"This manpage is Copyright (C) 2015 Olga Kornievskaia <kolga@Netapp.com>
>> +.\"
>> +.\" %%%LICENSE_START(VERBATIM)
>> +.\" Permission is granted to make and distribute verbatim copies of this
>> +.\" manual provided the copyright notice and this permission notice are
>> +.\" preserved on all copies.
>> +.\"
>> +.\" Permission is granted to copy and distribute modified versions of
>> +.\" this manual under the conditions for verbatim copying, provided that
>> +.\" the entire resulting derived work is distributed under the terms of
>> +.\" a permission notice identical to this one.
>> +.\"
>> +.\" Since the Linux kernel and libraries are constantly changing, this
>> +.\" manual page may be incorrect or out-of-date.  The author(s) assume
>> +.\" no responsibility for errors or omissions, or for damages resulting
>> +.\" from the use of the information contained herein.  The author(s) may
>> +.\" not have taken the same level of care in the production of this
>> +.\" manual, which is licensed free of charge, as they might when working
>> +.\" professionally.
>> +.\"
>> +.\" Formatted or processed versions of this manual, if unaccompanied by
>> +.\" the source, must acknowledge the copyright and authors of this work.
>> +.\" %%%LICENSE_END
>> +.\"
>> +.TH COPY 2 2017-08-07 "Linux" "Linux Programmer's Manual"
>> +.SH NAME
>> +destroy_creds \- destroy current user's file system credentials for a mount point
>> +.SH SYNOPSIS
>> +.nf
>> +.B #include <sys/syscall.h>
>> +.B #include <unistd.h>
>> +
>> +.BI "int destroy_creds(int " fd ");
>> +.fi
>> +.SH DESCRIPTION
>> +The
>> +.BR destroy ()
>> +system call performs destruction of file system credentials for the current
>> +user. It identifies the file system by the supplied file descriptor in
>> +.I fd
>> +that represents a mount point.
>
> Does this mean that whatever credentials are used for the current
> *fsuid* are destroyed?

It allows a filesystem to remove in-kernel credentials associated with
the current user (fluid). File system like NFS bootstraps its
in-kernel credentials with credentials stored in the Kerberos ticket
cache. Running example "fslogout" will not remove the Kerberos ticket
cache (which is typically done by running kdestroy).  Running fslogout
without running kdestroy invalidate current in-kernel credentials, but
NFS will acquire new ones using the still existing Kerberos ticket
cache.

> Are there actually per-uid credentials in the first place?

For something like NFS yes. AFS and CIFS too. Since no system call
like this was available, AFS has its own mechanism of removing
credentials (unlog). Linux CIFS security is based on Kerberos too but
is not currently implemented. It could benefit from a system call of
this sort.

> What privileges, if any, are needed to call this?

No privileges. A file system implementing destroy_creds() should
remove credentials associated with the credentials of the running user
context (fsuid).

> What if fd points to a bind mount?

It's just like the original mount. Removing credentials on either
(original or mount) of the mount points will remove credentials for
the user.
NeilBrown Aug. 11, 2017, 7:17 a.m. UTC | #5
On Wed, Aug 09 2017, Jeff Layton wrote:
....
>
> Thanks, that helps a bit. I'm less clear on what the higher-level
> vision is here though:
>
> Are we all going to be running scripts on logout that scrape
> /proc/mounts and run fslogout on each? Will this be added to kdestroy?
>
> Or are you aiming to have KCM do this on some trigger? (see:
> https://fedoraproject.org/wiki/Changes/KerberosKCMCache)
>
> Also, doing this per-mount seems wrong to me. Shouldn't this be done on
> a per-net-namespace basis or maybe even globally?

Having looked at the code, I think this is invalidating cached
credentials globally -- or at least, globally for all filesystems that
use sunrpc.

I actually question the premise for wanting to do this.  Tickets have a
timeout and will expire.  Any code that is allowed to get a ticket, can
hold on to it as long as it likes - but it will cease to work after the
expiry time.  Hunting out all the places that a key might be cached, and
invalidating them, seems to deviate from the model.  If you are concerned
about leaving credentials around where they can theoretically be
misused, then set a smaller expiry time.

What is the threat-model that this change is supposed to guard against?

Looking that the syscall itself:
 1/ why restrict the call to directories only?
 2/ Every new syscall should have a 'flags' argument, because you never
    know when you'll need one.

NeilBrown

   
>
> It seems like we can afford to be rather cavalier about destroying
> creds here. Even if we purge creds for a user that should have remained
> valid, we just end up having to re-upcall for them, right?
> -- 
> Jeff Layton <jlayton@redhat.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton Aug. 11, 2017, 11:18 a.m. UTC | #6
On Fri, 2017-08-11 at 17:17 +1000, NeilBrown wrote:
> On Wed, Aug 09 2017, Jeff Layton wrote:
> ....
> > 
> > Thanks, that helps a bit. I'm less clear on what the higher-level
> > vision is here though:
> > 
> > Are we all going to be running scripts on logout that scrape
> > /proc/mounts and run fslogout on each? Will this be added to kdestroy?
> > 
> > Or are you aiming to have KCM do this on some trigger? (see:
> > https://fedoraproject.org/wiki/Changes/KerberosKCMCache)
> > 
> > Also, doing this per-mount seems wrong to me. Shouldn't this be done on
> > a per-net-namespace basis or maybe even globally?
> 
> Having looked at the code, I think this is invalidating cached
> credentials globally -- or at least, globally for all filesystems that
> use sunrpc.
> 
> I actually question the premise for wanting to do this.  Tickets have a
> timeout and will expire.  Any code that is allowed to get a ticket, can
> hold on to it as long as it likes - but it will cease to work after the
> expiry time.  Hunting out all the places that a key might be cached, and
> invalidating them, seems to deviate from the model.  If you are concerned
> about leaving credentials around where they can theoretically be
> misused, then set a smaller expiry time.
> 
> What is the threat-model that this change is supposed to guard against?
> 
> Looking that the syscall itself:
>  1/ why restrict the call to directories only?
>  2/ Every new syscall should have a 'flags' argument, because you never
>     know when you'll need one.
> 

I have some of the same concerns. For instance, we don't kill off ssh
sessions that were established with krb5 just because the credcache was
destroyed. RPC is a bit different since we authenticate every call, but
is this fundamentally different from keeping an ssh session around that
was established before the credcache was destroyed?

Are we just getting tickets with too long a lifetime here? Maybe we just
need to be more cavalier about destroying cached creds on some event or
on a more timely basis?

Also, the whole gssapi credcache in the kernel is showing its age a bit.
struct auth_cred has had this over it for about as long as I've been
doing kernel work:

    /* Work around the lack of a VFS credential */

We've had struct cred for ages now.

David and I were chatting about this the other day and were wondering if
we could change the RPC gssapi code to cache credentials in one of the
keyrings in struct cred. Then, once the struct cred goes away, the key
would go away as well. It wouldn't be destroyed on kdestroy, but once
the last process with those creds exits, they would go away.
Olga Kornievskaia Aug. 11, 2017, 1:37 p.m. UTC | #7
> On Aug 11, 2017, at 3:17 AM, NeilBrown <neilb@suse.com> wrote:
> 
> On Wed, Aug 09 2017, Jeff Layton wrote:
> ....
>> 
>> Thanks, that helps a bit. I'm less clear on what the higher-level
>> vision is here though:
>> 
>> Are we all going to be running scripts on logout that scrape
>> /proc/mounts and run fslogout on each? Will this be added to kdestroy?
>> 
>> Or are you aiming to have KCM do this on some trigger? (see:
>> https://fedoraproject.org/wiki/Changes/KerberosKCMCache)
>> 
>> Also, doing this per-mount seems wrong to me. Shouldn't this be done on
>> a per-net-namespace basis or maybe even globally?
> 
> Having looked at the code, I think this is invalidating cached
> credentials globally -- or at least, globally for all filesystems that
> use sunrpc.

Yes all filesystems that use sunrpc could benefit from by calling the 
same routine that NFS calls. It only does it per “auth” flavor. If you 
have multiple flavor mounts, only specified one is effected.

> I actually question the premise for wanting to do this.  Tickets have a
> timeout and will expire.  Any code that is allowed to get a ticket, can
> hold on to it as long as it likes - but it will cease to work after the
> expiry time.

However, when kdestroy is called, then any code that tries to use it
will yet. User land is unaware that the kernel has cached his 
credentials.

>  Hunting out all the places that a key might be cached, and
> invalidating them, seems to deviate from the model.  

No caching should be valid after credentials were explicitly removed. 

> If you are concerned
> about leaving credentials around where they can theoretically be
> misused, then set a smaller expiry time.

That’s correct. The only means that people who have complained 
about is left with is using short credentials. But security context 
establishment is not for-free and impacts performance.

> What is the threat-model that this change is supposed to guard against?

It’s a limitation of a system that I feel has solution of providing the 
extension to kdestroy that destroys FS creds.

What’s the disadvantage of providing this feature? There are folks who 
have been asking for it. It tightens up the security.

> Looking that the syscall itself:
> 1/ why restrict the call to directories only?

No real reason. It seems unlikely that the practical unlog application would 
open a filesystem specific file and then call the unlog on that file. It’ll be
problematic as without creds no close can be done and would leave state
on the server?

> 2/ Every new syscall should have a 'flags' argument, because you never
>    know when you'll need one.

Ok.

> 
> NeilBrown
> 
> 
>> 
>> It seems like we can afford to be rather cavalier about destroying
>> creds here. Even if we purge creds for a user that should have remained
>> valid, we just end up having to re-upcall for them, right?
>> -- 
>> Jeff Layton <jlayton@redhat.com>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Olga Kornievskaia Aug. 11, 2017, 2:05 p.m. UTC | #8
On Fri, Aug 11, 2017 at 7:18 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Fri, 2017-08-11 at 17:17 +1000, NeilBrown wrote:
>> On Wed, Aug 09 2017, Jeff Layton wrote:
>> ....
>> >
>> > Thanks, that helps a bit. I'm less clear on what the higher-level
>> > vision is here though:
>> >
>> > Are we all going to be running scripts on logout that scrape
>> > /proc/mounts and run fslogout on each? Will this be added to kdestroy?
>> >
>> > Or are you aiming to have KCM do this on some trigger? (see:
>> > https://fedoraproject.org/wiki/Changes/KerberosKCMCache)
>> >
>> > Also, doing this per-mount seems wrong to me. Shouldn't this be done on
>> > a per-net-namespace basis or maybe even globally?
>>
>> Having looked at the code, I think this is invalidating cached
>> credentials globally -- or at least, globally for all filesystems that
>> use sunrpc.
>>
>> I actually question the premise for wanting to do this.  Tickets have a
>> timeout and will expire.  Any code that is allowed to get a ticket, can
>> hold on to it as long as it likes - but it will cease to work after the
>> expiry time.  Hunting out all the places that a key might be cached, and
>> invalidating them, seems to deviate from the model.  If you are concerned
>> about leaving credentials around where they can theoretically be
>> misused, then set a smaller expiry time.
>>
>> What is the threat-model that this change is supposed to guard against?
>>
>> Looking that the syscall itself:
>>  1/ why restrict the call to directories only?
>>  2/ Every new syscall should have a 'flags' argument, because you never
>>     know when you'll need one.
>>
>
> I have some of the same concerns. For instance, we don't kill off ssh
> sessions that were established with krb5 just because the credcache was
> destroyed. RPC is a bit different since we authenticate every call, but
> is this fundamentally different from keeping an ssh session around that
> was established before the credcache was destroyed?

Probably because fundamentally, it’s the same user that keeps using it.
If the same ssh connection was shared by multiple users that were inserting
and deleting their credentials then it would be as problematic.

>
> Are we just getting tickets with too long a lifetime here? Maybe we just
> need to be more cavalier about destroying cached creds on some event or
> on a more timely basis?
>
> Also, the whole gssapi credcache in the kernel is showing its age a bit.
> struct auth_cred has had this over it for about as long as I've been
> doing kernel work:
>
>     /* Work around the lack of a VFS credential */
>
> We've had struct cred for ages now.
>
> David and I were chatting about this the other day and were wondering if
> we could change the RPC gssapi code to cache credentials in one of the
> keyrings in struct cred. Then, once the struct cred goes away, the key
> would go away as well. It wouldn't be destroyed on kdestroy, but once
> the last process with those creds exits, they would go away.

One argument against it: Kerberos has changed their storage location
over the years (FILES … to keyring). What if they change again? Then NFS
would have to change their implementation as well.

Having said that: outside of the fs-mailing list, I have asked Trond that
if VFS decides to reject the syscall idea, what would be an alternative
and one of the choices is the keyring. Of course there are variations of
how the keyring would be used. One option would be to totally switch to
storing credentials in the keyring. To what what Andy had originally
proposed of introducing a gss key type and storing the gss context in
the keyring.

>
> --
> Jeff Layton <jlayton@redhat.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Olga Kornievskaia Aug. 11, 2017, 2:09 p.m. UTC | #9
apologies for duplicates (first attempt bounced from the mailing lists)

On Fri, Aug 11, 2017 at 3:17 AM, NeilBrown <neilb@suse.com> wrote:
> On Wed, Aug 09 2017, Jeff Layton wrote:
> ....
>>
>> Thanks, that helps a bit. I'm less clear on what the higher-level
>> vision is here though:
>>
>> Are we all going to be running scripts on logout that scrape
>> /proc/mounts and run fslogout on each? Will this be added to kdestroy?
>>
>> Or are you aiming to have KCM do this on some trigger? (see:
>> https://fedoraproject.org/wiki/Changes/KerberosKCMCache)
>>
>> Also, doing this per-mount seems wrong to me. Shouldn't this be done on
>> a per-net-namespace basis or maybe even globally?
>
> Having looked at the code, I think this is invalidating cached
> credentials globally -- or at least, globally for all filesystems that
> use sunrpc.

Yes all filesystems that use sunrpc could benefit from by calling the
same routine that NFS calls. It only does it per “auth” flavor. If you
have multiple flavor mounts, only specified one is effected.

>
> I actually question the premise for wanting to do this.  Tickets have a
> timeout and will expire.  Any code that is allowed to get a ticket, can
> hold on to it as long as it likes - but it will cease to work after the
> expiry time.

However, when kdestroy is called, then any code that tries to use it
will yet. User land is unaware that the kernel has cached his
credentials.

> Hunting out all the places that a key might be cached, and
> invalidating them, seems to deviate from the model.

No caching should be valid after credentials were explicitly removed.

> If you are concerned
> about leaving credentials around where they can theoretically be
> misused, then set a smaller expiry time.

That’s correct. The only means that people who have complained
about is left with is using short credentials. But security context
establishment is not for-free and impacts performance.

>
> What is the threat-model that this change is supposed to guard against?

It’s a limitation of a system that I feel has solution of providing the
extension to kdestroy that destroys FS creds.

What’s the disadvantage of providing this feature? There are folks who
have been asking for it. It tightens up the security.

>
> Looking that the syscall itself:
>  1/ why restrict the call to directories only?

No real reason. It seems unlikely that the practical unlog application would
open a filesystem specific file and then call the unlog on that file. It’ll be
problematic as without creds no close can be done and would leave state
on the server?

>  2/ Every new syscall should have a 'flags' argument, because you never
>     know when you'll need one.
>
> NeilBrown
>
>
>>
>> It seems like we can afford to be rather cavalier about destroying
>> creds here. Even if we purge creds for a user that should have remained
>> valid, we just end up having to re-upcall for them, right?

Ok.

>> --
>> Jeff Layton <jlayton@redhat.com>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton Aug. 11, 2017, 2:22 p.m. UTC | #10
On Fri, 2017-08-11 at 09:49 -0400, Olga Kornievskaia wrote:
> > On Aug 11, 2017, at 7:18 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > 
> > On Fri, 2017-08-11 at 17:17 +1000, NeilBrown wrote:
> > > On Wed, Aug 09 2017, Jeff Layton wrote:
> > > ....
> > > > Thanks, that helps a bit. I'm less clear on what the higher-level
> > > > vision is here though:
> > > > 
> > > > Are we all going to be running scripts on logout that scrape
> > > > /proc/mounts and run fslogout on each? Will this be added to kdestroy?
> > > > 
> > > > Or are you aiming to have KCM do this on some trigger? (see:
> > > > https://fedoraproject.org/wiki/Changes/KerberosKCMCache)
> > > > 
> > > > Also, doing this per-mount seems wrong to me. Shouldn't this be done on
> > > > a per-net-namespace basis or maybe even globally?
> > > 
> > > Having looked at the code, I think this is invalidating cached
> > > credentials globally -- or at least, globally for all filesystems that
> > > use sunrpc.
> > > 
> > > I actually question the premise for wanting to do this.  Tickets have a
> > > timeout and will expire.  Any code that is allowed to get a ticket, can
> > > hold on to it as long as it likes - but it will cease to work after the
> > > expiry time.  Hunting out all the places that a key might be cached, and
> > > invalidating them, seems to deviate from the model.  If you are concerned
> > > about leaving credentials around where they can theoretically be
> > > misused, then set a smaller expiry time.
> > > 
> > > What is the threat-model that this change is supposed to guard against?
> > > 
> > > Looking that the syscall itself:
> > > 1/ why restrict the call to directories only?
> > > 2/ Every new syscall should have a 'flags' argument, because you never
> > >    know when you'll need one.
> > > 
> > 
> > I have some of the same concerns. For instance, we don't kill off ssh
> > sessions that were established with krb5 just because the credcache was
> > destroyed. RPC is a bit different since we authenticate every call, but
> > is this fundamentally different from keeping an ssh session around that
> > was established before the credcache was destroyed?
> 
> Probably because fundamentally, it’s the same user that keeps using it.
> If the same ssh connection was shared by multiple users that were inserting
> and deleting their credentials then it would be as problematic.
> 
> > Are we just getting tickets with too long a lifetime here? Maybe we just
> > need to be more cavalier about destroying cached creds on some event or
> > on a more timely basis?
> > 
> > Also, the whole gssapi credcache in the kernel is showing its age a bit.
> > struct auth_cred has had this over it for about as long as I've been
> > doing kernel work:
> > 
> >    /* Work around the lack of a VFS credential */
> > 
> > We've had struct cred for ages now.
> > 
> > David and I were chatting about this the other day and were wondering if
> > we could change the RPC gssapi code to cache credentials in one of the
> > keyrings in struct cred. Then, once the struct cred goes away, the key
> > would go away as well. It wouldn't be destroyed on kdestroy, but once
> > the last process with those creds exits, they would go away.
> 
> One argument against it: Kerberos has changed their storage location 
> over the years (FILES … to keyring). What if they change again? Then NFS 
> would have to change their implementation as well.
> 
> Having said that: outside of the fs-mailing list, I have asked Trond that
> if VFS decides to reject the syscall idea, what would be an alternative 
> and one of the choices is the keyring. Of course there are variations of 
> how the keyring would be used. One option would be to totally switch to 
> storing credentials in the keyring. To what what Andy had originally 
> proposed of introducing a gss key type and storing the gss context in 
> the keyring.
> 
> 

I think I wasn't clear here. I'm not proposing that you move everyone to
KEYRING: credcaches. This would not be a visible change to userland.
We'd still use rpc.gssd to upcall for creds.

What I'm saying is that instead of storing the creds in a hashtable like
we do today, we'd just stash them in one of the keyrings hanging off of
struct cred.

Change all of the authgss_ops operations to do query/store from the
appropriate keyring directly. With that, the effective lifetime of
GSSAPI creds would be bounded by the lifetime of the keyrings that hold
references to it.

We'd probably need a new key_type for this to ensure that this couldn't
be manipulated directly from userland. Or...maybe you'd still want to
allow userland to destroy the creds? No need for a new syscall with that
-- they can just do a "keyctl unlink". There are a lot of options here.

It's a non-trivial amount of work though (rpcauth_lookupcred() on down
would probably need to be reworked) and I haven't looked at it detail.
Still, it seems like it could be a more modern and cleaner design than
what we have today.
Trond Myklebust Aug. 11, 2017, 3:12 p.m. UTC | #11
On Fri, 2017-08-11 at 10:22 -0400, Jeff Layton wrote:
> I think I wasn't clear here. I'm not proposing that you move everyone

> to

> KEYRING: credcaches. This would not be a visible change to userland.

> We'd still use rpc.gssd to upcall for creds.

> 

> What I'm saying is that instead of storing the creds in a hashtable

> like

> we do today, we'd just stash them in one of the keyrings hanging off

> of

> struct cred.

> 

> Change all of the authgss_ops operations to do query/store from the

> appropriate keyring directly. With that, the effective lifetime of

> GSSAPI creds would be bounded by the lifetime of the keyrings that

> hold

> references to it.

> 

> We'd probably need a new key_type for this to ensure that this

> couldn't

> be manipulated directly from userland. Or...maybe you'd still want to

> allow userland to destroy the creds? No need for a new syscall with

> that

> -- they can just do a "keyctl unlink". There are a lot of options

> here.

> 

> It's a non-trivial amount of work though (rpcauth_lookupcred() on

> down

> would probably need to be reworked) and I haven't looked at it

> detail.

> Still, it seems like it could be a more modern and cleaner design

> than

> what we have today.

> 


The main annoyance with going from a global to a local cache such as
the keyrings is that it makes comparing credentials a lot more work.
Today, because the credentials are essentially unique per server, we
just do pointer comparisons. Once we have non-global caches, we would
need to do more elaborate comparisons to ensure that the uid, gid, and
list of groups match.
That's also why we never made the leap to using 'struct cred', btw...

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com
Jeff Layton Aug. 13, 2017, 11:38 a.m. UTC | #12
On Fri, 2017-08-11 at 15:12 +0000, Trond Myklebust wrote:
> On Fri, 2017-08-11 at 10:22 -0400, Jeff Layton wrote:
> > I think I wasn't clear here. I'm not proposing that you move everyone
> > to
> > KEYRING: credcaches. This would not be a visible change to userland.
> > We'd still use rpc.gssd to upcall for creds.
> > 
> > What I'm saying is that instead of storing the creds in a hashtable
> > like
> > we do today, we'd just stash them in one of the keyrings hanging off
> > of
> > struct cred.
> > 
> > Change all of the authgss_ops operations to do query/store from the
> > appropriate keyring directly. With that, the effective lifetime of
> > GSSAPI creds would be bounded by the lifetime of the keyrings that
> > hold
> > references to it.
> > 
> > We'd probably need a new key_type for this to ensure that this
> > couldn't
> > be manipulated directly from userland. Or...maybe you'd still want to
> > allow userland to destroy the creds? No need for a new syscall with
> > that
> > -- they can just do a "keyctl unlink". There are a lot of options
> > here.
> > 
> > It's a non-trivial amount of work though (rpcauth_lookupcred() on
> > down
> > would probably need to be reworked) and I haven't looked at it
> > detail.
> > Still, it seems like it could be a more modern and cleaner design
> > than
> > what we have today.
> > 
> 
> The main annoyance with going from a global to a local cache such as
> the keyrings is that it makes comparing credentials a lot more work.
> Today, because the credentials are essentially unique per server, we
> just do pointer comparisons. Once we have non-global caches, we would
> need to do more elaborate comparisons to ensure that the uid, gid, and
> list of groups match.
> That's also why we never made the leap to using 'struct cred', btw...


Ok, it does seem better to have a global cache from that standpoint.
Still, a new syscall for this doesn't seem very elegant. I also worry a
bit about writeback here too (like David and Neil have pointed out).

What about changing how we hold references on these objects instead?

After we look up an auth token in e.g. rpcauth_lookupcred, take a
reference to it and stash a pointer to it somewhere in the cred.
Possibly in the thread or process keyrings, but it may work better
elsewhere.

When we go to look up creds from that thread in the future, we can get
to it directly (which is a nice bonus). When the cred is destroyed
(usually on process destruction), we'd drop the reference to the object,
which would drop the reference to the global cache object.

The global cache could then be changed to have a pretty short timeout (a
few seconds?) and reap the object soon afterward when there are no more
active processes that have used it.

It's a bit more work and we might need to grow struct cred to handle it
(maybe give it its own keyring?), but it seems like that might be a
cleaner solution than giving userland knobs to manage the kernel's
caches.
Olga Kornievskaia Aug. 14, 2017, 3:43 p.m. UTC | #13
On Sun, Aug 13, 2017 at 7:38 AM, Jeff Layton <jlayton@redhat.com> wrote:
> On Fri, 2017-08-11 at 15:12 +0000, Trond Myklebust wrote:
>> On Fri, 2017-08-11 at 10:22 -0400, Jeff Layton wrote:
>> > I think I wasn't clear here. I'm not proposing that you move everyone
>> > to
>> > KEYRING: credcaches. This would not be a visible change to userland.
>> > We'd still use rpc.gssd to upcall for creds.
>> >
>> > What I'm saying is that instead of storing the creds in a hashtable
>> > like
>> > we do today, we'd just stash them in one of the keyrings hanging off
>> > of
>> > struct cred.
>> >
>> > Change all of the authgss_ops operations to do query/store from the
>> > appropriate keyring directly. With that, the effective lifetime of
>> > GSSAPI creds would be bounded by the lifetime of the keyrings that
>> > hold
>> > references to it.
>> >
>> > We'd probably need a new key_type for this to ensure that this
>> > couldn't
>> > be manipulated directly from userland. Or...maybe you'd still want to
>> > allow userland to destroy the creds? No need for a new syscall with
>> > that
>> > -- they can just do a "keyctl unlink". There are a lot of options
>> > here.
>> >
>> > It's a non-trivial amount of work though (rpcauth_lookupcred() on
>> > down
>> > would probably need to be reworked) and I haven't looked at it
>> > detail.
>> > Still, it seems like it could be a more modern and cleaner design
>> > than
>> > what we have today.
>> >
>>
>> The main annoyance with going from a global to a local cache such as
>> the keyrings is that it makes comparing credentials a lot more work.
>> Today, because the credentials are essentially unique per server, we
>> just do pointer comparisons. Once we have non-global caches, we would
>> need to do more elaborate comparisons to ensure that the uid, gid, and
>> list of groups match.
>> That's also why we never made the leap to using 'struct cred', btw...
>
>
> Ok, it does seem better to have a global cache from that standpoint.
> Still, a new syscall for this doesn't seem very elegant. I also worry a
> bit about writeback here too (like David and Neil have pointed out).
>
> What about changing how we hold references on these objects instead?
>
> After we look up an auth token in e.g. rpcauth_lookupcred, take a
> reference to it and stash a pointer to it somewhere in the cred.
> Possibly in the thread or process keyrings, but it may work better
> elsewhere.
>
> When we go to look up creds from that thread in the future, we can get
> to it directly (which is a nice bonus). When the cred is destroyed
> (usually on process destruction), we'd drop the reference to the object,
> which would drop the reference to the global cache object.
>
> The global cache could then be changed to have a pretty short timeout (a
> few seconds?) and reap the object soon afterward when there are no more
> active processes that have used it.

Wouldn’t that produce a lot of unnecessary context re-establishments.

> It's a bit more work and we might need to grow struct cred to handle it
> (maybe give it its own keyring?), but it seems like that might be a
> cleaner solution than giving userland knobs to manage the kernel's
> caches.

Userland is the only place that know that kdestroy ran and is the best
place to tell the kernel to remove its cache. Everything else is guessing.

> --
> Jeff Layton <jlayton@redhat.com>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton Aug. 14, 2017, 3:59 p.m. UTC | #14
On Mon, 2017-08-14 at 11:15 -0400, Olga Kornievskaia wrote:
> > On Aug 13, 2017, at 7:38 AM, Jeff Layton <jlayton@redhat.com> wrote:
> > 
> > On Fri, 2017-08-11 at 15:12 +0000, Trond Myklebust wrote:
> > > On Fri, 2017-08-11 at 10:22 -0400, Jeff Layton wrote:
> > > > I think I wasn't clear here. I'm not proposing that you move everyone
> > > > to
> > > > KEYRING: credcaches. This would not be a visible change to userland.
> > > > We'd still use rpc.gssd to upcall for creds.
> > > > 
> > > > What I'm saying is that instead of storing the creds in a hashtable
> > > > like
> > > > we do today, we'd just stash them in one of the keyrings hanging off
> > > > of
> > > > struct cred.
> > > > 
> > > > Change all of the authgss_ops operations to do query/store from the
> > > > appropriate keyring directly. With that, the effective lifetime of
> > > > GSSAPI creds would be bounded by the lifetime of the keyrings that
> > > > hold
> > > > references to it.
> > > > 
> > > > We'd probably need a new key_type for this to ensure that this
> > > > couldn't
> > > > be manipulated directly from userland. Or...maybe you'd still want to
> > > > allow userland to destroy the creds? No need for a new syscall with
> > > > that
> > > > -- they can just do a "keyctl unlink". There are a lot of options
> > > > here.
> > > > 
> > > > It's a non-trivial amount of work though (rpcauth_lookupcred() on
> > > > down
> > > > would probably need to be reworked) and I haven't looked at it
> > > > detail.
> > > > Still, it seems like it could be a more modern and cleaner design
> > > > than
> > > > what we have today.
> > > > 
> > > 
> > > The main annoyance with going from a global to a local cache such as
> > > the keyrings is that it makes comparing credentials a lot more work.
> > > Today, because the credentials are essentially unique per server, we
> > > just do pointer comparisons. Once we have non-global caches, we would
> > > need to do more elaborate comparisons to ensure that the uid, gid, and
> > > list of groups match.
> > > That's also why we never made the leap to using 'struct cred', btw...
> > 
> > Ok, it does seem better to have a global cache from that standpoint.
> > Still, a new syscall for this doesn't seem very elegant. I also worry a
> > bit about writeback here too (like David and Neil have pointed out).
> > 
> > What about changing how we hold references on these objects instead?
> > 
> > After we look up an auth token in e.g. rpcauth_lookupcred, take a
> > reference to it and stash a pointer to it somewhere in the cred.
> > Possibly in the thread or process keyrings, but it may work better
> > elsewhere.
> > 
> > When we go to look up creds from that thread in the future, we can get
> > to it directly (which is a nice bonus). When the cred is destroyed
> > (usually on process destruction), we'd drop the reference to the object,
> > which would drop the reference to the global cache object.
> > 
> > The global cache could then be changed to have a pretty short timeout (a
> > few seconds?) and reap the object soon afterward when there are no more
> > active processes that have used it.
> 
> Wouldn’t that produce a lot of unnecessary context re-establishments. 
> 

I wouldn't think so.

As long as there is an outstanding struct cred that holds a reference to
the rpc cred, then the context will stick around and you shouldn't need
to upcall. Even if all you have is short-lived tasks, you still will
only need to upcall at the rate of the cache timeout, at the max.

Granted, timing out caches like this is a bit of a black art, and I'm
assuming that a small delay (<1 minute) between struct cred destruction
and the context destruction would be ok.

> > It's a bit more work and we might need to grow struct cred to handle it
> > (maybe give it its own keyring?), but it seems like that might be a
> > cleaner solution than giving userland knobs to manage the kernel's
> > caches.
> 
> Userland is the only place that know that kdestroy ran and is the best
> place to tell the kernel to remove its cache. Everything else is guessing.

This doesn't necessarily preclude destroying them manually. If you store
the key in a keyring you could still manually purge that reference with
a keyctl_unlink(). This approach would also mean you wouldn't need a new
syscall as well.

Regardless...I think there is a lot of mileage to be gained out of
handling cache timeouts intelligently. If for no other reason than to
have sane context timeouts for environments that can't or won't call
destroy the creds when the cache is destroyed.
diff mbox

Patch

diff --git a/man2/destroy_creds.2 b/man2/destroy_creds.2
new file mode 100644
index 0000000..7b41c9d
--- /dev/null
+++ b/man2/destroy_creds.2
@@ -0,0 +1,130 @@ 
+.\"This manpage is Copyright (C) 2015 Olga Kornievskaia <kolga@Netapp.com>
+.\"
+.\" %%%LICENSE_START(VERBATIM)
+.\" Permission is granted to make and distribute verbatim copies of this
+.\" manual provided the copyright notice and this permission notice are
+.\" preserved on all copies.
+.\"
+.\" Permission is granted to copy and distribute modified versions of
+.\" this manual under the conditions for verbatim copying, provided that
+.\" the entire resulting derived work is distributed under the terms of
+.\" a permission notice identical to this one.
+.\"
+.\" Since the Linux kernel and libraries are constantly changing, this
+.\" manual page may be incorrect or out-of-date.  The author(s) assume
+.\" no responsibility for errors or omissions, or for damages resulting
+.\" from the use of the information contained herein.  The author(s) may
+.\" not have taken the same level of care in the production of this
+.\" manual, which is licensed free of charge, as they might when working
+.\" professionally.
+.\"
+.\" Formatted or processed versions of this manual, if unaccompanied by
+.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
+.\"
+.TH COPY 2 2017-08-07 "Linux" "Linux Programmer's Manual"
+.SH NAME
+destroy_creds \- destroy current user's file system credentials for a mount point
+.SH SYNOPSIS
+.nf
+.B #include <sys/syscall.h>
+.B #include <unistd.h>
+
+.BI "int destroy_creds(int " fd ");
+.fi
+.SH DESCRIPTION
+The
+.BR destroy ()
+system call performs destruction of file system credentials for the current
+user. It identifies the file system by the supplied file descriptor in
+.I fd
+that represents a mount point.
+
+.SH RETURN VALUE
+Upon successful completion,
+.BR destroy_creds ()
+will return 0.
+
+On error,
+.BR destroy_creds ()
+returns \-1 and
+.I errno
+is set to indicate the error.
+.SH ERRORS
+.TP
+.B EBADF
+.I fd
+file descriptor is not valid
+.TP
+.B EINVAL
+if the input file descriptor is not a directory
+.TP
+.B ENOENT
+no credentials found
+.TP
+.B EACCES
+unable to access credentials
+.TP
+.B ENOSYS
+file system does not implement destroy_creds() functionality
+.SH VERSIONS
+The
+.BR destroy_creds ()
+system call first appeared in Linux 4.1?.
+.SH CONFORMING TO
+The
+.BR destroy_creds ()
+system call is a nonstandard Linux extension.
+.SH NOTES
+
+.BR destroy_creds ()
+gives filesystems an opportunity to destroy credentials. For instance,
+NFS uses Kerberos credentials stored in Kerberos credential cache to
+create its security contexts that then are stored and managed by the
+kernel. Once the user logs out and destroys Kerberos credentials via
+kdestroy, NFS security contexts associate with that user are valid 
+until they expire. fslogout application such provided by the example
+allows the user driven credential destruction in the file system.
+
+.SH EXAMPLE
+.nf
+#define _GNU_SOURCE
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/stat.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+
+static int
+destroy_creds(int fd)
+{
+    return syscall(__NR_destroy_creds, fd);
+}
+
+int
+main(int argc, char **argv)
+{
+    int fd, ret;
+
+    if (argc != 2) {
+        fprintf(stderr, "Usage: %s <mount point>\\n", argv[0]);
+        exit(EXIT_FAILURE);
+    }
+
+    fd = open(argv[1], O_DIRECTORY|O_RDONLY);
+    if (fd == \-1) {
+        perror("open (argv[1])");
+        exit(EXIT_FAILURE);
+    }
+
+    ret = destroy_creds(fd);
+    if (ret == \-1) {
+        perror("destroy_creds");
+        exit(EXIT_FAILURE);
+    }
+
+    close(fd);
+    exit(EXIT_SUCCESS);
+}
+.fi