mbox series

[RFC,0/2] nfsd: add principal to the data being tracked by nfsdcld

Message ID 20190730210847.9804-1-smayhew@redhat.com (mailing list archive)
Headers show
Series nfsd: add principal to the data being tracked by nfsdcld | expand

Message

Scott Mayhew July 30, 2019, 9:08 p.m. UTC
At the spring bakeathon, Chuck suggested that we should store the
kerberos principal in addition to the client id string in nfsdcld.  The
idea is to prevent an illegitimate client from reclaiming another
client's opens by supplying that client's id string.  This is an initial
attempt at doing that.

The first patch lays some groundwork for supporting multiple message
versions for the nfsdcld upcalls, adding fields for version and message
length to the nfsd4_client_tracking_ops (these fields are only used for
the nfsdcld upcalls and ignored for the other tracking methods), as well
as an upcall to get the maximum version supported by the userspace
daemon. 

Initially I had played with the idea of hanging a version number off
either the cld_net or nfsd_net structure, exposing that via a proc file,
and having the userspace daemon write the desired version number to the
proc file prior to opening the rpc_pipefs file.  That was potentially
problematic if nfsd was already trying to queue an upcall using a
version that nfscld didn't support... so I like the GetVersion upcall
idea better.  Another option might be to have different rpc_pipefs
files for the different message versions and decide which version to use
based on which pipefs file the daemon opens.

The second patch actually adds the v2 message, which adds the principal
(actually just the first 1024 bytes) to the Cld_Create upcall and to the 
Cld_GraceStart downcall (which is what loads the data in the
reclaim_str_hashtbl). I couldn't really figure out what the maximum length
of a kerberos principal actually is (looking at krb5.h the length field in
the struct krb5_data is an unsigned int, so I guess it can be pretty big).
I don't think the principal will be that large in practice, and even if
it is the first 1024 bytes should be sufficient for our purposes.

Scott Mayhew (2):
  nfsd: add a "GetVersion" upcall for nfsdcld
  nfsd: add support for upcall version 2

 fs/nfsd/nfs4recover.c         | 332 +++++++++++++++++++++++++++-------
 fs/nfsd/nfs4state.c           |   6 +-
 fs/nfsd/state.h               |   3 +-
 include/uapi/linux/nfsd/cld.h |  37 +++-
 4 files changed, 311 insertions(+), 67 deletions(-)

Comments

J. Bruce Fields July 30, 2019, 9:54 p.m. UTC | #1
On Tue, Jul 30, 2019 at 05:08:45PM -0400, Scott Mayhew wrote:
> At the spring bakeathon, Chuck suggested that we should store the
> kerberos principal in addition to the client id string in nfsdcld.  The
> idea is to prevent an illegitimate client from reclaiming another
> client's opens by supplying that client's id string.  This is an initial
> attempt at doing that.

Great to see this, thanks!

> Initially I had played with the idea of hanging a version number off
> either the cld_net or nfsd_net structure, exposing that via a proc file,
> and having the userspace daemon write the desired version number to the
> proc file prior to opening the rpc_pipefs file.  That was potentially
> problematic if nfsd was already trying to queue an upcall using a
> version that nfscld didn't support... so I like the GetVersion upcall
> idea better.

Sounds OK to me.

It queries the version once on nfsd startup and sticks with that
version.  We allow rpc.mountd to be restarted while nfsd is running.  So
the one case I think it wouldn't handle is the case where somebody
downgrades mountd while nfsd is running.

My feeling is that handling that case would be way too much trouble, so
actually I'm going to consider that a plus.  But it might be worth
documenting (if only in a patch changelog).

> The second patch actually adds the v2 message, which adds the principal
> (actually just the first 1024 bytes) to the Cld_Create upcall and to the 
> Cld_GraceStart downcall (which is what loads the data in the
> reclaim_str_hashtbl). I couldn't really figure out what the maximum length
> of a kerberos principal actually is (looking at krb5.h the length field in
> the struct krb5_data is an unsigned int, so I guess it can be pretty big).
> I don't think the principal will be that large in practice, and even if
> it is the first 1024 bytes should be sufficient for our purposes.

How does it fail when principals are longer?  Does it error out, or
treat two principals as equal if they agree in the first 1024 bytes?

--b.
J. Bruce Fields July 30, 2019, 9:56 p.m. UTC | #2
On Tue, Jul 30, 2019 at 05:54:28PM -0400, J. Bruce Fields wrote:
> How does it fail when principals are longer?  Does it error out, or
> treat two principals as equal if they agree in the first 1024 bytes?

I guess it's being compared against a string passed from gss-proxy?  We
could also check for limits there.

--b.
Scott Mayhew July 31, 2019, 12:07 p.m. UTC | #3
On Tue, 30 Jul 2019, J. Bruce Fields wrote:

> On Tue, Jul 30, 2019 at 05:08:45PM -0400, Scott Mayhew wrote:
> > At the spring bakeathon, Chuck suggested that we should store the
> > kerberos principal in addition to the client id string in nfsdcld.  The
> > idea is to prevent an illegitimate client from reclaiming another
> > client's opens by supplying that client's id string.  This is an initial
> > attempt at doing that.
> 
> Great to see this, thanks!
> 
> > Initially I had played with the idea of hanging a version number off
> > either the cld_net or nfsd_net structure, exposing that via a proc file,
> > and having the userspace daemon write the desired version number to the
> > proc file prior to opening the rpc_pipefs file.  That was potentially
> > problematic if nfsd was already trying to queue an upcall using a
> > version that nfscld didn't support... so I like the GetVersion upcall
> > idea better.
> 
> Sounds OK to me.
> 
> It queries the version once on nfsd startup and sticks with that
> version.  We allow rpc.mountd to be restarted while nfsd is running.  So
> the one case I think it wouldn't handle is the case where somebody
> downgrades mountd while nfsd is running.

I'm assuming you meant nfsdcld rather than mountd here... currently if
someone were to downgrade nfsdcld while nfsd is running then that case
wouldn't be handled, so a restart of nfsd would also be necessary.

> 
> My feeling is that handling that case would be way too much trouble, so
> actually I'm going to consider that a plus.  But it might be worth
> documenting (if only in a patch changelog).
> 
To handle that scenario, We'd need to change the database schema.  I'd
really rather do that in an external script than stick downgrade logic
into nfsdcld... mainly because I'd want to check if there was any data
in the columns being eliminated and warn the user & ask for
confirmation.  

We'd also need to change how nfsdcld reacts when it gets a message
version it doesn't support.  Currently it just reopens the pipe file,
which causes the upcall to fail with -EAGAIN, which causes nfsd to retry
the upcall, and it just goes into a loop.  I could implement a "version
not supported" downcall.  I'm not sure what error number to use... maybe
-EPROTONOSUPP?  Also even if we implemented a "version not supported"
downcall now, that wouldn't handle the problem with existing nfsdcld's.
The only thing I can think of there is to add a counter to
cld_pipe_upcall() and exit the loop after a certain number of iterations
(10? 100? 1000?).

> > The second patch actually adds the v2 message, which adds the principal
> > (actually just the first 1024 bytes) to the Cld_Create upcall and to the 
> > Cld_GraceStart downcall (which is what loads the data in the
> > reclaim_str_hashtbl). I couldn't really figure out what the maximum length
> > of a kerberos principal actually is (looking at krb5.h the length field in
> > the struct krb5_data is an unsigned int, so I guess it can be pretty big).
> > I don't think the principal will be that large in practice, and even if
> > it is the first 1024 bytes should be sufficient for our purposes.
> 
> How does it fail when principals are longer?  Does it error out, or
> treat two principals as equal if they agree in the first 1024 bytes?

It treats them as equal.

> 
> --b.
Scott Mayhew July 31, 2019, 12:11 p.m. UTC | #4
On Tue, 30 Jul 2019, J. Bruce Fields wrote:

> On Tue, Jul 30, 2019 at 05:54:28PM -0400, J. Bruce Fields wrote:
> > How does it fail when principals are longer?  Does it error out, or
> > treat two principals as equal if they agree in the first 1024 bytes?
> 
> I guess it's being compared against a string passed from gss-proxy?  We
> could also check for limits there.

I'm using cr_principal (servicetype@hostname) since it's set by both
gssproxy and rpc.svcgssd.  cr_raw_principal (servicetype/hostname@REALM)
is only set by gssproxy.

-Scott
> 
> --b.
J. Bruce Fields July 31, 2019, 2:31 p.m. UTC | #5
On Wed, Jul 31, 2019 at 08:07:53AM -0400, Scott Mayhew wrote:
> On Tue, 30 Jul 2019, J. Bruce Fields wrote:
> 
> > On Tue, Jul 30, 2019 at 05:08:45PM -0400, Scott Mayhew wrote:
> > > At the spring bakeathon, Chuck suggested that we should store the
> > > kerberos principal in addition to the client id string in nfsdcld.  The
> > > idea is to prevent an illegitimate client from reclaiming another
> > > client's opens by supplying that client's id string.  This is an initial
> > > attempt at doing that.
> > 
> > Great to see this, thanks!
> > 
> > > Initially I had played with the idea of hanging a version number off
> > > either the cld_net or nfsd_net structure, exposing that via a proc file,
> > > and having the userspace daemon write the desired version number to the
> > > proc file prior to opening the rpc_pipefs file.  That was potentially
> > > problematic if nfsd was already trying to queue an upcall using a
> > > version that nfscld didn't support... so I like the GetVersion upcall
> > > idea better.
> > 
> > Sounds OK to me.
> > 
> > It queries the version once on nfsd startup and sticks with that
> > version.  We allow rpc.mountd to be restarted while nfsd is running.  So
> > the one case I think it wouldn't handle is the case where somebody
> > downgrades mountd while nfsd is running.
> 
> I'm assuming you meant nfsdcld rather than mountd here...

Oops, right.

> currently if
> someone were to downgrade nfsdcld while nfsd is running then that case
> wouldn't be handled, so a restart of nfsd would also be necessary.

Right.

> > My feeling is that handling that case would be way too much trouble, so
> > actually I'm going to consider that a plus.  But it might be worth
> > documenting (if only in a patch changelog).
>
> To handle that scenario, We'd need to change the database schema.  I'd
> really rather do that in an external script than stick downgrade logic
> into nfsdcld... mainly because I'd want to check if there was any data
> in the columns being eliminated and warn the user & ask for
> confirmation.  
> 
> We'd also need to change how nfsdcld reacts when it gets a message
> version it doesn't support.  Currently it just reopens the pipe file,
> which causes the upcall to fail with -EAGAIN, which causes nfsd to retry
> the upcall, and it just goes into a loop.  I could implement a "version
> not supported" downcall.  I'm not sure what error number to use... maybe
> -EPROTONOSUPP?  Also even if we implemented a "version not supported"
> downcall now, that wouldn't handle the problem with existing nfsdcld's.
> The only thing I can think of there is to add a counter to
> cld_pipe_upcall() and exit the loop after a certain number of iterations
> (10? 100? 1000?).

This sounds too complicated.  Let's just add a note that downgrades
require an nfsd restart.

> > > The second patch actually adds the v2 message, which adds the principal
> > > (actually just the first 1024 bytes) to the Cld_Create upcall and to the 
> > > Cld_GraceStart downcall (which is what loads the data in the
> > > reclaim_str_hashtbl). I couldn't really figure out what the maximum length
> > > of a kerberos principal actually is (looking at krb5.h the length field in
> > > the struct krb5_data is an unsigned int, so I guess it can be pretty big).
> > > I don't think the principal will be that large in practice, and even if
> > > it is the first 1024 bytes should be sufficient for our purposes.
> > 
> > How does it fail when principals are longer?  Does it error out, or
> > treat two principals as equal if they agree in the first 1024 bytes?
> 
> It treats them as equal.

Got it, thanks.

--b.