mbox series

[v2,0/5] Implement referring call lists for CB_OFFLOAD

Message ID 20250301183151.11362-1-cel@kernel.org (mailing list archive)
Headers show
Series Implement referring call lists for CB_OFFLOAD | expand

Message

Chuck Lever March 1, 2025, 6:31 p.m. UTC
From: Chuck Lever <chuck.lever@oracle.com>

I've built a naive proof-of-concept of the csa_referring_call_list
argument of the CB_SEQUENCE operation, and hooked it up for the
CB_OFFLOAD callback operation.

This has been pushed to my kernel.org "fix-async-copy" branch for
folks to play around with.

I've done some basic testing with a server that ensures the
CB_OFFLOAD callback is sent before the COPY reply, while running a
network capture. Operation appears correct, Wireshark is happy
with the construction of the XDR, and the CB_SEQUENCE arguments
match the SEQUENCE operation in the COPY COMPOUND.

I'd like to include this series in nfsd-testing.

Changes since RFC:
- Add a field to struct nfsd4_slot that records its table index
- Include a few additional COPY-related fixes
- Some operational testing has been done

Chuck Lever (5):
  NFSD: OFFLOAD_CANCEL should mark an async COPY as completed
  NFSD: Shorten CB_OFFLOAD response to NFS4ERR_DELAY
  NFSD: Implement CB_SEQUENCE referring call lists
  NFSD: Record each NFSv4 call's session slot index
  NFSD: Use a referring call list for CB_OFFLOAD

 fs/nfsd/nfs4callback.c | 132 +++++++++++++++++++++++++++++++++++++++--
 fs/nfsd/nfs4proc.c     |  16 ++++-
 fs/nfsd/nfs4state.c    |  38 ++++++------
 fs/nfsd/state.h        |  23 +++++++
 fs/nfsd/xdr4.h         |   4 ++
 fs/nfsd/xdr4cb.h       |   5 +-
 6 files changed, 193 insertions(+), 25 deletions(-)

Comments

Jeff Layton March 7, 2025, 3 p.m. UTC | #1
On Sat, 2025-03-01 at 13:31 -0500, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> I've built a naive proof-of-concept of the csa_referring_call_list
> argument of the CB_SEQUENCE operation, and hooked it up for the
> CB_OFFLOAD callback operation.
> 
> This has been pushed to my kernel.org "fix-async-copy" branch for
> folks to play around with.
> 
> I've done some basic testing with a server that ensures the
> CB_OFFLOAD callback is sent before the COPY reply, while running a
> network capture. Operation appears correct, Wireshark is happy
> with the construction of the XDR, and the CB_SEQUENCE arguments
> match the SEQUENCE operation in the COPY COMPOUND.
> 
> I'd like to include this series in nfsd-testing.
> 
> Changes since RFC:
> - Add a field to struct nfsd4_slot that records its table index
> - Include a few additional COPY-related fixes
> - Some operational testing has been done
> 
> Chuck Lever (5):
>   NFSD: OFFLOAD_CANCEL should mark an async COPY as completed
>   NFSD: Shorten CB_OFFLOAD response to NFS4ERR_DELAY
>   NFSD: Implement CB_SEQUENCE referring call lists
>   NFSD: Record each NFSv4 call's session slot index
>   NFSD: Use a referring call list for CB_OFFLOAD
> 
>  fs/nfsd/nfs4callback.c | 132 +++++++++++++++++++++++++++++++++++++++--
>  fs/nfsd/nfs4proc.c     |  16 ++++-
>  fs/nfsd/nfs4state.c    |  38 ++++++------
>  fs/nfsd/state.h        |  23 +++++++
>  fs/nfsd/xdr4.h         |   4 ++
>  fs/nfsd/xdr4cb.h       |   5 +-
>  6 files changed, 193 insertions(+), 25 deletions(-)
> 

I think this all looks good for a first pass, and should be OK for
COPY. You can add:


    Reviewed-by: Jeff Layton <jlayton@kernel.org>


I think we'll eventually want this for longer-lived stateids too.
Specifically:

    CB_RECALL
    CB_LAYOUTRECALL
    CB_NOTIFY_LOCK

The main thing missing for that is the ability to free referring call
records once we ensure that the client has seen the reply. For
instance, if nfsd records a referring call on slot:seq 1:2, then once
it sees a SEQUENCE for 1:3, then it doesn't need to keep around the
referring call for 1:2. The server knows that call is no longer in
flight so it's no longer needed.

If we don't do that, then we could end up with rather long referring
call lists, with a bunch of long-completed calls.
Chuck Lever March 7, 2025, 4 p.m. UTC | #2
On 3/7/25 10:00 AM, Jeff Layton wrote:
> On Sat, 2025-03-01 at 13:31 -0500, cel@kernel.org wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> I've built a naive proof-of-concept of the csa_referring_call_list
>> argument of the CB_SEQUENCE operation, and hooked it up for the
>> CB_OFFLOAD callback operation.
>>
>> This has been pushed to my kernel.org "fix-async-copy" branch for
>> folks to play around with.
>>
>> I've done some basic testing with a server that ensures the
>> CB_OFFLOAD callback is sent before the COPY reply, while running a
>> network capture. Operation appears correct, Wireshark is happy
>> with the construction of the XDR, and the CB_SEQUENCE arguments
>> match the SEQUENCE operation in the COPY COMPOUND.
>>
>> I'd like to include this series in nfsd-testing.
>>
>> Changes since RFC:
>> - Add a field to struct nfsd4_slot that records its table index
>> - Include a few additional COPY-related fixes
>> - Some operational testing has been done
>>
>> Chuck Lever (5):
>>   NFSD: OFFLOAD_CANCEL should mark an async COPY as completed
>>   NFSD: Shorten CB_OFFLOAD response to NFS4ERR_DELAY
>>   NFSD: Implement CB_SEQUENCE referring call lists
>>   NFSD: Record each NFSv4 call's session slot index
>>   NFSD: Use a referring call list for CB_OFFLOAD
>>
>>  fs/nfsd/nfs4callback.c | 132 +++++++++++++++++++++++++++++++++++++++--
>>  fs/nfsd/nfs4proc.c     |  16 ++++-
>>  fs/nfsd/nfs4state.c    |  38 ++++++------
>>  fs/nfsd/state.h        |  23 +++++++
>>  fs/nfsd/xdr4.h         |   4 ++
>>  fs/nfsd/xdr4cb.h       |   5 +-
>>  6 files changed, 193 insertions(+), 25 deletions(-)
>>
> 
> I think this all looks good for a first pass, and should be OK for
> COPY. You can add:
> 
> 
>     Reviewed-by: Jeff Layton <jlayton@kernel.org>

Thanks!


> I think we'll eventually want this for longer-lived stateids too.
> Specifically:
> 
>     CB_RECALL
>     CB_LAYOUTRECALL
>     CB_NOTIFY_LOCK
> 
> The main thing missing for that is the ability to free referring call
> records once we ensure that the client has seen the reply. For
> instance, if nfsd records a referring call on slot:seq 1:2, then once
> it sees a SEQUENCE for 1:3, then it doesn't need to keep around the
> referring call for 1:2. The server knows that call is no longer in
> flight so it's no longer needed.
> 
> If we don't do that, then we could end up with rather long referring
> call lists, with a bunch of long-completed calls.

Agreed that RCLs will have uses outside of COPY. I don't have a good
understanding of the use cases you mention above, so it would delay
RCL support in CB_OFFLOAD quite a bit if we were to wait for the
implementation of the other use cases.
Jeff Layton March 7, 2025, 4:07 p.m. UTC | #3
On Fri, 2025-03-07 at 11:00 -0500, Chuck Lever wrote:
> On 3/7/25 10:00 AM, Jeff Layton wrote:
> > On Sat, 2025-03-01 at 13:31 -0500, cel@kernel.org wrote:
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > I've built a naive proof-of-concept of the csa_referring_call_list
> > > argument of the CB_SEQUENCE operation, and hooked it up for the
> > > CB_OFFLOAD callback operation.
> > > 
> > > This has been pushed to my kernel.org "fix-async-copy" branch for
> > > folks to play around with.
> > > 
> > > I've done some basic testing with a server that ensures the
> > > CB_OFFLOAD callback is sent before the COPY reply, while running a
> > > network capture. Operation appears correct, Wireshark is happy
> > > with the construction of the XDR, and the CB_SEQUENCE arguments
> > > match the SEQUENCE operation in the COPY COMPOUND.
> > > 
> > > I'd like to include this series in nfsd-testing.
> > > 
> > > Changes since RFC:
> > > - Add a field to struct nfsd4_slot that records its table index
> > > - Include a few additional COPY-related fixes
> > > - Some operational testing has been done
> > > 
> > > Chuck Lever (5):
> > >   NFSD: OFFLOAD_CANCEL should mark an async COPY as completed
> > >   NFSD: Shorten CB_OFFLOAD response to NFS4ERR_DELAY
> > >   NFSD: Implement CB_SEQUENCE referring call lists
> > >   NFSD: Record each NFSv4 call's session slot index
> > >   NFSD: Use a referring call list for CB_OFFLOAD
> > > 
> > >  fs/nfsd/nfs4callback.c | 132 +++++++++++++++++++++++++++++++++++++++--
> > >  fs/nfsd/nfs4proc.c     |  16 ++++-
> > >  fs/nfsd/nfs4state.c    |  38 ++++++------
> > >  fs/nfsd/state.h        |  23 +++++++
> > >  fs/nfsd/xdr4.h         |   4 ++
> > >  fs/nfsd/xdr4cb.h       |   5 +-
> > >  6 files changed, 193 insertions(+), 25 deletions(-)
> > > 
> > 
> > I think this all looks good for a first pass, and should be OK for
> > COPY. You can add:
> > 
> > 
> >     Reviewed-by: Jeff Layton <jlayton@kernel.org>
> 
> Thanks!
> 
> 
> > I think we'll eventually want this for longer-lived stateids too.
> > Specifically:
> > 
> >     CB_RECALL
> >     CB_LAYOUTRECALL
> >     CB_NOTIFY_LOCK
> > 
> > The main thing missing for that is the ability to free referring call
> > records once we ensure that the client has seen the reply. For
> > instance, if nfsd records a referring call on slot:seq 1:2, then once
> > it sees a SEQUENCE for 1:3, then it doesn't need to keep around the
> > referring call for 1:2. The server knows that call is no longer in
> > flight so it's no longer needed.
> > 
> > If we don't do that, then we could end up with rather long referring
> > call lists, with a bunch of long-completed calls.
> 
> Agreed that RCLs will have uses outside of COPY. I don't have a good
> understanding of the use cases you mention above, so it would delay
> RCL support in CB_OFFLOAD quite a bit if we were to wait for the
> implementation of the other use cases.
> 
> 

The use-cases for CB_RECALL and CB_LAYOUTRECALL are basically the
same: 

Server handed out a delegation or layout stateid, and then had to
immediately recall it. We need the RCL in case the callback races ahead
of the reply that has the stateid.

Now that I look, I don't think it's actually needed for CB_NOTIFY_LOCK
(no stateid in the call), so we can strike that one off the list.