Message ID | 20230801-wdeleg-v2-1-20c14252bab4@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] nfsd: don't hand out write delegations on O_WRONLY opens | expand |
On Tue, 01 Aug 2023, Jeff Layton wrote: > I noticed that xfstests generic/001 was failing against linux-next nfsd. > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server > would hand out a write delegation. The client would then try to use that > write delegation as the source stateid in a COPY or CLONE operation, and > the server would respond with NFS4ERR_STALE. > > The problem is that the struct file associated with the delegation does > not necessarily have read permissions. It's handing out a write > delegation on what is effectively an O_WRONLY open. RFC 8881 states: > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > own, all opens." > > Given that the client didn't request any read permissions, and that nfsd > didn't check for any, it seems wrong to give out a write delegation. > > Only hand out a write delegation if we have a O_RDWR descriptor > available. If it fails to find an appropriate write descriptor, go > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > requested. > > This fixes xfstest generic/001. > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > Changes in v2: > - Rework the logic when finding struct file for the delegation. The > earlier patch might still have attached a O_WRONLY file to the deleg > in some cases, and could still have handed out a write delegation on > an O_WRONLY OPEN request in some cases. > --- > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > 1 file changed, 18 insertions(+), 11 deletions(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index ef7118ebee00..e79d82fd05e7 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > struct nfs4_file *fp = stp->st_stid.sc_file; > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > struct nfs4_delegation *dp; > - struct nfsd_file *nf; > + struct nfsd_file *nf = NULL; > struct file_lock *fl; > u32 dl_type; > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > if (fp->fi_had_conflict) > return ERR_PTR(-EAGAIN); > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > - nf = find_writeable_file(fp); > + /* > + * Try for a write delegation first. We need an O_RDWR file > + * since a write delegation allows the client to perform any open > + * from its cache. > + */ > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); This doesn't take fp->fi_lock before accessing ->fi_fds[], while the find_readable_file() call below does. This inconsistency suggests a bug? Maybe the provided API is awkward. Should we have find_suitable_file() and find_suitable_file_locked() that gets passed an nfs4_file and an O_MODE? It tries the given mode, then O_RDWR NeilBrown > dl_type = NFS4_OPEN_DELEGATE_WRITE; > - } else { > + } > + > + /* > + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR > + * file for some reason, then try for a read deleg instead. > + */ > + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { > nf = find_readable_file(fp); > dl_type = NFS4_OPEN_DELEGATE_READ; > } > - if (!nf) { > - /* > - * We probably could attempt another open and get a read > - * delegation, but for now, don't bother until the > - * client actually sends us one. > - */ > + > + if (!nf) > return ERR_PTR(-EAGAIN); > - } > + > spin_lock(&state_lock); > spin_lock(&fp->fi_lock); > if (nfs4_delegation_exists(clp, fp)) > > --- > base-commit: a734662572708cf062e974f659ae50c24fc1ad17 > change-id: 20230731-wdeleg-bbdb6b25a3c6 > > Best regards, > -- > Jeff Layton <jlayton@kernel.org> > >
On Wed, Aug 02, 2023 at 08:26:15AM +1000, NeilBrown wrote: > On Tue, 01 Aug 2023, Jeff Layton wrote: > > I noticed that xfstests generic/001 was failing against linux-next nfsd. > > > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server > > would hand out a write delegation. The client would then try to use that > > write delegation as the source stateid in a COPY or CLONE operation, and > > the server would respond with NFS4ERR_STALE. > > > > The problem is that the struct file associated with the delegation does > > not necessarily have read permissions. It's handing out a write > > delegation on what is effectively an O_WRONLY open. RFC 8881 states: > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > > own, all opens." > > > > Given that the client didn't request any read permissions, and that nfsd > > didn't check for any, it seems wrong to give out a write delegation. > > > > Only hand out a write delegation if we have a O_RDWR descriptor > > available. If it fails to find an appropriate write descriptor, go > > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > > requested. > > > > This fixes xfstest generic/001. > > > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > Changes in v2: > > - Rework the logic when finding struct file for the delegation. The > > earlier patch might still have attached a O_WRONLY file to the deleg > > in some cases, and could still have handed out a write delegation on > > an O_WRONLY OPEN request in some cases. > > --- > > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > > 1 file changed, 18 insertions(+), 11 deletions(-) > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > index ef7118ebee00..e79d82fd05e7 100644 > > --- a/fs/nfsd/nfs4state.c > > +++ b/fs/nfsd/nfs4state.c > > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > struct nfs4_file *fp = stp->st_stid.sc_file; > > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > > struct nfs4_delegation *dp; > > - struct nfsd_file *nf; > > + struct nfsd_file *nf = NULL; > > struct file_lock *fl; > > u32 dl_type; > > > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > if (fp->fi_had_conflict) > > return ERR_PTR(-EAGAIN); > > > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > > - nf = find_writeable_file(fp); > > + /* > > + * Try for a write delegation first. We need an O_RDWR file > > + * since a write delegation allows the client to perform any open > > + * from its cache. > > + */ > > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { > > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); > > This doesn't take fp->fi_lock before accessing ->fi_fds[], while the > find_readable_file() call below does. Note that the code it replaces (find_writeable_file) takes the fi_lock, so that seems like an important omission. I noticed this earlier, but I was anxious to test whether this fix is on the right path. So far, NFSv4.2 behavior seems much improved. And, I like the new comments. > This inconsistency suggests a bug? > > Maybe the provided API is awkward. Should we have > find_suitable_file() and find_suitable_file_locked() > that gets passed an nfs4_file and an O_MODE? > It tries the given mode, then O_RDWR > > NeilBrown > > > > dl_type = NFS4_OPEN_DELEGATE_WRITE; > > - } else { > > + } > > + > > + /* > > + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR > > + * file for some reason, then try for a read deleg instead. > > + */ > > + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { > > nf = find_readable_file(fp); > > dl_type = NFS4_OPEN_DELEGATE_READ; > > } > > - if (!nf) { > > - /* > > - * We probably could attempt another open and get a read > > - * delegation, but for now, don't bother until the > > - * client actually sends us one. > > - */ > > + > > + if (!nf) > > return ERR_PTR(-EAGAIN); > > - } > > + > > spin_lock(&state_lock); > > spin_lock(&fp->fi_lock); > > if (nfs4_delegation_exists(clp, fp)) > > > > --- > > base-commit: a734662572708cf062e974f659ae50c24fc1ad17 > > change-id: 20230731-wdeleg-bbdb6b25a3c6 > > > > Best regards, > > -- > > Jeff Layton <jlayton@kernel.org> > > > > >
On Tue, 2023-08-01 at 18:51 -0400, Chuck Lever wrote: > On Wed, Aug 02, 2023 at 08:26:15AM +1000, NeilBrown wrote: > > On Tue, 01 Aug 2023, Jeff Layton wrote: > > > I noticed that xfstests generic/001 was failing against linux-next nfsd. > > > > > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server > > > would hand out a write delegation. The client would then try to use that > > > write delegation as the source stateid in a COPY or CLONE operation, and > > > the server would respond with NFS4ERR_STALE. > > > > > > The problem is that the struct file associated with the delegation does > > > not necessarily have read permissions. It's handing out a write > > > delegation on what is effectively an O_WRONLY open. RFC 8881 states: > > > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > > > own, all opens." > > > > > > Given that the client didn't request any read permissions, and that nfsd > > > didn't check for any, it seems wrong to give out a write delegation. > > > > > > Only hand out a write delegation if we have a O_RDWR descriptor > > > available. If it fails to find an appropriate write descriptor, go > > > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > > > requested. > > > > > > This fixes xfstest generic/001. > > > > > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > --- > > > Changes in v2: > > > - Rework the logic when finding struct file for the delegation. The > > > earlier patch might still have attached a O_WRONLY file to the deleg > > > in some cases, and could still have handed out a write delegation on > > > an O_WRONLY OPEN request in some cases. > > > --- > > > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > > > 1 file changed, 18 insertions(+), 11 deletions(-) > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > index ef7118ebee00..e79d82fd05e7 100644 > > > --- a/fs/nfsd/nfs4state.c > > > +++ b/fs/nfsd/nfs4state.c > > > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > > struct nfs4_file *fp = stp->st_stid.sc_file; > > > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > > > struct nfs4_delegation *dp; > > > - struct nfsd_file *nf; > > > + struct nfsd_file *nf = NULL; > > > struct file_lock *fl; > > > u32 dl_type; > > > > > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > > if (fp->fi_had_conflict) > > > return ERR_PTR(-EAGAIN); > > > > > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > > > - nf = find_writeable_file(fp); > > > + /* > > > + * Try for a write delegation first. We need an O_RDWR file > > > + * since a write delegation allows the client to perform any open > > > + * from its cache. > > > + */ > > > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { > > > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); > > > > This doesn't take fp->fi_lock before accessing ->fi_fds[], while the > > find_readable_file() call below does. > > Note that the code it replaces (find_writeable_file) takes the fi_lock, > so that seems like an important omission. > Yes, you and Neil are correct. We need the lock there. I'll respin the patch, re-test and resend soon (once I sort out an issue with my test setup). > I noticed this earlier, but I was anxious to test whether this fix is > on the right path. So far, NFSv4.2 behavior seems much improved. And, > I like the new comments. > > > > This inconsistency suggests a bug? > > > > Maybe the provided API is awkward. Should we have > > find_suitable_file() and find_suitable_file_locked() > > that gets passed an nfs4_file and an O_MODE? > > It tries the given mode, then O_RDWR > > > > NeilBrown > > > > > > > dl_type = NFS4_OPEN_DELEGATE_WRITE; > > > - } else { > > > + } > > > + > > > + /* > > > + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR > > > + * file for some reason, then try for a read deleg instead. > > > + */ > > > + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { > > > nf = find_readable_file(fp); > > > dl_type = NFS4_OPEN_DELEGATE_READ; > > > } > > > - if (!nf) { > > > - /* > > > - * We probably could attempt another open and get a read > > > - * delegation, but for now, don't bother until the > > > - * client actually sends us one. > > > - */ > > > + > > > + if (!nf) > > > return ERR_PTR(-EAGAIN); > > > - } > > > + > > > spin_lock(&state_lock); > > > spin_lock(&fp->fi_lock); > > > if (nfs4_delegation_exists(clp, fp)) > > > > > > --- > > > base-commit: a734662572708cf062e974f659ae50c24fc1ad17 > > > change-id: 20230731-wdeleg-bbdb6b25a3c6 > > > > > > Best regards, > > > -- > > > Jeff Layton <jlayton@kernel.org> > > > > > > > > >
On 8/1/23 6:33 AM, Jeff Layton wrote: > I noticed that xfstests generic/001 was failing against linux-next nfsd. > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server > would hand out a write delegation. The client would then try to use that > write delegation as the source stateid in a COPY not sure why the client opens the source file of a COPY operation with OPEN4_SHARE_ACCESS_WRITE? > or CLONE operation, and > the server would respond with NFS4ERR_STALE. If the server does not allow client to use write delegation for the READ, should the correct error return be NFS4ERR_OPENMODE? > > The problem is that the struct file associated with the delegation does > not necessarily have read permissions. It's handing out a write > delegation on what is effectively an O_WRONLY open. RFC 8881 states: > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > own, all opens." > > Given that the client didn't request any read permissions, and that nfsd > didn't check for any, it seems wrong to give out a write delegation. > > Only hand out a write delegation if we have a O_RDWR descriptor > available. If it fails to find an appropriate write descriptor, go > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > requested. > > This fixes xfstest generic/001. > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > Changes in v2: > - Rework the logic when finding struct file for the delegation. The > earlier patch might still have attached a O_WRONLY file to the deleg > in some cases, and could still have handed out a write delegation on > an O_WRONLY OPEN request in some cases. > --- > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > 1 file changed, 18 insertions(+), 11 deletions(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index ef7118ebee00..e79d82fd05e7 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > struct nfs4_file *fp = stp->st_stid.sc_file; > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > struct nfs4_delegation *dp; > - struct nfsd_file *nf; > + struct nfsd_file *nf = NULL; > struct file_lock *fl; > u32 dl_type; > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > if (fp->fi_had_conflict) > return ERR_PTR(-EAGAIN); > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > - nf = find_writeable_file(fp); > + /* > + * Try for a write delegation first. We need an O_RDWR file > + * since a write delegation allows the client to perform any open > + * from its cache. > + */ > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); > dl_type = NFS4_OPEN_DELEGATE_WRITE; > - } else { Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? It does not seem right. -Dai > + } > + > + /* > + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR > + * file for some reason, then try for a read deleg instead. > + */ > + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { > nf = find_readable_file(fp); > dl_type = NFS4_OPEN_DELEGATE_READ; > } > - if (!nf) { > - /* > - * We probably could attempt another open and get a read > - * delegation, but for now, don't bother until the > - * client actually sends us one. > - */ > + > + if (!nf) > return ERR_PTR(-EAGAIN); > - } > + > spin_lock(&state_lock); > spin_lock(&fp->fi_lock); > if (nfs4_delegation_exists(clp, fp)) > > --- > base-commit: a734662572708cf062e974f659ae50c24fc1ad17 > change-id: 20230731-wdeleg-bbdb6b25a3c6 > > Best regards,
On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: > On 8/1/23 6:33 AM, Jeff Layton wrote: > > I noticed that xfstests generic/001 was failing against linux-next nfsd. > > > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server > > would hand out a write delegation. The client would then try to use that > > write delegation as the source stateid in a COPY > > not sure why the client opens the source file of a COPY operation with > OPEN4_SHARE_ACCESS_WRITE? > It doesn't. The original open is to write the data for the file being copied. It then opens the file again for READ, but since it has a write delegation, it doesn't need to talk to the server at all -- it can just use that stateid for later operations. > > or CLONE operation, and > > the server would respond with NFS4ERR_STALE. > > If the server does not allow client to use write delegation for the > READ, should the correct error return be NFS4ERR_OPENMODE? > The server must allow the client to use a write delegation for read operations. It's required by the spec, AFAIU. The error in this case was just bogus. The vfs copy routine would return -EBADF since the file didn't have FMODE_READ, and the nfs server would translate that into NFS4ERR_STALE. Probably there is a better v4 error code that we could translate EBADF to, but with this patch it shouldn't be a problem any longer. > > > > The problem is that the struct file associated with the delegation does > > not necessarily have read permissions. It's handing out a write > > delegation on what is effectively an O_WRONLY open. RFC 8881 states: > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > > own, all opens." > > > > Given that the client didn't request any read permissions, and that nfsd > > didn't check for any, it seems wrong to give out a write delegation. > > > > Only hand out a write delegation if we have a O_RDWR descriptor > > available. If it fails to find an appropriate write descriptor, go > > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > > requested. > > > > This fixes xfstest generic/001. > > > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > Changes in v2: > > - Rework the logic when finding struct file for the delegation. The > > earlier patch might still have attached a O_WRONLY file to the deleg > > in some cases, and could still have handed out a write delegation on > > an O_WRONLY OPEN request in some cases. > > --- > > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > > 1 file changed, 18 insertions(+), 11 deletions(-) > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > index ef7118ebee00..e79d82fd05e7 100644 > > --- a/fs/nfsd/nfs4state.c > > +++ b/fs/nfsd/nfs4state.c > > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > struct nfs4_file *fp = stp->st_stid.sc_file; > > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > > struct nfs4_delegation *dp; > > - struct nfsd_file *nf; > > + struct nfsd_file *nf = NULL; > > struct file_lock *fl; > > u32 dl_type; > > > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > if (fp->fi_had_conflict) > > return ERR_PTR(-EAGAIN); > > > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > > - nf = find_writeable_file(fp); > > + /* > > + * Try for a write delegation first. We need an O_RDWR file > > + * since a write delegation allows the client to perform any open > > + * from its cache. > > + */ > > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { > > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); > > dl_type = NFS4_OPEN_DELEGATE_WRITE; > > - } else { > > Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? > It does not seem right. > > -Dai > Why? Per RFC 8881: "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its own, all opens." All opens. That includes read opens. An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the user has no read permissions. Therefore, we can't grant a write delegation since can't guarantee that the user is allowed to do that. > > + } > > + > > + /* > > + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR > > + * file for some reason, then try for a read deleg instead. > > + */ > > + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { > > nf = find_readable_file(fp); > > dl_type = NFS4_OPEN_DELEGATE_READ; > > } > > - if (!nf) { > > - /* > > - * We probably could attempt another open and get a read > > - * delegation, but for now, don't bother until the > > - * client actually sends us one. > > - */ > > + > > + if (!nf) > > return ERR_PTR(-EAGAIN); > > - } > > + > > spin_lock(&state_lock); > > spin_lock(&fp->fi_lock); > > if (nfs4_delegation_exists(clp, fp)) > > > > --- > > base-commit: a734662572708cf062e974f659ae50c24fc1ad17 > > change-id: 20230731-wdeleg-bbdb6b25a3c6 > > > > Best regards,
> On Aug 2, 2023, at 2:15 PM, Jeff Layton <jlayton@kernel.org> wrote: > > On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: >> On 8/1/23 6:33 AM, Jeff Layton wrote: >>> I noticed that xfstests generic/001 was failing against linux-next nfsd. >>> >>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server >>> would hand out a write delegation. The client would then try to use that >>> write delegation as the source stateid in a COPY >> >> not sure why the client opens the source file of a COPY operation with >> OPEN4_SHARE_ACCESS_WRITE? >> > > It doesn't. The original open is to write the data for the file being > copied. It then opens the file again for READ, but since it has a write > delegation, it doesn't need to talk to the server at all -- it can just > use that stateid for later operations. > >>> or CLONE operation, and >>> the server would respond with NFS4ERR_STALE. >> >> If the server does not allow client to use write delegation for the >> READ, should the correct error return be NFS4ERR_OPENMODE? >> > > The server must allow the client to use a write delegation for read > operations. It's required by the spec, AFAIU. > > The error in this case was just bogus. The vfs copy routine would return > -EBADF since the file didn't have FMODE_READ, and the nfs server would > translate that into NFS4ERR_STALE. > > Probably there is a better v4 error code that we could translate EBADF > to, but with this patch it shouldn't be a problem any longer. > >>> >>> The problem is that the struct file associated with the delegation does >>> not necessarily have read permissions. It's handing out a write >>> delegation on what is effectively an O_WRONLY open. RFC 8881 states: >>> >>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its >>> own, all opens." >>> >>> Given that the client didn't request any read permissions, and that nfsd >>> didn't check for any, it seems wrong to give out a write delegation. >>> >>> Only hand out a write delegation if we have a O_RDWR descriptor >>> available. If it fails to find an appropriate write descriptor, go >>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was >>> requested. >>> >>> This fixes xfstest generic/001. >>> >>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 >>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>> --- >>> Changes in v2: >>> - Rework the logic when finding struct file for the delegation. The >>> earlier patch might still have attached a O_WRONLY file to the deleg >>> in some cases, and could still have handed out a write delegation on >>> an O_WRONLY OPEN request in some cases. >>> --- >>> fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- >>> 1 file changed, 18 insertions(+), 11 deletions(-) >>> >>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>> index ef7118ebee00..e79d82fd05e7 100644 >>> --- a/fs/nfsd/nfs4state.c >>> +++ b/fs/nfsd/nfs4state.c >>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>> struct nfs4_file *fp = stp->st_stid.sc_file; >>> struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; >>> struct nfs4_delegation *dp; >>> - struct nfsd_file *nf; >>> + struct nfsd_file *nf = NULL; >>> struct file_lock *fl; >>> u32 dl_type; >>> >>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>> if (fp->fi_had_conflict) >>> return ERR_PTR(-EAGAIN); >>> >>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { >>> - nf = find_writeable_file(fp); >>> + /* >>> + * Try for a write delegation first. We need an O_RDWR file >>> + * since a write delegation allows the client to perform any open >>> + * from its cache. Since you're sending a v3... can you cite the section of RFC 8881 that specifies this "all open" case in this commment? I'm sure we're all going to forget this requirement. >>> + */ >>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { >>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); >>> dl_type = NFS4_OPEN_DELEGATE_WRITE; >>> - } else { >> >> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? >> It does not seem right. >> >> -Dai >> > > Why? Per RFC 8881: > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > own, all opens." > > All opens. That includes read opens. > > An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the > user has no read permissions. Therefore, we can't grant a write > delegation since can't guarantee that the user is allowed to do that. > > >>> + } >>> + >>> + /* >>> + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR >>> + * file for some reason, then try for a read deleg instead. >>> + */ >>> + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { >>> nf = find_readable_file(fp); >>> dl_type = NFS4_OPEN_DELEGATE_READ; >>> } >>> - if (!nf) { >>> - /* >>> - * We probably could attempt another open and get a read >>> - * delegation, but for now, don't bother until the >>> - * client actually sends us one. >>> - */ >>> + >>> + if (!nf) >>> return ERR_PTR(-EAGAIN); >>> - } >>> + >>> spin_lock(&state_lock); >>> spin_lock(&fp->fi_lock); >>> if (nfs4_delegation_exists(clp, fp)) >>> >>> --- >>> base-commit: a734662572708cf062e974f659ae50c24fc1ad17 >>> change-id: 20230731-wdeleg-bbdb6b25a3c6 >>> >>> Best regards, > > -- > Jeff Layton <jlayton@kernel.org> -- Chuck Lever
On 8/2/23 11:15 AM, Jeff Layton wrote: > On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: >> On 8/1/23 6:33 AM, Jeff Layton wrote: >>> I noticed that xfstests generic/001 was failing against linux-next nfsd. >>> >>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server >>> would hand out a write delegation. The client would then try to use that >>> write delegation as the source stateid in a COPY >> not sure why the client opens the source file of a COPY operation with >> OPEN4_SHARE_ACCESS_WRITE? >> > It doesn't. The original open is to write the data for the file being > copied. It then opens the file again for READ, but since it has a write > delegation, it doesn't need to talk to the server at all -- it can just > use that stateid for later operations. > >>> or CLONE operation, and >>> the server would respond with NFS4ERR_STALE. >> If the server does not allow client to use write delegation for the >> READ, should the correct error return be NFS4ERR_OPENMODE? >> > The server must allow the client to use a write delegation for read > operations. It's required by the spec, AFAIU. > > The error in this case was just bogus. The vfs copy routine would return > -EBADF since the file didn't have FMODE_READ, and the nfs server would > translate that into NFS4ERR_STALE. > > Probably there is a better v4 error code that we could translate EBADF > to, but with this patch it shouldn't be a problem any longer. > >>> The problem is that the struct file associated with the delegation does >>> not necessarily have read permissions. It's handing out a write >>> delegation on what is effectively an O_WRONLY open. RFC 8881 states: >>> >>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its >>> own, all opens." >>> >>> Given that the client didn't request any read permissions, and that nfsd >>> didn't check for any, it seems wrong to give out a write delegation. >>> >>> Only hand out a write delegation if we have a O_RDWR descriptor >>> available. If it fails to find an appropriate write descriptor, go >>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was >>> requested. >>> >>> This fixes xfstest generic/001. >>> >>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 >>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>> --- >>> Changes in v2: >>> - Rework the logic when finding struct file for the delegation. The >>> earlier patch might still have attached a O_WRONLY file to the deleg >>> in some cases, and could still have handed out a write delegation on >>> an O_WRONLY OPEN request in some cases. >>> --- >>> fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- >>> 1 file changed, 18 insertions(+), 11 deletions(-) >>> >>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>> index ef7118ebee00..e79d82fd05e7 100644 >>> --- a/fs/nfsd/nfs4state.c >>> +++ b/fs/nfsd/nfs4state.c >>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>> struct nfs4_file *fp = stp->st_stid.sc_file; >>> struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; >>> struct nfs4_delegation *dp; >>> - struct nfsd_file *nf; >>> + struct nfsd_file *nf = NULL; >>> struct file_lock *fl; >>> u32 dl_type; >>> >>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>> if (fp->fi_had_conflict) >>> return ERR_PTR(-EAGAIN); >>> >>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { >>> - nf = find_writeable_file(fp); >>> + /* >>> + * Try for a write delegation first. We need an O_RDWR file >>> + * since a write delegation allows the client to perform any open >>> + * from its cache. >>> + */ >>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { >>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); >>> dl_type = NFS4_OPEN_DELEGATE_WRITE; >>> - } else { >> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? >> It does not seem right. >> >> -Dai >> > Why? Per RFC 8881: > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > own, all opens." > > All opens. That includes read opens. > > An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the > user has no read permissions. Therefore, we can't grant a write > delegation since can't guarantee that the user is allowed to do that. If the server grants the write delegation on an OPEN with OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then why can't the server checks and denies the subsequent READ? Per RFC 8881, section 9.1.2: For delegation stateids, the access mode is based on the type of delegation. When a READ, WRITE, or SETATTR (that specifies the size attribute) operation is done, the operation is subject to checking against the access mode to verify that the operation is appropriate given the stateid with which the operation is associated. In the case of WRITE-type operations (i.e., WRITEs and SETATTRs that set size), the server MUST verify that the access mode allows writing and MUST return an NFS4ERR_OPENMODE error if it does not. In the case of READ, the server may perform the corresponding check on the access mode, or it may choose to allow READ on OPENs for OPEN4_SHARE_ACCESS_WRITE, to accommodate clients whose WRITE implementation may unavoidably do reads (e.g., due to buffer cache constraints). However, even if READs are allowed in these circumstances, the server MUST still check for locks that conflict with the READ (e.g., another OPEN specified OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a server that does enforce the access mode check on READs need not explicitly check for conflicting share reservations since the existence of OPEN for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share reservation can exist. FWIW, The Solaris server grants write delegation on OPEN with OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or WR-only. Maybe this is a bug? or the spec is not clear? It'd would be interesting to know how ONTAP server behaves in this scenario. -Dai > > >>> + } >>> + >>> + /* >>> + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR >>> + * file for some reason, then try for a read deleg instead. >>> + */ >>> + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { >>> nf = find_readable_file(fp); >>> dl_type = NFS4_OPEN_DELEGATE_READ; >>> } >>> - if (!nf) { >>> - /* >>> - * We probably could attempt another open and get a read >>> - * delegation, but for now, don't bother until the >>> - * client actually sends us one. >>> - */ >>> + >>> + if (!nf) >>> return ERR_PTR(-EAGAIN); >>> - } >>> + >>> spin_lock(&state_lock); >>> spin_lock(&fp->fi_lock); >>> if (nfs4_delegation_exists(clp, fp)) >>> >>> --- >>> base-commit: a734662572708cf062e974f659ae50c24fc1ad17 >>> change-id: 20230731-wdeleg-bbdb6b25a3c6 >>> >>> Best regards,
On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: > On 8/2/23 11:15 AM, Jeff Layton wrote: > > On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: > > > On 8/1/23 6:33 AM, Jeff Layton wrote: > > > > I noticed that xfstests generic/001 was failing against linux-next nfsd. > > > > > > > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server > > > > would hand out a write delegation. The client would then try to use that > > > > write delegation as the source stateid in a COPY > > > not sure why the client opens the source file of a COPY operation with > > > OPEN4_SHARE_ACCESS_WRITE? > > > > > It doesn't. The original open is to write the data for the file being > > copied. It then opens the file again for READ, but since it has a write > > delegation, it doesn't need to talk to the server at all -- it can just > > use that stateid for later operations. > > > > > > or CLONE operation, and > > > > the server would respond with NFS4ERR_STALE. > > > If the server does not allow client to use write delegation for the > > > READ, should the correct error return be NFS4ERR_OPENMODE? > > > > > The server must allow the client to use a write delegation for read > > operations. It's required by the spec, AFAIU. > > > > The error in this case was just bogus. The vfs copy routine would return > > -EBADF since the file didn't have FMODE_READ, and the nfs server would > > translate that into NFS4ERR_STALE. > > > > Probably there is a better v4 error code that we could translate EBADF > > to, but with this patch it shouldn't be a problem any longer. > > > > > > The problem is that the struct file associated with the delegation does > > > > not necessarily have read permissions. It's handing out a write > > > > delegation on what is effectively an O_WRONLY open. RFC 8881 states: > > > > > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > > > > own, all opens." > > > > > > > > Given that the client didn't request any read permissions, and that nfsd > > > > didn't check for any, it seems wrong to give out a write delegation. > > > > > > > > Only hand out a write delegation if we have a O_RDWR descriptor > > > > available. If it fails to find an appropriate write descriptor, go > > > > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > > > > requested. > > > > > > > > This fixes xfstest generic/001. > > > > > > > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > --- > > > > Changes in v2: > > > > - Rework the logic when finding struct file for the delegation. The > > > > earlier patch might still have attached a O_WRONLY file to the deleg > > > > in some cases, and could still have handed out a write delegation on > > > > an O_WRONLY OPEN request in some cases. > > > > --- > > > > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > > > > 1 file changed, 18 insertions(+), 11 deletions(-) > > > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > > index ef7118ebee00..e79d82fd05e7 100644 > > > > --- a/fs/nfsd/nfs4state.c > > > > +++ b/fs/nfsd/nfs4state.c > > > > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > > > struct nfs4_file *fp = stp->st_stid.sc_file; > > > > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > > > > struct nfs4_delegation *dp; > > > > - struct nfsd_file *nf; > > > > + struct nfsd_file *nf = NULL; > > > > struct file_lock *fl; > > > > u32 dl_type; > > > > > > > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > > > if (fp->fi_had_conflict) > > > > return ERR_PTR(-EAGAIN); > > > > > > > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > > > > - nf = find_writeable_file(fp); > > > > + /* > > > > + * Try for a write delegation first. We need an O_RDWR file > > > > + * since a write delegation allows the client to perform any open > > > > + * from its cache. > > > > + */ > > > > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { > > > > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); > > > > dl_type = NFS4_OPEN_DELEGATE_WRITE; > > > > - } else { > > > Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? > > > It does not seem right. > > > > > > -Dai > > > > > Why? Per RFC 8881: > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > > own, all opens." > > > > All opens. That includes read opens. > > > > An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the > > user has no read permissions. Therefore, we can't grant a write > > delegation since can't guarantee that the user is allowed to do that. > > If the server grants the write delegation on an OPEN with > OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then > why can't the server checks and denies the subsequent READ? > > Per RFC 8881, section 9.1.2: > > For delegation stateids, the access mode is based on the type of > delegation. > > When a READ, WRITE, or SETATTR (that specifies the size attribute) > operation is done, the operation is subject to checking against the > access mode to verify that the operation is appropriate given the > stateid with which the operation is associated. > > In the case of WRITE-type operations (i.e., WRITEs and SETATTRs that > set size), the server MUST verify that the access mode allows writing > and MUST return an NFS4ERR_OPENMODE error if it does not. In the case > of READ, the server may perform the corresponding check on the access > mode, or it may choose to allow READ on OPENs for OPEN4_SHARE_ACCESS_WRITE, > to accommodate clients whose WRITE implementation may unavoidably do > reads (e.g., due to buffer cache constraints). However, even if READs > are allowed in these circumstances, the server MUST still check for > locks that conflict with the READ (e.g., another OPEN specified > OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a server > that does enforce the access mode check on READs need not explicitly > check for conflicting share reservations since the existence of OPEN > for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share > reservation can exist. > > FWIW, The Solaris server grants write delegation on OPEN with > OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or > WR-only. Maybe this is a bug? or the spec is not clear? > I don't think that's necessarily a bug. It's not that the spec demands that we only hand out delegations on BOTH opens. This is more of a quirk of the Linux implementation. Linux' write delegations require an open O_RDWR file descriptor because we may be called upon to do a read on its behalf. Technically, we could probably just have it check for OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, then you're unlikely to get a delegation. Either the O_RDWR descriptor will be NULL, or there are other, conflicting opens already present. Solaris may have a completely different design that doesn't require this. I haven't looked at its code to know. > It'd would be interesting to know how ONTAP server behaves in > this scenario. > Indeed. Most likely it behaves more like Solaris does, but it'd nice to know. > > > > > > > > > + } > > > > + > > > > + /* > > > > + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR > > > > + * file for some reason, then try for a read deleg instead. > > > > + */ > > > > + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { > > > > nf = find_readable_file(fp); > > > > dl_type = NFS4_OPEN_DELEGATE_READ; > > > > } > > > > - if (!nf) { > > > > - /* > > > > - * We probably could attempt another open and get a read > > > > - * delegation, but for now, don't bother until the > > > > - * client actually sends us one. > > > > - */ > > > > + > > > > + if (!nf) > > > > return ERR_PTR(-EAGAIN); > > > > - } > > > > + > > > > spin_lock(&state_lock); > > > > spin_lock(&fp->fi_lock); > > > > if (nfs4_delegation_exists(clp, fp)) > > > > > > > > --- > > > > base-commit: a734662572708cf062e974f659ae50c24fc1ad17 > > > > change-id: 20230731-wdeleg-bbdb6b25a3c6 > > > > > > > > Best regards,
> On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote: > > On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: >> On 8/2/23 11:15 AM, Jeff Layton wrote: >>> On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: >>>> On 8/1/23 6:33 AM, Jeff Layton wrote: >>>>> I noticed that xfstests generic/001 was failing against linux-next nfsd. >>>>> >>>>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server >>>>> would hand out a write delegation. The client would then try to use that >>>>> write delegation as the source stateid in a COPY >>>> not sure why the client opens the source file of a COPY operation with >>>> OPEN4_SHARE_ACCESS_WRITE? >>>> >>> It doesn't. The original open is to write the data for the file being >>> copied. It then opens the file again for READ, but since it has a write >>> delegation, it doesn't need to talk to the server at all -- it can just >>> use that stateid for later operations. >>> >>>>> or CLONE operation, and >>>>> the server would respond with NFS4ERR_STALE. >>>> If the server does not allow client to use write delegation for the >>>> READ, should the correct error return be NFS4ERR_OPENMODE? >>>> >>> The server must allow the client to use a write delegation for read >>> operations. It's required by the spec, AFAIU. >>> >>> The error in this case was just bogus. The vfs copy routine would return >>> -EBADF since the file didn't have FMODE_READ, and the nfs server would >>> translate that into NFS4ERR_STALE. >>> >>> Probably there is a better v4 error code that we could translate EBADF >>> to, but with this patch it shouldn't be a problem any longer. >>> >>>>> The problem is that the struct file associated with the delegation does >>>>> not necessarily have read permissions. It's handing out a write >>>>> delegation on what is effectively an O_WRONLY open. RFC 8881 states: >>>>> >>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its >>>>> own, all opens." >>>>> >>>>> Given that the client didn't request any read permissions, and that nfsd >>>>> didn't check for any, it seems wrong to give out a write delegation. >>>>> >>>>> Only hand out a write delegation if we have a O_RDWR descriptor >>>>> available. If it fails to find an appropriate write descriptor, go >>>>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was >>>>> requested. >>>>> >>>>> This fixes xfstest generic/001. >>>>> >>>>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 >>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>>>> --- >>>>> Changes in v2: >>>>> - Rework the logic when finding struct file for the delegation. The >>>>> earlier patch might still have attached a O_WRONLY file to the deleg >>>>> in some cases, and could still have handed out a write delegation on >>>>> an O_WRONLY OPEN request in some cases. >>>>> --- >>>>> fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- >>>>> 1 file changed, 18 insertions(+), 11 deletions(-) >>>>> >>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>>>> index ef7118ebee00..e79d82fd05e7 100644 >>>>> --- a/fs/nfsd/nfs4state.c >>>>> +++ b/fs/nfsd/nfs4state.c >>>>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>>>> struct nfs4_file *fp = stp->st_stid.sc_file; >>>>> struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; >>>>> struct nfs4_delegation *dp; >>>>> - struct nfsd_file *nf; >>>>> + struct nfsd_file *nf = NULL; >>>>> struct file_lock *fl; >>>>> u32 dl_type; >>>>> >>>>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>>>> if (fp->fi_had_conflict) >>>>> return ERR_PTR(-EAGAIN); >>>>> >>>>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { >>>>> - nf = find_writeable_file(fp); >>>>> + /* >>>>> + * Try for a write delegation first. We need an O_RDWR file >>>>> + * since a write delegation allows the client to perform any open >>>>> + * from its cache. >>>>> + */ >>>>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { >>>>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); >>>>> dl_type = NFS4_OPEN_DELEGATE_WRITE; >>>>> - } else { >>>> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? >>>> It does not seem right. >>>> >>>> -Dai >>>> >>> Why? Per RFC 8881: >>> >>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its >>> own, all opens." >>> >>> All opens. That includes read opens. >>> >>> An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the >>> user has no read permissions. Therefore, we can't grant a write >>> delegation since can't guarantee that the user is allowed to do that. >> >> If the server grants the write delegation on an OPEN with >> OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then >> why can't the server checks and denies the subsequent READ? >> >> Per RFC 8881, section 9.1.2: >> >> For delegation stateids, the access mode is based on the type of >> delegation. >> >> When a READ, WRITE, or SETATTR (that specifies the size attribute) >> operation is done, the operation is subject to checking against the >> access mode to verify that the operation is appropriate given the >> stateid with which the operation is associated. >> >> In the case of WRITE-type operations (i.e., WRITEs and SETATTRs that >> set size), the server MUST verify that the access mode allows writing >> and MUST return an NFS4ERR_OPENMODE error if it does not. In the case >> of READ, the server may perform the corresponding check on the access >> mode, or it may choose to allow READ on OPENs for OPEN4_SHARE_ACCESS_WRITE, >> to accommodate clients whose WRITE implementation may unavoidably do >> reads (e.g., due to buffer cache constraints). However, even if READs >> are allowed in these circumstances, the server MUST still check for >> locks that conflict with the READ (e.g., another OPEN specified >> OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a server >> that does enforce the access mode check on READs need not explicitly >> check for conflicting share reservations since the existence of OPEN >> for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share >> reservation can exist. >> >> FWIW, The Solaris server grants write delegation on OPEN with >> OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or >> WR-only. Maybe this is a bug? or the spec is not clear? >> > > I don't think that's necessarily a bug. > > It's not that the spec demands that we only hand out delegations on BOTH > opens. This is more of a quirk of the Linux implementation. Linux' > write delegations require an open O_RDWR file descriptor because we may > be called upon to do a read on its behalf. > > Technically, we could probably just have it check for > OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, > then you're unlikely to get a delegation. Either the O_RDWR descriptor > will be NULL, or there are other, conflicting opens already present. > > Solaris may have a completely different design that doesn't require > this. I haven't looked at its code to know. I'm comfortable for now with not handing out write delegations for SHARE_ACCESS_WRITE opens. I prefer that to permission checking on every READ operation. If we find that it's a significant performance issue, we can revisit. >> It'd would be interesting to know how ONTAP server behaves in >> this scenario. >> > > Indeed. Most likely it behaves more like Solaris does, but it'd nice to > know. > >> >>> >>> >>>>> + } >>>>> + >>>>> + /* >>>>> + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR >>>>> + * file for some reason, then try for a read deleg instead. >>>>> + */ >>>>> + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { >>>>> nf = find_readable_file(fp); >>>>> dl_type = NFS4_OPEN_DELEGATE_READ; >>>>> } >>>>> - if (!nf) { >>>>> - /* >>>>> - * We probably could attempt another open and get a read >>>>> - * delegation, but for now, don't bother until the >>>>> - * client actually sends us one. >>>>> - */ >>>>> + >>>>> + if (!nf) >>>>> return ERR_PTR(-EAGAIN); >>>>> - } >>>>> + >>>>> spin_lock(&state_lock); >>>>> spin_lock(&fp->fi_lock); >>>>> if (nfs4_delegation_exists(clp, fp)) >>>>> >>>>> --- >>>>> base-commit: a734662572708cf062e974f659ae50c24fc1ad17 >>>>> change-id: 20230731-wdeleg-bbdb6b25a3c6 >>>>> >>>>> Best regards, > > -- > Jeff Layton <jlayton@kernel.org> -- Chuck Lever
On Wed, 2023-08-02 at 20:57 +0000, Chuck Lever III wrote: > > > On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote: > > > > On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: > > > On 8/2/23 11:15 AM, Jeff Layton wrote: > > > > On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: > > > > > On 8/1/23 6:33 AM, Jeff Layton wrote: > > > > > > I noticed that xfstests generic/001 was failing against linux-next nfsd. > > > > > > > > > > > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server > > > > > > would hand out a write delegation. The client would then try to use that > > > > > > write delegation as the source stateid in a COPY > > > > > not sure why the client opens the source file of a COPY operation with > > > > > OPEN4_SHARE_ACCESS_WRITE? > > > > > > > > > It doesn't. The original open is to write the data for the file being > > > > copied. It then opens the file again for READ, but since it has a write > > > > delegation, it doesn't need to talk to the server at all -- it can just > > > > use that stateid for later operations. > > > > > > > > > > or CLONE operation, and > > > > > > the server would respond with NFS4ERR_STALE. > > > > > If the server does not allow client to use write delegation for the > > > > > READ, should the correct error return be NFS4ERR_OPENMODE? > > > > > > > > > The server must allow the client to use a write delegation for read > > > > operations. It's required by the spec, AFAIU. > > > > > > > > The error in this case was just bogus. The vfs copy routine would return > > > > -EBADF since the file didn't have FMODE_READ, and the nfs server would > > > > translate that into NFS4ERR_STALE. > > > > > > > > Probably there is a better v4 error code that we could translate EBADF > > > > to, but with this patch it shouldn't be a problem any longer. > > > > > > > > > > The problem is that the struct file associated with the delegation does > > > > > > not necessarily have read permissions. It's handing out a write > > > > > > delegation on what is effectively an O_WRONLY open. RFC 8881 states: > > > > > > > > > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > > > > > > own, all opens." > > > > > > > > > > > > Given that the client didn't request any read permissions, and that nfsd > > > > > > didn't check for any, it seems wrong to give out a write delegation. > > > > > > > > > > > > Only hand out a write delegation if we have a O_RDWR descriptor > > > > > > available. If it fails to find an appropriate write descriptor, go > > > > > > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > > > > > > requested. > > > > > > > > > > > > This fixes xfstest generic/001. > > > > > > > > > > > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > > > --- > > > > > > Changes in v2: > > > > > > - Rework the logic when finding struct file for the delegation. The > > > > > > earlier patch might still have attached a O_WRONLY file to the deleg > > > > > > in some cases, and could still have handed out a write delegation on > > > > > > an O_WRONLY OPEN request in some cases. > > > > > > --- > > > > > > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > > > > > > 1 file changed, 18 insertions(+), 11 deletions(-) > > > > > > > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > > > > index ef7118ebee00..e79d82fd05e7 100644 > > > > > > --- a/fs/nfsd/nfs4state.c > > > > > > +++ b/fs/nfsd/nfs4state.c > > > > > > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > > > > > struct nfs4_file *fp = stp->st_stid.sc_file; > > > > > > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > > > > > > struct nfs4_delegation *dp; > > > > > > - struct nfsd_file *nf; > > > > > > + struct nfsd_file *nf = NULL; > > > > > > struct file_lock *fl; > > > > > > u32 dl_type; > > > > > > > > > > > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, > > > > > > if (fp->fi_had_conflict) > > > > > > return ERR_PTR(-EAGAIN); > > > > > > > > > > > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > > > > > > - nf = find_writeable_file(fp); > > > > > > + /* > > > > > > + * Try for a write delegation first. We need an O_RDWR file > > > > > > + * since a write delegation allows the client to perform any open > > > > > > + * from its cache. > > > > > > + */ > > > > > > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { > > > > > > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); > > > > > > dl_type = NFS4_OPEN_DELEGATE_WRITE; > > > > > > - } else { > > > > > Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? > > > > > It does not seem right. > > > > > > > > > > -Dai > > > > > > > > > Why? Per RFC 8881: > > > > > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its > > > > own, all opens." > > > > > > > > All opens. That includes read opens. > > > > > > > > An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the > > > > user has no read permissions. Therefore, we can't grant a write > > > > delegation since can't guarantee that the user is allowed to do that. > > > > > > If the server grants the write delegation on an OPEN with > > > OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then > > > why can't the server checks and denies the subsequent READ? > > > > > > Per RFC 8881, section 9.1.2: > > > > > > For delegation stateids, the access mode is based on the type of > > > delegation. > > > > > > When a READ, WRITE, or SETATTR (that specifies the size attribute) > > > operation is done, the operation is subject to checking against the > > > access mode to verify that the operation is appropriate given the > > > stateid with which the operation is associated. > > > > > > In the case of WRITE-type operations (i.e., WRITEs and SETATTRs that > > > set size), the server MUST verify that the access mode allows writing > > > and MUST return an NFS4ERR_OPENMODE error if it does not. In the case > > > of READ, the server may perform the corresponding check on the access > > > mode, or it may choose to allow READ on OPENs for OPEN4_SHARE_ACCESS_WRITE, > > > to accommodate clients whose WRITE implementation may unavoidably do > > > reads (e.g., due to buffer cache constraints). However, even if READs > > > are allowed in these circumstances, the server MUST still check for > > > locks that conflict with the READ (e.g., another OPEN specified > > > OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a server > > > that does enforce the access mode check on READs need not explicitly > > > check for conflicting share reservations since the existence of OPEN > > > for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share > > > reservation can exist. > > > > > > FWIW, The Solaris server grants write delegation on OPEN with > > > OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or > > > WR-only. Maybe this is a bug? or the spec is not clear? > > > > > > > I don't think that's necessarily a bug. > > > > It's not that the spec demands that we only hand out delegations on BOTH > > opens. This is more of a quirk of the Linux implementation. Linux' > > write delegations require an open O_RDWR file descriptor because we may > > be called upon to do a read on its behalf. > > > > Technically, we could probably just have it check for > > OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, > > then you're unlikely to get a delegation. Either the O_RDWR descriptor > > will be NULL, or there are other, conflicting opens already present. > > > > Solaris may have a completely different design that doesn't require > > this. I haven't looked at its code to know. > > I'm comfortable for now with not handing out write delegations for > SHARE_ACCESS_WRITE opens. I prefer that to permission checking on > every READ operation. > > If we find that it's a significant performance issue, we can revisit. > > Yeah. The thing to remember here is that delegations are optional. We can always say no. One thing we could consider to allow this is trying to open O_RDWR first, and then only fall back to doing an O_WRONLY open if that fails. I'm not sure how that would work out in practice though. One thing that'd be interesting to know with Solaris (and maybe OnTap) is whether they will give you a write delegation for an O_WRONLY open when you don't have any read permissions on the file. If they do, then is the client expected to do permissions enforcement for the cached open and reject local openers for read? I guess I ought to be looking at the Linux client code for this... > > > It'd would be interesting to know how ONTAP server behaves in > > > this scenario. > > > > > > > Indeed. Most likely it behaves more like Solaris does, but it'd nice to > > know. > > > > > > > > > > > > > > > > > > > + } > > > > > > + > > > > > > + /* > > > > > > + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR > > > > > > + * file for some reason, then try for a read deleg instead. > > > > > > + */ > > > > > > + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { > > > > > > nf = find_readable_file(fp); > > > > > > dl_type = NFS4_OPEN_DELEGATE_READ; > > > > > > } > > > > > > - if (!nf) { > > > > > > - /* > > > > > > - * We probably could attempt another open and get a read > > > > > > - * delegation, but for now, don't bother until the > > > > > > - * client actually sends us one. > > > > > > - */ > > > > > > + > > > > > > + if (!nf) > > > > > > return ERR_PTR(-EAGAIN); > > > > > > - } > > > > > > + > > > > > > spin_lock(&state_lock); > > > > > > spin_lock(&fp->fi_lock); > > > > > > if (nfs4_delegation_exists(clp, fp)) > > > > > > > > > > > > --- > > > > > > base-commit: a734662572708cf062e974f659ae50c24fc1ad17 > > > > > > change-id: 20230731-wdeleg-bbdb6b25a3c6 > > > > > > > > > > > > Best regards, > > > > -- > > Jeff Layton <jlayton@kernel.org> > > -- > Chuck Lever > >
On 8/2/23 1:57 PM, Chuck Lever III wrote: > >> On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote: >> >> On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: >>> On 8/2/23 11:15 AM, Jeff Layton wrote: >>>> On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: >>>>> On 8/1/23 6:33 AM, Jeff Layton wrote: >>>>>> I noticed that xfstests generic/001 was failing against linux-next nfsd. >>>>>> >>>>>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server >>>>>> would hand out a write delegation. The client would then try to use that >>>>>> write delegation as the source stateid in a COPY >>>>> not sure why the client opens the source file of a COPY operation with >>>>> OPEN4_SHARE_ACCESS_WRITE? >>>>> >>>> It doesn't. The original open is to write the data for the file being >>>> copied. It then opens the file again for READ, but since it has a write >>>> delegation, it doesn't need to talk to the server at all -- it can just >>>> use that stateid for later operations. >>>> >>>>>> or CLONE operation, and >>>>>> the server would respond with NFS4ERR_STALE. >>>>> If the server does not allow client to use write delegation for the >>>>> READ, should the correct error return be NFS4ERR_OPENMODE? >>>>> >>>> The server must allow the client to use a write delegation for read >>>> operations. It's required by the spec, AFAIU. >>>> >>>> The error in this case was just bogus. The vfs copy routine would return >>>> -EBADF since the file didn't have FMODE_READ, and the nfs server would >>>> translate that into NFS4ERR_STALE. >>>> >>>> Probably there is a better v4 error code that we could translate EBADF >>>> to, but with this patch it shouldn't be a problem any longer. >>>> >>>>>> The problem is that the struct file associated with the delegation does >>>>>> not necessarily have read permissions. It's handing out a write >>>>>> delegation on what is effectively an O_WRONLY open. RFC 8881 states: >>>>>> >>>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its >>>>>> own, all opens." >>>>>> >>>>>> Given that the client didn't request any read permissions, and that nfsd >>>>>> didn't check for any, it seems wrong to give out a write delegation. >>>>>> >>>>>> Only hand out a write delegation if we have a O_RDWR descriptor >>>>>> available. If it fails to find an appropriate write descriptor, go >>>>>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was >>>>>> requested. >>>>>> >>>>>> This fixes xfstest generic/001. >>>>>> >>>>>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 >>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>>>>> --- >>>>>> Changes in v2: >>>>>> - Rework the logic when finding struct file for the delegation. The >>>>>> earlier patch might still have attached a O_WRONLY file to the deleg >>>>>> in some cases, and could still have handed out a write delegation on >>>>>> an O_WRONLY OPEN request in some cases. >>>>>> --- >>>>>> fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- >>>>>> 1 file changed, 18 insertions(+), 11 deletions(-) >>>>>> >>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>>>>> index ef7118ebee00..e79d82fd05e7 100644 >>>>>> --- a/fs/nfsd/nfs4state.c >>>>>> +++ b/fs/nfsd/nfs4state.c >>>>>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>>>>> struct nfs4_file *fp = stp->st_stid.sc_file; >>>>>> struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; >>>>>> struct nfs4_delegation *dp; >>>>>> - struct nfsd_file *nf; >>>>>> + struct nfsd_file *nf = NULL; >>>>>> struct file_lock *fl; >>>>>> u32 dl_type; >>>>>> >>>>>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>>>>> if (fp->fi_had_conflict) >>>>>> return ERR_PTR(-EAGAIN); >>>>>> >>>>>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { >>>>>> - nf = find_writeable_file(fp); >>>>>> + /* >>>>>> + * Try for a write delegation first. We need an O_RDWR file >>>>>> + * since a write delegation allows the client to perform any open >>>>>> + * from its cache. >>>>>> + */ >>>>>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { >>>>>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); >>>>>> dl_type = NFS4_OPEN_DELEGATE_WRITE; >>>>>> - } else { >>>>> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? >>>>> It does not seem right. >>>>> >>>>> -Dai >>>>> >>>> Why? Per RFC 8881: >>>> >>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its >>>> own, all opens." >>>> >>>> All opens. That includes read opens. >>>> >>>> An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the >>>> user has no read permissions. Therefore, we can't grant a write >>>> delegation since can't guarantee that the user is allowed to do that. >>> If the server grants the write delegation on an OPEN with >>> OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then >>> why can't the server checks and denies the subsequent READ? >>> >>> Per RFC 8881, section 9.1.2: >>> >>> For delegation stateids, the access mode is based on the type of >>> delegation. >>> >>> When a READ, WRITE, or SETATTR (that specifies the size attribute) >>> operation is done, the operation is subject to checking against the >>> access mode to verify that the operation is appropriate given the >>> stateid with which the operation is associated. >>> >>> In the case of WRITE-type operations (i.e., WRITEs and SETATTRs that >>> set size), the server MUST verify that the access mode allows writing >>> and MUST return an NFS4ERR_OPENMODE error if it does not. In the case >>> of READ, the server may perform the corresponding check on the access >>> mode, or it may choose to allow READ on OPENs for OPEN4_SHARE_ACCESS_WRITE, >>> to accommodate clients whose WRITE implementation may unavoidably do >>> reads (e.g., due to buffer cache constraints). However, even if READs >>> are allowed in these circumstances, the server MUST still check for >>> locks that conflict with the READ (e.g., another OPEN specified >>> OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a server >>> that does enforce the access mode check on READs need not explicitly >>> check for conflicting share reservations since the existence of OPEN >>> for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share >>> reservation can exist. >>> >>> FWIW, The Solaris server grants write delegation on OPEN with >>> OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or >>> WR-only. Maybe this is a bug? or the spec is not clear? >>> >> I don't think that's necessarily a bug. >> >> It's not that the spec demands that we only hand out delegations on BOTH >> opens. This is more of a quirk of the Linux implementation. Linux' >> write delegations require an open O_RDWR file descriptor because we may >> be called upon to do a read on its behalf. >> >> Technically, we could probably just have it check for >> OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, >> then you're unlikely to get a delegation. Either the O_RDWR descriptor >> will be NULL, or there are other, conflicting opens already present. >> >> Solaris may have a completely different design that doesn't require >> this. I haven't looked at its code to know. > I'm comfortable for now with not handing out write delegations for > SHARE_ACCESS_WRITE opens. I prefer that to permission checking on > every READ operation. I'm fine with just handling out write delegation for SHARE_ACCESS_BOTH only. Just a concern about not checking for access at the time of READ operation. If the file was opened with SHARE_ACCESS_WRITE (no write delegation granted) and the file access mode was changed to read-only, is it a correct behavior for the server to allow the READ to go through? -Dai > > If we find that it's a significant performance issue, we can revisit. > > >>> It'd would be interesting to know how ONTAP server behaves in >>> this scenario. >>> >> Indeed. Most likely it behaves more like Solaris does, but it'd nice to >> know. >> >>>> >>>>>> + } >>>>>> + >>>>>> + /* >>>>>> + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR >>>>>> + * file for some reason, then try for a read deleg instead. >>>>>> + */ >>>>>> + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { >>>>>> nf = find_readable_file(fp); >>>>>> dl_type = NFS4_OPEN_DELEGATE_READ; >>>>>> } >>>>>> - if (!nf) { >>>>>> - /* >>>>>> - * We probably could attempt another open and get a read >>>>>> - * delegation, but for now, don't bother until the >>>>>> - * client actually sends us one. >>>>>> - */ >>>>>> + >>>>>> + if (!nf) >>>>>> return ERR_PTR(-EAGAIN); >>>>>> - } >>>>>> + >>>>>> spin_lock(&state_lock); >>>>>> spin_lock(&fp->fi_lock); >>>>>> if (nfs4_delegation_exists(clp, fp)) >>>>>> >>>>>> --- >>>>>> base-commit: a734662572708cf062e974f659ae50c24fc1ad17 >>>>>> change-id: 20230731-wdeleg-bbdb6b25a3c6 >>>>>> >>>>>> Best regards, >> -- >> Jeff Layton <jlayton@kernel.org> > -- > Chuck Lever > >
On 8/2/23 2:13 PM, Jeff Layton wrote: > On Wed, 2023-08-02 at 20:57 +0000, Chuck Lever III wrote: >>> On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote: >>> >>> On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: >>>> On 8/2/23 11:15 AM, Jeff Layton wrote: >>>>> On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: >>>>>> On 8/1/23 6:33 AM, Jeff Layton wrote: >>>>>>> I noticed that xfstests generic/001 was failing against linux-next nfsd. >>>>>>> >>>>>>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server >>>>>>> would hand out a write delegation. The client would then try to use that >>>>>>> write delegation as the source stateid in a COPY >>>>>> not sure why the client opens the source file of a COPY operation with >>>>>> OPEN4_SHARE_ACCESS_WRITE? >>>>>> >>>>> It doesn't. The original open is to write the data for the file being >>>>> copied. It then opens the file again for READ, but since it has a write >>>>> delegation, it doesn't need to talk to the server at all -- it can just >>>>> use that stateid for later operations. >>>>> >>>>>>> or CLONE operation, and >>>>>>> the server would respond with NFS4ERR_STALE. >>>>>> If the server does not allow client to use write delegation for the >>>>>> READ, should the correct error return be NFS4ERR_OPENMODE? >>>>>> >>>>> The server must allow the client to use a write delegation for read >>>>> operations. It's required by the spec, AFAIU. >>>>> >>>>> The error in this case was just bogus. The vfs copy routine would return >>>>> -EBADF since the file didn't have FMODE_READ, and the nfs server would >>>>> translate that into NFS4ERR_STALE. >>>>> >>>>> Probably there is a better v4 error code that we could translate EBADF >>>>> to, but with this patch it shouldn't be a problem any longer. >>>>> >>>>>>> The problem is that the struct file associated with the delegation does >>>>>>> not necessarily have read permissions. It's handing out a write >>>>>>> delegation on what is effectively an O_WRONLY open. RFC 8881 states: >>>>>>> >>>>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its >>>>>>> own, all opens." >>>>>>> >>>>>>> Given that the client didn't request any read permissions, and that nfsd >>>>>>> didn't check for any, it seems wrong to give out a write delegation. >>>>>>> >>>>>>> Only hand out a write delegation if we have a O_RDWR descriptor >>>>>>> available. If it fails to find an appropriate write descriptor, go >>>>>>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was >>>>>>> requested. >>>>>>> >>>>>>> This fixes xfstest generic/001. >>>>>>> >>>>>>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 >>>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>>>>>> --- >>>>>>> Changes in v2: >>>>>>> - Rework the logic when finding struct file for the delegation. The >>>>>>> earlier patch might still have attached a O_WRONLY file to the deleg >>>>>>> in some cases, and could still have handed out a write delegation on >>>>>>> an O_WRONLY OPEN request in some cases. >>>>>>> --- >>>>>>> fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- >>>>>>> 1 file changed, 18 insertions(+), 11 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>>>>>> index ef7118ebee00..e79d82fd05e7 100644 >>>>>>> --- a/fs/nfsd/nfs4state.c >>>>>>> +++ b/fs/nfsd/nfs4state.c >>>>>>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>>>>>> struct nfs4_file *fp = stp->st_stid.sc_file; >>>>>>> struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; >>>>>>> struct nfs4_delegation *dp; >>>>>>> - struct nfsd_file *nf; >>>>>>> + struct nfsd_file *nf = NULL; >>>>>>> struct file_lock *fl; >>>>>>> u32 dl_type; >>>>>>> >>>>>>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, >>>>>>> if (fp->fi_had_conflict) >>>>>>> return ERR_PTR(-EAGAIN); >>>>>>> >>>>>>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { >>>>>>> - nf = find_writeable_file(fp); >>>>>>> + /* >>>>>>> + * Try for a write delegation first. We need an O_RDWR file >>>>>>> + * since a write delegation allows the client to perform any open >>>>>>> + * from its cache. >>>>>>> + */ >>>>>>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { >>>>>>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); >>>>>>> dl_type = NFS4_OPEN_DELEGATE_WRITE; >>>>>>> - } else { >>>>>> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write delegation? >>>>>> It does not seem right. >>>>>> >>>>>> -Dai >>>>>> >>>>> Why? Per RFC 8881: >>>>> >>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its >>>>> own, all opens." >>>>> >>>>> All opens. That includes read opens. >>>>> >>>>> An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the >>>>> user has no read permissions. Therefore, we can't grant a write >>>>> delegation since can't guarantee that the user is allowed to do that. >>>> If the server grants the write delegation on an OPEN with >>>> OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then >>>> why can't the server checks and denies the subsequent READ? >>>> >>>> Per RFC 8881, section 9.1.2: >>>> >>>> For delegation stateids, the access mode is based on the type of >>>> delegation. >>>> >>>> When a READ, WRITE, or SETATTR (that specifies the size attribute) >>>> operation is done, the operation is subject to checking against the >>>> access mode to verify that the operation is appropriate given the >>>> stateid with which the operation is associated. >>>> >>>> In the case of WRITE-type operations (i.e., WRITEs and SETATTRs that >>>> set size), the server MUST verify that the access mode allows writing >>>> and MUST return an NFS4ERR_OPENMODE error if it does not. In the case >>>> of READ, the server may perform the corresponding check on the access >>>> mode, or it may choose to allow READ on OPENs for OPEN4_SHARE_ACCESS_WRITE, >>>> to accommodate clients whose WRITE implementation may unavoidably do >>>> reads (e.g., due to buffer cache constraints). However, even if READs >>>> are allowed in these circumstances, the server MUST still check for >>>> locks that conflict with the READ (e.g., another OPEN specified >>>> OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a server >>>> that does enforce the access mode check on READs need not explicitly >>>> check for conflicting share reservations since the existence of OPEN >>>> for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share >>>> reservation can exist. >>>> >>>> FWIW, The Solaris server grants write delegation on OPEN with >>>> OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or >>>> WR-only. Maybe this is a bug? or the spec is not clear? >>>> >>> I don't think that's necessarily a bug. >>> >>> It's not that the spec demands that we only hand out delegations on BOTH >>> opens. This is more of a quirk of the Linux implementation. Linux' >>> write delegations require an open O_RDWR file descriptor because we may >>> be called upon to do a read on its behalf. >>> >>> Technically, we could probably just have it check for >>> OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, >>> then you're unlikely to get a delegation. Either the O_RDWR descriptor >>> will be NULL, or there are other, conflicting opens already present. >>> >>> Solaris may have a completely different design that doesn't require >>> this. I haven't looked at its code to know. >> I'm comfortable for now with not handing out write delegations for >> SHARE_ACCESS_WRITE opens. I prefer that to permission checking on >> every READ operation. >> >> If we find that it's a significant performance issue, we can revisit. >> >> > Yeah. The thing to remember here is that delegations are optional. We > can always say no. > > One thing we could consider to allow this is trying to open O_RDWR > first, and then only fall back to doing an O_WRONLY open if that fails. > I'm not sure how that would work out in practice though. > > One thing that'd be interesting to know with Solaris (and maybe OnTap) > is whether they will give you a write delegation for an O_WRONLY open > when you don't have any read permissions on the file. Yes, the Solaris server does hand out the write delegation for a O_WRONLY open on a file with write only permission for the owner. -Dai > > If they do, then is the client expected to do permissions enforcement > for the cached open and reject local openers for read? I guess I ought > to be looking at the Linux client code for this... > >>>> It'd would be interesting to know how ONTAP server behaves in >>>> this scenario. >>>> >>> Indeed. Most likely it behaves more like Solaris does, but it'd nice to >>> know. >>> >>>>> >>>>>>> + } >>>>>>> + >>>>>>> + /* >>>>>>> + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR >>>>>>> + * file for some reason, then try for a read deleg instead. >>>>>>> + */ >>>>>>> + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { >>>>>>> nf = find_readable_file(fp); >>>>>>> dl_type = NFS4_OPEN_DELEGATE_READ; >>>>>>> } >>>>>>> - if (!nf) { >>>>>>> - /* >>>>>>> - * We probably could attempt another open and get a read >>>>>>> - * delegation, but for now, don't bother until the >>>>>>> - * client actually sends us one. >>>>>>> - */ >>>>>>> + >>>>>>> + if (!nf) >>>>>>> return ERR_PTR(-EAGAIN); >>>>>>> - } >>>>>>> + >>>>>>> spin_lock(&state_lock); >>>>>>> spin_lock(&fp->fi_lock); >>>>>>> if (nfs4_delegation_exists(clp, fp)) >>>>>>> >>>>>>> --- >>>>>>> base-commit: a734662572708cf062e974f659ae50c24fc1ad17 >>>>>>> change-id: 20230731-wdeleg-bbdb6b25a3c6 >>>>>>> >>>>>>> Best regards, >>> -- >>> Jeff Layton <jlayton@kernel.org> >> -- >> Chuck Lever >> >>
On 8/2/23 2:22 PM, dai.ngo@oracle.com wrote: > > On 8/2/23 1:57 PM, Chuck Lever III wrote: >> >>> On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote: >>> >>> On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: >>>> On 8/2/23 11:15 AM, Jeff Layton wrote: >>>>> On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: >>>>>> On 8/1/23 6:33 AM, Jeff Layton wrote: >>>>>>> I noticed that xfstests generic/001 was failing against >>>>>>> linux-next nfsd. >>>>>>> >>>>>>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and >>>>>>> the server >>>>>>> would hand out a write delegation. The client would then try to >>>>>>> use that >>>>>>> write delegation as the source stateid in a COPY >>>>>> not sure why the client opens the source file of a COPY operation >>>>>> with >>>>>> OPEN4_SHARE_ACCESS_WRITE? >>>>>> >>>>> It doesn't. The original open is to write the data for the file being >>>>> copied. It then opens the file again for READ, but since it has a >>>>> write >>>>> delegation, it doesn't need to talk to the server at all -- it can >>>>> just >>>>> use that stateid for later operations. >>>>> >>>>>>> or CLONE operation, and >>>>>>> the server would respond with NFS4ERR_STALE. >>>>>> If the server does not allow client to use write delegation for the >>>>>> READ, should the correct error return be NFS4ERR_OPENMODE? >>>>>> >>>>> The server must allow the client to use a write delegation for read >>>>> operations. It's required by the spec, AFAIU. >>>>> >>>>> The error in this case was just bogus. The vfs copy routine would >>>>> return >>>>> -EBADF since the file didn't have FMODE_READ, and the nfs server >>>>> would >>>>> translate that into NFS4ERR_STALE. >>>>> >>>>> Probably there is a better v4 error code that we could translate >>>>> EBADF >>>>> to, but with this patch it shouldn't be a problem any longer. >>>>> >>>>>>> The problem is that the struct file associated with the >>>>>>> delegation does >>>>>>> not necessarily have read permissions. It's handing out a write >>>>>>> delegation on what is effectively an O_WRONLY open. RFC 8881 >>>>>>> states: >>>>>>> >>>>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to >>>>>>> handle, on its >>>>>>> own, all opens." >>>>>>> >>>>>>> Given that the client didn't request any read permissions, and >>>>>>> that nfsd >>>>>>> didn't check for any, it seems wrong to give out a write >>>>>>> delegation. >>>>>>> >>>>>>> Only hand out a write delegation if we have a O_RDWR descriptor >>>>>>> available. If it fails to find an appropriate write descriptor, go >>>>>>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was >>>>>>> requested. >>>>>>> >>>>>>> This fixes xfstest generic/001. >>>>>>> >>>>>>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 >>>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>>>>>> --- >>>>>>> Changes in v2: >>>>>>> - Rework the logic when finding struct file for the delegation. The >>>>>>> earlier patch might still have attached a O_WRONLY file to >>>>>>> the deleg >>>>>>> in some cases, and could still have handed out a write >>>>>>> delegation on >>>>>>> an O_WRONLY OPEN request in some cases. >>>>>>> --- >>>>>>> fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- >>>>>>> 1 file changed, 18 insertions(+), 11 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>>>>>> index ef7118ebee00..e79d82fd05e7 100644 >>>>>>> --- a/fs/nfsd/nfs4state.c >>>>>>> +++ b/fs/nfsd/nfs4state.c >>>>>>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open >>>>>>> *open, struct nfs4_ol_stateid *stp, >>>>>>> struct nfs4_file *fp = stp->st_stid.sc_file; >>>>>>> struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; >>>>>>> struct nfs4_delegation *dp; >>>>>>> - struct nfsd_file *nf; >>>>>>> + struct nfsd_file *nf = NULL; >>>>>>> struct file_lock *fl; >>>>>>> u32 dl_type; >>>>>>> >>>>>>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open >>>>>>> *open, struct nfs4_ol_stateid *stp, >>>>>>> if (fp->fi_had_conflict) >>>>>>> return ERR_PTR(-EAGAIN); >>>>>>> >>>>>>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { >>>>>>> - nf = find_writeable_file(fp); >>>>>>> + /* >>>>>>> + * Try for a write delegation first. We need an O_RDWR file >>>>>>> + * since a write delegation allows the client to perform any open >>>>>>> + * from its cache. >>>>>>> + */ >>>>>>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == >>>>>>> NFS4_SHARE_ACCESS_BOTH) { >>>>>>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); >>>>>>> dl_type = NFS4_OPEN_DELEGATE_WRITE; >>>>>>> - } else { >>>>>> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write >>>>>> delegation? >>>>>> It does not seem right. >>>>>> >>>>>> -Dai >>>>>> >>>>> Why? Per RFC 8881: >>>>> >>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on >>>>> its >>>>> own, all opens." >>>>> >>>>> All opens. That includes read opens. >>>>> >>>>> An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the >>>>> user has no read permissions. Therefore, we can't grant a write >>>>> delegation since can't guarantee that the user is allowed to do that. >>>> If the server grants the write delegation on an OPEN with >>>> OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then >>>> why can't the server checks and denies the subsequent READ? >>>> >>>> Per RFC 8881, section 9.1.2: >>>> >>>> For delegation stateids, the access mode is based on the type of >>>> delegation. >>>> >>>> When a READ, WRITE, or SETATTR (that specifies the size >>>> attribute) >>>> operation is done, the operation is subject to checking >>>> against the >>>> access mode to verify that the operation is appropriate given the >>>> stateid with which the operation is associated. >>>> >>>> In the case of WRITE-type operations (i.e., WRITEs and >>>> SETATTRs that >>>> set size), the server MUST verify that the access mode allows >>>> writing >>>> and MUST return an NFS4ERR_OPENMODE error if it does not. In >>>> the case >>>> of READ, the server may perform the corresponding check on the >>>> access >>>> mode, or it may choose to allow READ on OPENs for >>>> OPEN4_SHARE_ACCESS_WRITE, >>>> to accommodate clients whose WRITE implementation may >>>> unavoidably do >>>> reads (e.g., due to buffer cache constraints). However, even >>>> if READs >>>> are allowed in these circumstances, the server MUST still >>>> check for >>>> locks that conflict with the READ (e.g., another OPEN specified >>>> OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a >>>> server >>>> that does enforce the access mode check on READs need not >>>> explicitly >>>> check for conflicting share reservations since the existence >>>> of OPEN >>>> for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share >>>> reservation can exist. >>>> >>>> FWIW, The Solaris server grants write delegation on OPEN with >>>> OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or >>>> WR-only. Maybe this is a bug? or the spec is not clear? >>>> >>> I don't think that's necessarily a bug. >>> >>> It's not that the spec demands that we only hand out delegations on >>> BOTH >>> opens. This is more of a quirk of the Linux implementation. Linux' >>> write delegations require an open O_RDWR file descriptor because we may >>> be called upon to do a read on its behalf. >>> >>> Technically, we could probably just have it check for >>> OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, >>> then you're unlikely to get a delegation. Either the O_RDWR descriptor >>> will be NULL, or there are other, conflicting opens already present. >>> >>> Solaris may have a completely different design that doesn't require >>> this. I haven't looked at its code to know. >> I'm comfortable for now with not handing out write delegations for >> SHARE_ACCESS_WRITE opens. I prefer that to permission checking on >> every READ operation. > > I'm fine with just handling out write delegation for SHARE_ACCESS_BOTH > only. > > Just a concern about not checking for access at the time of READ > operation. or not checking file permission at the time WRITE. > If the file was opened with SHARE_ACCESS_WRITE (no write delegation > granted) > and the file access mode was changed to read-only, is it a correct > behavior > for the server to allow the READ to go through? I meant for the WRITE to go through. > > -Dai > >> >> If we find that it's a significant performance issue, we can revisit. >> >> >>>> It'd would be interesting to know how ONTAP server behaves in >>>> this scenario. >>>> >>> Indeed. Most likely it behaves more like Solaris does, but it'd nice to >>> know. >>> >>>>> >>>>>>> + } >>>>>>> + >>>>>>> + /* >>>>>>> + * If the file is being opened O_RDONLY or we couldn't get a >>>>>>> O_RDWR >>>>>>> + * file for some reason, then try for a read deleg instead. >>>>>>> + */ >>>>>>> + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { >>>>>>> nf = find_readable_file(fp); >>>>>>> dl_type = NFS4_OPEN_DELEGATE_READ; >>>>>>> } >>>>>>> - if (!nf) { >>>>>>> - /* >>>>>>> - * We probably could attempt another open and get a read >>>>>>> - * delegation, but for now, don't bother until the >>>>>>> - * client actually sends us one. >>>>>>> - */ >>>>>>> + >>>>>>> + if (!nf) >>>>>>> return ERR_PTR(-EAGAIN); >>>>>>> - } >>>>>>> + >>>>>>> spin_lock(&state_lock); >>>>>>> spin_lock(&fp->fi_lock); >>>>>>> if (nfs4_delegation_exists(clp, fp)) >>>>>>> >>>>>>> --- >>>>>>> base-commit: a734662572708cf062e974f659ae50c24fc1ad17 >>>>>>> change-id: 20230731-wdeleg-bbdb6b25a3c6 >>>>>>> >>>>>>> Best regards, >>> -- >>> Jeff Layton <jlayton@kernel.org> >> -- >> Chuck Lever >> >>
On Wed, 2023-08-02 at 14:32 -0700, dai.ngo@oracle.com wrote: > On 8/2/23 2:22 PM, dai.ngo@oracle.com wrote: > > > > On 8/2/23 1:57 PM, Chuck Lever III wrote: > > > > > > > On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: > > > > > On 8/2/23 11:15 AM, Jeff Layton wrote: > > > > > > On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: > > > > > > > On 8/1/23 6:33 AM, Jeff Layton wrote: > > > > > > > > I noticed that xfstests generic/001 was failing against > > > > > > > > linux-next nfsd. > > > > > > > > > > > > > > > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and > > > > > > > > the server > > > > > > > > would hand out a write delegation. The client would then try to > > > > > > > > use that > > > > > > > > write delegation as the source stateid in a COPY > > > > > > > not sure why the client opens the source file of a COPY operation > > > > > > > with > > > > > > > OPEN4_SHARE_ACCESS_WRITE? > > > > > > > > > > > > > It doesn't. The original open is to write the data for the file being > > > > > > copied. It then opens the file again for READ, but since it has a > > > > > > write > > > > > > delegation, it doesn't need to talk to the server at all -- it can > > > > > > just > > > > > > use that stateid for later operations. > > > > > > > > > > > > > > or CLONE operation, and > > > > > > > > the server would respond with NFS4ERR_STALE. > > > > > > > If the server does not allow client to use write delegation for the > > > > > > > READ, should the correct error return be NFS4ERR_OPENMODE? > > > > > > > > > > > > > The server must allow the client to use a write delegation for read > > > > > > operations. It's required by the spec, AFAIU. > > > > > > > > > > > > The error in this case was just bogus. The vfs copy routine would > > > > > > return > > > > > > -EBADF since the file didn't have FMODE_READ, and the nfs server > > > > > > would > > > > > > translate that into NFS4ERR_STALE. > > > > > > > > > > > > Probably there is a better v4 error code that we could translate > > > > > > EBADF > > > > > > to, but with this patch it shouldn't be a problem any longer. > > > > > > > > > > > > > > The problem is that the struct file associated with the > > > > > > > > delegation does > > > > > > > > not necessarily have read permissions. It's handing out a write > > > > > > > > delegation on what is effectively an O_WRONLY open. RFC 8881 > > > > > > > > states: > > > > > > > > > > > > > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to > > > > > > > > handle, on its > > > > > > > > own, all opens." > > > > > > > > > > > > > > > > Given that the client didn't request any read permissions, and > > > > > > > > that nfsd > > > > > > > > didn't check for any, it seems wrong to give out a write > > > > > > > > delegation. > > > > > > > > > > > > > > > > Only hand out a write delegation if we have a O_RDWR descriptor > > > > > > > > available. If it fails to find an appropriate write descriptor, go > > > > > > > > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > > > > > > > > requested. > > > > > > > > > > > > > > > > This fixes xfstest generic/001. > > > > > > > > > > > > > > > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > > > > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > > > > > --- > > > > > > > > Changes in v2: > > > > > > > > - Rework the logic when finding struct file for the delegation. The > > > > > > > > earlier patch might still have attached a O_WRONLY file to > > > > > > > > the deleg > > > > > > > > in some cases, and could still have handed out a write > > > > > > > > delegation on > > > > > > > > an O_WRONLY OPEN request in some cases. > > > > > > > > --- > > > > > > > > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > > > > > > > > 1 file changed, 18 insertions(+), 11 deletions(-) > > > > > > > > > > > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > > > > > > index ef7118ebee00..e79d82fd05e7 100644 > > > > > > > > --- a/fs/nfsd/nfs4state.c > > > > > > > > +++ b/fs/nfsd/nfs4state.c > > > > > > > > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open > > > > > > > > *open, struct nfs4_ol_stateid *stp, > > > > > > > > struct nfs4_file *fp = stp->st_stid.sc_file; > > > > > > > > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > > > > > > > > struct nfs4_delegation *dp; > > > > > > > > - struct nfsd_file *nf; > > > > > > > > + struct nfsd_file *nf = NULL; > > > > > > > > struct file_lock *fl; > > > > > > > > u32 dl_type; > > > > > > > > > > > > > > > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open > > > > > > > > *open, struct nfs4_ol_stateid *stp, > > > > > > > > if (fp->fi_had_conflict) > > > > > > > > return ERR_PTR(-EAGAIN); > > > > > > > > > > > > > > > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > > > > > > > > - nf = find_writeable_file(fp); > > > > > > > > + /* > > > > > > > > + * Try for a write delegation first. We need an O_RDWR file > > > > > > > > + * since a write delegation allows the client to perform any open > > > > > > > > + * from its cache. > > > > > > > > + */ > > > > > > > > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == > > > > > > > > NFS4_SHARE_ACCESS_BOTH) { > > > > > > > > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); > > > > > > > > dl_type = NFS4_OPEN_DELEGATE_WRITE; > > > > > > > > - } else { > > > > > > > Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write > > > > > > > delegation? > > > > > > > It does not seem right. > > > > > > > > > > > > > > -Dai > > > > > > > > > > > > > Why? Per RFC 8881: > > > > > > > > > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on > > > > > > its > > > > > > own, all opens." > > > > > > > > > > > > All opens. That includes read opens. > > > > > > > > > > > > An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the > > > > > > user has no read permissions. Therefore, we can't grant a write > > > > > > delegation since can't guarantee that the user is allowed to do that. > > > > > If the server grants the write delegation on an OPEN with > > > > > OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then > > > > > why can't the server checks and denies the subsequent READ? > > > > > > > > > > Per RFC 8881, section 9.1.2: > > > > > > > > > > For delegation stateids, the access mode is based on the type of > > > > > delegation. > > > > > > > > > > When a READ, WRITE, or SETATTR (that specifies the size > > > > > attribute) > > > > > operation is done, the operation is subject to checking > > > > > against the > > > > > access mode to verify that the operation is appropriate given the > > > > > stateid with which the operation is associated. > > > > > > > > > > In the case of WRITE-type operations (i.e., WRITEs and > > > > > SETATTRs that > > > > > set size), the server MUST verify that the access mode allows > > > > > writing > > > > > and MUST return an NFS4ERR_OPENMODE error if it does not. In > > > > > the case > > > > > of READ, the server may perform the corresponding check on the > > > > > access > > > > > mode, or it may choose to allow READ on OPENs for > > > > > OPEN4_SHARE_ACCESS_WRITE, > > > > > to accommodate clients whose WRITE implementation may > > > > > unavoidably do > > > > > reads (e.g., due to buffer cache constraints). However, even > > > > > if READs > > > > > are allowed in these circumstances, the server MUST still > > > > > check for > > > > > locks that conflict with the READ (e.g., another OPEN specified > > > > > OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a > > > > > server > > > > > that does enforce the access mode check on READs need not > > > > > explicitly > > > > > check for conflicting share reservations since the existence > > > > > of OPEN > > > > > for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share > > > > > reservation can exist. > > > > > > > > > > FWIW, The Solaris server grants write delegation on OPEN with > > > > > OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or > > > > > WR-only. Maybe this is a bug? or the spec is not clear? > > > > > > > > > I don't think that's necessarily a bug. > > > > > > > > It's not that the spec demands that we only hand out delegations on > > > > BOTH > > > > opens. This is more of a quirk of the Linux implementation. Linux' > > > > write delegations require an open O_RDWR file descriptor because we may > > > > be called upon to do a read on its behalf. > > > > > > > > Technically, we could probably just have it check for > > > > OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, > > > > then you're unlikely to get a delegation. Either the O_RDWR descriptor > > > > will be NULL, or there are other, conflicting opens already present. > > > > > > > > Solaris may have a completely different design that doesn't require > > > > this. I haven't looked at its code to know. > > > I'm comfortable for now with not handing out write delegations for > > > SHARE_ACCESS_WRITE opens. I prefer that to permission checking on > > > every READ operation. > > > > I'm fine with just handling out write delegation for SHARE_ACCESS_BOTH > > only. > > > > Just a concern about not checking for access at the time of READ > > operation. > or not checking file permission at the time WRITE. > > If the file was opened with SHARE_ACCESS_WRITE (no write delegation > > granted) > > and the file access mode was changed to read-only, is it a correct > > behavior > > for the server to allow the READ to go through? > I meant for the WRITE to go through. Yes: POSIX permissions enforcement is done at open time, not when doing actual reads and writes. If you open a file on (e.g.) xfs and start streaming writes to it, then you don't expect that you will lose the ability to write to that fd if the permissions change. In the old v2/3 days of stateless NFS, we had to check permissions on every READ or WRITE operation, but we generally did an open on every RPC too, so it just worked out that we checked permissions on each operation. With v4 we can better approximate POSIX semantics by just associating a stateid with an open file to allow the client to keep writing in this case. > > > > -Dai > > > > > > > > If we find that it's a significant performance issue, we can revisit. > > > > > > > > > > > It'd would be interesting to know how ONTAP server behaves in > > > > > this scenario. > > > > > > > > > Indeed. Most likely it behaves more like Solaris does, but it'd nice to > > > > know. > > > > > > > > > > > > > > > > > > + } > > > > > > > > + > > > > > > > > + /* > > > > > > > > + * If the file is being opened O_RDONLY or we couldn't get a > > > > > > > > O_RDWR > > > > > > > > + * file for some reason, then try for a read deleg instead. > > > > > > > > + */ > > > > > > > > + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { > > > > > > > > nf = find_readable_file(fp); > > > > > > > > dl_type = NFS4_OPEN_DELEGATE_READ; > > > > > > > > } > > > > > > > > - if (!nf) { > > > > > > > > - /* > > > > > > > > - * We probably could attempt another open and get a read > > > > > > > > - * delegation, but for now, don't bother until the > > > > > > > > - * client actually sends us one. > > > > > > > > - */ > > > > > > > > + > > > > > > > > + if (!nf) > > > > > > > > return ERR_PTR(-EAGAIN); > > > > > > > > - } > > > > > > > > + > > > > > > > > spin_lock(&state_lock); > > > > > > > > spin_lock(&fp->fi_lock); > > > > > > > > if (nfs4_delegation_exists(clp, fp)) > > > > > > > > > > > > > > > > --- > > > > > > > > base-commit: a734662572708cf062e974f659ae50c24fc1ad17 > > > > > > > > change-id: 20230731-wdeleg-bbdb6b25a3c6 > > > > > > > > > > > > > > > > Best regards, > > > > -- > > > > Jeff Layton <jlayton@kernel.org> > > > -- > > > Chuck Lever > > > > > >
On Wed, 2023-08-02 at 16:38 -0700, dai.ngo@oracle.com wrote: > > > > > > > > On 8/2/23 2:52 PM, Jeff Layton wrote: > > > > > > > > > > On Wed, 2023-08-02 at 14:32 -0700, dai.ngo@oracle.com wrote: > > > > > > > > > > > > > On 8/2/23 2:22 PM, dai.ngo@oracle.com wrote: > > > > > > > > > > > > > > > > > > On 8/2/23 1:57 PM, Chuck Lever III wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote: > > > > > > > > > > > > On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 8/2/23 11:15 AM, Jeff Layton wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 8/1/23 6:33 AM, Jeff Layton wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I noticed that xfstests generic/001 was failing against > > > > > > > > > > linux-next nfsd. > > > > > > > > > > > > > > > > > > > > The client would request a OPEN4_SHARE_ACCESS_WRITE open, and > > > > > > > > > > the server > > > > > > > > > > would hand out a write delegation. The client would then try to > > > > > > > > > > use that > > > > > > > > > > write delegation as the source stateid in a COPY > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > not sure why the client opens the source file of a COPY operation > > > > > > > > > with > > > > > > > > > OPEN4_SHARE_ACCESS_WRITE? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It doesn't. The original open is to write the data for the file being > > > > > > > > copied. It then opens the file again for READ, but since it has a > > > > > > > > write > > > > > > > > delegation, it doesn't need to talk to the server at all -- it can > > > > > > > > just > > > > > > > > use that stateid for later operations. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > or CLONE operation, and > > > > > > > > > > the server would respond with NFS4ERR_STALE. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the server does not allow client to use write delegation for the > > > > > > > > > READ, should the correct error return be NFS4ERR_OPENMODE? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The server must allow the client to use a write delegation for read > > > > > > > > operations. It's required by the spec, AFAIU. > > > > > > > > > > > > > > > > The error in this case was just bogus. The vfs copy routine would > > > > > > > > return > > > > > > > > -EBADF since the file didn't have FMODE_READ, and the nfs server > > > > > > > > would > > > > > > > > translate that into NFS4ERR_STALE. > > > > > > > > > > > > > > > > Probably there is a better v4 error code that we could translate > > > > > > > > EBADF > > > > > > > > to, but with this patch it shouldn't be a problem any longer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The problem is that the struct file associated with the > > > > > > > > > > delegation does > > > > > > > > > > not necessarily have read permissions. It's handing out a write > > > > > > > > > > delegation on what is effectively an O_WRONLY open. RFC 8881 > > > > > > > > > > states: > > > > > > > > > > > > > > > > > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to > > > > > > > > > > handle, on its > > > > > > > > > > own, all opens." > > > > > > > > > > > > > > > > > > > > Given that the client didn't request any read permissions, and > > > > > > > > > > that nfsd > > > > > > > > > > didn't check for any, it seems wrong to give out a write > > > > > > > > > > delegation. > > > > > > > > > > > > > > > > > > > > Only hand out a write delegation if we have a O_RDWR descriptor > > > > > > > > > > available. If it fails to find an appropriate write descriptor, go > > > > > > > > > > ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was > > > > > > > > > > requested. > > > > > > > > > > > > > > > > > > > > This fixes xfstest generic/001. > > > > > > > > > > > > > > > > > > > > Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 > > > > > > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > > > > > > > --- > > > > > > > > > > Changes in v2: > > > > > > > > > > - Rework the logic when finding struct file for the delegation. The > > > > > > > > > > earlier patch might still have attached a O_WRONLY file to > > > > > > > > > > the deleg > > > > > > > > > > in some cases, and could still have handed out a write > > > > > > > > > > delegation on > > > > > > > > > > an O_WRONLY OPEN request in some cases. > > > > > > > > > > --- > > > > > > > > > > fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- > > > > > > > > > > 1 file changed, 18 insertions(+), 11 deletions(-) > > > > > > > > > > > > > > > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > > > > > > > > index ef7118ebee00..e79d82fd05e7 100644 > > > > > > > > > > --- a/fs/nfsd/nfs4state.c > > > > > > > > > > +++ b/fs/nfsd/nfs4state.c > > > > > > > > > > @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open > > > > > > > > > > *open, struct nfs4_ol_stateid *stp, > > > > > > > > > > struct nfs4_file *fp = stp->st_stid.sc_file; > > > > > > > > > > struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; > > > > > > > > > > struct nfs4_delegation *dp; > > > > > > > > > > - struct nfsd_file *nf; > > > > > > > > > > + struct nfsd_file *nf = NULL; > > > > > > > > > > struct file_lock *fl; > > > > > > > > > > u32 dl_type; > > > > > > > > > > > > > > > > > > > > @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open > > > > > > > > > > *open, struct nfs4_ol_stateid *stp, > > > > > > > > > > if (fp->fi_had_conflict) > > > > > > > > > > return ERR_PTR(-EAGAIN); > > > > > > > > > > > > > > > > > > > > - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { > > > > > > > > > > - nf = find_writeable_file(fp); > > > > > > > > > > + /* > > > > > > > > > > + * Try for a write delegation first. We need an O_RDWR file > > > > > > > > > > + * since a write delegation allows the client to perform any open > > > > > > > > > > + * from its cache. > > > > > > > > > > + */ > > > > > > > > > > + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == > > > > > > > > > > NFS4_SHARE_ACCESS_BOTH) { > > > > > > > > > > + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); > > > > > > > > > > dl_type = NFS4_OPEN_DELEGATE_WRITE; > > > > > > > > > > - } else { > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write > > > > > > > > > delegation? > > > > > > > > > It does not seem right. > > > > > > > > > > > > > > > > > > -Dai > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Why? Per RFC 8881: > > > > > > > > > > > > > > > > "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on > > > > > > > > its > > > > > > > > own, all opens." > > > > > > > > > > > > > > > > All opens. That includes read opens. > > > > > > > > > > > > > > > > An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the > > > > > > > > user has no read permissions. Therefore, we can't grant a write > > > > > > > > delegation since can't guarantee that the user is allowed to do that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If the server grants the write delegation on an OPEN with > > > > > > > OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then > > > > > > > why can't the server checks and denies the subsequent READ? > > > > > > > > > > > > > > Per RFC 8881, section 9.1.2: > > > > > > > > > > > > > > For delegation stateids, the access mode is based on the type of > > > > > > > delegation. > > > > > > > > > > > > > > When a READ, WRITE, or SETATTR (that specifies the size > > > > > > > attribute) > > > > > > > operation is done, the operation is subject to checking > > > > > > > against the > > > > > > > access mode to verify that the operation is appropriate given the > > > > > > > stateid with which the operation is associated. > > > > > > > > > > > > > > In the case of WRITE-type operations (i.e., WRITEs and > > > > > > > SETATTRs that > > > > > > > set size), the server MUST verify that the access mode allows > > > > > > > writing > > > > > > > and MUST return an NFS4ERR_OPENMODE error if it does not. In > > > > > > > the case > > > > > > > of READ, the server may perform the corresponding check on the > > > > > > > access > > > > > > > mode, or it may choose to allow READ on OPENs for > > > > > > > OPEN4_SHARE_ACCESS_WRITE, > > > > > > > to accommodate clients whose WRITE implementation may > > > > > > > unavoidably do > > > > > > > reads (e.g., due to buffer cache constraints). However, even > > > > > > > if READs > > > > > > > are allowed in these circumstances, the server MUST still > > > > > > > check for > > > > > > > locks that conflict with the READ (e.g., another OPEN specified > > > > > > > OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a > > > > > > > server > > > > > > > that does enforce the access mode check on READs need not > > > > > > > explicitly > > > > > > > check for conflicting share reservations since the existence > > > > > > > of OPEN > > > > > > > for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share > > > > > > > reservation can exist. > > > > > > > > > > > > > > FWIW, The Solaris server grants write delegation on OPEN with > > > > > > > OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or > > > > > > > WR-only. Maybe this is a bug? or the spec is not clear? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't think that's necessarily a bug. > > > > > > > > > > > > It's not that the spec demands that we only hand out delegations on > > > > > > BOTH > > > > > > opens. This is more of a quirk of the Linux implementation. Linux' > > > > > > write delegations require an open O_RDWR file descriptor because we may > > > > > > be called upon to do a read on its behalf. > > > > > > > > > > > > Technically, we could probably just have it check for > > > > > > OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, > > > > > > then you're unlikely to get a delegation. Either the O_RDWR descriptor > > > > > > will be NULL, or there are other, conflicting opens already present. > > > > > > > > > > > > Solaris may have a completely different design that doesn't require > > > > > > this. I haven't looked at its code to know. > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm comfortable for now with not handing out write delegations for > > > > > SHARE_ACCESS_WRITE opens. I prefer that to permission checking on > > > > > every READ operation. > > > > > > > > > > > > > > > > > > > > > > I'm fine with just handling out write delegation for SHARE_ACCESS_BOTH > > > > only. > > > > > > > > Just a concern about not checking for access at the time of READ > > > > operation. > > > > > > > > > > > > > > > > > or not checking file permission at the time WRITE. > > > > > > > > > > > > > > > > > > If the file was opened with SHARE_ACCESS_WRITE (no write delegation > > > > granted) > > > > and the file access mode was changed to read-only, is it a correct > > > > behavior > > > > for the server to allow the READ to go through? > > > > > > > > > > > > > > > > > I meant for the WRITE to go through. > > > > > > > > > > > > Yes: > > > > POSIX permissions enforcement is done at open time, not when doing > > actual reads and writes. If you open a file on (e.g.) xfs and start > > streaming writes to it, then you don't expect that you will lose the > > ability to write to that fd if the permissions change. > > > > In the old v2/3 days of stateless NFS, we had to check permissions on > > every READ or WRITE operation, but we generally did an open on every RPC > > too, so it just worked out that we checked permissions on each > > operation. > > > > With v4 we can better approximate POSIX semantics by just associating a > > stateid with an open file to allow the client to keep writing in this > > case. > > > > > > > Thanks Jeff, Don't thank me yet. I went back and looked at the code, and it looks like we still do check permissions on every READ/WRITE (see nfs4_check_file). I'm unclear on whether that's required, but it's probably safest to always check permissions like we are. That does mean that if the mode of the file changes after we open it we could end up being unable to read or write to it (much like with v2/3), but at this point most people are used to that sort of behavior on NFS, so I don't worry about it too much.
On 8/3/23 4:27 AM, Jeff Layton wrote: > On Wed, 2023-08-02 at 16:38 -0700, dai.ngo@oracle.com wrote: >> >> >> >> >> >> >> >> On 8/2/23 2:52 PM, Jeff Layton wrote: >> >> >> >> >>> >>> >>> On Wed, 2023-08-02 at 14:32 -0700, dai.ngo@oracle.com wrote: >>> >>> >>>> >>>> >>>> On 8/2/23 2:22 PM, dai.ngo@oracle.com wrote: >>>> >>>> >>>>> >>>>> >>>>> On 8/2/23 1:57 PM, Chuck Lever III wrote: >>>>> >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> On Aug 2, 2023, at 4:48 PM, Jeff Layton <jlayton@kernel.org> wrote: >>>>>>> >>>>>>> On Wed, 2023-08-02 at 13:15 -0700, dai.ngo@oracle.com wrote: >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 8/2/23 11:15 AM, Jeff Layton wrote: >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, 2023-08-02 at 09:29 -0700, dai.ngo@oracle.com wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 8/1/23 6:33 AM, Jeff Layton wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I noticed that xfstests generic/001 was failing against >>>>>>>>>>> linux-next nfsd. >>>>>>>>>>> >>>>>>>>>>> The client would request a OPEN4_SHARE_ACCESS_WRITE open, and >>>>>>>>>>> the server >>>>>>>>>>> would hand out a write delegation. The client would then try to >>>>>>>>>>> use that >>>>>>>>>>> write delegation as the source stateid in a COPY >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> not sure why the client opens the source file of a COPY operation >>>>>>>>>> with >>>>>>>>>> OPEN4_SHARE_ACCESS_WRITE? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> It doesn't. The original open is to write the data for the file being >>>>>>>>> copied. It then opens the file again for READ, but since it has a >>>>>>>>> write >>>>>>>>> delegation, it doesn't need to talk to the server at all -- it can >>>>>>>>> just >>>>>>>>> use that stateid for later operations. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> or CLONE operation, and >>>>>>>>>>> the server would respond with NFS4ERR_STALE. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> If the server does not allow client to use write delegation for the >>>>>>>>>> READ, should the correct error return be NFS4ERR_OPENMODE? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> The server must allow the client to use a write delegation for read >>>>>>>>> operations. It's required by the spec, AFAIU. >>>>>>>>> >>>>>>>>> The error in this case was just bogus. The vfs copy routine would >>>>>>>>> return >>>>>>>>> -EBADF since the file didn't have FMODE_READ, and the nfs server >>>>>>>>> would >>>>>>>>> translate that into NFS4ERR_STALE. >>>>>>>>> >>>>>>>>> Probably there is a better v4 error code that we could translate >>>>>>>>> EBADF >>>>>>>>> to, but with this patch it shouldn't be a problem any longer. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> The problem is that the struct file associated with the >>>>>>>>>>> delegation does >>>>>>>>>>> not necessarily have read permissions. It's handing out a write >>>>>>>>>>> delegation on what is effectively an O_WRONLY open. RFC 8881 >>>>>>>>>>> states: >>>>>>>>>>> >>>>>>>>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to >>>>>>>>>>> handle, on its >>>>>>>>>>> own, all opens." >>>>>>>>>>> >>>>>>>>>>> Given that the client didn't request any read permissions, and >>>>>>>>>>> that nfsd >>>>>>>>>>> didn't check for any, it seems wrong to give out a write >>>>>>>>>>> delegation. >>>>>>>>>>> >>>>>>>>>>> Only hand out a write delegation if we have a O_RDWR descriptor >>>>>>>>>>> available. If it fails to find an appropriate write descriptor, go >>>>>>>>>>> ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was >>>>>>>>>>> requested. >>>>>>>>>>> >>>>>>>>>>> This fixes xfstest generic/001. >>>>>>>>>>> >>>>>>>>>>> Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 >>>>>>>>>>> Signed-off-by: Jeff Layton <jlayton@kernel.org> >>>>>>>>>>> --- >>>>>>>>>>> Changes in v2: >>>>>>>>>>> - Rework the logic when finding struct file for the delegation. The >>>>>>>>>>> earlier patch might still have attached a O_WRONLY file to >>>>>>>>>>> the deleg >>>>>>>>>>> in some cases, and could still have handed out a write >>>>>>>>>>> delegation on >>>>>>>>>>> an O_WRONLY OPEN request in some cases. >>>>>>>>>>> --- >>>>>>>>>>> fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- >>>>>>>>>>> 1 file changed, 18 insertions(+), 11 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c >>>>>>>>>>> index ef7118ebee00..e79d82fd05e7 100644 >>>>>>>>>>> --- a/fs/nfsd/nfs4state.c >>>>>>>>>>> +++ b/fs/nfsd/nfs4state.c >>>>>>>>>>> @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open >>>>>>>>>>> *open, struct nfs4_ol_stateid *stp, >>>>>>>>>>> struct nfs4_file *fp = stp->st_stid.sc_file; >>>>>>>>>>> struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; >>>>>>>>>>> struct nfs4_delegation *dp; >>>>>>>>>>> - struct nfsd_file *nf; >>>>>>>>>>> + struct nfsd_file *nf = NULL; >>>>>>>>>>> struct file_lock *fl; >>>>>>>>>>> u32 dl_type; >>>>>>>>>>> >>>>>>>>>>> @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open >>>>>>>>>>> *open, struct nfs4_ol_stateid *stp, >>>>>>>>>>> if (fp->fi_had_conflict) >>>>>>>>>>> return ERR_PTR(-EAGAIN); >>>>>>>>>>> >>>>>>>>>>> - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { >>>>>>>>>>> - nf = find_writeable_file(fp); >>>>>>>>>>> + /* >>>>>>>>>>> + * Try for a write delegation first. We need an O_RDWR file >>>>>>>>>>> + * since a write delegation allows the client to perform any open >>>>>>>>>>> + * from its cache. >>>>>>>>>>> + */ >>>>>>>>>>> + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == >>>>>>>>>>> NFS4_SHARE_ACCESS_BOTH) { >>>>>>>>>>> + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); >>>>>>>>>>> dl_type = NFS4_OPEN_DELEGATE_WRITE; >>>>>>>>>>> - } else { >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Does this mean OPEN4_SHARE_ACCESS_WRITE do not get a write >>>>>>>>>> delegation? >>>>>>>>>> It does not seem right. >>>>>>>>>> >>>>>>>>>> -Dai >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Why? Per RFC 8881: >>>>>>>>> >>>>>>>>> "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on >>>>>>>>> its >>>>>>>>> own, all opens." >>>>>>>>> >>>>>>>>> All opens. That includes read opens. >>>>>>>>> >>>>>>>>> An OPEN4_SHARE_ACCESS_WRITE open will succeed on a file to which the >>>>>>>>> user has no read permissions. Therefore, we can't grant a write >>>>>>>>> delegation since can't guarantee that the user is allowed to do that. >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> If the server grants the write delegation on an OPEN with >>>>>>>> OPEN4_SHARE_ACCESS_WRITE on the file with WR-only access mode then >>>>>>>> why can't the server checks and denies the subsequent READ? >>>>>>>> >>>>>>>> Per RFC 8881, section 9.1.2: >>>>>>>> >>>>>>>> For delegation stateids, the access mode is based on the type of >>>>>>>> delegation. >>>>>>>> >>>>>>>> When a READ, WRITE, or SETATTR (that specifies the size >>>>>>>> attribute) >>>>>>>> operation is done, the operation is subject to checking >>>>>>>> against the >>>>>>>> access mode to verify that the operation is appropriate given the >>>>>>>> stateid with which the operation is associated. >>>>>>>> >>>>>>>> In the case of WRITE-type operations (i.e., WRITEs and >>>>>>>> SETATTRs that >>>>>>>> set size), the server MUST verify that the access mode allows >>>>>>>> writing >>>>>>>> and MUST return an NFS4ERR_OPENMODE error if it does not. In >>>>>>>> the case >>>>>>>> of READ, the server may perform the corresponding check on the >>>>>>>> access >>>>>>>> mode, or it may choose to allow READ on OPENs for >>>>>>>> OPEN4_SHARE_ACCESS_WRITE, >>>>>>>> to accommodate clients whose WRITE implementation may >>>>>>>> unavoidably do >>>>>>>> reads (e.g., due to buffer cache constraints). However, even >>>>>>>> if READs >>>>>>>> are allowed in these circumstances, the server MUST still >>>>>>>> check for >>>>>>>> locks that conflict with the READ (e.g., another OPEN specified >>>>>>>> OPEN4_SHARE_DENY_READ or OPEN4_SHARE_DENY_BOTH). Note that a >>>>>>>> server >>>>>>>> that does enforce the access mode check on READs need not >>>>>>>> explicitly >>>>>>>> check for conflicting share reservations since the existence >>>>>>>> of OPEN >>>>>>>> for OPEN4_SHARE_ACCESS_READ guarantees that no conflicting share >>>>>>>> reservation can exist. >>>>>>>> >>>>>>>> FWIW, The Solaris server grants write delegation on OPEN with >>>>>>>> OPEN4_SHARE_ACCESS_WRITE on file with access mode either RW or >>>>>>>> WR-only. Maybe this is a bug? or the spec is not clear? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> I don't think that's necessarily a bug. >>>>>>> >>>>>>> It's not that the spec demands that we only hand out delegations on >>>>>>> BOTH >>>>>>> opens. This is more of a quirk of the Linux implementation. Linux' >>>>>>> write delegations require an open O_RDWR file descriptor because we may >>>>>>> be called upon to do a read on its behalf. >>>>>>> >>>>>>> Technically, we could probably just have it check for >>>>>>> OPEN4_SHARE_ACCESS_WRITE, but in the case where READ isn't also set, >>>>>>> then you're unlikely to get a delegation. Either the O_RDWR descriptor >>>>>>> will be NULL, or there are other, conflicting opens already present. >>>>>>> >>>>>>> Solaris may have a completely different design that doesn't require >>>>>>> this. I haven't looked at its code to know. >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> I'm comfortable for now with not handing out write delegations for >>>>>> SHARE_ACCESS_WRITE opens. I prefer that to permission checking on >>>>>> every READ operation. >>>>>> >>>>>> >>>>> >>>>> >>>>> I'm fine with just handling out write delegation for SHARE_ACCESS_BOTH >>>>> only. >>>>> >>>>> Just a concern about not checking for access at the time of READ >>>>> operation. >>>>> >>>>> >>>> >>>> >>>> or not checking file permission at the time WRITE. >>>> >>>> >>>>> >>>>> >>>>> If the file was opened with SHARE_ACCESS_WRITE (no write delegation >>>>> granted) >>>>> and the file access mode was changed to read-only, is it a correct >>>>> behavior >>>>> for the server to allow the READ to go through? >>>>> >>>>> >>>> >>>> >>>> I meant for the WRITE to go through. >>>> >>>> >>> >>> >>> Yes: >>> >>> POSIX permissions enforcement is done at open time, not when doing >>> actual reads and writes. If you open a file on (e.g.) xfs and start >>> streaming writes to it, then you don't expect that you will lose the >>> ability to write to that fd if the permissions change. >>> >>> In the old v2/3 days of stateless NFS, we had to check permissions on >>> every READ or WRITE operation, but we generally did an open on every RPC >>> too, so it just worked out that we checked permissions on each >>> operation. >>> >>> With v4 we can better approximate POSIX semantics by just associating a >>> stateid with an open file to allow the client to keep writing in this >>> case. >>> >>> >> >> >> Thanks Jeff, > Don't thank me yet. I went back and looked at the code, and it looks > like we still do check permissions on every READ/WRITE (see > nfs4_check_file). > > I'm unclear on whether that's required, but it's probably safest to > always check permissions like we are. That does mean that if the mode of > the file changes after we open it we could end up being unable to read > or write to it (much like with v2/3), but at this point most people are > used to that sort of behavior on NFS, so I don't worry about it too > much. It might not conform to Posix permissions enforcement but I like what the server is doing right now, correctness of permissions enforcement and consistent behavior of v2/3/4. -Dai
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index ef7118ebee00..e79d82fd05e7 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -5449,7 +5449,7 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, struct nfs4_file *fp = stp->st_stid.sc_file; struct nfs4_clnt_odstate *odstate = stp->st_clnt_odstate; struct nfs4_delegation *dp; - struct nfsd_file *nf; + struct nfsd_file *nf = NULL; struct file_lock *fl; u32 dl_type; @@ -5461,21 +5461,28 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp, if (fp->fi_had_conflict) return ERR_PTR(-EAGAIN); - if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) { - nf = find_writeable_file(fp); + /* + * Try for a write delegation first. We need an O_RDWR file + * since a write delegation allows the client to perform any open + * from its cache. + */ + if ((open->op_share_access & NFS4_SHARE_ACCESS_BOTH) == NFS4_SHARE_ACCESS_BOTH) { + nf = nfsd_file_get(fp->fi_fds[O_RDWR]); dl_type = NFS4_OPEN_DELEGATE_WRITE; - } else { + } + + /* + * If the file is being opened O_RDONLY or we couldn't get a O_RDWR + * file for some reason, then try for a read deleg instead. + */ + if (!nf && (open->op_share_access & NFS4_SHARE_ACCESS_READ)) { nf = find_readable_file(fp); dl_type = NFS4_OPEN_DELEGATE_READ; } - if (!nf) { - /* - * We probably could attempt another open and get a read - * delegation, but for now, don't bother until the - * client actually sends us one. - */ + + if (!nf) return ERR_PTR(-EAGAIN); - } + spin_lock(&state_lock); spin_lock(&fp->fi_lock); if (nfs4_delegation_exists(clp, fp))
I noticed that xfstests generic/001 was failing against linux-next nfsd. The client would request a OPEN4_SHARE_ACCESS_WRITE open, and the server would hand out a write delegation. The client would then try to use that write delegation as the source stateid in a COPY or CLONE operation, and the server would respond with NFS4ERR_STALE. The problem is that the struct file associated with the delegation does not necessarily have read permissions. It's handing out a write delegation on what is effectively an O_WRONLY open. RFC 8881 states: "An OPEN_DELEGATE_WRITE delegation allows the client to handle, on its own, all opens." Given that the client didn't request any read permissions, and that nfsd didn't check for any, it seems wrong to give out a write delegation. Only hand out a write delegation if we have a O_RDWR descriptor available. If it fails to find an appropriate write descriptor, go ahead and try for a read delegation if NFS4_SHARE_ACCESS_READ was requested. This fixes xfstest generic/001. Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=412 Signed-off-by: Jeff Layton <jlayton@kernel.org> --- Changes in v2: - Rework the logic when finding struct file for the delegation. The earlier patch might still have attached a O_WRONLY file to the deleg in some cases, and could still have handed out a write delegation on an O_WRONLY OPEN request in some cases. --- fs/nfsd/nfs4state.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) --- base-commit: a734662572708cf062e974f659ae50c24fc1ad17 change-id: 20230731-wdeleg-bbdb6b25a3c6 Best regards,