diff mbox series

nfs: block notification on fs with its own ->lock

Message ID 20211216172013.GA13418@fieldses.org (mailing list archive)
State New, archived
Headers show
Series nfs: block notification on fs with its own ->lock | expand

Commit Message

J. Bruce Fields Dec. 16, 2021, 5:20 p.m. UTC
From: "J. Bruce Fields" <bfields@redhat.com>

NFSv4.1 supports an optional lock notification feature which notifies
the client when a lock comes available.  (Normally NFSv4 clients just
poll for locks if necessary.)  To make that work, we need to request a
blocking lock from the filesystem.

We turned that off for NFS in f657f8eef3ff "nfs: don't atempt blocking
locks on nfs reexports" because it actually blocks the nfsd thread while
waiting for the lock.

Thanks to Vasily Averin for pointing out that NFS isn't the only
filesystem with that problem.

Any filesystem that leaves ->lock NULL will use posix_lock_file(), which
does the right thing.  Simplest is just to assume that any filesystem
that defines its own ->lock is not safe to request a blocking lock from.

So, this patch mostly reverts f657f8eef3ff and b840be2f00c0, and instead
uses a check of ->lock (Vasily's suggestion) to decide whether to
support blocking lock notifications on a given filesystem.  Also add a
little documentation.

Perhaps someday we could add back an export flag later to allow
filesystems with "good" ->lock methods to support blocking lock
notifications.

Reported-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/lockd/svclock.c          |  2 +-
 fs/nfs/export.c             |  2 +-
 fs/nfsd/nfs4state.c         | 18 ++++++++++++------
 include/linux/exportfs.h    |  2 --
 include/linux/lockd/lockd.h |  9 +++++++--
 5 files changed, 21 insertions(+), 12 deletions(-)

Comments

Vasily Averin Dec. 17, 2021, 6:41 a.m. UTC | #1
On 16.12.2021 20:20, J. Bruce Fields wrote:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> NFSv4.1 supports an optional lock notification feature which notifies
> the client when a lock comes available.  (Normally NFSv4 clients just
> poll for locks if necessary.)  To make that work, we need to request a
> blocking lock from the filesystem.
> 
> We turned that off for NFS in f657f8eef3ff "nfs: don't atempt blocking
> locks on nfs reexports" because it actually blocks the nfsd thread while
> waiting for the lock.
> 
> Thanks to Vasily Averin for pointing out that NFS isn't the only
> filesystem with that problem.
> 
> Any filesystem that leaves ->lock NULL will use posix_lock_file(), which
> does the right thing.  Simplest is just to assume that any filesystem
> that defines its own ->lock is not safe to request a blocking lock from.
> 
> So, this patch mostly reverts f657f8eef3ff and b840be2f00c0, and instead
> uses a check of ->lock (Vasily's suggestion) to decide whether to
> support blocking lock notifications on a given filesystem.  Also add a
> little documentation.
> 
> Perhaps someday we could add back an export flag later to allow
> filesystems with "good" ->lock methods to support blocking lock
> notifications.
> 
> Reported-by: Vasily Averin <vvs@virtuozzo.com>
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>

Reviewed-by: Vasily  Averin <vvs@virtuozzo.com>
Chuck Lever Dec. 17, 2021, 4:11 p.m. UTC | #2
> On Dec 17, 2021, at 1:41 AM, Vasily Averin <vvs@virtuozzo.com> wrote:
> 
> On 16.12.2021 20:20, J. Bruce Fields wrote:
>> From: "J. Bruce Fields" <bfields@redhat.com>
>> 
>> NFSv4.1 supports an optional lock notification feature which notifies
>> the client when a lock comes available.  (Normally NFSv4 clients just
>> poll for locks if necessary.)  To make that work, we need to request a
>> blocking lock from the filesystem.
>> 
>> We turned that off for NFS in f657f8eef3ff "nfs: don't atempt blocking
>> locks on nfs reexports" because it actually blocks the nfsd thread while
>> waiting for the lock.
>> 
>> Thanks to Vasily Averin for pointing out that NFS isn't the only
>> filesystem with that problem.
>> 
>> Any filesystem that leaves ->lock NULL will use posix_lock_file(), which
>> does the right thing.  Simplest is just to assume that any filesystem
>> that defines its own ->lock is not safe to request a blocking lock from.
>> 
>> So, this patch mostly reverts f657f8eef3ff and b840be2f00c0, and instead
>> uses a check of ->lock (Vasily's suggestion) to decide whether to
>> support blocking lock notifications on a given filesystem.  Also add a
>> little documentation.
>> 
>> Perhaps someday we could add back an export flag later to allow
>> filesystems with "good" ->lock methods to support blocking lock
>> notifications.
>> 
>> Reported-by: Vasily Averin <vvs@virtuozzo.com>
>> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> 
> Reviewed-by: Vasily  Averin <vvs@virtuozzo.com>

I've applied this with Vasily's R-b to for-next at

git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git

I also cleaned up some checkpatch nits in the patch description.

It might be good for subsequent work in this area to be based
on the for-next branch so we can track what is done and what
is left to do.


--
Chuck Lever
Pratyush Yadav Dec. 21, 2021, 11:31 a.m. UTC | #3
Hi,

On 17/12/21 04:11PM, Chuck Lever III wrote:
> 
> > On Dec 17, 2021, at 1:41 AM, Vasily Averin <vvs@virtuozzo.com> wrote:
> > 
> > On 16.12.2021 20:20, J. Bruce Fields wrote:
> >> From: "J. Bruce Fields" <bfields@redhat.com>
> >> 
> >> NFSv4.1 supports an optional lock notification feature which notifies
> >> the client when a lock comes available.  (Normally NFSv4 clients just
> >> poll for locks if necessary.)  To make that work, we need to request a
> >> blocking lock from the filesystem.
> >> 
> >> We turned that off for NFS in f657f8eef3ff "nfs: don't atempt blocking
> >> locks on nfs reexports" because it actually blocks the nfsd thread while
> >> waiting for the lock.
> >> 
> >> Thanks to Vasily Averin for pointing out that NFS isn't the only
> >> filesystem with that problem.
> >> 
> >> Any filesystem that leaves ->lock NULL will use posix_lock_file(), which
> >> does the right thing.  Simplest is just to assume that any filesystem
> >> that defines its own ->lock is not safe to request a blocking lock from.
> >> 
> >> So, this patch mostly reverts f657f8eef3ff and b840be2f00c0, and instead
> >> uses a check of ->lock (Vasily's suggestion) to decide whether to
> >> support blocking lock notifications on a given filesystem.  Also add a
> >> little documentation.
> >> 
> >> Perhaps someday we could add back an export flag later to allow
> >> filesystems with "good" ->lock methods to support blocking lock
> >> notifications.
> >> 
> >> Reported-by: Vasily Averin <vvs@virtuozzo.com>
> >> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> > 
> > Reviewed-by: Vasily  Averin <vvs@virtuozzo.com>
> 
> I've applied this with Vasily's R-b to for-next at
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
> 
> I also cleaned up some checkpatch nits in the patch description.
> 
> It might be good for subsequent work in this area to be based
> on the for-next branch so we can track what is done and what
> is left to do.

This patch breaks LLVM build on linux-next for me:

fs/lockd/svclock.c:474:17: error: unused variable 'inode' [-Werror,-Wunused-variable]
        struct inode            *inode = nlmsvc_file_inode(file);

This is because now the only user of inode is the dprintk() call, and 
this is probably a noop when debug is disabled. I think you should wrap 
the declaration of inode under the same debug symbol used to select 
dprintk(). My LSP (ccls) is getting confused and can't point out where 
exactly this macro is declared, so I am not sure which symbol controls 
it (CONFIG_DEBUG? CONFIG_NFS_DEBUG?).
Chuck Lever Dec. 21, 2021, 3:13 p.m. UTC | #4
> On Dec 21, 2021, at 6:31 AM, Pratyush Yadav <p.yadav@ti.com> wrote:
> 
> Hi,
> 
> On 17/12/21 04:11PM, Chuck Lever III wrote:
>> 
>>> On Dec 17, 2021, at 1:41 AM, Vasily Averin <vvs@virtuozzo.com> wrote:
>>> 
>>> On 16.12.2021 20:20, J. Bruce Fields wrote:
>>>> From: "J. Bruce Fields" <bfields@redhat.com>
>>>> 
>>>> NFSv4.1 supports an optional lock notification feature which notifies
>>>> the client when a lock comes available.  (Normally NFSv4 clients just
>>>> poll for locks if necessary.)  To make that work, we need to request a
>>>> blocking lock from the filesystem.
>>>> 
>>>> We turned that off for NFS in f657f8eef3ff "nfs: don't atempt blocking
>>>> locks on nfs reexports" because it actually blocks the nfsd thread while
>>>> waiting for the lock.
>>>> 
>>>> Thanks to Vasily Averin for pointing out that NFS isn't the only
>>>> filesystem with that problem.
>>>> 
>>>> Any filesystem that leaves ->lock NULL will use posix_lock_file(), which
>>>> does the right thing.  Simplest is just to assume that any filesystem
>>>> that defines its own ->lock is not safe to request a blocking lock from.
>>>> 
>>>> So, this patch mostly reverts f657f8eef3ff and b840be2f00c0, and instead
>>>> uses a check of ->lock (Vasily's suggestion) to decide whether to
>>>> support blocking lock notifications on a given filesystem.  Also add a
>>>> little documentation.
>>>> 
>>>> Perhaps someday we could add back an export flag later to allow
>>>> filesystems with "good" ->lock methods to support blocking lock
>>>> notifications.
>>>> 
>>>> Reported-by: Vasily Averin <vvs@virtuozzo.com>
>>>> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>>> 
>>> Reviewed-by: Vasily  Averin <vvs@virtuozzo.com>
>> 
>> I've applied this with Vasily's R-b to for-next at
>> 
>> git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
>> 
>> I also cleaned up some checkpatch nits in the patch description.
>> 
>> It might be good for subsequent work in this area to be based
>> on the for-next branch so we can track what is done and what
>> is left to do.
> 
> This patch breaks LLVM build on linux-next for me:
> 
> fs/lockd/svclock.c:474:17: error: unused variable 'inode' [-Werror,-Wunused-variable]
>        struct inode            *inode = nlmsvc_file_inode(file);
> 
> This is because now the only user of inode is the dprintk() call, and 
> this is probably a noop when debug is disabled. I think you should wrap 
> the declaration of inode under the same debug symbol used to select 
> dprintk(). My LSP (ccls) is getting confused and can't point out where 
> exactly this macro is declared, so I am not sure which symbol controls 
> it (CONFIG_DEBUG? CONFIG_NFS_DEBUG?).

I updated this patch in my for-next tree yesterday to take care of
the issue. The change has been merged into today's linux-next.

--
Chuck Lever
Pratyush Yadav Dec. 21, 2021, 7:30 p.m. UTC | #5
On 21/12/21 03:13PM, Chuck Lever III wrote:
> 
> 
> > On Dec 21, 2021, at 6:31 AM, Pratyush Yadav <p.yadav@ti.com> wrote:
> > 
> > Hi,
> > 
> > On 17/12/21 04:11PM, Chuck Lever III wrote:
> >> 
> >>> On Dec 17, 2021, at 1:41 AM, Vasily Averin <vvs@virtuozzo.com> wrote:
> >>> 
> >>> On 16.12.2021 20:20, J. Bruce Fields wrote:
> >>>> From: "J. Bruce Fields" <bfields@redhat.com>
> >>>> 
> >>>> NFSv4.1 supports an optional lock notification feature which notifies
> >>>> the client when a lock comes available.  (Normally NFSv4 clients just
> >>>> poll for locks if necessary.)  To make that work, we need to request a
> >>>> blocking lock from the filesystem.
> >>>> 
> >>>> We turned that off for NFS in f657f8eef3ff "nfs: don't atempt blocking
> >>>> locks on nfs reexports" because it actually blocks the nfsd thread while
> >>>> waiting for the lock.
> >>>> 
> >>>> Thanks to Vasily Averin for pointing out that NFS isn't the only
> >>>> filesystem with that problem.
> >>>> 
> >>>> Any filesystem that leaves ->lock NULL will use posix_lock_file(), which
> >>>> does the right thing.  Simplest is just to assume that any filesystem
> >>>> that defines its own ->lock is not safe to request a blocking lock from.
> >>>> 
> >>>> So, this patch mostly reverts f657f8eef3ff and b840be2f00c0, and instead
> >>>> uses a check of ->lock (Vasily's suggestion) to decide whether to
> >>>> support blocking lock notifications on a given filesystem.  Also add a
> >>>> little documentation.
> >>>> 
> >>>> Perhaps someday we could add back an export flag later to allow
> >>>> filesystems with "good" ->lock methods to support blocking lock
> >>>> notifications.
> >>>> 
> >>>> Reported-by: Vasily Averin <vvs@virtuozzo.com>
> >>>> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> >>> 
> >>> Reviewed-by: Vasily  Averin <vvs@virtuozzo.com>
> >> 
> >> I've applied this with Vasily's R-b to for-next at
> >> 
> >> git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
> >> 
> >> I also cleaned up some checkpatch nits in the patch description.
> >> 
> >> It might be good for subsequent work in this area to be based
> >> on the for-next branch so we can track what is done and what
> >> is left to do.
> > 
> > This patch breaks LLVM build on linux-next for me:
> > 
> > fs/lockd/svclock.c:474:17: error: unused variable 'inode' [-Werror,-Wunused-variable]
> >        struct inode            *inode = nlmsvc_file_inode(file);
> > 
> > This is because now the only user of inode is the dprintk() call, and 
> > this is probably a noop when debug is disabled. I think you should wrap 
> > the declaration of inode under the same debug symbol used to select 
> > dprintk(). My LSP (ccls) is getting confused and can't point out where 
> > exactly this macro is declared, so I am not sure which symbol controls 
> > it (CONFIG_DEBUG? CONFIG_NFS_DEBUG?).
> 
> I updated this patch in my for-next tree yesterday to take care of
> the issue. The change has been merged into today's linux-next.

Thanks. I updated to today's linux-next and I no longer see this error. 
The build still fails though, this time on some firmware driver. I'll go 
bother their maintainers about it now ;-)
diff mbox series

Patch

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index e9b85d8fd5fe..98e2f9b32e21 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -484,7 +484,7 @@  nlmsvc_lock(struct svc_rqst *rqstp, struct nlm_file *file,
 				(long long)lock->fl.fl_end,
 				wait);
 
-	if (inode->i_sb->s_export_op->flags & EXPORT_OP_SYNC_LOCKS) {
+	if (nlmsvc_file_file(file)->f_op->lock) {
 		async_block = wait;
 		wait = 0;
 	}
diff --git a/fs/nfs/export.c b/fs/nfs/export.c
index 171c424cb6d5..01596f2d0a1e 100644
--- a/fs/nfs/export.c
+++ b/fs/nfs/export.c
@@ -158,5 +158,5 @@  const struct export_operations nfs_export_ops = {
 	.fetch_iversion = nfs_fetch_iversion,
 	.flags = EXPORT_OP_NOWCC|EXPORT_OP_NOSUBTREECHK|
 		EXPORT_OP_CLOSE_BEFORE_UNLINK|EXPORT_OP_REMOTE_FS|
-		EXPORT_OP_NOATOMIC_ATTR|EXPORT_OP_SYNC_LOCKS,
+		EXPORT_OP_NOATOMIC_ATTR,
 };
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 1956d377d1a6..3317493d2750 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6836,7 +6836,6 @@  nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct nfsd4_blocked_lock *nbl = NULL;
 	struct file_lock *file_lock = NULL;
 	struct file_lock *conflock = NULL;
-	struct super_block *sb;
 	__be32 status = 0;
 	int lkflg;
 	int err;
@@ -6858,7 +6857,6 @@  nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		dprintk("NFSD: nfsd4_lock: permission denied!\n");
 		return status;
 	}
-	sb = cstate->current_fh.fh_dentry->d_sb;
 
 	if (lock->lk_is_new) {
 		if (nfsd4_has_session(cstate))
@@ -6910,8 +6908,7 @@  nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	fp = lock_stp->st_stid.sc_file;
 	switch (lock->lk_type) {
 		case NFS4_READW_LT:
-			if (nfsd4_has_session(cstate) &&
-			    !(sb->s_export_op->flags & EXPORT_OP_SYNC_LOCKS))
+			if (nfsd4_has_session(cstate))
 				fl_flags |= FL_SLEEP;
 			fallthrough;
 		case NFS4_READ_LT:
@@ -6923,8 +6920,7 @@  nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 			fl_type = F_RDLCK;
 			break;
 		case NFS4_WRITEW_LT:
-			if (nfsd4_has_session(cstate) &&
-			    !(sb->s_export_op->flags & EXPORT_OP_SYNC_LOCKS))
+			if (nfsd4_has_session(cstate))
 				fl_flags |= FL_SLEEP;
 			fallthrough;
 		case NFS4_WRITE_LT:
@@ -6940,6 +6936,16 @@  nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		goto out;
 	}
 
+	/*
+	 * Most filesystems with their own ->lock operations will block
+	 * the nfsd thread waiting to acquire the lock.  That leads to
+	 * deadlocks (we don't want every nfsd thread tied up waiting
+	 * for file locks), so don't attempt blocking lock notifications
+	 * on those filesystems:
+	 */
+	if (nf->nf_file->f_op->lock)
+		fl_flags &= ~FL_SLEEP;
+
 	if (!nf) {
 		status = nfserr_openmode;
 		goto out;
diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
index 3260fe714846..fe848901fcc3 100644
--- a/include/linux/exportfs.h
+++ b/include/linux/exportfs.h
@@ -221,8 +221,6 @@  struct export_operations {
 #define EXPORT_OP_NOATOMIC_ATTR		(0x10) /* Filesystem cannot supply
 						  atomic attribute updates
 						*/
-#define EXPORT_OP_SYNC_LOCKS		(0x20) /* Filesystem can't do
-						  asychronous blocking locks */
 	unsigned long	flags;
 };
 
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index c4ae6506b8b3..fcef192e5e45 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -303,10 +303,15 @@  void		  nlmsvc_invalidate_all(void);
 int           nlmsvc_unlock_all_by_sb(struct super_block *sb);
 int           nlmsvc_unlock_all_by_ip(struct sockaddr *server_addr);
 
+static inline struct file *nlmsvc_file_file(struct nlm_file *file)
+{
+	return file->f_file[O_RDONLY] ?
+	       file->f_file[O_RDONLY] : file->f_file[O_WRONLY];
+}
+
 static inline struct inode *nlmsvc_file_inode(struct nlm_file *file)
 {
-	return locks_inode(file->f_file[O_RDONLY] ?
-			   file->f_file[O_RDONLY] : file->f_file[O_WRONLY]);
+	return locks_inode(nlmsvc_file_file(file));
 }
 
 static inline int __nlm_privileged_request4(const struct sockaddr *sap)