Message ID | 1409834323-7171-6-git-send-email-jlayton@primarydata.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Sep 04, 2014 at 08:38:31AM -0400, Jeff Layton wrote: > Ensure that it's OK to pass in a NULL file_lock double pointer on > a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to > do just that. > > Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE > with an error return. That's a problem we can handle without > crashing the box if it occurs. Can we just make generic_delete_lease (maye renamed to vfs_delete_lease) the interface for deleting leases instead of going through a useless multiplex and file operation? -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 4 Sep 2014 13:14:24 -0700 Christoph Hellwig <hch@infradead.org> wrote: > On Thu, Sep 04, 2014 at 08:38:31AM -0400, Jeff Layton wrote: > > Ensure that it's OK to pass in a NULL file_lock double pointer on > > a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to > > do just that. > > > > Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE > > with an error return. That's a problem we can handle without > > crashing the box if it occurs. > > Can we just make generic_delete_lease (maye renamed to vfs_delete_lease) > the interface for deleting leases instead of going through a useless > multiplex and file operation? > I'm not sure that change really makes sense to me at this point. Suppose we have an exportable filesystem with a ->setlease implementation [1]. We end up calling into it to set up a lease and it calls generic_add_lease. If we make the change you're suggesting, then we'll have no parallel to a ->setlease op when removing that lease. We could of course make a ->dellease op or something, but I'd rather not introduce that change until I've had a chance to do some other cleanup to the file locking infrastructure. So...I'm not opposed to doing what you suggest, but I'd rather not do it just yet until I've gotten a little farther with some other cleanup of how we deal with locks in general. I think it'll be easier to do that once some other changes have gone in. I'll post a draft patchset based on those changes "real soon now" as an RFC. Hopefully at that point my rationale will make a bit more sense... [1]: of course, only cifs has a non-trivial one for now and it's pretty half-assed...
On Thu, 4 Sep 2014 08:38:31 -0400 Jeff Layton <jlayton@primarydata.com> wrote: > Ensure that it's OK to pass in a NULL file_lock double pointer on > a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to > do just that. > > Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE > with an error return. That's a problem we can handle without > crashing the box if it occurs. > > Signed-off-by: Jeff Layton <jlayton@primarydata.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > --- > fs/locks.c | 34 ++++++++++++++-------------------- > fs/nfsd/nfs4state.c | 2 +- > include/trace/events/filelock.h | 14 +++++++------- > 3 files changed, 22 insertions(+), 28 deletions(-) > > diff --git a/fs/locks.c b/fs/locks.c > index 4031324e6cca..1289b74fffbf 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -1637,22 +1637,23 @@ out: > return error; > } > > -static int generic_delete_lease(struct file *filp, struct file_lock **flp) > +static int generic_delete_lease(struct file *filp) > { > + int error = -EAGAIN; > struct file_lock *fl, **before; > struct dentry *dentry = filp->f_path.dentry; > struct inode *inode = dentry->d_inode; > > - trace_generic_delete_lease(inode, *flp); > - > for (before = &inode->i_flock; > ((fl = *before) != NULL) && IS_LEASE(fl); > before = &fl->fl_next) { > - if (fl->fl_file != filp) > - continue; > - return (*flp)->fl_lmops->lm_change(before, F_UNLCK); > + if (fl->fl_file == filp) > + break; > } > - return -EAGAIN; > + trace_generic_delete_lease(inode, fl); > + if (fl) > + error = fl->fl_lmops->lm_change(before, F_UNLCK); > + return error; > } Hi Jeff, I have a report of a crash in 3.18 because fl->fl_lmops is NULL in the above. https://bugzilla.suse.com/show_bug.cgi?id=912569 I assume this happens because a file_lock is found which is not IS_LEASE. When that happens, the loop will abort, but fl will not be NULL. As non-LEASE locks have a NULL fl_lmops, we crash. I would be inclined to put the code back the way it was, and just move the trace_generic_delete_lease call. Alternately we could make it if (fl && IS_LEASE(fl)) error = fl->fl_lmops-> ..... What do you think? NeilBrown
On Tue, 13 Jan 2015 12:03:43 +1300 NeilBrown <neilb@suse.de> wrote: > On Thu, 4 Sep 2014 08:38:31 -0400 Jeff Layton <jlayton@primarydata.com> > wrote: > > > Ensure that it's OK to pass in a NULL file_lock double pointer on > > a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to > > do just that. > > > > Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE > > with an error return. That's a problem we can handle without > > crashing the box if it occurs. > > > > Signed-off-by: Jeff Layton <jlayton@primarydata.com> > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > --- > > fs/locks.c | 34 ++++++++++++++-------------------- > > fs/nfsd/nfs4state.c | 2 +- > > include/trace/events/filelock.h | 14 +++++++------- > > 3 files changed, 22 insertions(+), 28 deletions(-) > > > > diff --git a/fs/locks.c b/fs/locks.c > > index 4031324e6cca..1289b74fffbf 100644 > > --- a/fs/locks.c > > +++ b/fs/locks.c > > @@ -1637,22 +1637,23 @@ out: > > return error; > > } > > > > -static int generic_delete_lease(struct file *filp, struct file_lock **flp) > > +static int generic_delete_lease(struct file *filp) > > { > > + int error = -EAGAIN; > > struct file_lock *fl, **before; > > struct dentry *dentry = filp->f_path.dentry; > > struct inode *inode = dentry->d_inode; > > > > - trace_generic_delete_lease(inode, *flp); > > - > > for (before = &inode->i_flock; > > ((fl = *before) != NULL) && IS_LEASE(fl); > > before = &fl->fl_next) { > > - if (fl->fl_file != filp) > > - continue; > > - return (*flp)->fl_lmops->lm_change(before, F_UNLCK); > > + if (fl->fl_file == filp) > > + break; > > } > > - return -EAGAIN; > > + trace_generic_delete_lease(inode, fl); > > + if (fl) > > + error = fl->fl_lmops->lm_change(before, F_UNLCK); > > + return error; > > } > > Hi Jeff, > I have a report of a crash in 3.18 because fl->fl_lmops is NULL in the above. > https://bugzilla.suse.com/show_bug.cgi?id=912569 > > I assume this happens because a file_lock is found which is not IS_LEASE. > When that happens, the loop will abort, but fl will not be NULL. > As non-LEASE locks have a NULL fl_lmops, we crash. > > I would be inclined to put the code back the way it was, and just move the > trace_generic_delete_lease call. > > Alternately we could make it > > if (fl && IS_LEASE(fl)) > error = fl->fl_lmops-> ..... > > What do you think? > > NeilBrown Doh! Well spotted... Either fix sounds fine as long as we don't make generic_delete_lease require a "flp" arg again. IOW, if you do make the code work similarly to how it did before, then we should do: return fl->fl_lmops->lm_change(before, F_UNLCK); ...rather than trying to use the ops from a completely different struct file_lock argument that's passed in. FWIW, I have an overhaul of the locking code that is queued for v3.20 that will also fix this (as we'll be moving all of the different locks to separate lists), but we'll obviously need to queue up a patch for stable for this in the interim. Thanks!
On Mon, 12 Jan 2015 18:25:00 -0500 Jeff Layton <jeff.layton@primarydata.com> wrote: > On Tue, 13 Jan 2015 12:03:43 +1300 > NeilBrown <neilb@suse.de> wrote: > > > On Thu, 4 Sep 2014 08:38:31 -0400 Jeff Layton <jlayton@primarydata.com> > > wrote: > > > > > Ensure that it's OK to pass in a NULL file_lock double pointer on > > > a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to > > > do just that. > > > > > > Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE > > > with an error return. That's a problem we can handle without > > > crashing the box if it occurs. > > > > > > Signed-off-by: Jeff Layton <jlayton@primarydata.com> > > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > > --- > > > fs/locks.c | 34 ++++++++++++++-------------------- > > > fs/nfsd/nfs4state.c | 2 +- > > > include/trace/events/filelock.h | 14 +++++++------- > > > 3 files changed, 22 insertions(+), 28 deletions(-) > > > > > > diff --git a/fs/locks.c b/fs/locks.c > > > index 4031324e6cca..1289b74fffbf 100644 > > > --- a/fs/locks.c > > > +++ b/fs/locks.c > > > @@ -1637,22 +1637,23 @@ out: > > > return error; > > > } > > > > > > -static int generic_delete_lease(struct file *filp, struct file_lock **flp) > > > +static int generic_delete_lease(struct file *filp) > > > { > > > + int error = -EAGAIN; > > > struct file_lock *fl, **before; > > > struct dentry *dentry = filp->f_path.dentry; > > > struct inode *inode = dentry->d_inode; > > > > > > - trace_generic_delete_lease(inode, *flp); > > > - > > > for (before = &inode->i_flock; > > > ((fl = *before) != NULL) && IS_LEASE(fl); > > > before = &fl->fl_next) { > > > - if (fl->fl_file != filp) > > > - continue; > > > - return (*flp)->fl_lmops->lm_change(before, F_UNLCK); > > > + if (fl->fl_file == filp) > > > + break; > > > } > > > - return -EAGAIN; > > > + trace_generic_delete_lease(inode, fl); > > > + if (fl) > > > + error = fl->fl_lmops->lm_change(before, F_UNLCK); > > > + return error; > > > } > > > > Hi Jeff, > > I have a report of a crash in 3.18 because fl->fl_lmops is NULL in the above. > > https://bugzilla.suse.com/show_bug.cgi?id=912569 > > > > I assume this happens because a file_lock is found which is not IS_LEASE. > > When that happens, the loop will abort, but fl will not be NULL. > > As non-LEASE locks have a NULL fl_lmops, we crash. > > > > I would be inclined to put the code back the way it was, and just move the > > trace_generic_delete_lease call. > > > > Alternately we could make it > > > > if (fl && IS_LEASE(fl)) > > error = fl->fl_lmops-> ..... > > > > What do you think? > > > > NeilBrown > > Doh! Well spotted... > > Either fix sounds fine as long as we don't make generic_delete_lease > require a "flp" arg again. IOW, if you do make the code work similarly > to how it did before, then we should do: > > return fl->fl_lmops->lm_change(before, F_UNLCK); > > ...rather than trying to use the ops from a completely different struct > file_lock argument that's passed in. > > FWIW, I have an overhaul of the locking code that is queued for v3.20 > that will also fix this (as we'll be moving all of the different locks > to separate lists), but we'll obviously need to queue up a patch for > stable for this in the interim. > > Thanks! As you are going to re-write it all I won't try to make it elegant, just a simple fix. I'll post shortly. Thanks, NeilBrown
diff --git a/fs/locks.c b/fs/locks.c index 4031324e6cca..1289b74fffbf 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1637,22 +1637,23 @@ out: return error; } -static int generic_delete_lease(struct file *filp, struct file_lock **flp) +static int generic_delete_lease(struct file *filp) { + int error = -EAGAIN; struct file_lock *fl, **before; struct dentry *dentry = filp->f_path.dentry; struct inode *inode = dentry->d_inode; - trace_generic_delete_lease(inode, *flp); - for (before = &inode->i_flock; ((fl = *before) != NULL) && IS_LEASE(fl); before = &fl->fl_next) { - if (fl->fl_file != filp) - continue; - return (*flp)->fl_lmops->lm_change(before, F_UNLCK); + if (fl->fl_file == filp) + break; } - return -EAGAIN; + trace_generic_delete_lease(inode, fl); + if (fl) + error = fl->fl_lmops->lm_change(before, F_UNLCK); + return error; } /** @@ -1682,13 +1683,15 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) time_out_leases(inode); - BUG_ON(!(*flp)->fl_lmops->lm_break); - switch (arg) { case F_UNLCK: - return generic_delete_lease(filp, flp); + return generic_delete_lease(filp); case F_RDLCK: case F_WRLCK: + if (!(*flp)->fl_lmops->lm_break) { + WARN_ON_ONCE(1); + return -ENOLCK; + } return generic_add_lease(filp, arg, flp); default: return -EINVAL; @@ -1744,15 +1747,6 @@ int vfs_setlease(struct file *filp, long arg, struct file_lock **lease) } EXPORT_SYMBOL_GPL(vfs_setlease); -static int do_fcntl_delete_lease(struct file *filp) -{ - struct file_lock fl, *flp = &fl; - - lease_init(filp, F_UNLCK, flp); - - return vfs_setlease(filp, F_UNLCK, &flp); -} - static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) { struct file_lock *fl, *ret; @@ -1809,7 +1803,7 @@ out_unlock: int fcntl_setlease(unsigned int fd, struct file *filp, long arg) { if (arg == F_UNLCK) - return do_fcntl_delete_lease(filp); + return vfs_setlease(filp, F_UNLCK, NULL); return do_fcntl_add_lease(fd, filp, arg); } diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 29fac18d9102..0cd252916e1a 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -683,7 +683,7 @@ static void nfs4_put_deleg_lease(struct nfs4_file *fp) if (!fp->fi_lease) return; if (atomic_dec_and_test(&fp->fi_delegees)) { - vfs_setlease(fp->fi_deleg_file, F_UNLCK, &fp->fi_lease); + vfs_setlease(fp->fi_deleg_file, F_UNLCK, NULL); fp->fi_lease = NULL; fput(fp->fi_deleg_file); fp->fi_deleg_file = NULL; diff --git a/include/trace/events/filelock.h b/include/trace/events/filelock.h index 59d11c22f076..a0d008070962 100644 --- a/include/trace/events/filelock.h +++ b/include/trace/events/filelock.h @@ -53,15 +53,15 @@ DECLARE_EVENT_CLASS(filelock_lease, ), TP_fast_assign( - __entry->fl = fl; + __entry->fl = fl ? fl : NULL; __entry->s_dev = inode->i_sb->s_dev; __entry->i_ino = inode->i_ino; - __entry->fl_next = fl->fl_next; - __entry->fl_owner = fl->fl_owner; - __entry->fl_flags = fl->fl_flags; - __entry->fl_type = fl->fl_type; - __entry->fl_break_time = fl->fl_break_time; - __entry->fl_downgrade_time = fl->fl_downgrade_time; + __entry->fl_next = fl ? fl->fl_next : NULL; + __entry->fl_owner = fl ? fl->fl_owner : NULL; + __entry->fl_flags = fl ? fl->fl_flags : 0; + __entry->fl_type = fl ? fl->fl_type : 0; + __entry->fl_break_time = fl ? fl->fl_break_time : 0; + __entry->fl_downgrade_time = fl ? fl->fl_downgrade_time : 0; ), TP_printk("fl=0x%p dev=0x%x:0x%x ino=0x%lx fl_next=0x%p fl_owner=0x%p fl_flags=%s fl_type=%s fl_break_time=%lu fl_downgrade_time=%lu",