diff mbox

locks: breaking read lease should not block read open

Message ID 20110609231606.GB22215@fieldses.org (mailing list archive)
State New, archived
Headers show

Commit Message

J. Bruce Fields June 9, 2011, 11:16 p.m. UTC
The lease code behavior during the lease-breaking process is strange.
Fixing it completely would be complicated by the fact that the current
code allows a lease break to downgrade the lease instead of necessarily
removing it.

But I can't see what the point of that feature is.  And googling around
and looking at the Samba code, I can't see any evidence that anyone uses
it.  Think we could just do away with removing the ability to downgrade
to satisfy a lease break?

--b.

commit 70fc3ee9d3e9d61203d4310db4a8128886747272
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Thu Jun 9 09:31:30 2011 -0400

    locks: breaking read lease should not block read open
    
    Currently read opens block while a lease break is in progress, even if
    the lease being broken was only a read lease.
    
    This is an annoyance for v4, since clients may need to do read opens
    before they can return a delegation.
    
    This happens because we use fl_type on a breaking lease to track the
    type it will have when the break is finished (F_RDLCK or F_UNLCK) as
    opposed to the type it had before the break started, so we no longer
    even know at that point whether it was a write lease or a read lease.
    
    The only reason we do that is to allow a write lease broken by a read
    open to be downgraded to a read lease instead of removed completely.
    
    However, that doesn't seem like a useful feature--as far as I can tell,
    nobody uses it or likely ever will.
    
    So, just keep the original lease type in fl_type.
    
    Reported-by: Casey Bodley <cbodley@citi.umich.edu>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Volker Lendecke June 10, 2011, 7:56 a.m. UTC | #1
On Thu, Jun 09, 2011 at 07:16:06PM -0400, J. Bruce Fields wrote:
> The lease code behavior during the lease-breaking process is strange.
> Fixing it completely would be complicated by the fact that the current
> code allows a lease break to downgrade the lease instead of necessarily
> removing it.
> 
> But I can't see what the point of that feature is.  And googling around
> and looking at the Samba code, I can't see any evidence that anyone uses
> it.  Think we could just do away with removing the ability to downgrade
> to satisfy a lease break?

Without having looked too deeply, just let me point out that
Samba here has a plain flaw. Early Linux Kernel versions
that we programmed against did not properly support read
only leases, so we did not implement that initially. If I
remember correctly we never got around to finally do it once
it became available. Eventually we will probably, as read
only leases are a pretty important feature to present to
CIFS clients.

Volker
J. Bruce Fields June 10, 2011, 1:48 p.m. UTC | #2
On Fri, Jun 10, 2011 at 09:56:49AM +0200, Volker Lendecke wrote:
> On Thu, Jun 09, 2011 at 07:16:06PM -0400, J. Bruce Fields wrote:
> > The lease code behavior during the lease-breaking process is strange.
> > Fixing it completely would be complicated by the fact that the current
> > code allows a lease break to downgrade the lease instead of necessarily
> > removing it.
> > 
> > But I can't see what the point of that feature is.  And googling around
> > and looking at the Samba code, I can't see any evidence that anyone uses
> > it.  Think we could just do away with removing the ability to downgrade
> > to satisfy a lease break?
> 
> Without having looked too deeply, just let me point out that
> Samba here has a plain flaw. Early Linux Kernel versions
> that we programmed against did not properly support read
> only leases, so we did not implement that initially. If I
> remember correctly we never got around to finally do it once
> it became available. Eventually we will probably, as read
> only leases are a pretty important feature to present to
> CIFS clients.

Thanks, I didn't know that.  (Or I did, and I forgot.)

When you *do* implement that, is there any chance you'd have this need
to be able to downgrade to a read lease in the case of a conflict?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields July 21, 2011, 12:07 a.m. UTC | #3
On Fri, Jun 10, 2011 at 09:48:59AM -0400, J. Bruce Fields wrote:
> On Fri, Jun 10, 2011 at 09:56:49AM +0200, Volker Lendecke wrote:
> > Without having looked too deeply, just let me point out that
> > Samba here has a plain flaw. Early Linux Kernel versions
> > that we programmed against did not properly support read
> > only leases, so we did not implement that initially. If I
> > remember correctly we never got around to finally do it once
> > it became available. Eventually we will probably, as read
> > only leases are a pretty important feature to present to
> > CIFS clients.
> 
> Thanks, I didn't know that.  (Or I did, and I forgot.)
> 
> When you *do* implement that, is there any chance you'd have this need
> to be able to downgrade to a read lease in the case of a conflict?

So it's a question about the protocols samba implements:

	- Do they allow an atomic downgrade from an exclusive to a
	  shared oplock?  (Or to a level 2 oplock, or whatever the right
	  term is).
	- If so, can that happen as a response to a conflicting open?
	  (So, if you're holding an exclusive oplock, and a conflicting
	  open comes in, can the server-to-client break message say "now
	  you're getting a shared oplock instead"?  Or is the client
	  left without any oplock until it requests a new one?)

I'm not sure how to approach the lease code.

On the one hand, I've never seen any evidence that anyone outside Samba
and NFSv4 has ever used it, and both currently make extremely limited
use of it.  So we could probably get away with "fixing" the lease code
to do whatever both of us need.

On the other hand, I don't know how to figure out what exactly Samba
will actually need.  And the conservative thing to do would be to leave
leases alone.

So maybe I'm better off with a new "NFSv4 delegation lock type" that
does exactly what I need it to, and that's only available from inside
the kernel for now.

Then later if it proves useful to Samba as well, we could figure out how
to export an interface for it to userspace.

I'm not sure.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeremy Allison July 21, 2011, 12:15 a.m. UTC | #4
On Wed, Jul 20, 2011 at 08:07:58PM -0400, J. Bruce Fields wrote:
> On Fri, Jun 10, 2011 at 09:48:59AM -0400, J. Bruce Fields wrote:
> > On Fri, Jun 10, 2011 at 09:56:49AM +0200, Volker Lendecke wrote:
> > > Without having looked too deeply, just let me point out that
> > > Samba here has a plain flaw. Early Linux Kernel versions
> > > that we programmed against did not properly support read
> > > only leases, so we did not implement that initially. If I
> > > remember correctly we never got around to finally do it once
> > > it became available. Eventually we will probably, as read
> > > only leases are a pretty important feature to present to
> > > CIFS clients.
> > 
> > Thanks, I didn't know that.  (Or I did, and I forgot.)
> > 
> > When you *do* implement that, is there any chance you'd have this need
> > to be able to downgrade to a read lease in the case of a conflict?
> 
> So it's a question about the protocols samba implements:
> 
> 	- Do they allow an atomic downgrade from an exclusive to a
> 	  shared oplock?  (Or to a level 2 oplock, or whatever the right
> 	  term is).

Yes. Exclusive can go to level 2 - in fact that's the default
downgrade we do (unless an smb.conf option explicity denies it).

> 	- If so, can that happen as a response to a conflicting open?
> 	  (So, if you're holding an exclusive oplock, and a conflicting
> 	  open comes in, can the server-to-client break message say "now
> 	  you're getting a shared oplock instead"?  Or is the client
> 	  left without any oplock until it requests a new one?)

Yes, this can happen.

In SMB, we only break to no lease when a write request comes
in on a exclusive or level2 oplock (read-lease) handle.

Jeremy.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields July 21, 2011, 4:35 p.m. UTC | #5
On Wed, Jul 20, 2011 at 05:15:42PM -0700, Jeremy Allison wrote:
> On Wed, Jul 20, 2011 at 08:07:58PM -0400, J. Bruce Fields wrote:
> > On Fri, Jun 10, 2011 at 09:48:59AM -0400, J. Bruce Fields wrote:
> > > On Fri, Jun 10, 2011 at 09:56:49AM +0200, Volker Lendecke wrote:
> > > > Without having looked too deeply, just let me point out that
> > > > Samba here has a plain flaw. Early Linux Kernel versions
> > > > that we programmed against did not properly support read
> > > > only leases, so we did not implement that initially. If I
> > > > remember correctly we never got around to finally do it once
> > > > it became available. Eventually we will probably, as read
> > > > only leases are a pretty important feature to present to
> > > > CIFS clients.
> > > 
> > > Thanks, I didn't know that.  (Or I did, and I forgot.)
> > > 
> > > When you *do* implement that, is there any chance you'd have this need
> > > to be able to downgrade to a read lease in the case of a conflict?
> > 
> > So it's a question about the protocols samba implements:
> > 
> > 	- Do they allow an atomic downgrade from an exclusive to a
> > 	  shared oplock?  (Or to a level 2 oplock, or whatever the right
> > 	  term is).
> 
> Yes. Exclusive can go to level 2 - in fact that's the default
> downgrade we do (unless an smb.conf option explicity denies it).
> 
> > 	- If so, can that happen as a response to a conflicting open?
> > 	  (So, if you're holding an exclusive oplock, and a conflicting
> > 	  open comes in, can the server-to-client break message say "now
> > 	  you're getting a shared oplock instead"?  Or is the client
> > 	  left without any oplock until it requests a new one?)
> 
> Yes, this can happen.
> 
> In SMB, we only break to no lease when a write request comes
> in on a exclusive or level2 oplock (read-lease) handle.

Ok, thanks, that means we need a more complicated fix here--I'll work on
that....

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields July 29, 2011, 2:27 a.m. UTC | #6
On Thu, Jul 21, 2011 at 12:35:20PM -0400, J. Bruce Fields wrote:
> On Wed, Jul 20, 2011 at 05:15:42PM -0700, Jeremy Allison wrote:
> > On Wed, Jul 20, 2011 at 08:07:58PM -0400, J. Bruce Fields wrote:
> > > So it's a question about the protocols samba implements:
> > > 
> > > 	- Do they allow an atomic downgrade from an exclusive to a
> > > 	  shared oplock?  (Or to a level 2 oplock, or whatever the right
> > > 	  term is).
> > 
> > Yes. Exclusive can go to level 2 - in fact that's the default
> > downgrade we do (unless an smb.conf option explicity denies it).
> > 
> > > 	- If so, can that happen as a response to a conflicting open?
> > > 	  (So, if you're holding an exclusive oplock, and a conflicting
> > > 	  open comes in, can the server-to-client break message say "now
> > > 	  you're getting a shared oplock instead"?  Or is the client
> > > 	  left without any oplock until it requests a new one?)
> > 
> > Yes, this can happen.
> > 
> > In SMB, we only break to no lease when a write request comes
> > in on a exclusive or level2 oplock (read-lease) handle.
> 
> Ok, thanks, that means we need a more complicated fix here--I'll work on
> that....

My attempt follows.  Lightly tested.

I'll probably try writing a test or two for it, then queueing up
something like this for 3.2, absent objections.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Aug. 19, 2011, 4:04 p.m. UTC | #7
On Thu, Jul 28, 2011 at 10:27:58PM -0400, J. Bruce Fields wrote:
> On Thu, Jul 21, 2011 at 12:35:20PM -0400, J. Bruce Fields wrote:
> > On Wed, Jul 20, 2011 at 05:15:42PM -0700, Jeremy Allison wrote:
> > > On Wed, Jul 20, 2011 at 08:07:58PM -0400, J. Bruce Fields wrote:
> > > > So it's a question about the protocols samba implements:
> > > > 
> > > > 	- Do they allow an atomic downgrade from an exclusive to a
> > > > 	  shared oplock?  (Or to a level 2 oplock, or whatever the right
> > > > 	  term is).
> > > 
> > > Yes. Exclusive can go to level 2 - in fact that's the default
> > > downgrade we do (unless an smb.conf option explicity denies it).
> > > 
> > > > 	- If so, can that happen as a response to a conflicting open?
> > > > 	  (So, if you're holding an exclusive oplock, and a conflicting
> > > > 	  open comes in, can the server-to-client break message say "now
> > > > 	  you're getting a shared oplock instead"?  Or is the client
> > > > 	  left without any oplock until it requests a new one?)
> > > 
> > > Yes, this can happen.
> > > 
> > > In SMB, we only break to no lease when a write request comes
> > > in on a exclusive or level2 oplock (read-lease) handle.
> > 
> > Ok, thanks, that means we need a more complicated fix here--I'll work on
> > that....
> 
> My attempt follows.  Lightly tested.
> 
> I'll probably try writing a test or two for it, then queueing up
> something like this for 3.2, absent objections.

Second take follows, this time after a little more testing
(lease_tests.c from git://linux-nfs.org/~bfields/lock-tests.git), some
bug fixes, and minor simplification of the logic.  This is what I intend
to queue up for 3.2.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jamie Lokier Aug. 19, 2011, 7:08 p.m. UTC | #8
J. Bruce Fields wrote:
> I'm not sure how to approach the lease code.
> 
> On the one hand, I've never seen any evidence that anyone outside Samba
> and NFSv4 has ever used it, and both currently make extremely limited
> use of it.  So we could probably get away with "fixing" the lease code
> to do whatever both of us need.

I've never used it, but I've _nearly_ used it (project took a
different direction), in a web application framework.

Pretty much the way CIFS/NFS use it, to keep other things (remote
state, database state, derived files) transactionally coherent with
changes to file contents by programs that only know about the files
they access.

The SIGIO stuff is a horrible interface.
I could still see me trying to use it sometime in the future.
In which case I really don't mind if you make the semantics saner :-)

Now we have fanotify which does something very similar and could have
generalised leases, but unfortunately fanotify came from such a
different motivation that it's not well suited for ordinary user
applications.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields Aug. 21, 2011, 4:50 p.m. UTC | #9
On Fri, Aug 19, 2011 at 08:08:29PM +0100, Jamie Lokier wrote:
> J. Bruce Fields wrote:
> > I'm not sure how to approach the lease code.
> > 
> > On the one hand, I've never seen any evidence that anyone outside Samba
> > and NFSv4 has ever used it, and both currently make extremely limited
> > use of it.  So we could probably get away with "fixing" the lease code
> > to do whatever both of us need.
> 
> I've never used it, but I've _nearly_ used it (project took a
> different direction), in a web application framework.
> 
> Pretty much the way CIFS/NFS use it, to keep other things (remote
> state, database state, derived files) transactionally coherent with
> changes to file contents by programs that only know about the files
> they access.
> 
> The SIGIO stuff is a horrible interface.
> I could still see me trying to use it sometime in the future.
> In which case I really don't mind if you make the semantics saner :-)
> 
> Now we have fanotify which does something very similar and could have
> generalised leases, but unfortunately fanotify came from such a
> different motivation that it's not well suited for ordinary user
> applications.

I'm not sure what you mean by that--mainly just because I'm not as
familiar with fanotify as I should be.

For my case the important difference between leases and the various
notification interfaces is that leases are synchronous--the lease-holder
is notified and has a chance to clean up before releasing its lease and
allowing the conflicting operation to continue--whereas the the various
notification interfaces tell you "tough luck, something just happened".

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/locks.c b/fs/locks.c
index 0a4f50d..171391f 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1158,9 +1158,9 @@  static void time_out_leases(struct inode *inode)
 			before = &fl->fl_next;
 			continue;
 		}
-		lease_modify(before, fl->fl_type & ~F_INPROGRESS);
-		if (fl == *before)	/* lease_modify may have freed fl */
-			before = &fl->fl_next;
+		lease_modify(before, F_UNLCK);
+		/* lease_modify has freed fl */
+		before = &fl->fl_next;
 	}
 }
 
@@ -1176,7 +1176,7 @@  static void time_out_leases(struct inode *inode)
  */
 int __break_lease(struct inode *inode, unsigned int mode)
 {
-	int error = 0, future;
+	int error = 0;
 	struct file_lock *new_fl, *flock;
 	struct file_lock *fl;
 	unsigned long break_time;
@@ -1197,19 +1197,8 @@  int __break_lease(struct inode *inode, unsigned int mode)
 		if (fl->fl_owner == current->files)
 			i_have_this_lease = 1;
 
-	if (want_write) {
-		/* If we want write access, we have to revoke any lease. */
-		future = F_UNLCK | F_INPROGRESS;
-	} else if (flock->fl_type & F_INPROGRESS) {
-		/* If the lease is already being broken, we just leave it */
-		future = flock->fl_type;
-	} else if (flock->fl_type & F_WRLCK) {
-		/* Downgrade the exclusive lease to a read-only lease. */
-		future = F_RDLCK | F_INPROGRESS;
-	} else {
-		/* the existing lease was read-only, so we can read too. */
-		goto out;
-	}
+	if (!want_write && !(flock->fl_type & F_WRLCK))
+		goto out; /* no conflict */
 
 	if (IS_ERR(new_fl) && !i_have_this_lease
 			&& ((mode & O_NONBLOCK) == 0)) {
@@ -1225,8 +1214,8 @@  int __break_lease(struct inode *inode, unsigned int mode)
 	}
 
 	for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
-		if (fl->fl_type != future) {
-			fl->fl_type = future;
+		if (!(fl->fl_type & F_INPROGRESS)) {
+			fl->fl_type |= F_INPROGRESS;
 			fl->fl_break_time = break_time;
 			/* lease must have lmops break callback */
 			fl->fl_lmops->fl_break(fl);
@@ -1254,10 +1243,10 @@  restart:
 	if (error >= 0) {
 		if (error == 0)
 			time_out_leases(inode);
-		/* Wait for the next lease that has not been broken yet */
+		/* Wait for the next lease that conflicts */
 		for (flock = inode->i_flock; flock && IS_LEASE(flock);
 				flock = flock->fl_next) {
-			if (flock->fl_type & F_INPROGRESS)
+			if (want_write || flock->fl_type & F_WRLCK)
 				goto restart;
 		}
 		error = 0;
@@ -1390,11 +1379,10 @@  int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 			before = &fl->fl_next) {
 		if (fl->fl_file == filp)
 			my_before = before;
-		else if (fl->fl_type == (F_INPROGRESS | F_UNLCK))
+		else if (fl->fl_type & F_INPROGRESS)
 			/*
-			 * Someone is in the process of opening this
-			 * file for writing so we may not take an
-			 * exclusive lease on it.
+			 * Don't allow new leases while any lease is
+			 * being broken:
 			 */
 			wrlease_count++;
 		else