diff mbox

add OCFS2_LOCK_IGNORE_BLOCKED arg_flags to ocfs2_cluster_lock() to prevent hang

Message ID 1450828550-23168-1-git-send-email-tariq.x.saeed@oracle.com
State New, archived
Headers show

Commit Message

Tariq Saeed Dec. 22, 2015, 11:55 p.m. UTC
From: Linus Torvalds <torvalds@linux-foundation.org>

Orabug: 21793017

commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
introduced this issue.  ocfs2_setattr called by chmod command
holds cluster wide inode lock (Orabug 21685187) when calling
posix_acl_chmod. This latter function in turn calls ocfs2_iop_get_acl
and ocfs2_iop_set_acl.  These two are also called directly from vfs layer
for getfacl/setfacl commands and therefore acquire the cluster wide inode
lock. If a remote conversion request comes after the first inode lock
in ocfs2_setattr, OCFS2_LOCK_BLOCKED will be set in l_flags. This will
cause the second call to inode lock from the  ocfs2_iop_get|set_acl()
to block indefinetly.  The new flag OCFS2_LOCK_IGNORE_BLOCKED will be
used to prevent this blocking.

Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> Signed-off-by:
Junxiao Bi <junxiao.bi@oracle.com>
---
 Makefile           |    2 +-
 fs/ocfs2/acl.c     |    7 +++++--
 fs/ocfs2/dlmglue.c |    3 ++-
 fs/ocfs2/dlmglue.h |    2 ++
 4 files changed, 10 insertions(+), 4 deletions(-)

Comments

Mark Fasheh Dec. 30, 2015, 12:34 a.m. UTC | #1
On Tue, Dec 22, 2015 at 03:55:50PM -0800, Tariq Saeed wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> 
> Orabug: 21793017
> 
> commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
> introduced this issue.  ocfs2_setattr called by chmod command
> holds cluster wide inode lock (Orabug 21685187) when calling
> posix_acl_chmod. This latter function in turn calls ocfs2_iop_get_acl
> and ocfs2_iop_set_acl.  These two are also called directly from vfs layer
> for getfacl/setfacl commands and therefore acquire the cluster wide inode
> lock. If a remote conversion request comes after the first inode lock
> in ocfs2_setattr, OCFS2_LOCK_BLOCKED will be set in l_flags. This will
> cause the second call to inode lock from the  ocfs2_iop_get|set_acl()
> to block indefinetly.  The new flag OCFS2_LOCK_IGNORE_BLOCKED will be
> used to prevent this blocking.


NACK - as I explained earlier on this list we need to refactor the code to
avoid double locking the same resource. It is not acceptable to introduce
recursive locks into the code.
	--Mark

--
Mark Fasheh
Tariq Saeed Jan. 4, 2016, 8:57 p.m. UTC | #2
On 12/29/2015 04:34 PM, Mark Fasheh wrote:
> NACK - as I explained earlier on this list we need to refactor the code to
Not sure how things work in the open source world. Is refactoring to be 
picked by
somebody soon?
Thanks
-Tariq
Mark Fasheh Jan. 5, 2016, 12:35 a.m. UTC | #3
On Mon, Jan 04, 2016 at 12:57:37PM -0800, Tariq Saeed wrote:
> 
> On 12/29/2015 04:34 PM, Mark Fasheh wrote:
> > NACK - as I explained earlier on this list we need to refactor the code to
> Not sure how things work in the open source world. Is refactoring to be 
> picked by
> somebody soon?

No problem, A couple explanations  / notes:

1) CC me with your patches (someone had to point the last one out to me).

I'll get to reviewing most things but if you CC me then it can happen
sooner.


2) You are fixing the bug and sending the patches. We usually expect the
sender to incorporate feedback into their patch so the answer would be that
you get to to the refactoring :)

If you need to make small changes to the VFS to let this happen (sometimes
we might want 'nolock' variants of funtions, etc), do them in seperate
patches and send them along with your changes.

Hope this helps,
	--Mark

--
Mark Fasheh
Junxiao Bi Jan. 5, 2016, 1:46 a.m. UTC | #4
Hi Mark,

We may fix this issue in another way, without recursive locking and
without refactoring. We can check before asking for the second locking,
if lock is already hold, then skip the second locking, do you agree
fixing like this?

Thanks,
Junxiao.
On 01/05/2016 08:35 AM, Mark Fasheh wrote:
> On Mon, Jan 04, 2016 at 12:57:37PM -0800, Tariq Saeed wrote:
>>
>> On 12/29/2015 04:34 PM, Mark Fasheh wrote:
>>> NACK - as I explained earlier on this list we need to refactor the code to
>> Not sure how things work in the open source world. Is refactoring to be 
>> picked by
>> somebody soon?
> 
> No problem, A couple explanations  / notes:
> 
> 1) CC me with your patches (someone had to point the last one out to me).
> 
> I'll get to reviewing most things but if you CC me then it can happen
> sooner.
> 
> 
> 2) You are fixing the bug and sending the patches. We usually expect the
> sender to incorporate feedback into their patch so the answer would be that
> you get to to the refactoring :)
> 
> If you need to make small changes to the VFS to let this happen (sometimes
> we might want 'nolock' variants of funtions, etc), do them in seperate
> patches and send them along with your changes.
> 
> Hope this helps,
> 	--Mark
> 
> --
> Mark Fasheh
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
Joel Becker Jan. 5, 2016, 11:47 p.m. UTC | #5
On Tue, Jan 05, 2016 at 09:46:49AM +0800, Junxiao Bi wrote:
> Hi Mark,
> 
> We may fix this issue in another way, without recursive locking and
> without refactoring. We can check before asking for the second locking,
> if lock is already hold, then skip the second locking, do you agree
> fixing like this?

Hey Junxiao,

What is to prevent the second lock from being dropped while in the
middle of the operation?

Joel
Junxiao Bi Jan. 6, 2016, 4:47 a.m. UTC | #6
Hi Joel,

On 01/06/2016 07:47 AM, Joel Becker wrote:
> On Tue, Jan 05, 2016 at 09:46:49AM +0800, Junxiao Bi wrote:
>> Hi Mark,
>>
>> We may fix this issue in another way, without recursive locking and
>> without refactoring. We can check before asking for the second locking,
>> if lock is already hold, then skip the second locking, do you agree
>> fixing like this?
> 
> Hey Junxiao,
> 
> What is to prevent the second lock from being dropped while in the
> middle of the operation?
Log processes id into lockres when lock is got. When do second locking,
if this lock is already got by this process, then skip it.

Thanks,
Junxiao.

> 
> Joel
>
diff mbox

Patch

diff --git a/Makefile b/Makefile
index 431067a..d5b3739 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@ 
 VERSION = 4
 PATCHLEVEL = 3
 SUBLEVEL = 0
-EXTRAVERSION = -rc7
+EXTRAVERSION =
 NAME = Blurry Fish Butt
 
 # *DOCUMENTATION*
diff --git a/fs/ocfs2/acl.c b/fs/ocfs2/acl.c
index 0cdf497..4404d9a 100644
--- a/fs/ocfs2/acl.c
+++ b/fs/ocfs2/acl.c
@@ -37,6 +37,7 @@ 
 #include "xattr.h"
 #include "acl.h"
 
+
 /*
  * Convert from xattr value to acl struct.
  */
@@ -287,7 +288,8 @@  int ocfs2_iop_set_acl(struct inode *inode, struct posix_acl *acl, int type)
 	struct buffer_head *bh = NULL;
 	int status = 0;
 
-	status = ocfs2_inode_lock(inode, &bh, 1);
+	status = ocfs2_inode_lock_full(inode, &bh, 1,
+					OCFS2_LOCK_IGNORE_BLOCKED);
 	if (status < 0) {
 		if (status != -ENOENT)
 			mlog_errno(status);
@@ -309,7 +311,8 @@  struct posix_acl *ocfs2_iop_get_acl(struct inode *inode, int type)
 	osb = OCFS2_SB(inode->i_sb);
 	if (!(osb->s_mount_opt & OCFS2_MOUNT_POSIX_ACL))
 		return NULL;
-	ret = ocfs2_inode_lock(inode, &di_bh, 0);
+	ret = ocfs2_inode_lock_full(inode, &di_bh, 0,
+					OCFS2_LOCK_IGNORE_BLOCKED);
 	if (ret < 0) {
 		if (ret != -ENOENT)
 			mlog_errno(ret);
diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 1c91103..ea79e1b 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -1447,7 +1447,8 @@  again:
 	}
 
 	if (lockres->l_flags & OCFS2_LOCK_BLOCKED &&
-	    !ocfs2_may_continue_on_blocked_lock(lockres, level)) {
+	    !ocfs2_may_continue_on_blocked_lock(lockres, level) &&
+	    !(arg_flags & OCFS2_LOCK_IGNORE_BLOCKED)) {
 		/* is the lock is currently blocked on behalf of
 		 * another node */
 		lockres_add_mask_waiter(lockres, &mw, OCFS2_LOCK_BLOCKED, 0);
diff --git a/fs/ocfs2/dlmglue.h b/fs/ocfs2/dlmglue.h
index d293a22..845831f 100644
--- a/fs/ocfs2/dlmglue.h
+++ b/fs/ocfs2/dlmglue.h
@@ -78,6 +78,8 @@  struct ocfs2_orphan_scan_lvb {
 /* don't block waiting for the downconvert thread, instead return -EAGAIN */
 #define OCFS2_LOCK_NONBLOCK		(0x04)
 
+/* if requested level is <= l_level, ignore BLOCKED flag. */
+#define OCFS2_LOCK_IGNORE_BLOCKED		(0x08)
 /* Locking subclasses of inode cluster lock */
 enum {
 	OI_LS_NORMAL = 0,