ocfs2/dlm: call dlm_lockres_put without resource spinlock
diff mbox

Message ID 542526C3.2090302@huawei.com
State New, archived
Headers show

Commit Message

AlexChen Sept. 26, 2014, 8:41 a.m. UTC
dlm_lockres_put should be called without &res->spinlock, otherwise a
deadlock case may happen.

spin_lock(&res->spinlock)
...
dlm_lockres_put
  ->dlm_lockres_release
    ->dlm_print_one_lock_resource
      ->spin_lock(&res->spinlock)

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
---
 fs/ocfs2/dlm/dlmrecovery.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Andrew Morton Oct. 1, 2014, 10:55 p.m. UTC | #1
On Fri, 26 Sep 2014 16:41:39 +0800 alex chen <alex.chen@huawei.com> wrote:

> dlm_lockres_put should be called without &res->spinlock, otherwise a
> deadlock case may happen.
> 
> spin_lock(&res->spinlock)
> ...
> dlm_lockres_put
>   ->dlm_lockres_release
>     ->dlm_print_one_lock_resource
>       ->spin_lock(&res->spinlock)
> 
> Signed-off-by: Alex Chen <alex.chen@huawei.com>
> Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
> ---
>  fs/ocfs2/dlm/dlmrecovery.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
> index 45067fa..3365839 100644
> --- a/fs/ocfs2/dlm/dlmrecovery.c
> +++ b/fs/ocfs2/dlm/dlmrecovery.c
> @@ -1710,9 +1710,12 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
>  				BUG();

This code does a GFP_ATOMIC allocation attempt and if that fails, it
goes BUG().

Guys, GFP_ATOMIC is unreliable.  This isn't production quality code :(
Joseph Qi Oct. 8, 2014, 1:48 a.m. UTC | #2
On 2014/10/2 6:55, Andrew Morton wrote:
> On Fri, 26 Sep 2014 16:41:39 +0800 alex chen <alex.chen@huawei.com> wrote:
> 
>> dlm_lockres_put should be called without &res->spinlock, otherwise a
>> deadlock case may happen.
>>
>> spin_lock(&res->spinlock)
>> ...
>> dlm_lockres_put
>>   ->dlm_lockres_release
>>     ->dlm_print_one_lock_resource
>>       ->spin_lock(&res->spinlock)
>>
>> Signed-off-by: Alex Chen <alex.chen@huawei.com>
>> Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
>> ---
>>  fs/ocfs2/dlm/dlmrecovery.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>> index 45067fa..3365839 100644
>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>> @@ -1710,9 +1710,12 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
>>  				BUG();
> 
> This code does a GFP_ATOMIC allocation attempt and if that fails, it
> goes BUG().
> 
> Guys, GFP_ATOMIC is unreliable.  This isn't production quality code :(
> 
Last time we talked about this and Wengang sugguested it return an
error to the sender and let the sender retry.
I'll take this idea and send a patch.
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 
>

Patch
diff mbox

diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 45067fa..3365839 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -1710,9 +1710,12 @@  int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
 				BUG();
 			} else
 				__dlm_lockres_grab_inflight_worker(dlm, res);
-		} else /* put.. incase we are not the master */
+			spin_unlock(&res->spinlock);
+		} else {
+			/* put.. incase we are not the master */
+			spin_unlock(&res->spinlock);
 			dlm_lockres_put(res);
-		spin_unlock(&res->spinlock);
+		}
 	}
 	spin_unlock(&dlm->spinlock);