diff mbox

[v2] ocfs2: fix possible memory leak in dlm_process_recovery_data

Message ID 51826292.2000001@huawei.com
State Changes Requested
Headers show

Commit Message

Joseph Qi May 2, 2013, 12:56 p.m. UTC
We found a possible memory leak in dlm_process_recovery_data when doing
code review. In dlm_process_recovery_data, it creates newlock each time,
but don't free when it is bad, and then it will lead to memory leak.

Cc: stable@vger.kernel.org
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>

---
 fs/ocfs2/dlm/dlmrecovery.c |    4 ++++
 1 file changed, 4 insertions(+)

Comments

Sunil Mushran May 2, 2013, 5:19 p.m. UTC | #1
Do you know under what conditions does it create a new lock when it should
not?

This code should only trigger if the lockres is/was mastered on another
node.
Meaning this node will not know about the newlock. Meaning that code should
never trigger.

1949                         if (lock->ml.cookie == ml->cookie) {
This if looks hacky.

If you have a reproducible case, it may be worthwhile to get some traces to
see
under what conditions this happens. Should be straight forward after that.

Thanks
Sunil



On Thu, May 2, 2013 at 5:56 AM, Joseph Qi <joseph.qi@huawei.com> wrote:

>   We found a possible memory leak in dlm_process_recovery_data when doing
> code review. In dlm_process_recovery_data, it creates newlock each time,
> but don't free when it is bad, and then it will lead to memory leak.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Joseph Qi <joseph.qi@huawei.com> <joseph.qi@huawei.com>
> Reviewed-by: Jie Liu <jeff.liu@oracle.com> <jeff.liu@oracle.com>
>
> ---
>  fs/ocfs2/dlm/dlmrecovery.c |    4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
> index eeac97b..9f08523 100644
> --- a/fs/ocfs2/dlm/dlmrecovery.c
> +++ b/fs/ocfs2/dlm/dlmrecovery.c
> @@ -1974,6 +1974,10 @@ skip_lvb:
>  			     res->lockname.len, res->lockname.name, ml->node);
>  			dlm_lockres_set_refmap_bit(dlm, res, ml->node);
>  			added++;
> +		} else {
> +			/* Free the new lock if it is bad */
> +			dlm_lock_put(newlock);
> +			newlock = NULL;
>  		}
>  		spin_unlock(&res->spinlock);
>  	}
> --
> 1.7.9.7
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
Joseph Qi May 4, 2013, 1:02 a.m. UTC | #2
Please consider the belowing case:
dlm_migrate_all_locks -> dlm_empty_lockres -> dlm_migrate_lockres ->
dlm_send_one_lockres, if we have already sent some locks, say
DLM_MAX_MIGRATABLE_LOCKS for the first time, and then
dlm_send_mig_lockres_msg failed because of network down, it will redo
it. During the redo_bucket, the lockres can be hashed and migrated
again.

On 2013/5/3 1:19, Sunil Mushran wrote:
> Do you know under what conditions does it create a new lock when it
> should not?
> 
> This code should only trigger if the lockres is/was mastered on another
> node.
> Meaning this node will not know about the newlock. Meaning that code should
> never trigger.
> 
> 1949                         if (lock->ml.cookie == ml->cookie) {
> This if looks hacky.
> 
> If you have a reproducible case, it may be worthwhile to get some traces
> to see
> under what conditions this happens. Should be straight forward after that.
> 
> Thanks
> Sunil
> 
> 
> 
> On Thu, May 2, 2013 at 5:56 AM, Joseph Qi <joseph.qi@huawei.com
> <mailto:joseph.qi@huawei.com>> wrote:
> 
>     We found a possible memory leak in dlm_process_recovery_data when doing
>     code review. In dlm_process_recovery_data, it creates newlock each time,
>     but don't free when it is bad, and then it will lead to memory leak.
> 
>     Cc: stable@vger.kernel.org <mailto:stable@vger.kernel.org>
>     Signed-off-by: Joseph Qi <joseph.qi@huawei.com> <mailto:joseph.qi@huawei.com>
>     Reviewed-by: Jie Liu <jeff.liu@oracle.com> <mailto:jeff.liu@oracle.com>
> 
>     ---
>      fs/ocfs2/dlm/dlmrecovery.c |    4 ++++
>      1 file changed, 4 insertions(+)
> 
>     diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>     index eeac97b..9f08523 100644
>     --- a/fs/ocfs2/dlm/dlmrecovery.c
>     +++ b/fs/ocfs2/dlm/dlmrecovery.c
>     @@ -1974,6 +1974,10 @@ skip_lvb:
>      			     res->lockname.len, res->lockname.name <http://lockname.name>, ml->node);
>      			dlm_lockres_set_refmap_bit(dlm, res, ml->node);
>      			added++;
>     +		} else {
>     +			/* Free the new lock if it is bad */
>     +			dlm_lock_put(newlock);
>     +			newlock = NULL;
>      		}
>      		spin_unlock(&res->spinlock);
>      	}
>     -- 
>     1.7.9.7
> 
> 
>     _______________________________________________
>     Ocfs2-devel mailing list
>     Ocfs2-devel@oss.oracle.com <mailto:Ocfs2-devel@oss.oracle.com>
>     https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 
>
Sunil Mushran May 4, 2013, 1:05 a.m. UTC | #3
Any reason we cannot add the existence check to before the create and thus
skip creation altogether.


On Fri, May 3, 2013 at 6:02 PM, Joseph Qi <joseph.qi@huawei.com> wrote:

> Please consider the belowing case:
> dlm_migrate_all_locks -> dlm_empty_lockres -> dlm_migrate_lockres ->
> dlm_send_one_lockres, if we have already sent some locks, say
> DLM_MAX_MIGRATABLE_LOCKS for the first time, and then
> dlm_send_mig_lockres_msg failed because of network down, it will redo
> it. During the redo_bucket, the lockres can be hashed and migrated
> again.
>
> On 2013/5/3 1:19, Sunil Mushran wrote:
> > Do you know under what conditions does it create a new lock when it
> > should not?
> >
> > This code should only trigger if the lockres is/was mastered on another
> > node.
> > Meaning this node will not know about the newlock. Meaning that code
> should
> > never trigger.
> >
> > 1949                         if (lock->ml.cookie == ml->cookie) {
> > This if looks hacky.
> >
> > If you have a reproducible case, it may be worthwhile to get some traces
> > to see
> > under what conditions this happens. Should be straight forward after
> that.
> >
> > Thanks
> > Sunil
> >
> >
> >
> > On Thu, May 2, 2013 at 5:56 AM, Joseph Qi <joseph.qi@huawei.com
> > <mailto:joseph.qi@huawei.com>> wrote:
> >
> >     We found a possible memory leak in dlm_process_recovery_data when
> doing
> >     code review. In dlm_process_recovery_data, it creates newlock each
> time,
> >     but don't free when it is bad, and then it will lead to memory leak.
> >
> >     Cc: stable@vger.kernel.org <mailto:stable@vger.kernel.org>
> >     Signed-off-by: Joseph Qi <joseph.qi@huawei.com> <mailto:
> joseph.qi@huawei.com>
> >     Reviewed-by: Jie Liu <jeff.liu@oracle.com> <mailto:
> jeff.liu@oracle.com>
> >
> >     ---
> >      fs/ocfs2/dlm/dlmrecovery.c |    4 ++++
> >      1 file changed, 4 insertions(+)
> >
> >     diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
> >     index eeac97b..9f08523 100644
> >     --- a/fs/ocfs2/dlm/dlmrecovery.c
> >     +++ b/fs/ocfs2/dlm/dlmrecovery.c
> >     @@ -1974,6 +1974,10 @@ skip_lvb:
> >                            res->lockname.len, res->lockname.name <
> http://lockname.name>, ml->node);
> >                       dlm_lockres_set_refmap_bit(dlm, res, ml->node);
> >                       added++;
> >     +         } else {
> >     +                 /* Free the new lock if it is bad */
> >     +                 dlm_lock_put(newlock);
> >     +                 newlock = NULL;
> >               }
> >               spin_unlock(&res->spinlock);
> >       }
> >     --
> >     1.7.9.7
> >
> >
> >     _______________________________________________
> >     Ocfs2-devel mailing list
> >     Ocfs2-devel@oss.oracle.com <mailto:Ocfs2-devel@oss.oracle.com>
> >     https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> >
> >
>
>
>
diff mbox

Patch

diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index eeac97b..9f08523 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -1974,6 +1974,10 @@  skip_lvb:
 			     res->lockname.len, res->lockname.name, ml->node);
 			dlm_lockres_set_refmap_bit(dlm, res, ml->node);
 			added++;
+		} else {
+			/* Free the new lock if it is bad */
+			dlm_lock_put(newlock);
+			newlock = NULL;
 		}
 		spin_unlock(&res->spinlock);
 	}