diff mbox

答复: [RFC] unreleased lock after handler failure

Message ID OFF0B1B6BA.C0361FF8-ON4825804A.002E502E-4825804A.002F0026@zte.com.cn (mailing list archive)
State Not Applicable, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

tang.junhui@zte.com.cn Oct. 12, 2016, 8:33 a.m. UTC
Hi, Michael Wang

This patch looks well for me, cleanup_lock() should be called without
affected by the return value of h->fn().

Regards,
Tang



发件人:         Michael Wang <yun.wang@profitbricks.com>
收件人:         christophe.varoqui@opensvc.com, dm-devel@redhat.com, 
日期:   2016/10/12 16:11
主题:   [dm-devel] [RFC] unreleased lock after handler failure
发件人: dm-devel-bounces@redhat.com



Hi, folks

While we are testing with the latest multipathd, we encounter issue that 
once
'del map' failed, all the following operation will show 'error -1 
receiving packet'
whatever how long the timeout has been set (we have applied the latest fix 
which
make sure the poll will last for 10 seconds).

After some debugging we found that inside parse_cmd(), the 
pthread_cleanup_pop()
rely on '!r' as indicator for locked or not, while this will be 
overwritten if
the handler return failed (1 in our case), and then the unlock will be 
missed.

After applied below fix the problem got solved, although the del map 
failed, the
other operation can still works.

Could you please help to review whether this is really the problem to be 
fixed
like that?

Regards,
Michael Wang



                pthread_cleanup_push(cleanup_lock, &vecs->lock);
@@ -491,10 +492,11 @@ parse_cmd (char * cmd, char ** reply, int * len, 
void * data, int timeout )
                        r = 0;
                }
                if (r == 0) {
+                       locked = 1;
                        pthread_testcancel();
                        r = h->fn(cmdvec, reply, len, data);
                }
-               pthread_cleanup_pop(!r);
+               pthread_cleanup_pop(locked);
        } else
                r = h->fn(cmdvec, reply, len, data);
        free_keys(cmdvec);

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Comments

Michael Wang Oct. 12, 2016, 8:36 a.m. UTC | #1
Hi, Tang

On 10/12/2016 10:33 AM, tang.junhui@zte.com.cn wrote:
> Hi, Michael Wang
> 
> This patch looks well for me, cleanup_lock() should be called without
> affectedby the return value of h->fn().

Thanks for the quick respond, I'll submit a formal patch then :-)

Regards,
Michael Wang

> 
> Regards,
> Tang
> 
> 
> 
> 发件人:         Michael Wang <yun.wang@profitbricks.com>
> 收件人:         christophe.varoqui@opensvc.com, dm-devel@redhat.com,
> 日期:         2016/10/12 16:11
> 主题:        [dm-devel] [RFC] unreleased lock after handler failure
> 发件人:        dm-devel-bounces@redhat.com
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> 
> 
> Hi, folks
> 
> While we are testing with the latest multipathd, we encounter issue that once
> 'del map' failed, all the following operation will show 'error -1 receiving packet'
> whatever how long the timeout has been set (we have applied the latest fix which
> make sure the poll will last for 10 seconds).
> 
> After some debugging we found that inside parse_cmd(), the pthread_cleanup_pop()
> rely on '!r' as indicator for locked or not, while this will be overwritten if
> the handler return failed (1 in our case), and then the unlock will be missed.
> 
> After applied below fix the problem got solved, although the del map failed, the
> other operation can still works.
> 
> Could you please help to review whether this is really the problem to be fixed
> like that?
> 
> Regards,
> Michael Wang
> 
> 
> 
> diff --git a/multipathd/cli.c b/multipathd/cli.c
> index e8a9384..50161be 100644
> --- a/multipathd/cli.c
> +++ b/multipathd/cli.c
> @@ -481,6 +481,7 @@ parse_cmd (char * cmd, char ** reply, int * len, void * data, int timeout )
>                tmo.tv_sec = 0;
>        }
>        if (h->locked) {
> +               int locked = 0;
>                struct vectors * vecs = (struct vectors *)data;
> 
>                pthread_cleanup_push(cleanup_lock, &vecs->lock);
> @@ -491,10 +492,11 @@ parse_cmd (char * cmd, char ** reply, int * len, void * data, int timeout )
>                        r = 0;
>                }
>                if (r == 0) {
> +                       locked = 1;
>                        pthread_testcancel();
>                        r = h->fn(cmdvec, reply, len, data);
>                }
> -               pthread_cleanup_pop(!r);
> +               pthread_cleanup_pop(locked);
>        } else
>                r = h->fn(cmdvec, reply, len, data);
>        free_keys(cmdvec);
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/multipathd/cli.c b/multipathd/cli.c
index e8a9384..50161be 100644
--- a/multipathd/cli.c
+++ b/multipathd/cli.c
@@ -481,6 +481,7 @@  parse_cmd (char * cmd, char ** reply, int * len, void 
* data, int timeout )
                tmo.tv_sec = 0;
        }
        if (h->locked) {
+               int locked = 0;
                struct vectors * vecs = (struct vectors *)data;