From patchwork Fri Sep 21 23:05:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 10611175 X-Patchwork-Delegate: christophe.varoqui@free.fr Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 715B3913 for ; Fri, 21 Sep 2018 23:05:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 61B122D79D for ; Fri, 21 Sep 2018 23:05:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5609F2E0C3; Fri, 21 Sep 2018 23:05:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 033012E0C0 for ; Fri, 21 Sep 2018 23:05:41 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2327387629; Fri, 21 Sep 2018 23:05:38 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DF2397F377; Fri, 21 Sep 2018 23:05:37 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 57E024BB75; Fri, 21 Sep 2018 23:05:37 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id w8LN5ZC5027270 for ; Fri, 21 Sep 2018 19:05:35 -0400 Received: by smtp.corp.redhat.com (Postfix) id BD2597E30B; Fri, 21 Sep 2018 23:05:35 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from redhat.com (octiron.msp.redhat.com [10.15.80.209]) by smtp.corp.redhat.com (Postfix) with SMTP id 7CFF37E30F; Fri, 21 Sep 2018 23:05:31 +0000 (UTC) Received: by redhat.com (sSMTP sendmail emulation); Fri, 21 Sep 2018 18:05:31 -0500 From: "Benjamin Marzinski" To: device-mapper development Date: Fri, 21 Sep 2018 18:05:09 -0500 Message-Id: <1537571127-10143-2-git-send-email-bmarzins@redhat.com> In-Reply-To: <1537571127-10143-1-git-send-email-bmarzins@redhat.com> References: <1537571127-10143-1-git-send-email-bmarzins@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-loop: dm-devel@redhat.com Cc: Martin Wilck Subject: [dm-devel] [PATCH v3 01/19] libmultipath: fix tur checker timeout X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 21 Sep 2018 23:05:40 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP The code previously was timing out mode if ct->thread was 0 but ct->running wasn't. This combination never happens. The idea was to timeout if for some reason the path checker tried to kill the thread, but it didn't die. The correct thing to check for this is ct->holders. ct->holders will always be at least one when libcheck_check() is called, since libcheck_free() won't get called until the device is no longer being checked. So, if ct->holders is 2, that means that the tur thread is has not shut down yet. Also, instead of returning PATH_TIMEOUT whenever the thread hasn't died, it should only time out if the thread didn't successfully get a value, which means the previous state was already PATH_TIMEOUT. Signed-off-by: Benjamin Marzinski --- libmultipath/checkers/tur.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c index bf8486d..275541f 100644 --- a/libmultipath/checkers/tur.c +++ b/libmultipath/checkers/tur.c @@ -355,12 +355,15 @@ int libcheck_check(struct checker * c) } pthread_mutex_unlock(&ct->lock); } else { - if (uatomic_read(&ct->running) != 0) { - /* pthread cancel failed. continue in sync mode */ - pthread_mutex_unlock(&ct->lock); - condlog(3, "%s: tur thread not responding", - tur_devt(devt, sizeof(devt), ct)); - return PATH_TIMEOUT; + if (uatomic_read(&ct->holders) > 1) { + /* pthread cancel failed. If it didn't get the path + state already, timeout */ + if (ct->state == PATH_PENDING) { + pthread_mutex_unlock(&ct->lock); + condlog(3, "%s: tur thread not responding", + tur_devt(devt, sizeof(devt), ct)); + return PATH_TIMEOUT; + } } /* Start new TUR checker */ ct->state = PATH_UNCHECKED;