From patchwork Fri Feb 9 23:28:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Martin Wilck X-Patchwork-Id: 10210101 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C8E4D60245 for ; Fri, 9 Feb 2018 23:47:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD5BE28CEB for ; Fri, 9 Feb 2018 23:47:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B16C929A35; Fri, 9 Feb 2018 23:47:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ECC3F28CEB for ; Fri, 9 Feb 2018 23:47:09 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4B7B3C05FEC2; Fri, 9 Feb 2018 23:47:08 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C274660C89; Fri, 9 Feb 2018 23:47:07 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id E5EC318033DF; Fri, 9 Feb 2018 23:47:06 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id w19NTAjr021736 for ; Fri, 9 Feb 2018 18:29:10 -0500 Received: by smtp.corp.redhat.com (Postfix) id 3D5005C89C; Fri, 9 Feb 2018 23:29:10 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx07.extmail.prod.ext.phx2.redhat.com [10.5.110.31]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 336B05C265; Fri, 9 Feb 2018 23:29:07 +0000 (UTC) Received: from prv3-mh.provo.novell.com (prv3-mh.provo.novell.com [137.65.250.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E1E0DC05B033; Fri, 9 Feb 2018 23:29:05 +0000 (UTC) Received: from [192.168.1.41] (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by prv3-mh.provo.novell.com with ESMTP (TLS encrypted); Fri, 09 Feb 2018 16:28:55 -0700 Message-ID: <1518218931.2937.20.camel@suse.com> From: Martin Wilck To: Benjamin Marzinski Date: Sat, 10 Feb 2018 00:28:51 +0100 In-Reply-To: <20180209230449.GW14513@octiron.msp.redhat.com> References: <1518134167-15938-1-git-send-email-bmarzins@redhat.com> <1518134167-15938-2-git-send-email-bmarzins@redhat.com> <1518208256.2937.16.camel@suse.com> <20180209230449.GW14513@octiron.msp.redhat.com> Mime-Version: 1.0 X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 207 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 09 Feb 2018 23:29:06 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 09 Feb 2018 23:29:06 +0000 (UTC) for IP:'137.65.250.26' DOMAIN:'prv3-mh.provo.novell.com' HELO:'prv3-mh.provo.novell.com' FROM:'mwilck@suse.com' RCPT:'' X-RedHat-Spam-Score: -2.301 (RCVD_IN_DNSWL_MED, SPF_PASS) 137.65.250.26 prv3-mh.provo.novell.com 137.65.250.26 prv3-mh.provo.novell.com X-Scanned-By: MIMEDefang 2.78 on 10.5.110.31 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-loop: dm-devel@redhat.com Cc: Bart Van Assche , device-mapper development Subject: Re: [dm-devel] [PATCH v2 1/7] libmultipath: fix tur checker locking X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 09 Feb 2018 23:47:08 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 2018-02-09 at 17:04 -0600, Benjamin Marzinski wrote: > On Fri, Feb 09, 2018 at 09:30:56PM +0100, Martin Wilck wrote: > > On Thu, 2018-02-08 at 17:56 -0600, Benjamin Marzinski wrote: > > > ct->running is now an atomic variable. When the thread is > > > started > > > it is set to 1. When the checker wants to kill a thread, it > > > atomically > > > sets the value to 0 and reads the previous value. If it was 1, > > > the checker cancels the thread. If it was 0, the nothing needs to > > > be > > > done. After the checker has dealt with the thread, it sets ct- > > > > thread > > > > > > to NULL. > > > > > > When the thread is done, it atomicalllys sets the value of ct- > > > > running > > > > > > to 0 and reads the previous value. If it was 1, the thread just > > > exits. > > > If it was 0, then the checker is trying to cancel the thread, and > > > so > > > the thread calls pause(), which is a cancellation point. > > > > > > > I'm missing one thing here. My poor brain is aching. > > > > cleanup_func() can be entered in two ways: a) if the thread has > > been > > cancelled and passed a cancellation point already, or b) if it > > exits > > normally and calls pthread_cleanup_pop(). > > In case b), waiting for the cancellation request by calling pause() > > makes sense to me. But in case a), the thread has already seen the > > cancellation request - wouldn't calling pause() cause it to sleep > > forever? > > Urgh. You're right. If it is in the cleanup helper because it already > has been cancelled, then the pause isn't going get cancelled. So much > for my quick rewrite. Maybe it's easier than we thought. Attached is a patch on top of yours that I think might work, please have a look. It's quite late here, so I'll need to ponder your alternatives below the other day. Cheers Martin > > That leaves three options. > > 1. have either the thread or the checker detach the thread (depending > on which one exits first) > 2. make the checker always cancel and detach the thread. This > simplifies > the code, but there will zombie threads hanging around between > calls > to the checker. > 3. just move the condlog > > I really don't care which one we pick anymore. > > -Ben > > > > > Martin > > > > -- > > Dr. Martin Wilck , Tel. +49 (0)911 74053 2107 > > SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham > > Norton > > HRB 21284 (AG Nürnberg) > > From 831ef27b41858fa248201b74f2dd8ea5b7c4aece Mon Sep 17 00:00:00 2001 From: Martin Wilck Date: Sat, 10 Feb 2018 00:22:17 +0100 Subject: [PATCH] tur checker: make sure pthread_cancel isn't called for exited thread If we enter the cleanup function as the result of a pthread_cancel by another thread, we don't need to wait for a cancellation any more. If we exit regularly, just tell the other thread not to try to cancel us. --- libmultipath/checkers/tur.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/libmultipath/checkers/tur.c b/libmultipath/checkers/tur.c index 894ad41c89c3..31a87d2b5cf2 100644 --- a/libmultipath/checkers/tur.c +++ b/libmultipath/checkers/tur.c @@ -221,8 +221,6 @@ static void cleanup_func(void *data) holders = uatomic_sub_return(&ct->holders, 1); if (!holders) cleanup_context(ct); - if (!running) - pause(); } static int tur_running(struct tur_checker_context *ct) @@ -266,6 +264,9 @@ static void *tur_thread(void *ctx) pthread_cond_signal(&ct->active); pthread_mutex_unlock(&ct->lock); + /* Tell main checker thread not to cancel us, as we exit anyway */ + running = uatomic_xchg(&ct->running, 0); + condlog(3, "%s: tur checker finished, state %s", tur_devt(devt, sizeof(devt), ct), checker_state_name(state)); tur_thread_cleanup_pop(ct); -- 2.16.1