From patchwork Thu Dec 7 18:49:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Marzinski X-Patchwork-Id: 10100867 X-Patchwork-Delegate: christophe.varoqui@free.fr Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B3E5C60329 for ; Thu, 7 Dec 2017 18:49:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A5DAA28521 for ; Thu, 7 Dec 2017 18:49:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9A70C2853C; Thu, 7 Dec 2017 18:49:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 11B1228521 for ; Thu, 7 Dec 2017 18:49:48 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6094EC0641EC; Thu, 7 Dec 2017 18:49:47 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 467ED7B8E9; Thu, 7 Dec 2017 18:49:47 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 1E7F11800BDB; Thu, 7 Dec 2017 18:49:47 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id vB7Injvs026335 for ; Thu, 7 Dec 2017 13:49:45 -0500 Received: by smtp.corp.redhat.com (Postfix) id DEDC469297; Thu, 7 Dec 2017 18:49:45 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from redhat.com (octiron.msp.redhat.com [10.15.80.209]) by smtp.corp.redhat.com (Postfix) with SMTP id 932CA6929B; Thu, 7 Dec 2017 18:49:42 +0000 (UTC) Received: by redhat.com (sSMTP sendmail emulation); Thu, 07 Dec 2017 12:49:42 -0600 From: "Benjamin Marzinski" To: device-mapper development Date: Thu, 7 Dec 2017 12:49:06 -0600 Message-Id: <1512672546-12785-13-git-send-email-bmarzins@redhat.com> In-Reply-To: <1512672546-12785-1-git-send-email-bmarzins@redhat.com> References: <1512672546-12785-1-git-send-email-bmarzins@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-loop: dm-devel@redhat.com Cc: Guan Junxiong , Martin Wilck Subject: [dm-devel] [PATCH 12/12] multipathd: marginal path code fixes X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Thu, 07 Dec 2017 18:49:47 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP There are a couple of issues I noticed with the marginal paths code. In hit_io_err_recheck_time() there are some problems with the initial checks. We should always recover the path if there are no other usable paths to the device, so this check should be first. Also, we just checked that io_err_disable_reinstate isn't zero before calling this function, so we don't need to check again here (and it doesn't make any sense to continue disabling the path if io_err_disable_reinstate is set to zero). Finally, the only kind of errors we can get while calling clock_gettime() are going to happen on every call. So, if we can't get the time, assume that the timeout has passed. The multipath.conf.5 description of marginal_path_err_sample_time, states that sampling is stopped for marginal_path_err_rate_threshold seconds, when it should be marginal_path_err_recheck_gap_time seconds. Lastly, there is a race that can cause multipathd to access freed memory on shutdown. io_err_stat_thr is started as a detached thread. This means that stop_io_err_stat_thread() can't know when it has actually stopped, after pthread_cancel() and pthread_kill() are called. To be safe, it should not start the thread in a deteched state, and call join to verify that it has stopped before freeing the memory it uses. But more importantly, stop_io_err_stat_thread() was being called before the checker and uevent threads were being canceled. Both of these threads access data that is freed in stop_io_err_stat_thread(). To avoid the chance of these threads accessing freed memory, child() should wait until these threads are stopped before calling stop_io_err_stat_thread(). Cc: Guan Junxiong Signed-off-by: Benjamin Marzinski Reviewed-by: Martin Wilck --- libmultipath/io_err_stat.c | 12 +++++------- multipath/multipath.conf.5 | 2 +- multipathd/main.c | 6 +++--- 3 files changed, 9 insertions(+), 11 deletions(-) diff --git a/libmultipath/io_err_stat.c b/libmultipath/io_err_stat.c index 75a6df6..5b10f03 100644 --- a/libmultipath/io_err_stat.c +++ b/libmultipath/io_err_stat.c @@ -379,17 +379,14 @@ int hit_io_err_recheck_time(struct path *pp) struct timespec curr_time; int r; - if (pp->io_err_disable_reinstate == 0) - return 1; - if (clock_gettime(CLOCK_MONOTONIC, &curr_time) != 0) - return 1; - if (pp->io_err_pathfail_cnt != PATH_IO_ERR_IN_POLLING_RECHECK) - return 1; if (pp->mpp->nr_active <= 0) { io_err_stat_log(2, "%s: recover path early", pp->dev); goto recover; } - if ((curr_time.tv_sec - pp->io_err_dis_reinstate_time) > + if (pp->io_err_pathfail_cnt != PATH_IO_ERR_IN_POLLING_RECHECK) + return 1; + if (clock_gettime(CLOCK_MONOTONIC, &curr_time) != 0 || + (curr_time.tv_sec - pp->io_err_dis_reinstate_time) > pp->mpp->marginal_path_err_recheck_gap_time) { io_err_stat_log(4, "%s: reschedule checking after %d seconds", pp->dev, @@ -738,6 +735,7 @@ void stop_io_err_stat_thread(void) { pthread_cancel(io_err_stat_thr); pthread_kill(io_err_stat_thr, SIGUSR2); + pthread_join(io_err_stat_thr, NULL); free_io_err_pathvec(paths); io_destroy(ioctx); } diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5 index ba5d3bc..ab151e7 100644 --- a/multipath/multipath.conf.5 +++ b/multipath/multipath.conf.5 @@ -867,7 +867,7 @@ accounting process for the path will last for \fImarginal_path_err_sample_time\fR. If the rate of IO error on a particular path is greater than the \fImarginal_path_err_rate_threshold\fR, then the path will not reinstate for -\fImarginal_path_err_rate_threshold\fR seconds unless there is only one +\fImarginal_path_err_recheck_gap_time\fR seconds unless there is only one active path. After \fImarginal_path_err_recheck_gap_time\fR expires, the path will be requeueed for rechecking. If checking result is good enough, the path will be reinstated. diff --git a/multipathd/main.c b/multipathd/main.c index 16e4fdf..0703ca0 100644 --- a/multipathd/main.c +++ b/multipathd/main.c @@ -2332,7 +2332,7 @@ child (void * param) setup_thread_attr(&misc_attr, 64 * 1024, 0); setup_thread_attr(&uevent_attr, DEFAULT_UEVENT_STACKSIZE * 1024, 0); setup_thread_attr(&waiter_attr, 32 * 1024, 1); - setup_thread_attr(&io_err_stat_attr, 32 * 1024, 1); + setup_thread_attr(&io_err_stat_attr, 32 * 1024, 0); if (logsink == 1) { setup_thread_attr(&log_attr, 64 * 1024, 0); @@ -2508,8 +2508,6 @@ child (void * param) remove_maps_and_stop_waiters(vecs); unlock(&vecs->lock); - stop_io_err_stat_thread(); - pthread_cancel(check_thr); pthread_cancel(uevent_thr); pthread_cancel(uxlsnr_thr); @@ -2520,6 +2518,8 @@ child (void * param) pthread_join(uxlsnr_thr, NULL); pthread_join(uevq_thr, NULL); + stop_io_err_stat_thread(); + lock(&vecs->lock); free_pathvec(vecs->pathvec, FREE_PATHS); vecs->pathvec = NULL;