From patchwork Fri Dec 13 09:11:19 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 3339251 X-Patchwork-Delegate: christophe.varoqui@free.fr Return-Path: X-Original-To: patchwork-dm-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id C1026C0D4A for ; Fri, 13 Dec 2013 09:19:29 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 91D8E20609 for ; Fri, 13 Dec 2013 09:19:28 +0000 (UTC) Received: from mx4-phx2.redhat.com (mx4-phx2.redhat.com [209.132.183.25]) by mail.kernel.org (Postfix) with ESMTP id 9407820603 for ; Fri, 13 Dec 2013 09:19:23 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx4-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id rBD9CukU016010; Fri, 13 Dec 2013 04:13:21 -0500 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id rBD9Bsbx030618 for ; Fri, 13 Dec 2013 04:11:54 -0500 Received: from mx1.redhat.com (ext-mx16.extmail.prod.ext.phx2.redhat.com [10.5.110.21]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id rBD9BpiN009138; Fri, 13 Dec 2013 04:11:53 -0500 Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rBD9BWm5021318; Fri, 13 Dec 2013 04:11:38 -0500 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 46B35ABCC; Fri, 13 Dec 2013 09:11:31 +0000 (UTC) From: Hannes Reinecke To: Christophe Varoqui Date: Fri, 13 Dec 2013 10:11:19 +0100 Message-Id: <1386925879-31225-4-git-send-email-hare@suse.de> In-Reply-To: <1386925879-31225-1-git-send-email-hare@suse.de> References: <1386925879-31225-1-git-send-email-hare@suse.de> X-RedHat-Spam-Score: -6.9 (BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, URIBL_BLOCKED) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Scanned-By: MIMEDefang 2.68 on 10.5.110.21 X-loop: dm-devel@redhat.com Cc: dm-devel@redhat.com Subject: [dm-devel] [PATCH 3/3] multipathd: Implement systemd watchdog integration X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In the past there have been several instances where multipathd would hang with the checkerloop as some path checker might not be able to return in time. This patch now activates the watchdog feature from systemd to shutdown (and possibly restart) multipathd in these situations. Due to a bug in systemd watchdog integration only works correctly with later version (> 206), so watchdog integration has been disabled per default. Signed-off-by: Hannes Reinecke --- libmultipath/config.h | 3 +++ multipath/multipath.conf.5 | 7 +++++-- multipathd/main.c | 23 ++++++++++++++++++++++- multipathd/multipathd.8 | 45 ++++++++++++++++++++++++++++++++++++++++++++- 4 files changed, 74 insertions(+), 4 deletions(-) diff --git a/libmultipath/config.h b/libmultipath/config.h index 9c467e8..445525b 100644 --- a/libmultipath/config.h +++ b/libmultipath/config.h @@ -98,6 +98,9 @@ struct config { int queue_without_daemon; int checker_timeout; int daemon; +#ifdef USE_SYSTEMD + int watchdog; +#endif int flush_on_last_del; int attribute_flags; int fast_io_fail; diff --git a/multipath/multipath.conf.5 b/multipath/multipath.conf.5 index 0fd3035..cf5bec0 100644 --- a/multipath/multipath.conf.5 +++ b/multipath/multipath.conf.5 @@ -71,8 +71,11 @@ section recognizes the following keywords: .B polling_interval interval between two path checks in seconds. For properly functioning paths, the interval between checks will gradually increase to -.B max_polling_interval; -default is +.B max_polling_interval. +This value will be overridden by the +.B WatchdogSec +setting in the multipathd.service definition if systemd is used. +Default is .I 5 .TP .B max_polling_interval diff --git a/multipathd/main.c b/multipathd/main.c index 49e74a6..3219511 100644 --- a/multipathd/main.c +++ b/multipathd/main.c @@ -1294,7 +1294,10 @@ checkerloop (void *ap) lock(vecs->lock); pthread_testcancel(); condlog(4, "tick"); - +#ifdef USE_SYSTEMD + if (conf->watchdog) + sd_notify(0, "WATCHDOG=1"); +#endif if (vecs->pathvec) { vector_foreach_slot (vecs->pathvec, pp, i) { num_paths += check_path(vecs, pp); @@ -1616,6 +1619,9 @@ child (void * param) struct vectors * vecs; struct multipath * mpp; int i; +#ifdef USE_SYSTEMD + unsigned long checkint; +#endif int rc, pid_rc; char *envp; @@ -1696,6 +1702,21 @@ child (void * param) conf->daemon = 1; udev_set_sync_support(0); +#ifdef USE_SYSTEMD + envp = getenv("WATCHDOG_USEC"); + if (envp && sscanf(envp, "%lu", &checkint) == 1) { + /* Value is in microseconds */ + conf->max_checkint = checkint / 1000000; + /* Rescale checkint */ + if (conf->checkint > conf->max_checkint) + conf->checkint = conf->max_checkint; + else + conf->checkint = conf->max_checkint / 4; + condlog(3, "enabling watchdog, interval %d max %d", + conf->checkint, conf->max_checkint); + conf->watchdog = conf->checkint; + } +#endif /* * Start uevent listener early to catch events */ diff --git a/multipathd/multipathd.8 b/multipathd/multipathd.8 index 2aea150..5e35665 100644 --- a/multipathd/multipathd.8 +++ b/multipathd/multipathd.8 @@ -128,10 +128,53 @@ Restore queuing on multipahted map $map .B quit|exit End interactive session. +.SH "SYSTEMD INTEGRATION" +When compiled with systemd support two systemd service files are +installed, +.I multipathd.service +and +.I multipathd.socket +The +.I multipathd.socket +service instructs systemd to intercept the CLI command socket, so +that any call to the CLI interface will start-up the daemon if +required. +The +.I multipathd.service +file carries the definitions for controlling the multipath daemon. +The daemon itself uses the +.B sd_notify(3) +interface to communicate with systemd. The following unit keywords are +recognized: +.TP +.I WatchdogSec= +Enables the internal watchdog from systemd. multipath will send a +notification via +.B sd_notify(3) +to systemd to reset the watchdog. If specified the +.I polling_interval +and +.I max_polling_interval +settings will be overridden by the watchdog settings. + +Please note that systemd prior to version 207 has issues which prevent +the systemd-provided watchdog from working correctly. So the watchdog +is not enabled per default, but has to be enabled manually by updating +the multipathd.service file. +.TP +.I OOMScoreAdjust= +Overrides the internal OOM adjust mechanism +.TP +.I LimitNOFILE= +Overrides the +.I max_fds +configuration setting. + .SH "SEE ALSO" .BR multipath (8) .BR kpartx (8) -.BR hotplug (8) +.BR sd_notify (3) +.BR system.service (5) .SH "AUTHORS" .B multipathd was developed by Christophe Varoqui, and others.