From patchwork Wed Aug 7 15:23:52 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sage Weil X-Patchwork-Id: 2840356 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 1B15C9F494 for ; Wed, 7 Aug 2013 15:23:58 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D914A201EA for ; Wed, 7 Aug 2013 15:23:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E99EC20399 for ; Wed, 7 Aug 2013 15:23:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752377Ab3HGPXx (ORCPT ); Wed, 7 Aug 2013 11:23:53 -0400 Received: from cobra.newdream.net ([66.33.216.30]:43767 "EHLO cobra.newdream.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751139Ab3HGPXx (ORCPT ); Wed, 7 Aug 2013 11:23:53 -0400 Received: from cobra.newdream.net (localhost [127.0.0.1]) by cobra.newdream.net (Postfix) with ESMTP id C6FF28004F; Wed, 7 Aug 2013 08:23:52 -0700 (PDT) Received: by cobra.newdream.net (Postfix, from userid 1031) id A9E408047E; Wed, 7 Aug 2013 08:23:52 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by cobra.newdream.net (Postfix) with ESMTP id 95EC28004F; Wed, 7 Aug 2013 08:23:52 -0700 (PDT) Date: Wed, 7 Aug 2013 08:23:52 -0700 (PDT) From: Sage Weil X-X-Sender: sage@cobra.newdream.net To: James Harper cc: "ceph-devel@vger.kernel.org" Subject: RE: bug in /etc/init.d/ceph debian In-Reply-To: <6035A0D088A63A46850C3988ED045A4B62E82A96@BITCOM1.int.sbss.com.au> Message-ID: References: <6035A0D088A63A46850C3988ED045A4B62E7617E@BITCOM1.int.sbss.com.au> <6035A0D088A63A46850C3988ED045A4B62E82A96@BITCOM1.int.sbss.com.au> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi James, Here is a somewhat simpler patch; does this work for you? Note that if you something like /etc/init.d/ceph status osd.123 where osd.123 isn't in ceph.conf then you get a status 1 instead of 3. But for the /etc/init.d/ceph status mds (or osd or mon) case where there are no daemons of a particular type it works. Perhaps the "does not exist" check should be also modified to return 3? sage On Wed, 7 Aug 2013, James Harper wrote: > > > > I'm running ceph 0.61.7-1~bpo70+1 and I think there is a bug in > > /etc/init.d/ceph > > > > The heartbeat RA expects that the init.d script will return 3 for "not running", > > but if there is no agent (eg mds) defined for that host it will return 0 instead, > > so pacemaker thinks the agent is running on a node where it isn't even > > defined and presumably would then start doing stonith when it finds it > > remains running after a stop command. > > > > Or maybe that is the correct behaviour of the init.d script and the RA needs > > to be modified? > > > > Nobody interested in this? > > My proposed fix follows this email. Return status is: > 0 - everything tested is running > 1 - something wrong > 3 - something tested is stopped > > Without this patch, the resource agents report that the service is running if the service is not defined on the host. > > I'm not sure though if this is the right approach. Maybe the /etc/init.d/ceph should return 0 when checking the status of (say) mon, when there are no mons defined on this host? > > James > > --- ceph.orig 2013-08-07 13:28:25.000000000 +1000 > +++ ceph 2013-08-07 13:32:37.000000000 +1000 > @@ -170,6 +170,9 @@ > get_local_name_list > get_name_list "$@" > > +running=0 > +dead=0 > +stopped=0 > for name in $what; do > type=`echo $name | cut -c 1-3` # e.g. 'mon', if $item is 'mon1' > id=`echo $name | cut -c 4- | sed 's/^\\.//'` > @@ -375,14 +378,15 @@ > if daemon_is_running $name ceph-$type $id $pid_file; then > echo -n "$name: running " > do_cmd "$BINDIR/ceph --admin-daemon $asok version 2>/dev/null" || echo unknown > + running=1 > elif [ -e "$pid_file" ]; then > # daemon is dead, but pid file still exists > echo "$name: dead." > - EXIT_STATUS=1 > + dead=1 > else > # daemon is dead, and pid file is gone > echo "$name: not running." > - EXIT_STATUS=3 > + stopped=1 > fi > ;; > > @@ -430,6 +434,16 @@ > esac > done > > +if [ "$command" = "status" ]; then > + if [ "$dead" = "1" ]; then > + EXIT_STATUS=1 > + elif [ "$running" = "1" ]; then > + EXIT_STATUS=0 > + else > + EXIT_STATUS=3 > + fi > +fi > + > # activate latent osds? > if [ "$command" = "start" ]; then > if [ "$*" = "" ] || echo $* | grep -q ^osd\$ ; then > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > --- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/src/init-ceph.in b/src/init-ceph.in index 8eb02f8..be5565c 100644 --- a/src/init-ceph.in +++ b/src/init-ceph.in @@ -165,6 +165,12 @@ verify_conf command=$1 [ -n "$*" ] && shift +if [ "$command" = "status" ]; then + # nothing defined for this host => not running; we'll use this if we + # don't check anything below. + EXIT_STATUS=3 +fi + get_local_name_list get_name_list "$@"