From patchwork Tue May 17 22:32:19 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jim Schutt X-Patchwork-Id: 792572 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p4HMWs8i030875 for ; Tue, 17 May 2011 22:32:56 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756706Ab1EQWcx (ORCPT ); Tue, 17 May 2011 18:32:53 -0400 Received: from sentry-two.sandia.gov ([132.175.109.14]:33132 "EHLO sentry-two.sandia.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756695Ab1EQWcw (ORCPT ); Tue, 17 May 2011 18:32:52 -0400 X-WSS-ID: 0LLD2MQ-0B-1VH-02 X-M-MSG: Received: from interceptor1.sandia.gov (interceptor1.sandia.gov [132.175.109.5]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sentry-two.sandia.gov (Postfix) with ESMTP id 14C8F5686BD; Tue, 17 May 2011 16:32:50 -0600 (MDT) Received: from sentry.sandia.gov (sentry.sandia.gov [132.175.109.21]) by interceptor1.sandia.gov (RSA Interceptor); Tue, 17 May 2011 16:25:11 -0600 Received: from [132.175.109.4] by sentry.sandia.gov with ESMTP (SMTP Relay 01 (Email Firewall v6.3.2)); Tue, 17 May 2011 16:32:26 -0600 X-Server-Uuid: 6BFC7783-7E22-49B4-B610-66D6BE496C0E Received: from mail.sandia.gov (exch03.sandia.gov [134.253.103.3] (may be forged)) by mailgate2.sandia.gov (8.14.4/8.14.4) with ESMTP id p4HMWP3O013681; Tue, 17 May 2011 16:32:25 -0600 Received: from [134.253.4.20] (134.253.95.179) by mail.sandia.gov ( 134.253.103.3) with Microsoft SMTP Server (TLS) id 14.1.270.1; Tue, 17 May 2011 16:32:25 -0600 Message-ID: <4DD2F773.30508@sandia.gov> Date: Tue, 17 May 2011 16:32:19 -0600 From: "Jim Schutt" User-Agent: Thunderbird 2.0.0.24 (X11/20110128) MIME-Version: 1.0 To: "Sage Weil" cc: ceph-devel@vger.kernel.org Subject: Re: [PATCH 2/2] libceph: fix handle_timeout() racing with con_work()/try_write() References: <1305235954-9860-2-git-send-email-jaschut@sandia.gov> In-Reply-To: X-Originating-IP: [134.253.95.179] X-PMX-Version: 5.6.0.2009776, Antispam-Engine: 2.7.2.376379, Antispam-Data: 2011.5.17.222116 X-PMX-Spam: Gauge=IIIIIIII, Probability=8%, Report=' BODY_SIZE_3000_3999 0, BODY_SIZE_5000_LESS 0, BODY_SIZE_7000_LESS 0, DATE_TZ_NA 0, NO_URI_FOUND 0, WEBMAIL_SOURCE 0, WEBMAIL_XOIP 0, WEBMAIL_X_IP_HDR 0, __BOUNCE_CHALLENGE_SUBJ 0, __BOUNCE_NDR_SUBJ_EXEMPT 0, __CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __HAS_XOIP 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __MOZILLA_MSGID 0, __RATWARE_X_MAILER_CS_B 0, __SANE_MSGID 0, __TO_MALFORMED_2 0, __USER_AGENT 0' X-TMWD-Spam-Summary: TS=20110517223229; ID=1; SEV=2.3.1; DFV=B2011051723; IFV=NA; AIF=B2011051723; RPD=5.03.0010; ENG=NA; RPDID=7374723D303030312E30413031303230342E34444432463737442E303033343A534346535441543838363133332C73733D312C6667733D30; CAT=NONE; CON=NONE; SIG=AAABAJsKIgAAAAAAAAAAAAAAAAAAAH0= X-MMS-Spam-Filter-ID: B2011051723_5.03.0010 X-WSS-ID: 61CC28F02K44030446-01-01 X-RSA-Inspected: yes X-RSA-Classifications: public X-RSA-Action: allow Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Tue, 17 May 2011 22:32:58 +0000 (UTC) Hi Sage, Sage Weil wrote: > Hi Jim, > This picks up the second resent message. But and OPENING is set.. so how > is it that we were in try_write but we didn't pass through the OPENING > test in con_work? > >>>> con_work()/try_write() start again on new message. >>>> Note that con_work() did not notice that the connection was marked to >>>> be closed/reopened! This is where things went bad. >>>> Note the connection state: 8220 = 0x201c, and 0x2000 is OPENING. >> [ 2412.643178] libceph: messenger.c:1969 : con_work ffff8801955c0030 OPENING >> [ 2412.643182] libceph: messenger.c:292 : con_close_socket on ffff8801955c0030 sock ffff8801825a8c00 >> [ 2412.643197] libceph: messenger.c:162 : ceph_state_change ffff8801955c0030 state = 2076 sk_state = 4 > > Ok, now we see it, but this should have executed before we ever got to > try_write! > > There are a handful of places that drop and re-take con->mutex... I think > it has to be one of those. They are: > > - prepare_connect_authorizer (before calling ->prepare_authorizer > callback) > - fail_protocol (before calling ->bad_proto callback) > - process_connect RESETSESSION case (before calling ->reset callback) > - process_message (before calling ->dispatch callback) > > I think after retaking the lock in each of these cases we need to test for > the CLOSED or OPENING bits and, if set, bail out so that con_work can > restart. > > Does that make sense? Can you confirm which case of lock droppage inside > con_work is at fault? (Maybe adding a printk at the top of con_work after > taking the lock too would make it more clear what's going on; currently it > only prints in all the random test_bit cases). So I've got logs now that show that connection state changes across a call to process_message. That seems like it proves the concept. > > I'm thinking something like the below... does that seem reasonable? I > only did the prepare_authorizer case to start, but the others should be > fixed up too. Compile tested only. > Is there any reason not to fix all the cases at once, in try_read? It's less code, but far away from where the mutex dropping occurs, so we add a comment: What am I missing? -- Jim --- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index a9bd424..11d6a61 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1816,10 +1820,19 @@ static int try_read(struct ceph_connection *con) dout("try_read start on %p state %lu\n", con, con->state); more: dout("try_read %p state %lu tag %d in_base_pos %d\n", con, con->state, (int)con->in_tag, con->in_base_pos); + + /* process_connect, process_message need to drop con->mutex, + * so connection state might have changed on us. Check it, + * leave if necessary so new state can be processed. + */ + if (test_bit(CLOSED, &con->state) || + test_bit(OPENING, &con->state)) + goto out; + if (test_bit(CONNECTING, &con->state)) { if (!test_bit(NEGOTIATING, &con->state)) { dout("try_read %p connecting\n", con); ret = read_partial_banner(con); if (ret <= 0)