From patchwork Thu Jul 20 21:09:43 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 9855725 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D89AB602BA for ; Thu, 20 Jul 2017 21:09:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C3447286FE for ; Thu, 20 Jul 2017 21:09:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B7E82286F9; Thu, 20 Jul 2017 21:09:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE1C1286F9 for ; Thu, 20 Jul 2017 21:09:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936244AbdGTVJu (ORCPT ); Thu, 20 Jul 2017 17:09:50 -0400 Received: from us-smtp-delivery-194.mimecast.com ([63.128.21.194]:20005 "EHLO us-smtp-delivery-194.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S936098AbdGTVJt (ORCPT ); Thu, 20 Jul 2017 17:09:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=PrimaryData.onmicrosoft.com; s=selector1-primarydata-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=Ix5j/k/bP0KhmtYZWGc9p+toYQMviSZ+mkvz8p3sW50=; b=aiCwMNvvl7pf+o5JLqHeNLpVMQHzG+t1HMfa7MKUoxVkmkhjKrYu2Vwrz3cGybcAGtdIP2ZyiifKL0ceS4NRvCeyhd40AkYvXDEBzqYmxwt1xebKjRWZcL3aTlQSVTb4ZcIsSOdsvEHaL8WLU5BM3kMTauLBPc2Wwt4qoICntT4= Received: from NAM03-CO1-obe.outbound.protection.outlook.com (mail-co1nam03lp0016.outbound.protection.outlook.com [216.32.181.16]) (Using TLS) by us-smtp-1.mimecast.com with ESMTP id us-mta-116-FIuLKcdrN2CD3RQAMZVgpA-1; Thu, 20 Jul 2017 17:09:46 -0400 Received: from DM5PR11MB0075.namprd11.prod.outlook.com (10.164.155.144) by DM5PR11MB0074.namprd11.prod.outlook.com (10.164.155.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1261.13; Thu, 20 Jul 2017 21:09:43 +0000 Received: from DM5PR11MB0075.namprd11.prod.outlook.com ([10.164.155.144]) by DM5PR11MB0075.namprd11.prod.outlook.com ([10.164.155.144]) with mapi id 15.01.1261.024; Thu, 20 Jul 2017 21:09:43 +0000 From: Trond Myklebust To: "kolga@netapp.com" CC: "linux-nfs@vger.kernel.org" Subject: Re: [PATCH v3 1/1] PNFS fix dangling DS mount Thread-Topic: [PATCH v3 1/1] PNFS fix dangling DS mount Thread-Index: AQHS8dxDONSc/nCZUUa2VXOQbTbttaJdQMwAgAAE74CAAA+MAA== Date: Thu, 20 Jul 2017 21:09:43 +0000 Message-ID: <1500584980.6577.4.camel@primarydata.com> References: <20170630195201.95597-1-kolga@netapp.com> <1500580582.5457.1.camel@primarydata.com> <4C765F1F-4958-4D1B-922A-95AE9CB69288@netapp.com> In-Reply-To: <4C765F1F-4958-4D1B-922A-95AE9CB69288@netapp.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [68.49.162.121] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DM5PR11MB0074; 20:LdIabpE+m4rgtj5UoIoMXdxasGaewy/rg5UaqNSmnUr0iQmIo1GN0de1ikLQXD0b1rC4G/+JgnAgYnjz9p1FCzZi/7cTg+9WbUXWkwbA8QWkMsMBqDXv3Nl985IkbjBPkT8iEjVq/R2ArA5DHAxRfeJ+9ffb0+YZ7usQCJ2Xv9g= x-ms-office365-filtering-correlation-id: 1602a57b-be95-43cb-46e3-08d4cfb3a4ba x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254075)(300000503095)(300135400095)(2017052603031)(201703131423075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095); SRVR:DM5PR11MB0074; x-ms-traffictypediagnostic: DM5PR11MB0074: x-exchange-antispam-report-test: UriScan:; x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3002001)(100000703101)(100105400095)(6041248)(20161123564025)(20161123555025)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(2016111802025)(20161123560025)(6043046)(6072148)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:DM5PR11MB0074; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:DM5PR11MB0074; x-forefront-prvs: 0374433C81 x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(39450400003)(39410400002)(39840400002)(39400400002)(377424004)(377454003)(24454002)(229853002)(110136004)(81166006)(102836003)(4326008)(14454004)(38730400002)(33646002)(1730700003)(6246003)(2501003)(478600001)(53546010)(76176999)(50986999)(8936002)(2900100001)(54356999)(189998001)(8676002)(103116003)(6116002)(3846002)(53936002)(6486002)(3280700002)(2950100002)(2351001)(6506006)(25786009)(77096006)(6916009)(3660700001)(2906002)(86362001)(36756003)(5660300001)(99286003)(575784001)(305945005)(6436002)(7736002)(6512007)(5640700003)(66066001); DIR:OUT; SFP:1102; SCL:1; SRVR:DM5PR11MB0074; H:DM5PR11MB0075.namprd11.prod.outlook.com; FPR:; SPF:None; MLV:sfv; LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-ID: <57EA210140E0CA47957E362D3D02C4AA@namprd11.prod.outlook.com> MIME-Version: 1.0 X-OriginatorOrg: primarydata.com X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Jul 2017 21:09:43.3767 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 03193ed6-8726-4bb3-a832-18ab0d28adb7 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR11MB0074 X-MC-Unique: FIuLKcdrN2CD3RQAMZVgpA-1 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, 2017-07-20 at 16:14 -0400, Olga Kornievskaia wrote: > > On Jul 20, 2017, at 3:56 PM, Trond Myklebust > om> wrote: > > > > Hi Olga, > > > > Apologies for missing this patch. It was hiding in my 'linux- > > fsdevel' > > mailbox, so I didn't recognise it as a NFS patch. > > > > Yeah after I mailed it out I realized I cc-ed fsdevel incorrectly. > > > On Fri, 2017-06-30 at 15:52 -0400, Olga Kornievskaia wrote: > > > There is a regression by commit 8d40b0f14846 ("NFS > > > filelayout:call > > > GETDEVICEINFO after pnfs_layout_process completes"). It leaves > > > the > > > DS mount dangling. > > > > > > Previously, filelayout_alloc_sec() would call > > > filelayout_check_layout() > > > which would call nfs4_find_get_deviceid which ups the count on > > > the > > > device_id. It's only called once and it's matched by the > > > filelayout_free_lseg() that calls nfs4_fl_put_deviceid(). > > > > > > After that patch, each read/write ends up calling > > > nfs4_find_get_deviceid > > > and there is no balance for that. Instead, do > > > nfs4_fl_put_deviceid() > > > in the filelayout's .pg_cleanup and remove it from > > > filelayout_free_lseg. > > > > > > But we still need a reference to hold over the lifetime of the > > > segment. > > > For every new lseg that's created we need to take a reference on > > > deviceid > > > that uses it. It will be released in the "free_lseg" routine. > > > > This is what I'm not understanding. If you have a reference in the > > layout segment, then why do you need to call > > nfs4_find_get_deviceid() > > in the read/write code? > > I think I’m probably misunderstanding the question. It sounds to me > that you asking me for why the commit 8d40b0f14846 was done the way > it was done (I’d would say it was done as per your suggestion). > > I would say the call to nfs4_find_get_deviceid() has always been in > the read/write code. It was a part of the pnfs_update_layout() but > the commit 8d40b0f14846 moved it out of it (so that the layoutget was > complete) and then the call to the getdeviceinfo would be done. > > > Isn't it sufficient to change the "pg_init" calls to check whether > > or > > not the struct nfs4_filelayout_segment has set a value for dsaddr > > (that > > needs to be done with care to avoid races - cmpxchg() is your > > friend), > > and then rely on that reference being set for the remainder of the > > layout segment lifetime? > > Are you suggesting to change when getdeviceinfo is suppose to be > called? > No. Now that I'm looking at filelayout_check_deviceid(), here is what I suggest: we need to ensure that filelayout_check_deviceid() sets the deviceid once, and only once! How about something like the following (untested) patch? 8<------------------------------------------------------- From 1e40ee13950d03ad2a54a0d1dba35f2a04c28ca0 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 20 Jul 2017 17:00:02 -0400 Subject: [PATCH] NFS/filelayout: Fix racy setting of fl->dsaddr in filelayout_check_deviceid() We must set fl->dsaddr once, and once only, even if there are multiple processes calling filelayout_check_deviceid() for the same layout segment. Reported-by: Olga Kornievskaia Signed-off-by: Trond Myklebust --- fs/nfs/filelayout/filelayout.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) -- 2.13.3 -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@primarydata.com diff --git a/fs/nfs/filelayout/filelayout.c b/fs/nfs/filelayout/filelayout.c index 080fc6b278bd..44c638b7876c 100644 --- a/fs/nfs/filelayout/filelayout.c +++ b/fs/nfs/filelayout/filelayout.c @@ -542,6 +542,10 @@ filelayout_check_deviceid(struct pnfs_layout_hdr *lo, struct nfs4_file_layout_dsaddr *dsaddr; int status = -EINVAL; + /* Is the deviceid already set? If so, we're good. */ + if (fl->dsaddr != NULL) + return 0; + /* find and reference the deviceid */ d = nfs4_find_get_deviceid(NFS_SERVER(lo->plh_inode), &fl->deviceid, lo->plh_lc_cred, gfp_flags); @@ -553,8 +557,6 @@ filelayout_check_deviceid(struct pnfs_layout_hdr *lo, if (filelayout_test_devid_unavailable(&dsaddr->id_node)) goto out_put; - fl->dsaddr = dsaddr; - if (fl->first_stripe_index >= dsaddr->stripe_count) { dprintk("%s Bad first_stripe_index %u\n", __func__, fl->first_stripe_index); @@ -570,6 +572,13 @@ filelayout_check_deviceid(struct pnfs_layout_hdr *lo, goto out_put; } status = 0; + + /* + * Atomic compare and xchange to ensure we don't scribble + * over a non-NULL pointer. + */ + if (cmpxchg(&fl->dsaddr, NULL, dsaddr) != NULL) + goto out_put; out: return status; out_put: