From patchwork Tue Jun 20 23:35:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trond Myklebust X-Patchwork-Id: 9800423 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 687E76038C for ; Tue, 20 Jun 2017 23:36:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 373A71FF8E for ; Tue, 20 Jun 2017 23:36:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2B4DF1FFD6; Tue, 20 Jun 2017 23:36:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 068AB26B41 for ; Tue, 20 Jun 2017 23:36:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752646AbdFTXgP (ORCPT ); Tue, 20 Jun 2017 19:36:15 -0400 Received: from mail-it0-f65.google.com ([209.85.214.65]:36579 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752428AbdFTXgO (ORCPT ); Tue, 20 Jun 2017 19:36:14 -0400 Received: by mail-it0-f65.google.com with SMTP id 185so17168918itv.3 for ; Tue, 20 Jun 2017 16:36:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=QcsoddHazsWgfADsxlahqDdIfKZ71KmmroN6BeNvkxc=; b=UMjOE2S+0U1b/paExw8e37iA3c+bLbWcI9ELGH4aW59zrSw4E/iSE2ifbL3fjquFGg xHyCCz42Ldm2U+eFDM0NSM4sa8LbXIIb8qi4B/aCGfNbpyUjqBD3D3piakzeirk+bChS WxApJkH58piH/OT0/BucfOCA4ciTfIQDSmGbPRiJjq0dAzcUZi1WJsBgR33LEGFjiA4n ZK2Sx7ya4GeFD+ZgnBsCrprBSz9JmtWyeUoZgCIYKGu5qPSsde4uwR+OMuaAW5JLGy4O f1uvn+KDAiSaCJGdpEegZC+PXw8HlLFpKZTNHm0UUMRz1W303g9iOM1GmoGsyaeYmgJD /wwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=QcsoddHazsWgfADsxlahqDdIfKZ71KmmroN6BeNvkxc=; b=ejEUaaw1FqoLn35OSabcdsNByQo43/TYuIaQowH4homR2bxDbX4Gk/Nk4mfkA8Qdyb P0On5rAblZ6zwQ8v21c/Dd5+BruYPW1IgzlOurfsbT6WPrIXNqIKlR0/DrsQ8FwjyAED C+osCZfYry2s7h64hiiVqNeEB9qS7y94FVyj8Jjw07QdgWJsTaob2e5n+7DaETNYnPn1 hJp6ynrzZprx5A3Yc2QAgXDXP3rHifRb/5yfVc4/Swu05iMqMMa1H+3/ZqFaQaDhnkPZ bpGHWEy4H5mdjmW/nAVd4zJkR1smxpG1hMuE1tH3RrtK5hIlRrCj8vU8wLG/OVA5Kkar Y0dQ== X-Gm-Message-State: AKS2vOzhsjgd3vjZFy/e3dOpN8BxyqidY6CPGLX7DRpot2GXK2DUjz0G MYpqn7zvpDj2Lg== X-Received: by 10.36.0.70 with SMTP id 67mr6201422ita.114.1498001773610; Tue, 20 Jun 2017 16:36:13 -0700 (PDT) Received: from localhost.localdomain (c-68-49-162-121.hsd1.mi.comcast.net. [68.49.162.121]) by smtp.gmail.com with ESMTPSA id m129sm95277iom.27.2017.06.20.16.36.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 20 Jun 2017 16:36:13 -0700 (PDT) From: Trond Myklebust To: Anna Schumaker Cc: linux-nfs@vger.kernel.org Subject: [PATCH RESEND 2/3] NFS: Ensure we commit after writeback is complete Date: Tue, 20 Jun 2017 19:35:37 -0400 Message-Id: <20170620233539.22417-3-trond.myklebust@primarydata.com> X-Mailer: git-send-email 2.9.4 In-Reply-To: <20170620233539.22417-2-trond.myklebust@primarydata.com> References: <20170620233539.22417-1-trond.myklebust@primarydata.com> <20170620233539.22417-2-trond.myklebust@primarydata.com> Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP If the page cache is being flushed, then we want to ensure that we do start a commit once the pages are done being flushed. If we just wait until all I/O is done to that file, we can end up livelocking until the balance_dirty_pages() mechanism puts its foot down and forces I/O to stop. So instead we do more or less the same thing that O_DIRECT does, and set up a counter to tell us when the flush is done, Signed-off-by: Trond Myklebust --- fs/nfs/pagelist.c | 3 +++ fs/nfs/write.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/nfs_page.h | 1 + include/linux/nfs_xdr.h | 2 ++ 4 files changed, 63 insertions(+) diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c index de107d866297..83d2918f1b13 100644 --- a/fs/nfs/pagelist.c +++ b/fs/nfs/pagelist.c @@ -50,6 +50,7 @@ void nfs_pgheader_init(struct nfs_pageio_descriptor *desc, hdr->cred = hdr->req->wb_context->cred; hdr->io_start = req_offset(hdr->req); hdr->good_bytes = mirror->pg_count; + hdr->io_completion = desc->pg_io_completion; hdr->dreq = desc->pg_dreq; hdr->release = release; hdr->completion_ops = desc->pg_completion_ops; @@ -709,6 +710,7 @@ void nfs_pageio_init(struct nfs_pageio_descriptor *desc, desc->pg_ioflags = io_flags; desc->pg_error = 0; desc->pg_lseg = NULL; + desc->pg_io_completion = NULL; desc->pg_dreq = NULL; desc->pg_bsize = bsize; @@ -1231,6 +1233,7 @@ int nfs_pageio_resend(struct nfs_pageio_descriptor *desc, { LIST_HEAD(failed); + desc->pg_io_completion = hdr->io_completion; desc->pg_dreq = hdr->dreq; while (!list_empty(&hdr->pages)) { struct nfs_page *req = nfs_list_entry(hdr->pages.next); diff --git a/fs/nfs/write.c b/fs/nfs/write.c index db7ba542559e..051197cb9195 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -40,6 +40,12 @@ #define MIN_POOL_WRITE (32) #define MIN_POOL_COMMIT (4) +struct nfs_io_completion { + void (*complete)(void *data); + void *data; + struct kref refcount; +}; + /* * Local function declarations */ @@ -108,6 +114,39 @@ static void nfs_writehdr_free(struct nfs_pgio_header *hdr) mempool_free(hdr, nfs_wdata_mempool); } +static struct nfs_io_completion *nfs_io_completion_alloc(gfp_t gfp_flags) +{ + return kmalloc(sizeof(struct nfs_io_completion), gfp_flags); +} + +static void nfs_io_completion_init(struct nfs_io_completion *ioc, + void (*complete)(void *), void *data) +{ + ioc->complete = complete; + ioc->data = data; + kref_init(&ioc->refcount); +} + +static void nfs_io_completion_release(struct kref *kref) +{ + struct nfs_io_completion *ioc = container_of(kref, + struct nfs_io_completion, refcount); + ioc->complete(ioc->data); + kfree(ioc); +} + +static void nfs_io_completion_get(struct nfs_io_completion *ioc) +{ + if (ioc != NULL) + kref_get(&ioc->refcount); +} + +static void nfs_io_completion_put(struct nfs_io_completion *ioc) +{ + if (ioc != NULL) + kref_put(&ioc->refcount, nfs_io_completion_release); +} + static void nfs_context_set_write_error(struct nfs_open_context *ctx, int error) { ctx->error = error; @@ -681,18 +720,29 @@ static int nfs_writepages_callback(struct page *page, struct writeback_control * return ret; } +static void nfs_io_completion_commit(void *inode) +{ + nfs_commit_inode(inode, 0); +} + int nfs_writepages(struct address_space *mapping, struct writeback_control *wbc) { struct inode *inode = mapping->host; struct nfs_pageio_descriptor pgio; + struct nfs_io_completion *ioc = nfs_io_completion_alloc(GFP_NOFS); int err; nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGES); + if (ioc) + nfs_io_completion_init(ioc, nfs_io_completion_commit, inode); + nfs_pageio_init_write(&pgio, inode, wb_priority(wbc), false, &nfs_async_write_completion_ops); + pgio.pg_io_completion = ioc; err = write_cache_pages(mapping, wbc, nfs_writepages_callback, &pgio); nfs_pageio_complete(&pgio); + nfs_io_completion_put(ioc); if (err < 0) goto out_err; @@ -940,6 +990,11 @@ int nfs_write_need_commit(struct nfs_pgio_header *hdr) return hdr->verf.committed != NFS_FILE_SYNC; } +static void nfs_async_write_init(struct nfs_pgio_header *hdr) +{ + nfs_io_completion_get(hdr->io_completion); +} + static void nfs_write_completion(struct nfs_pgio_header *hdr) { struct nfs_commit_info cinfo; @@ -973,6 +1028,7 @@ static void nfs_write_completion(struct nfs_pgio_header *hdr) nfs_release_request(req); } out: + nfs_io_completion_put(hdr->io_completion); hdr->release(hdr); } @@ -1378,6 +1434,7 @@ static void nfs_async_write_reschedule_io(struct nfs_pgio_header *hdr) } static const struct nfs_pgio_completion_ops nfs_async_write_completion_ops = { + .init_hdr = nfs_async_write_init, .error_cleanup = nfs_async_write_error, .completion = nfs_write_completion, .reschedule_io = nfs_async_write_reschedule_io, diff --git a/include/linux/nfs_page.h b/include/linux/nfs_page.h index 6138cf91346b..abbee2d15dce 100644 --- a/include/linux/nfs_page.h +++ b/include/linux/nfs_page.h @@ -93,6 +93,7 @@ struct nfs_pageio_descriptor { const struct rpc_call_ops *pg_rpc_callops; const struct nfs_pgio_completion_ops *pg_completion_ops; struct pnfs_layout_segment *pg_lseg; + struct nfs_io_completion *pg_io_completion; struct nfs_direct_req *pg_dreq; unsigned int pg_bsize; /* default bsize for mirrors */ diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index ee998ed08cf6..ab6a8381459f 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -1422,6 +1422,7 @@ enum { NFS_IOHDR_STAT, }; +struct nfs_io_completion; struct nfs_pgio_header { struct inode *inode; struct rpc_cred *cred; @@ -1435,6 +1436,7 @@ struct nfs_pgio_header { void (*release) (struct nfs_pgio_header *hdr); const struct nfs_pgio_completion_ops *completion_ops; const struct nfs_rw_ops *rw_ops; + struct nfs_io_completion *io_completion; struct nfs_direct_req *dreq; spinlock_t lock; /* fields protected by lock */