From patchwork Tue Nov 16 02:44:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 12621149 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71CD4C433EF for ; Tue, 16 Nov 2021 02:46:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1ED416322A for ; Tue, 16 Nov 2021 02:46:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1ED416322A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B5AA66B00AF; Mon, 15 Nov 2021 21:46:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B09816B00BB; Mon, 15 Nov 2021 21:46:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 983A56B00BC; Mon, 15 Nov 2021 21:46:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0090.hostedemail.com [216.40.44.90]) by kanga.kvack.org (Postfix) with ESMTP id 88A566B00AF for ; Mon, 15 Nov 2021 21:46:54 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 51E438249980 for ; Tue, 16 Nov 2021 02:46:54 +0000 (UTC) X-FDA: 78813255906.17.339377B Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf03.hostedemail.com (Postfix) with ESMTP id 18DE230000B7 for ; Tue, 16 Nov 2021 02:46:43 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id C08E01FD6E; Tue, 16 Nov 2021 02:46:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1637030812; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ilqBwRNcKCWrWikYfICWiUGsbiDnIfpp6dIzn05L/u0=; b=bxOgH2KDZq5WM3FoP6Hh8QNS3dDmYdj+VdHSkOCn5TaCua9nhQnZJ5j0OMTAs7i9ksqJfQ 7r2QXkHt7QPW4DPKehojOONQMslkVY3Bs2DqimQIn7QxwvcxtVtGafL9N0oUEMggYfIW6e induzPCcnkKaCwsfjr5mBC7mqtYGRSE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1637030812; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ilqBwRNcKCWrWikYfICWiUGsbiDnIfpp6dIzn05L/u0=; b=J6AaypphTXdJwcS+m7MJGN9qLmx78KXfjbO2OjSvoBgV5QKOlL2pGzUnKSOEtDgXjBZvFh JDE7foDThmS2U/Aw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 6A52913B70; Tue, 16 Nov 2021 02:46:50 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id t4CfCpobk2EmCQAAMHmgww (envelope-from ); Tue, 16 Nov 2021 02:46:50 +0000 Subject: [PATCH 11/13] NFS: swap-out must always use STABLE writes. From: NeilBrown To: Trond Myklebust , Anna Schumaker , Chuck Lever , Andrew Morton , Mel Gorman Cc: linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Date: Tue, 16 Nov 2021 13:44:04 +1100 Message-ID: <163703064457.25805.5324231084334985723.stgit@noble.brown> In-Reply-To: <163702956672.25805.16457749992977493579.stgit@noble.brown> References: <163702956672.25805.16457749992977493579.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 18DE230000B7 X-Stat-Signature: 5jygroebsadcig3nicp4rfxhx9a6bnxb Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=bxOgH2KD; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=J6Aaypph; spf=pass (imf03.hostedemail.com: domain of neilb@suse.de designates 195.135.220.29 as permitted sender) smtp.mailfrom=neilb@suse.de; dmarc=pass (policy=none) header.from=suse.de X-HE-Tag: 1637030803-805241 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The commit handling code is not safe against memory-pressure deadlocks when writing to swap. In particular, nfs_commitdata_alloc() blocks indefinitely waiting for memory, and this can consume all available workqueue threads. swap-out most likely uses STABLE writes anyway as COND_STABLE indicates that a stable write should be used if the write fits in a single request, and it normally does. However if we ever swap with a small wsize, or gather unusually large numbers of pages for a single write, this might change. For safety, make it explicit in the code that direct writes use for swap must always use FLUSH_COND_STABLE. Signed-off-by: NeilBrown --- fs/nfs/direct.c | 12 +++++++----- fs/nfs/file.c | 2 +- include/linux/nfs_fs.h | 3 ++- 3 files changed, 10 insertions(+), 7 deletions(-) diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index 1e80d243ba25..8d3b12402725 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -173,7 +173,7 @@ ssize_t nfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) if (iov_iter_rw(iter) == READ) return nfs_file_direct_read(iocb, iter); - return nfs_file_direct_write(iocb, iter); + return nfs_file_direct_write(iocb, iter, FLUSH_STABLE); } static void nfs_direct_release_pages(struct page **pages, unsigned int npages) @@ -789,7 +789,7 @@ static const struct nfs_pgio_completion_ops nfs_direct_write_completion_ops = { */ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, struct iov_iter *iter, - loff_t pos) + loff_t pos, int ioflags) { struct nfs_pageio_descriptor desc; struct inode *inode = dreq->inode; @@ -797,7 +797,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, size_t requested_bytes = 0; size_t wsize = max_t(size_t, NFS_SERVER(inode)->wsize, PAGE_SIZE); - nfs_pageio_init_write(&desc, inode, FLUSH_COND_STABLE, false, + nfs_pageio_init_write(&desc, inode, ioflags, false, &nfs_direct_write_completion_ops); desc.pg_dreq = dreq; get_dreq(dreq); @@ -875,6 +875,7 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, * nfs_file_direct_write - file direct write operation for NFS files * @iocb: target I/O control block * @iter: vector of user buffers from which to write data + * @ioflags: flags for nfs_pageio_init_write() * * We use this function for direct writes instead of calling * generic_file_aio_write() in order to avoid taking the inode @@ -891,7 +892,8 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq, * Note that O_APPEND is not supported for NFS direct writes, as there * is no atomic O_APPEND write facility in the NFS protocol. */ -ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter) +ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter, + int ioflags) { ssize_t result, requested; size_t count; @@ -935,7 +937,7 @@ ssize_t nfs_file_direct_write(struct kiocb *iocb, struct iov_iter *iter) nfs_start_io_direct(inode); - requested = nfs_direct_write_schedule_iovec(dreq, iter, pos); + requested = nfs_direct_write_schedule_iovec(dreq, iter, pos, ioflags); if (mapping->nrpages) { invalidate_inode_pages2_range(mapping, diff --git a/fs/nfs/file.c b/fs/nfs/file.c index 59c271f42ea5..878a6a510a5e 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -630,7 +630,7 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from) result = generic_write_checks(iocb, from); if (result <= 0) return result; - return nfs_file_direct_write(iocb, from); + return nfs_file_direct_write(iocb, from, FLUSH_COND_STABLE); } dprintk("NFS: write(%pD2, %zu@%Ld)\n", diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 5a605e51f4b1..ca312aea6bec 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -509,7 +509,8 @@ extern ssize_t nfs_direct_IO(struct kiocb *, struct iov_iter *); extern ssize_t nfs_file_direct_read(struct kiocb *iocb, struct iov_iter *iter); extern ssize_t nfs_file_direct_write(struct kiocb *iocb, - struct iov_iter *iter); + struct iov_iter *iter, + int ioflags); /* * linux/fs/nfs/dir.c