From patchwork Tue Jun 6 03:33:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13268205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8E05C77B73 for ; Tue, 6 Jun 2023 03:34:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230121AbjFFDeI (ORCPT ); Mon, 5 Jun 2023 23:34:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231258AbjFFDeG (ORCPT ); Mon, 5 Jun 2023 23:34:06 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6FD811B for ; Mon, 5 Jun 2023 20:34:05 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 2C8E02199D; Tue, 6 Jun 2023 03:34:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1686022444; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=bpANx9X+sQF5S7Kv+fFBW2DnhKe6tEMNo5nE7/ii0nw=; b=1dTnFYDRNJnc1u4A8Mw/hLM1PQXvhDskeBEXnYpPkG6ulnLssJXBykmj0f+VSXTGQNL5ZA QyHzYlLUkfU5I/hdRKMFygPNdfgerGX8VK418fBi5iGFJxGPynlL7PP6shFUz8o4rC/XC6 gyVWPPFjUu8tfhcv4muUW1YLrzO8Z84= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1686022444; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=bpANx9X+sQF5S7Kv+fFBW2DnhKe6tEMNo5nE7/ii0nw=; b=eXnbPvleEVvjno4nA9IDAukrvQ4od4JP0PPCljaqIKWR+qtRK83qE5nF+h3tTEffCrXjDy TYJ6NihB2Ugm8eDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id DE2FB13343; Tue, 6 Jun 2023 03:34:02 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id ysFnIyqpfmR6KAAAMHmgww (envelope-from ); Tue, 06 Jun 2023 03:34:02 +0000 MIME-Version: 1.0 From: "NeilBrown" To: Trond Myklebust , Anna Schumaker Cc: Linux NFS Mailing List Subject: [PATCH/RFC] NFS: fix use-after-free with O_DIRECT write Date: Tue, 06 Jun 2023 13:33:58 +1000 Message-id: <168602243849.29198.7621580343986486904@noble.neil.brown.name> Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org If the page size is greater than the wsize, O_DIRECT writes are broken up into multiple sub-requests (subreqs) per page. If there are two subreqs for a given page, one (not the head) succeeds and the other succeeds but sees a write verifier mis-match, we get a problem. The first subreq will have been released (nfs_release_request) and will have a refcount of zero and PG_TEARDOWN will be set. The remainder of the request will be passed to nfs_direct_write_reschedule() and thence to nfs_direct_join_group(). nfs_direct_join_group() calls nfs_release_request() on each non-head subreq, and this results in a refcounter underflow. The list is then passed to nfs_join_page_group() which should probably ignore these completed subreqs too, though there is no serious problem caused by not skipping them. It finally gets to nfs_destroy_unlinked_subrequests() which handles these unrefed subreqs correctly. This behaviour has been seen on a ppc64 machine with 64K page size, mounting with NFSv3 an rsize=wsize=32768. The server-side filesystem fills up causing the Linux NFS server to report ENOSPC and to update the write verifier. This patch adds tests on PG_TEARDOWN and skips those subrequests in nfs_direcf_join_group(). It doesn't make changes to nfs_join_page_group(). If a "head" subreq succeeds but other subreqs fail, nfs_direct_join_group() will not join those subreqs back together. I doubt this is correct, but I haven't yet demonstrated a crash, or worked through all the consequences. Signed-off-by: NeilBrown --- fs/nfs/direct.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c index 9a18c5a69ace..d41d4b869d42 100644 --- a/fs/nfs/direct.c +++ b/fs/nfs/direct.c @@ -483,8 +483,10 @@ nfs_direct_join_group(struct list_head *list, struct inode *inode) for (next = req->wb_this_page; next != req->wb_head; next = next->wb_this_page) { - nfs_list_remove_request(next); - nfs_release_request(next); + if (!test_bit(PG_TEARDOWN, &next->wb_flags)) { + nfs_list_remove_request(next); + nfs_release_request(next); + } } nfs_join_page_group(req, inode); }