From patchwork Thu Jul 11 07:17:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13730080 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D09B3BBED for ; Thu, 11 Jul 2024 07:17:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720682230; cv=none; b=n2oazEaxFRdv71GFaVw6ySf7gVv2o3D/EpY10bSWtaRBPmzV2V+Z9G3sW+HrH7d3/3AiXDMW6VVaH4le22cDjWkUPDUQPhn1pTcU/02UUjUCL7CSbROoZftnVZhcxKyZQQgbOOQ3GssUFwjFa822NejrDAOyqtIlYUzeXc/ePgI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720682230; c=relaxed/simple; bh=5nE0R4Mohkzdd8E/rUcww3ZyElGXjl20M1q1HQFxegc=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=glA40/AkgSnFAWOpP61XmhVC/AA7Mnp5elXpk2OkBKpAjYBclgpiDb1baVsQLJ26Wh6TR41/kQQxLl2wtivHVZl5bpAxC0USwq9f6mcQh8266Qt/1JrsaT0MdviZ/cgomKFSgcEKZS02bk8n+NNEDTcxNDJ/yv+Iva34oXABF+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=ncqZsT/l; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="ncqZsT/l" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: Content-Type:MIME-Version:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-ID:Content-Description:In-Reply-To:References; bh=UovDSEY4DWvNJN/jjp/wUPmVPbR6pyVi2zyhJofNI3I=; b=ncqZsT/lTr6PB0wLsB9oyMlhmt NhjEFzdM5TOhoF1e8C97AEtVgzjM6qKi+zUNieOHEAsH0GgBUHfXEjTRyDluzkYSKPhVupZE+5zcm B9QLVRn84btHK0i0Hl1Y5qjrAhJogPseeju3uGjIajuwlS9scjqXkGCvtBged7Nd8SXZe2Wxfn3tC dG6clt06hXTLStoOStrrVMK0UAv+81q/rMDkPxxyEOZ+gGevDSXXtIcyZFdDFEpQZi1SmJU0sx2vH vK5m7s7jk19uwblvpzt/y5VGWky91ur/OAfpL2VNWDkHUeFA/zMqkxBab+uNBGmfU+6Jd5BuJd2LP WNxY0+8w==; Received: from 2a02-8389-2341-5b80-0976-dc3d-d348-fed7.cable.dynamic.v6.surfer.at ([2a02:8389:2341:5b80:976:dc3d:d348:fed7] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sRo38-0000000Czey-1NbZ; Thu, 11 Jul 2024 07:17:06 +0000 From: Christoph Hellwig To: trondmy@kernel.org, anna@kernel.org Cc: chuck.lever@oracle.com, linux-nfs@vger.kernel.org Subject: [PATCH 1/2] nfs: pass explicit offset/count to trace events Date: Thu, 11 Jul 2024 09:17:02 +0200 Message-ID: <20240711071703.65793-1-hch@lst.de> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html nfs_folio_length is unsafe to use without having the folio locked and a check for a NULL ->f_mapping that protects against truncations and can lead to kernel crashes. E.g. when running xfstests generic/065 with all nfs trace points enabled. Follow the model of the XFS trace points and pass in an explŃ–cit offset and length. This has the additional benefit that these values can be more accurate as some of the users touch partial folio ranges. Fixes: eb5654b3b89d ("NFS: Enable tracing of nfs_invalidate_folio() and nfs_launder_folio()") Reported-by: Chuck Lever Signed-off-by: Christoph Hellwig Reviewed-by: Sagi Grimberg --- fs/nfs/file.c | 5 +++-- fs/nfs/nfstrace.h | 36 ++++++++++++++++++++---------------- fs/nfs/read.c | 8 +++++--- fs/nfs/write.c | 8 ++++---- 4 files changed, 32 insertions(+), 25 deletions(-) diff --git a/fs/nfs/file.c b/fs/nfs/file.c index 0e2f87120cb840..9aa2ab218c0ac5 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -436,7 +436,7 @@ static void nfs_invalidate_folio(struct folio *folio, size_t offset, /* Cancel any unstarted writes on this page */ nfs_wb_folio_cancel(inode, folio); folio_wait_private_2(folio); /* [DEPRECATED] */ - trace_nfs_invalidate_folio(inode, folio); + trace_nfs_invalidate_folio(inode, folio_pos(folio) + offset, length); } /* @@ -504,7 +504,8 @@ static int nfs_launder_folio(struct folio *folio) folio_wait_private_2(folio); /* [DEPRECATED] */ ret = nfs_wb_folio(inode, folio); - trace_nfs_launder_folio_done(inode, folio, ret); + trace_nfs_launder_folio_done(inode, folio_pos(folio), + folio_size(folio), ret); return ret; } diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h index 1e710654af1173..352fdaed407541 100644 --- a/fs/nfs/nfstrace.h +++ b/fs/nfs/nfstrace.h @@ -939,10 +939,11 @@ TRACE_EVENT(nfs_sillyrename_unlink, DECLARE_EVENT_CLASS(nfs_folio_event, TP_PROTO( const struct inode *inode, - struct folio *folio + loff_t offset, + size_t count ), - TP_ARGS(inode, folio), + TP_ARGS(inode, offset, count), TP_STRUCT__entry( __field(dev_t, dev) @@ -950,7 +951,7 @@ DECLARE_EVENT_CLASS(nfs_folio_event, __field(u64, fileid) __field(u64, version) __field(loff_t, offset) - __field(u32, count) + __field(size_t, count) ), TP_fast_assign( @@ -960,13 +961,13 @@ DECLARE_EVENT_CLASS(nfs_folio_event, __entry->fileid = nfsi->fileid; __entry->fhandle = nfs_fhandle_hash(&nfsi->fh); __entry->version = inode_peek_iversion_raw(inode); - __entry->offset = folio_file_pos(folio); - __entry->count = nfs_folio_length(folio); + __entry->offset = offset, + __entry->count = count; ), TP_printk( "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu " - "offset=%lld count=%u", + "offset=%lld count=%zu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long long)__entry->fileid, __entry->fhandle, __entry->version, @@ -978,18 +979,20 @@ DECLARE_EVENT_CLASS(nfs_folio_event, DEFINE_EVENT(nfs_folio_event, name, \ TP_PROTO( \ const struct inode *inode, \ - struct folio *folio \ + loff_t offset, \ + size_t count \ ), \ - TP_ARGS(inode, folio)) + TP_ARGS(inode, offset, count)) DECLARE_EVENT_CLASS(nfs_folio_event_done, TP_PROTO( const struct inode *inode, - struct folio *folio, + loff_t offset, + size_t count, int ret ), - TP_ARGS(inode, folio, ret), + TP_ARGS(inode, offset, count, ret), TP_STRUCT__entry( __field(dev_t, dev) @@ -998,7 +1001,7 @@ DECLARE_EVENT_CLASS(nfs_folio_event_done, __field(u64, fileid) __field(u64, version) __field(loff_t, offset) - __field(u32, count) + __field(size_t, count) ), TP_fast_assign( @@ -1008,14 +1011,14 @@ DECLARE_EVENT_CLASS(nfs_folio_event_done, __entry->fileid = nfsi->fileid; __entry->fhandle = nfs_fhandle_hash(&nfsi->fh); __entry->version = inode_peek_iversion_raw(inode); - __entry->offset = folio_file_pos(folio); - __entry->count = nfs_folio_length(folio); + __entry->offset = offset, + __entry->count = count, __entry->ret = ret; ), TP_printk( "fileid=%02x:%02x:%llu fhandle=0x%08x version=%llu " - "offset=%lld count=%u ret=%d", + "offset=%lld count=%zu ret=%d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long long)__entry->fileid, __entry->fhandle, __entry->version, @@ -1027,10 +1030,11 @@ DECLARE_EVENT_CLASS(nfs_folio_event_done, DEFINE_EVENT(nfs_folio_event_done, name, \ TP_PROTO( \ const struct inode *inode, \ - struct folio *folio, \ + loff_t offset, \ + size_t count, \ int ret \ ), \ - TP_ARGS(inode, folio, ret)) + TP_ARGS(inode, offset, count, ret)) DEFINE_NFS_FOLIO_EVENT(nfs_aop_readpage); DEFINE_NFS_FOLIO_EVENT_DONE(nfs_aop_readpage_done); diff --git a/fs/nfs/read.c b/fs/nfs/read.c index 036ede4875cab6..571a5d654cf547 100644 --- a/fs/nfs/read.c +++ b/fs/nfs/read.c @@ -333,13 +333,15 @@ int nfs_read_add_folio(struct nfs_pageio_descriptor *pgio, int nfs_read_folio(struct file *file, struct folio *folio) { struct inode *inode = file_inode(file); + loff_t pos = folio_file_pos(folio); + size_t len = folio_size(folio); struct nfs_pageio_descriptor pgio; struct nfs_open_context *ctx; int ret; - trace_nfs_aop_readpage(inode, folio); + trace_nfs_aop_readpage(inode, pos, len); nfs_inc_stats(inode, NFSIOS_VFSREADPAGE); - task_io_account_read(folio_size(folio)); + task_io_account_read(len); /* * Try to flush any pending writes to the file.. @@ -383,7 +385,7 @@ int nfs_read_folio(struct file *file, struct folio *folio) out_put: put_nfs_open_context(ctx); out: - trace_nfs_aop_readpage_done(inode, folio, ret); + trace_nfs_aop_readpage_done(inode, pos, len, ret); return ret; out_unlock: folio_unlock(folio); diff --git a/fs/nfs/write.c b/fs/nfs/write.c index acf2d942d78fe2..54eab063b67851 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -2064,16 +2064,16 @@ int nfs_wb_folio_cancel(struct inode *inode, struct folio *folio) int nfs_wb_folio(struct inode *inode, struct folio *folio) { loff_t range_start = folio_file_pos(folio); - loff_t range_end = range_start + (loff_t)folio_size(folio) - 1; + size_t len = folio_size(folio); struct writeback_control wbc = { .sync_mode = WB_SYNC_ALL, .nr_to_write = 0, .range_start = range_start, - .range_end = range_end, + .range_end = range_start + len - 1, }; int ret; - trace_nfs_writeback_folio(inode, folio); + trace_nfs_writeback_folio(inode, range_start, len); for (;;) { folio_wait_writeback(folio); @@ -2091,7 +2091,7 @@ int nfs_wb_folio(struct inode *inode, struct folio *folio) goto out_error; } out_error: - trace_nfs_writeback_folio_done(inode, folio, ret); + trace_nfs_writeback_folio_done(inode, range_start, len, ret); return ret; } From patchwork Thu Jul 11 07:17:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 13730081 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 294FA770E6 for ; Thu, 11 Jul 2024 07:17:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720682231; cv=none; b=X+E5V/APN3oDG9354HAofgACcrQHP1haRcmXnzJbLN6YA1YkjtldWa0Ulg1pbvjGRsrcVKpSQrod3w2kfxAlXcLbvfcg0SFSLM7BmTE28rnf7s736s5AKi3Xz/ljCPDRUnJVJ7osAdXvGWQfyOLxvVd2ANgQA1hVtC8MJgkwMyE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1720682231; c=relaxed/simple; bh=3blsRq1CiYO3V5Ti2bsFnsuZ0M9CFI7EWxXX68OVCRU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=t9J6rcgtWVQlL0VC4/s6vYAR3n85eNohb44gFSmFPzAmlWoIQooumKfvYDcP/zflEOUiDQqNmdp4KbrZnzDQAyq2qpk5/glscyFZ54V4Ge/p54U1OLgTP23MXmuzI+zip33dju2epvl74qK6TXQP9lpO8i+ExcHt+Gf45rabSwU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=Ttxjb+uh; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="Ttxjb+uh" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=OeAbGG4uDiFVrxeq5hxYdHeUQgrppVqgh11wh4YpQfI=; b=Ttxjb+uhjHHf7tXJRp6w+gLwr6 3YkKIeIN/8hDnl/K49GZK3vSgOI+JkaXec/uRYyg30se+CPl6apQ+z4e9k5DLU8v2RgIuEd2qb8SY QRpOHL3ci04bROS/9OJO4kSXy+rX3mXeR9C5n3WNT+aHbwiggOPgOfJQ2Vo3hE3O6vlV6rw7E9O8u basbRMW4GBfHfIaH7dZkyyI4IFnqg0h8oCYdPJUQ3OzniDQP2lR/PuqL/DNzHx8Ams7qvoQVDxMzz KEi9g9DKDEkmqeFc4x3Jvx4XQSM/3ZWFOkvWyOTjHWgq4Lp0knDDuLsyLuQvaztUQQdM7fDRflZzR +4g8MJlQ==; Received: from 2a02-8389-2341-5b80-0976-dc3d-d348-fed7.cable.dynamic.v6.surfer.at ([2a02:8389:2341:5b80:976:dc3d:d348:fed7] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sRo3B-0000000CzfF-0e1N; Thu, 11 Jul 2024 07:17:09 +0000 From: Christoph Hellwig To: trondmy@kernel.org, anna@kernel.org Cc: chuck.lever@oracle.com, linux-nfs@vger.kernel.org Subject: [PATCH 2/2] nfs: split nfs_read_folio Date: Thu, 11 Jul 2024 09:17:03 +0200 Message-ID: <20240711071703.65793-2-hch@lst.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240711071703.65793-1-hch@lst.de> References: <20240711071703.65793-1-hch@lst.de> Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html nfs_read_folio is a bit hard to follow because it mixes highlevel logic with the actual data read. Split the latter into a helper and update the comments to be more accurate. Signed-off-by: Christoph Hellwig Reviewed-by: Sagi Grimberg --- fs/nfs/read.c | 69 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 41 insertions(+), 28 deletions(-) diff --git a/fs/nfs/read.c b/fs/nfs/read.c index 571a5d654cf547..4b767d5a3524b3 100644 --- a/fs/nfs/read.c +++ b/fs/nfs/read.c @@ -325,18 +325,52 @@ int nfs_read_add_folio(struct nfs_pageio_descriptor *pgio, } /* - * Read a page over NFS. - * We read the page synchronously in the following case: - * - The error flag is set for this page. This happens only when a - * previous async read operation failed. + * Actually read a folio over the wire. + */ +static int nfs_do_read_folio(struct file *file, struct folio *folio) +{ + struct inode *inode = file_inode(file); + struct nfs_pageio_descriptor pgio; + struct nfs_open_context *ctx; + int ret; + + ctx = get_nfs_open_context(nfs_file_open_context(file)); + + xchg(&ctx->error, 0); + nfs_pageio_init_read(&pgio, inode, false, + &nfs_async_read_completion_ops); + + ret = nfs_read_add_folio(&pgio, ctx, folio); + if (ret) + goto out_put; + + nfs_pageio_complete_read(&pgio); + nfs_update_delegated_atime(inode); + if (pgio.pg_error < 0) { + ret = pgio.pg_error; + goto out_put; + } + + ret = folio_wait_locked_killable(folio); + if (!folio_test_uptodate(folio) && !ret) + ret = xchg(&ctx->error, 0); + +out_put: + put_nfs_open_context(ctx); + return ret; +} + +/* + * Synchronously read a folio. + * + * This is not heavily used as most users to try an asynchronous + * large read through ->readahead first. */ int nfs_read_folio(struct file *file, struct folio *folio) { struct inode *inode = file_inode(file); loff_t pos = folio_file_pos(folio); size_t len = folio_size(folio); - struct nfs_pageio_descriptor pgio; - struct nfs_open_context *ctx; int ret; trace_nfs_aop_readpage(inode, pos, len); @@ -361,29 +395,8 @@ int nfs_read_folio(struct file *file, struct folio *folio) goto out_unlock; ret = nfs_netfs_read_folio(file, folio); - if (!ret) - goto out; - - ctx = get_nfs_open_context(nfs_file_open_context(file)); - - xchg(&ctx->error, 0); - nfs_pageio_init_read(&pgio, inode, false, - &nfs_async_read_completion_ops); - - ret = nfs_read_add_folio(&pgio, ctx, folio); if (ret) - goto out_put; - - nfs_pageio_complete_read(&pgio); - nfs_update_delegated_atime(inode); - ret = pgio.pg_error < 0 ? pgio.pg_error : 0; - if (!ret) { - ret = folio_wait_locked_killable(folio); - if (!folio_test_uptodate(folio) && !ret) - ret = xchg(&ctx->error, 0); - } -out_put: - put_nfs_open_context(ctx); + ret = nfs_do_read_folio(file, folio); out: trace_nfs_aop_readpage_done(inode, pos, len, ret); return ret;