diff mbox series

[AUTOSEL,4.19,52/68] iomap: sub-block dio needs to zeroout beyond EOF

Message ID 20181129055559.159228-52-sashal@kernel.org (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Sasha Levin Nov. 29, 2018, 5:55 a.m. UTC
From: Dave Chinner <dchinner@redhat.com>

[ Upstream commit b450672fb66b4a991a5b55ee24209ac7ae7690ce ]

If we are doing sub-block dio that extends EOF, we need to zero
the unused tail of the block to initialise the data in it it. If we
do not zero the tail of the block, then an immediate mmap read of
the EOF block will expose stale data beyond EOF to userspace. Found
with fsx running sub-block DIO sizes vs MAPREAD/MAPWRITE operations.

Fix this by detecting if the end of the DIO write is beyond EOF
and zeroing the tail if necessary.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/iomap.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Dave Chinner Nov. 29, 2018, 12:19 p.m. UTC | #1
On Thu, Nov 29, 2018 at 12:55:43AM -0500, Sasha Levin wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> [ Upstream commit b450672fb66b4a991a5b55ee24209ac7ae7690ce ]
> 
> If we are doing sub-block dio that extends EOF, we need to zero
> the unused tail of the block to initialise the data in it it. If we
> do not zero the tail of the block, then an immediate mmap read of
> the EOF block will expose stale data beyond EOF to userspace. Found
> with fsx running sub-block DIO sizes vs MAPREAD/MAPWRITE operations.
> 
> Fix this by detecting if the end of the DIO write is beyond EOF
> and zeroing the tail if necessary.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  fs/iomap.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/iomap.c b/fs/iomap.c
> index fa46e3ed8f53..82e35265679d 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -1678,7 +1678,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
>  		dio->submit.cookie = submit_bio(bio);
>  	} while (nr_pages);
>  
> -	if (need_zeroout) {
> +	/*
> +	 * We need to zeroout the tail of a sub-block write if the extent type
> +	 * requires zeroing or the write extends beyond EOF. If we don't zero
> +	 * the block tail in the latter case, we can expose stale data via mmap
> +	 * reads of the EOF block.
> +	 */
> +	if (need_zeroout ||
> +	    ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
>  		/* zero out from the end of the write to the end of the block */
>  		pad = pos & (fs_block_size - 1);
>  		if (pad)

Same again - what's the test plan for these cherry-picked data
corruption fixes?

Cheers,

Dave.
Amir Goldstein Nov. 29, 2018, 12:36 p.m. UTC | #2
> Same again - what's the test plan for these cherry-picked data
> corruption fixes?
>

Dave,

Just to check if we are on the same page, if this was the test plan:
https://www.spinics.net/lists/linux-xfs/msg20639.html

Would "no regressions from baseline" have been sufficient to validate
those specific patches are solid for stable?

Thanks,
Amir.
Dave Chinner Nov. 29, 2018, 10:43 p.m. UTC | #3
On Thu, Nov 29, 2018 at 02:36:50PM +0200, Amir Goldstein wrote:
> > Same again - what's the test plan for these cherry-picked data
> > corruption fixes?
> >
> 
> Dave,
> 
> Just to check if we are on the same page, if this was the test plan:
> https://www.spinics.net/lists/linux-xfs/msg20639.html
> 
> Would "no regressions from baseline" have been sufficient to validate
> those specific patches are solid for stable?

No, not in this case, because fsx in fstests only does 100k ops at
most - it's a smoke test. This isn't sufficient to regression test
fixes that, in some cases, took hundreds of millions of fsx ops to
expose.

Cheers,

Dave.
diff mbox series

Patch

diff --git a/fs/iomap.c b/fs/iomap.c
index fa46e3ed8f53..82e35265679d 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -1678,7 +1678,14 @@  iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length,
 		dio->submit.cookie = submit_bio(bio);
 	} while (nr_pages);
 
-	if (need_zeroout) {
+	/*
+	 * We need to zeroout the tail of a sub-block write if the extent type
+	 * requires zeroing or the write extends beyond EOF. If we don't zero
+	 * the block tail in the latter case, we can expose stale data via mmap
+	 * reads of the EOF block.
+	 */
+	if (need_zeroout ||
+	    ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) {
 		/* zero out from the end of the write to the end of the block */
 		pad = pos & (fs_block_size - 1);
 		if (pad)