Message ID | 158984935136.619853.1558687512700172480.stgit@magnolia (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | xfs: fix stale disk exposure after crash | expand |
On Mon, May 18, 2020 at 05:49:11PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > When writing to a delalloc region in the data fork, commit the new > allocations (of the da reservation) as unwritten so that the mappings > are only marked written once writeback completes successfully. This > fixes the problem of stale data exposure if the system goes down during > targeted writeback of a specific region of a file, as tested by > generic/042. > We could probably add generic/042 into the auto group once this patch lands. > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/libxfs/xfs_bmap.c | 28 +++++++++++++++++----------- > 1 file changed, 17 insertions(+), 11 deletions(-) > > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > index fda13cd7add0..825d170e1503 100644 > --- a/fs/xfs/libxfs/xfs_bmap.c > +++ b/fs/xfs/libxfs/xfs_bmap.c ... > @@ -4611,8 +4601,24 @@ xfs_bmapi_convert_delalloc( > bma.offset = bma.got.br_startoff; > bma.length = max_t(xfs_filblks_t, bma.got.br_blockcount, MAXEXTLEN); > bma.minleft = xfs_bmapi_minleft(tp, ip, whichfork); > + > + /* > + * When we're converting the delalloc reservations backing dirty pages > + * in the page cache, we must be careful about how we create the new > + * extents: > + * > + * New CoW fork extents are created unwritten, turned into real extents > + * when we're about to write the data to disk, and mapped into the data > + * fork after the write finishes. End of story. > + * > + * New data fork extents must be mapped in as unwritten and converted > + * to real extents after the write succeeds to avoid exposing stale > + * disk contents if we crash. > + */ > if (whichfork == XFS_COW_FORK) > bma.flags = XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC; > + else > + bma.flags = XFS_BMAPI_PREALLOC; The following seems a bit cleaner: bma.flags = XFS_BMAPI_PREALLOC; if (whichfork == XFS_COW_FORK) bma.flags |= XFS_BMAPI_COWFORK; ... but nit aside, LGTM: Reviewed-by: Brian Foster <bfoster@redhat.com> > > if (!xfs_iext_peek_prev_extent(ifp, &bma.icur, &bma.prev)) > bma.prev.br_startoff = NULLFILEOFF; >
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index fda13cd7add0..825d170e1503 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4193,17 +4193,7 @@ xfs_bmapi_allocate( bma->got.br_blockcount = bma->length; bma->got.br_state = XFS_EXT_NORM; - /* - * In the data fork, a wasdelay extent has been initialized, so - * shouldn't be flagged as unwritten. - * - * For the cow fork, however, we convert delalloc reservations - * (extents allocated for speculative preallocation) to - * allocated unwritten extents, and only convert the unwritten - * extents to real extents when we're about to write the data. - */ - if ((!bma->wasdel || (bma->flags & XFS_BMAPI_COWFORK)) && - (bma->flags & XFS_BMAPI_PREALLOC)) + if (bma->flags & XFS_BMAPI_PREALLOC) bma->got.br_state = XFS_EXT_UNWRITTEN; if (bma->wasdel) @@ -4611,8 +4601,24 @@ xfs_bmapi_convert_delalloc( bma.offset = bma.got.br_startoff; bma.length = max_t(xfs_filblks_t, bma.got.br_blockcount, MAXEXTLEN); bma.minleft = xfs_bmapi_minleft(tp, ip, whichfork); + + /* + * When we're converting the delalloc reservations backing dirty pages + * in the page cache, we must be careful about how we create the new + * extents: + * + * New CoW fork extents are created unwritten, turned into real extents + * when we're about to write the data to disk, and mapped into the data + * fork after the write finishes. End of story. + * + * New data fork extents must be mapped in as unwritten and converted + * to real extents after the write succeeds to avoid exposing stale + * disk contents if we crash. + */ if (whichfork == XFS_COW_FORK) bma.flags = XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC; + else + bma.flags = XFS_BMAPI_PREALLOC; if (!xfs_iext_peek_prev_extent(ifp, &bma.icur, &bma.prev)) bma.prev.br_startoff = NULLFILEOFF;