Message ID | 20230404145319.2057051-1-aalbersh@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | fs-verity support for XFS | expand |
On Tue, Apr 04, 2023 at 04:52:56PM +0200, Andrey Albershteyn wrote: > Hi all, > > This is V2 of fs-verity support in XFS. In this series I did > numerous changes from V1 which are described below. > > This patchset introduces fs-verity [5] support for XFS. This > implementation utilizes extended attributes to store fs-verity > metadata. The Merkle tree blocks are stored in the remote extended > attributes. > > A few key points: > - fs-verity metadata is stored in extended attributes > - Direct path and DAX are disabled for inodes with fs-verity > - Pages are verified in iomap's read IO path (offloaded to > workqueue) > - New workqueue for verification processing > - New ro-compat flag > - Inodes with fs-verity have new on-disk diflag > - xfs_attr_get() can return buffer with the attribute > > The patchset is tested with xfstests -g auto on xfs_1k, xfs_4k, > xfs_1k_quota, and xfs_4k_quota. Haven't found any major failures. > > Patches [6/23] and [7/23] touch ext4, f2fs, btrfs, and patch [8/23] > touches erofs, gfs2, and zonefs. > > The patchset consist of four parts: > - [1..4]: Patches from Parent Pointer patchset which add binary > xattr names with a few deps > - [5..7]: Improvements to core fs-verity > - [8..9]: Add read path verification to iomap > - [10..23]: Integration of fs-verity to xfs > > Changes from V1: > - Added parent pointer patches for easier testing > - Many issues and refactoring points fixed from the V1 review > - Adjusted for recent changes in fs-verity core (folios, non-4k) > - Dropped disabling of large folios > - Completely new fsverity patches (fix, callout, log_blocksize) > - Change approach to verification in iomap to the same one as in > write path. Callouts to fs instead of direct fs-verity use. > - New XFS workqueue for post read folio verification > - xfs_attr_get() can return underlying xfs_buf > - xfs_bufs are marked with XBF_VERITY_CHECKED to track verified > blocks > > kernel: > [1]: https://github.com/alberand/linux/tree/xfs-verity-v2 > > xfsprogs: > [2]: https://github.com/alberand/xfsprogs/tree/fsverity-v2 Will there any means for xfs_repair to check the merkle tree contents? Should it clear the ondisk inode flag if it decides to trash the xattr structure, or is it ok to let the kernel deal with flag set and no verity data? --D > xfstests: > [3]: https://github.com/alberand/xfstests/tree/fsverity-v2 > > v1: > [4]: https://lore.kernel.org/linux-xfs/20221213172935.680971-1-aalbersh@redhat.com/ > > fs-verity: > [5]: https://www.kernel.org/doc/html/latest/filesystems/fsverity.html > > Thanks, > Andrey > > Allison Henderson (4): > xfs: Add new name to attri/d > xfs: add parent pointer support to attribute code > xfs: define parent pointer xattr format > xfs: Add xfs_verify_pptr > > Andrey Albershteyn (19): > fsverity: make fsverity_verify_folio() accept folio's offset and size > fsverity: add drop_page() callout > fsverity: pass Merkle tree block size to ->read_merkle_tree_page() > iomap: hoist iomap_readpage_ctx from the iomap_readahead/_folio > iomap: allow filesystem to implement read path verification > xfs: add XBF_VERITY_CHECKED xfs_buf flag > xfs: add XFS_DA_OP_BUFFER to make xfs_attr_get() return buffer > xfs: introduce workqueue for post read IO work > xfs: add iomap's readpage operations > xfs: add attribute type for fs-verity > xfs: add fs-verity ro-compat flag > xfs: add inode on-disk VERITY flag > xfs: initialize fs-verity on file open and cleanup on inode > destruction > xfs: don't allow to enable DAX on fs-verity sealsed inode > xfs: disable direct read path for fs-verity sealed files > xfs: add fs-verity support > xfs: handle merkle tree block size != fs blocksize != PAGE_SIZE > xfs: add fs-verity ioctls > xfs: enable ro-compat fs-verity flag > > fs/btrfs/verity.c | 15 +- > fs/erofs/data.c | 12 +- > fs/ext4/verity.c | 9 +- > fs/f2fs/verity.c | 9 +- > fs/gfs2/aops.c | 10 +- > fs/ioctl.c | 4 + > fs/iomap/buffered-io.c | 89 ++++++----- > fs/verity/read_metadata.c | 7 +- > fs/verity/verify.c | 9 +- > fs/xfs/Makefile | 1 + > fs/xfs/libxfs/xfs_attr.c | 81 +++++++++- > fs/xfs/libxfs/xfs_attr.h | 7 +- > fs/xfs/libxfs/xfs_attr_leaf.c | 7 + > fs/xfs/libxfs/xfs_attr_remote.c | 13 +- > fs/xfs/libxfs/xfs_da_btree.h | 7 +- > fs/xfs/libxfs/xfs_da_format.h | 46 +++++- > fs/xfs/libxfs/xfs_format.h | 14 +- > fs/xfs/libxfs/xfs_log_format.h | 8 +- > fs/xfs/libxfs/xfs_sb.c | 2 + > fs/xfs/scrub/attr.c | 4 +- > fs/xfs/xfs_aops.c | 61 +++++++- > fs/xfs/xfs_attr_item.c | 142 +++++++++++++++--- > fs/xfs/xfs_attr_item.h | 1 + > fs/xfs/xfs_attr_list.c | 17 ++- > fs/xfs/xfs_buf.h | 17 ++- > fs/xfs/xfs_file.c | 22 ++- > fs/xfs/xfs_inode.c | 2 + > fs/xfs/xfs_inode.h | 3 +- > fs/xfs/xfs_ioctl.c | 22 +++ > fs/xfs/xfs_iomap.c | 14 ++ > fs/xfs/xfs_iops.c | 4 + > fs/xfs/xfs_linux.h | 1 + > fs/xfs/xfs_mount.h | 3 + > fs/xfs/xfs_ondisk.h | 4 + > fs/xfs/xfs_super.c | 19 +++ > fs/xfs/xfs_trace.h | 1 + > fs/xfs/xfs_verity.c | 256 ++++++++++++++++++++++++++++++++ > fs/xfs/xfs_verity.h | 27 ++++ > fs/xfs/xfs_xattr.c | 9 ++ > fs/zonefs/file.c | 12 +- > include/linux/fsverity.h | 18 ++- > include/linux/iomap.h | 39 ++++- > include/uapi/linux/fs.h | 1 + > 43 files changed, 923 insertions(+), 126 deletions(-) > create mode 100644 fs/xfs/xfs_verity.c > create mode 100644 fs/xfs/xfs_verity.h > > -- > 2.38.4 >
On Tue, Apr 04, 2023 at 04:52:56PM +0200, Andrey Albershteyn wrote: > The patchset is tested with xfstests -g auto on xfs_1k, xfs_4k, > xfs_1k_quota, and xfs_4k_quota. Haven't found any major failures. Just to double check, did you verify that the tests in the "verity" group are running, and were not skipped? - Eric
On Tue, Apr 04, 2023 at 04:37:13PM -0700, Eric Biggers wrote: > On Tue, Apr 04, 2023 at 04:52:56PM +0200, Andrey Albershteyn wrote: > > The patchset is tested with xfstests -g auto on xfs_1k, xfs_4k, > > xfs_1k_quota, and xfs_4k_quota. Haven't found any major failures. > > Just to double check, did you verify that the tests in the "verity" group are > running, and were not skipped? Yes, the linked xfstests in cover-letter has patch enabling xfs (xfsprogs also needed). > > - Eric >
Hi Darrick, On Tue, Apr 04, 2023 at 09:39:42AM -0700, Darrick J. Wong wrote: > On Tue, Apr 04, 2023 at 04:52:56PM +0200, Andrey Albershteyn wrote: > > Hi all, > > > > This is V2 of fs-verity support in XFS. In this series I did > > numerous changes from V1 which are described below. > > > > This patchset introduces fs-verity [5] support for XFS. This > > implementation utilizes extended attributes to store fs-verity > > metadata. The Merkle tree blocks are stored in the remote extended > > attributes. > > > > A few key points: > > - fs-verity metadata is stored in extended attributes > > - Direct path and DAX are disabled for inodes with fs-verity > > - Pages are verified in iomap's read IO path (offloaded to > > workqueue) > > - New workqueue for verification processing > > - New ro-compat flag > > - Inodes with fs-verity have new on-disk diflag > > - xfs_attr_get() can return buffer with the attribute > > > > The patchset is tested with xfstests -g auto on xfs_1k, xfs_4k, > > xfs_1k_quota, and xfs_4k_quota. Haven't found any major failures. > > > > Patches [6/23] and [7/23] touch ext4, f2fs, btrfs, and patch [8/23] > > touches erofs, gfs2, and zonefs. > > > > The patchset consist of four parts: > > - [1..4]: Patches from Parent Pointer patchset which add binary > > xattr names with a few deps > > - [5..7]: Improvements to core fs-verity > > - [8..9]: Add read path verification to iomap > > - [10..23]: Integration of fs-verity to xfs > > > > Changes from V1: > > - Added parent pointer patches for easier testing > > - Many issues and refactoring points fixed from the V1 review > > - Adjusted for recent changes in fs-verity core (folios, non-4k) > > - Dropped disabling of large folios > > - Completely new fsverity patches (fix, callout, log_blocksize) > > - Change approach to verification in iomap to the same one as in > > write path. Callouts to fs instead of direct fs-verity use. > > - New XFS workqueue for post read folio verification > > - xfs_attr_get() can return underlying xfs_buf > > - xfs_bufs are marked with XBF_VERITY_CHECKED to track verified > > blocks > > > > kernel: > > [1]: https://github.com/alberand/linux/tree/xfs-verity-v2 > > > > xfsprogs: > > [2]: https://github.com/alberand/xfsprogs/tree/fsverity-v2 > > Will there any means for xfs_repair to check the merkle tree contents? > Should it clear the ondisk inode flag if it decides to trash the xattr > structure, or is it ok to let the kernel deal with flag set and no > verity data? The fsverity-util can calculate merkle tree offline, so, it's possible for xfs_repair to do the same and compare, also it can check that all merkle tree blocks are there. The flag without tree is probably bad as all reading ops will fail and it won't be possible to regenerate the tree (enable also checks for flag).
Dave is going to hate me for this, but.. I've been looking over some of the interfaces here, and I'm starting to very seriously questioning the design decisions of storing the fsverity hashes in xattrs. Yes, storing them beyond i_size in the file is a bit of a hack, but it allows to reuse a lot of the existing infrastructure, and much of fsverity is based around it. So storing them in an xattrs causes a lot of churn in the interface. And the XFS side with special casing xattr indices also seems not exactly nice.
On Mon, Apr 10, 2023 at 10:19:46PM -0700, Christoph Hellwig wrote: > Dave is going to hate me for this, but.. > > I've been looking over some of the interfaces here, and I'm starting > to very seriously questioning the design decisions of storing the > fsverity hashes in xattrs. > > Yes, storing them beyond i_size in the file is a bit of a hack, but > it allows to reuse a lot of the existing infrastructure, and much > of fsverity is based around it. So storing them in an xattrs causes > a lot of churn in the interface. And the XFS side with special > casing xattr indices also seems not exactly nice. It seems it's really just the Merkle tree caching interface that is causing problems, as it's currently too closely tied to the page cache? That is just an implementation detail that could be reworked along the lines of what is being discussed. But anyway, it is up to the XFS folks. Keep in mind there is also the option of doing what btrfs is doing, where it stores the Merkle tree separately from the file data stream, but caches it past i_size in the page cache at runtime. I guess there is also the issue of encryption, which hasn't come up yet since we're talking about fsverity support only. The Merkle tree (including the fsverity_descriptor) is supposed to be encrypted, just like the file contents are. Having it be stored after the file contents accomplishes that easily... Of course, it doesn't have to be that way; a separate key could be derived, or the Merkle tree blocks could be encrypted with the file contents key using indices past i_size, without them physically being stored in the data stream. - Eric
On Tue, Apr 11, 2023 at 07:33:19PM -0700, Eric Biggers wrote: > On Mon, Apr 10, 2023 at 10:19:46PM -0700, Christoph Hellwig wrote: > > Dave is going to hate me for this, but.. > > > > I've been looking over some of the interfaces here, and I'm starting > > to very seriously questioning the design decisions of storing the > > fsverity hashes in xattrs. > > > > Yes, storing them beyond i_size in the file is a bit of a hack, but > > it allows to reuse a lot of the existing infrastructure, and much > > of fsverity is based around it. So storing them in an xattrs causes > > a lot of churn in the interface. And the XFS side with special > > casing xattr indices also seems not exactly nice. > > It seems it's really just the Merkle tree caching interface that is causing > problems, as it's currently too closely tied to the page cache? That is just an > implementation detail that could be reworked along the lines of what is being > discussed. > > But anyway, it is up to the XFS folks. Keep in mind there is also the option of > doing what btrfs is doing, where it stores the Merkle tree separately from the > file data stream, but caches it past i_size in the page cache at runtime. Right. It's not entirely simple to store metadata on disk beyond EOF in XFS because of all the assumptions throughout the IO path and allocator interfaces that it can allocate space beyond EOF at will and something else will clean it up later if it is not needed. This impacts on truncate, delayed allocation, writeback, IO completion, EOF block removal on file close, background garbage collection, ENOSPC/EDQUOT driven space freeing, etc. Some of these things cross over into iomap infrastructure, too. AFAIC, it's far more intricate, complex and risky to try to store merkle tree data beyond EOF than it is to put it in an xattr namespace because IO path EOF handling bugs result in user data corruption. This happens over and over again, no matter how careful we are about these aspects of user data handling. OTOH, putting the merkle tree data in a different namespace avoids these issues completely. Yes, we now have to solve an API mismatch, but we aren't risking the addition of IO path data corruption bugs to every non-fsverity filesystem in production... Hence I think copying the btrfs approach (i.e. only caching the merkle tree data in the page cache beyond EOF) would be as far as I think we'd want to go. Realistically, there would be little practical difference between btrfs storing the merkle tree blocks in a separate internal btree and XFS storing them in an internal private xattr btree namespace. I would, however, prefer not to have to do this at all if we could simply map the blocks directly out of the xattr buffers as we already do internally for all the XFS code... > I guess there is also the issue of encryption, which hasn't come up yet since > we're talking about fsverity support only. The Merkle tree (including the > fsverity_descriptor) is supposed to be encrypted, just like the file contents > are. Having it be stored after the file contents accomplishes that easily... > Of course, it doesn't have to be that way; a separate key could be derived, or > the Merkle tree blocks could be encrypted with the file contents key using > indices past i_size, without them physically being stored in the data stream. I'm expecting that fscrypt for XFS will include encryption of the xattr names and values (just like we will need to do for directory names) except for the xattrs that hold the encryption keys themselves. That means the merkle tree blocks should get encrypted without any extra work needing to be done anywhere. This will simply require the fscrypt keys to be held in a private internal xattr namespace that isn't encrypted, but that's realtively trivial to do... Cheers, Dave.
On Tue, Apr 11, 2023 at 07:33:19PM -0700, Eric Biggers wrote: > It seems it's really just the Merkle tree caching interface that is causing > problems, as it's currently too closely tied to the page cache? That is just an > implementation detail that could be reworked along the lines of what is being > discussed. Well, that and some of the XFS internal changes that seem a bit ugly. But it's not only very much tied to the page cache, but also to page aligned data, which is really part of the problem. > But anyway, it is up to the XFS folks. Keep in mind there is also the option of > doing what btrfs is doing, where it stores the Merkle tree separately from the > file data stream, but caches it past i_size in the page cache at runtime. That seems to be the worst of both worlds. > I guess there is also the issue of encryption, which hasn't come up yet since > we're talking about fsverity support only. The Merkle tree (including the > fsverity_descriptor) is supposed to be encrypted, just like the file contents > are. Having it be stored after the file contents accomplishes that easily... > Of course, it doesn't have to be that way; a separate key could be derived, or > the Merkle tree blocks could be encrypted with the file contents key using > indices past i_size, without them physically being stored in the data stream. xattrs contents better be encrypted as well, fsverity or not.
On Wed, Apr 12, 2023 at 01:18:26PM +1000, Dave Chinner wrote: > Right. It's not entirely simple to store metadata on disk beyond EOF > in XFS because of all the assumptions throughout the IO path and > allocator interfaces that it can allocate space beyond EOF at will > and something else will clean it up later if it is not needed. This > impacts on truncate, delayed allocation, writeback, IO completion, > EOF block removal on file close, background garbage collection, > ENOSPC/EDQUOT driven space freeing, etc. Some of these things cross > over into iomap infrastructure, too. To me that actually makes it easier to support the metadata beyond i_size. Remember that the file is immutable after add fsverity hash is added. So basically we just need to skip freeing the eofblocks if that flag is set.