diff mbox series

[v4,5/5] btrfs: verity metadata orphan items

Message ID 8e7e0d3dd84f729d86e7f1a466fe8828f0e7ba58.1620241221.git.boris@bur.io (mailing list archive)
State Superseded
Headers show
Series btrfs: support fsverity | expand

Commit Message

Boris Burkov May 5, 2021, 7:20 p.m. UTC
If we don't finish creating fsverity metadata for a file, or fail to
clean up already created metadata after a failure, we could leak the
verity items.

To address this issue, we use the orphan mechanism. When we start
enabling verity on a file, we also add an orphan item for that inode.
When we are finished, we delete the orphan. However, if we are
interrupted midway, the orphan will be present at mount and we can
cleanup the half-formed verity state.

There is a possible race with a normal unlink operation: if unlink and
verity run on the same file in parallel, it is possible for verity to
succeed and delete the still legitimate orphan added by unlink. Then, if
we are interrupted and mount in that state, we will never clean up the
inode properly. This is also possible for a file created with O_TMPFILE.
Check nlink==0 before deleting to avoid this race.

A final thing to note is that this is a resurrection of using orphans to
signal orphaned metadata that isn't the inode itself. This makes the
comment discussing deprecating that concept a bit messy in full context.

Signed-off-by: Boris Burkov <boris@bur.io>
---
 fs/btrfs/inode.c  | 15 +++++++--
 fs/btrfs/verity.c | 79 ++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 87 insertions(+), 7 deletions(-)

Comments

David Sterba May 12, 2021, 5:48 p.m. UTC | #1
On Wed, May 05, 2021 at 12:20:43PM -0700, Boris Burkov wrote:
> +/*
> + * Helper to manage the transaction for adding an orphan item.
> + */
> +static int add_orphan(struct btrfs_inode *inode)

I wonder if this helper is useful, it's used only once and the code is
not long. Simply wrapping btrfs_orphan_add into a transaction is short
enough to be in btrfs_begin_enable_verity.

> +{
> +	struct btrfs_trans_handle *trans;
> +	struct btrfs_root *root = inode->root;
> +	int ret = 0;
> +
> +	trans = btrfs_start_transaction(root, 1);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		goto out;
> +	}
> +	ret = btrfs_orphan_add(trans, inode);
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);
> +		goto out;
> +	}
> +	btrfs_end_transaction(trans);
> +
> +out:
> +	return ret;
> +}
> +
> +/*
> + * Helper to manage the transaction for deleting an orphan item.
> + */
> +static int del_orphan(struct btrfs_inode *inode)

Same here.

> +{
> +	struct btrfs_trans_handle *trans;
> +	struct btrfs_root *root = inode->root;
> +	int ret;
> +
> +	/*
> +	 * If the inode has no links, it is either already unlinked, or was
> +	 * created with O_TMPFILE. In either case, it should have an orphan from
> +	 * that other operation. Rather than reference count the orphans, we
> +	 * simply ignore them here, because we only invoke the verity path in
> +	 * the orphan logic when i_nlink is 0.
> +	 */
> +	if (!inode->vfs_inode.i_nlink)
> +		return 0;
> +
> +	trans = btrfs_start_transaction(root, 1);
> +	if (IS_ERR(trans))
> +		return PTR_ERR(trans);
> +
> +	ret = btrfs_del_orphan_item(trans, root, btrfs_ino(inode));
> +	if (ret) {
> +		btrfs_abort_transaction(trans, ret);
> +		return ret;
> +	}
> +
> +	btrfs_end_transaction(trans);
> +	return ret;
> +}
Boris Burkov May 12, 2021, 6:08 p.m. UTC | #2
On Wed, May 12, 2021 at 07:48:27PM +0200, David Sterba wrote:
> On Wed, May 05, 2021 at 12:20:43PM -0700, Boris Burkov wrote:
> > +/*
> > + * Helper to manage the transaction for adding an orphan item.
> > + */
> > +static int add_orphan(struct btrfs_inode *inode)
> 
> I wonder if this helper is useful, it's used only once and the code is
> not long. Simply wrapping btrfs_orphan_add into a transaction is short
> enough to be in btrfs_begin_enable_verity.
> 

I agree that just the plain transaction logic is not a big deal, and I
couldn't figure out how to phrase the comment so I left it at that,
which is unhelpful.

With that said, I found that pulling it out into a helper function
significantly reduced the gross-ness of the error handling in the
callsites. Especially for del_orphan in end verity which tries to
handle failures deleting the orphans, which quickly got tangled up with
other errors in the function and the possible transaction errors.

Honestly, I was surprised just how much it helped, and couldn't really
figure out why. If a helper being really beneficial is abnormal, I can
try again to figure out a clean way to write the code with the
transaction in-line.

> > +{
> > +	struct btrfs_trans_handle *trans;
> > +	struct btrfs_root *root = inode->root;
> > +	int ret = 0;
> > +
> > +	trans = btrfs_start_transaction(root, 1);
> > +	if (IS_ERR(trans)) {
> > +		ret = PTR_ERR(trans);
> > +		goto out;
> > +	}
> > +	ret = btrfs_orphan_add(trans, inode);
> > +	if (ret) {
> > +		btrfs_abort_transaction(trans, ret);
> > +		goto out;
> > +	}
> > +	btrfs_end_transaction(trans);
> > +
> > +out:
> > +	return ret;
> > +}
> > +
> > +/*
> > + * Helper to manage the transaction for deleting an orphan item.
> > + */
> > +static int del_orphan(struct btrfs_inode *inode)
> 
> Same here.

My comment is dumb again, but the nlink check does make this function
marginally more useful for re-use/correctness.

> 
> > +{
> > +	struct btrfs_trans_handle *trans;
> > +	struct btrfs_root *root = inode->root;
> > +	int ret;
> > +
> > +	/*
> > +	 * If the inode has no links, it is either already unlinked, or was
> > +	 * created with O_TMPFILE. In either case, it should have an orphan from
> > +	 * that other operation. Rather than reference count the orphans, we
> > +	 * simply ignore them here, because we only invoke the verity path in
> > +	 * the orphan logic when i_nlink is 0.
> > +	 */
> > +	if (!inode->vfs_inode.i_nlink)
> > +		return 0;
> > +
> > +	trans = btrfs_start_transaction(root, 1);
> > +	if (IS_ERR(trans))
> > +		return PTR_ERR(trans);
> > +
> > +	ret = btrfs_del_orphan_item(trans, root, btrfs_ino(inode));
> > +	if (ret) {
> > +		btrfs_abort_transaction(trans, ret);
> > +		return ret;
> > +	}
> > +
> > +	btrfs_end_transaction(trans);
> > +	return ret;
> > +}
David Sterba May 12, 2021, 11:36 p.m. UTC | #3
On Wed, May 12, 2021 at 11:08:57AM -0700, Boris Burkov wrote:
> On Wed, May 12, 2021 at 07:48:27PM +0200, David Sterba wrote:
> > On Wed, May 05, 2021 at 12:20:43PM -0700, Boris Burkov wrote:
> > > +/*
> > > + * Helper to manage the transaction for adding an orphan item.
> > > + */
> > > +static int add_orphan(struct btrfs_inode *inode)
> > 
> > I wonder if this helper is useful, it's used only once and the code is
> > not long. Simply wrapping btrfs_orphan_add into a transaction is short
> > enough to be in btrfs_begin_enable_verity.
> 
> I agree that just the plain transaction logic is not a big deal, and I
> couldn't figure out how to phrase the comment so I left it at that,
> which is unhelpful.
> 
> With that said, I found that pulling it out into a helper function
> significantly reduced the gross-ness of the error handling in the
> callsites. Especially for del_orphan in end verity which tries to
> handle failures deleting the orphans, which quickly got tangled up with
> other errors in the function and the possible transaction errors.
> 
> Honestly, I was surprised just how much it helped, and couldn't really
> figure out why. If a helper being really beneficial is abnormal, I can
> try again to figure out a clean way to write the code with the
> transaction in-line.

This gives me an impression that the helper in your view helps
readability and that's something I'm fine with. In the past we got
cleanups that remove one time helpers so I'm affected by that. Also the
helpers hide some details like the transaction start that could be
considered heavy so the helper kind of obscures that. But there's
another aspect, again readability, "do that and the caller does not need
to care", and when the helper is static in the same file it's easy to
look up and not a big deal.

> > > +{
> > > +	struct btrfs_trans_handle *trans;
> > > +	struct btrfs_root *root = inode->root;
> > > +	int ret = 0;
> > > +
> > > +	trans = btrfs_start_transaction(root, 1);
> > > +	if (IS_ERR(trans)) {
> > > +		ret = PTR_ERR(trans);
> > > +		goto out;
> > > +	}
> > > +	ret = btrfs_orphan_add(trans, inode);
> > > +	if (ret) {
> > > +		btrfs_abort_transaction(trans, ret);
> > > +		goto out;
> > > +	}
> > > +	btrfs_end_transaction(trans);
> > > +
> > > +out:
> > > +	return ret;
> > > +}
> > > +
> > > +/*
> > > + * Helper to manage the transaction for deleting an orphan item.
> > > + */
> > > +static int del_orphan(struct btrfs_inode *inode)
> > 
> > Same here.
> 
> My comment is dumb again, but the nlink check does make this function
> marginally more useful for re-use/correctness.

I don't think it's dumb, the nlink check is one line with several lines
of comment explaining and also described in the changelog as a corner
case and it's not obvious. For that reason a helper is fine and let's
keep the helpers as they are, so it's consistent. It's just when I'm
reading the code I'm questioning everything but it does not mean that
all of that needs to be done the way I see it, in the comments I'm
just exploring the possibility to do so.
diff mbox series

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 1b1101369777..67eba8db4b65 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3419,7 +3419,9 @@  int btrfs_orphan_cleanup(struct btrfs_root *root)
 
 		/*
 		 * If we have an inode with links, there are a couple of
-		 * possibilities. Old kernels (before v3.12) used to create an
+		 * possibilities:
+		 *
+		 * 1. Old kernels (before v3.12) used to create an
 		 * orphan item for truncate indicating that there were possibly
 		 * extent items past i_size that needed to be deleted. In v3.12,
 		 * truncate was changed to update i_size in sync with the extent
@@ -3432,13 +3434,22 @@  int btrfs_orphan_cleanup(struct btrfs_root *root)
 		 * slim, and it's a pain to do the truncate now, so just delete
 		 * the orphan item.
 		 *
+		 * 2. We were halfway through creating fsverity metadata for the
+		 * file. In that case, the orphan item represents incomplete
+		 * fsverity metadata which must be cleaned up with
+		 * btrfs_drop_verity_items.
+		 *
 		 * It's also possible that this orphan item was supposed to be
 		 * deleted but wasn't. The inode number may have been reused,
 		 * but either way, we can delete the orphan item.
 		 */
 		if (ret == -ENOENT || inode->i_nlink) {
-			if (!ret)
+			if (!ret) {
+				ret = btrfs_drop_verity_items(BTRFS_I(inode));
 				iput(inode);
+				if (ret)
+					goto out;
+			}
 			trans = btrfs_start_transaction(root, 1);
 			if (IS_ERR(trans)) {
 				ret = PTR_ERR(trans);
diff --git a/fs/btrfs/verity.c b/fs/btrfs/verity.c
index feaf5908b3d3..3a115cdca018 100644
--- a/fs/btrfs/verity.c
+++ b/fs/btrfs/verity.c
@@ -362,6 +362,64 @@  static ssize_t read_key_bytes(struct btrfs_inode *inode, u8 key_type, u64 offset
 	return ret;
 }
 
+/*
+ * Helper to manage the transaction for adding an orphan item.
+ */
+static int add_orphan(struct btrfs_inode *inode)
+{
+	struct btrfs_trans_handle *trans;
+	struct btrfs_root *root = inode->root;
+	int ret = 0;
+
+	trans = btrfs_start_transaction(root, 1);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		goto out;
+	}
+	ret = btrfs_orphan_add(trans, inode);
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		goto out;
+	}
+	btrfs_end_transaction(trans);
+
+out:
+	return ret;
+}
+
+/*
+ * Helper to manage the transaction for deleting an orphan item.
+ */
+static int del_orphan(struct btrfs_inode *inode)
+{
+	struct btrfs_trans_handle *trans;
+	struct btrfs_root *root = inode->root;
+	int ret;
+
+	/*
+	 * If the inode has no links, it is either already unlinked, or was
+	 * created with O_TMPFILE. In either case, it should have an orphan from
+	 * that other operation. Rather than reference count the orphans, we
+	 * simply ignore them here, because we only invoke the verity path in
+	 * the orphan logic when i_nlink is 0.
+	 */
+	if (!inode->vfs_inode.i_nlink)
+		return 0;
+
+	trans = btrfs_start_transaction(root, 1);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+
+	ret = btrfs_del_orphan_item(trans, root, btrfs_ino(inode));
+	if (ret) {
+		btrfs_abort_transaction(trans, ret);
+		return ret;
+	}
+
+	btrfs_end_transaction(trans);
+	return ret;
+}
+
 /*
  * Drop verity items from the btree and from the page cache
  *
@@ -399,11 +457,12 @@  static int btrfs_begin_enable_verity(struct file *filp)
 		return -EBUSY;
 
 	set_bit(BTRFS_INODE_VERITY_IN_PROGRESS, &BTRFS_I(inode)->runtime_flags);
-	ret = drop_verity_items(BTRFS_I(inode), BTRFS_VERITY_DESC_ITEM_KEY);
+
+	ret = btrfs_drop_verity_items(BTRFS_I(inode));
 	if (ret)
 		goto err;
 
-	ret = drop_verity_items(BTRFS_I(inode), BTRFS_VERITY_MERKLE_ITEM_KEY);
+	ret = add_orphan(BTRFS_I(inode));
 	if (ret)
 		goto err;
 
@@ -430,6 +489,7 @@  static int btrfs_end_enable_verity(struct file *filp, const void *desc,
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_verity_descriptor_item item;
 	int ret;
+	int keep_orphan = 0;
 
 	if (desc != NULL) {
 		/* write out the descriptor item */
@@ -461,11 +521,20 @@  static int btrfs_end_enable_verity(struct file *filp, const void *desc,
 
 out:
 	if (desc == NULL || ret) {
-		/* If we failed, drop all the verity items */
-		drop_verity_items(BTRFS_I(inode), BTRFS_VERITY_DESC_ITEM_KEY);
-		drop_verity_items(BTRFS_I(inode), BTRFS_VERITY_MERKLE_ITEM_KEY);
+		/*
+		 * If verity failed (here or in the generic code), drop all the
+		 * verity items.
+		 */
+		keep_orphan = btrfs_drop_verity_items(BTRFS_I(inode));
 	} else
 		btrfs_set_fs_compat_ro(root->fs_info, VERITY);
+	/*
+	 * If we are handling an error, but failed to drop the verity items,
+	 * we still need the orphan.
+	 */
+	if (!keep_orphan)
+		del_orphan(BTRFS_I(inode));
+
 	clear_bit(BTRFS_INODE_VERITY_IN_PROGRESS, &BTRFS_I(inode)->runtime_flags);
 	return ret;
 }