Btrfs: fix wrong dentries after fsync of file that got its parent replaced
diff mbox series

Message ID 20181009140529.21196-1-fdmanana@kernel.org
State New
Headers show
Series
  • Btrfs: fix wrong dentries after fsync of file that got its parent replaced
Related show

Commit Message

Filipe Manana Oct. 9, 2018, 2:05 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

In a scenario like the following:

  mkdir /mnt/A               # inode 258
  mkdir /mnt/B               # inode 259
  touch /mnt/B/bar           # inode 260

  sync

  mv /mnt/B/bar /mnt/A/bar
  mv -T /mnt/A /mnt/B
  fsync /mnt/B/bar

  <power fail>

After replaying the log we end up with file bar having 2 hard links, both
with the name 'bar' and one in the directory with inode number 258 and the
other in the directory with inode number 259. Also, we end up with the
directory inode 259 still existing and with the directory inode 258 still
named as 'A', instead of 'B'. In this scenario, file 'bar' should only
have one hard link, located at directory inode 258, the directory inode
259 should not exist anymore and the name for directory inode 258 should
be 'B'.

This incorrect behaviour happens because when attempting to log the old
parents of an inode, we skip any parents that no longer exist. Fix this
by forcing a full commit if an old parent no longer exists.

A test case for fstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/tree-log.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

Comments

David Sterba Oct. 9, 2018, 4:18 p.m. UTC | #1
On Tue, Oct 09, 2018 at 03:05:29PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> In a scenario like the following:
> 
>   mkdir /mnt/A               # inode 258
>   mkdir /mnt/B               # inode 259
>   touch /mnt/B/bar           # inode 260
> 
>   sync
> 
>   mv /mnt/B/bar /mnt/A/bar
>   mv -T /mnt/A /mnt/B
>   fsync /mnt/B/bar
> 
>   <power fail>
> 
> After replaying the log we end up with file bar having 2 hard links, both
> with the name 'bar' and one in the directory with inode number 258 and the
> other in the directory with inode number 259. Also, we end up with the
> directory inode 259 still existing and with the directory inode 258 still
> named as 'A', instead of 'B'. In this scenario, file 'bar' should only
> have one hard link, located at directory inode 258, the directory inode
> 259 should not exist anymore and the name for directory inode 258 should
> be 'B'.
> 
> This incorrect behaviour happens because when attempting to log the old
> parents of an inode, we skip any parents that no longer exist. Fix this
> by forcing a full commit if an old parent no longer exists.
> 
> A test case for fstests follows soon.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Thanks, queued for 4.20 with CC: stable 4.4+ .

Patch
diff mbox series

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 5e83991eb064..eeeeed455dba 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -5580,9 +5580,33 @@  static int btrfs_log_all_parents(struct btrfs_trans_handle *trans,
 
 			dir_inode = btrfs_iget(fs_info->sb, &inode_key,
 					       root, NULL);
-			/* If parent inode was deleted, skip it. */
-			if (IS_ERR(dir_inode))
-				continue;
+			/*
+			 * If the parent inode was deleted, return an error to
+			 * fallback to a transaction commit. This is to prevent
+			 * getting an inode that was moved from one parent A to
+			 * a parent B, got its former parent A deleted and then
+			 * it got fsync'ed, from existing at both parents after
+			 * a log replay (and the old parent still existing).
+			 * Example:
+			 *
+			 * mkdir /mnt/A
+			 * mkdir /mnt/B
+			 * touch /mnt/B/bar
+			 * sync
+			 * mv /mnt/B/bar /mnt/A/bar
+			 * mv -T /mnt/A /mnt/B
+			 * fsync /mnt/B/bar
+			 * <power fail>
+			 *
+			 * If we ignore the old parent B which got deleted,
+			 * after a log replay we would have file bar linked
+			 * at both parents and the old parent B would still
+			 * exist.
+			 */
+			if (IS_ERR(dir_inode)) {
+				ret = PTR_ERR(dir_inode);
+				goto out;
+			}
 
 			if (ctx)
 				ctx->log_new_dentries = false;