diff mbox

btrfs: enhance superblock checks

Message ID 1362585400-13379-1-git-send-email-dsterba@suse.cz (mailing list archive)
State New, archived
Headers show

Commit Message

David Sterba March 6, 2013, 3:56 p.m. UTC
The superblock checksum is not verified upon mount. <awkward silence>

Add that check and also reorder existing checks to a more logical
order.

Current mkfs.btrfs does not calculate the correct checksum of
super_block and thus a freshly created filesytem will fail to mount when
this patch is applied.

First transaction commit calculates correct superblock checksum and
saves it to disk.

Reproducer:
$ mfks.btrfs /dev/sda
$ mount /dev/sda /mnt
$ btrfs scrub start /mnt
$ btrfs scrub status /mnt
... super:2 ...

CC: stable@kernel.org
Signed-off-by: David Sterba <dsterba@suse.cz>
---
 fs/btrfs/disk-io.c |   80 ++++++++++++++++++++++++++++++++++++++++++----------
 1 files changed, 65 insertions(+), 15 deletions(-)

Comments

Blair Zajac March 6, 2013, 5:14 p.m. UTC | #1
On 03/06/2013 07:56 AM, David Sterba wrote:
> The superblock checksum is not verified upon mount. <awkward silence>
>
> Add that check and also reorder existing checks to a more logical
> order.
>
> Current mkfs.btrfs does not calculate the correct checksum of
> super_block and thus a freshly created filesytem will fail to mount when
> this patch is applied.
>
> First transaction commit calculates correct superblock checksum and
> saves it to disk.

> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 7d84651..d5c710c 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -354,6 +354,42 @@ out:
>   }
>
>   /*
> + * Return 0 if the superblock checksum type matches the checksum value of that
> + * alghorithm. Pass the raw disk superblock data.

I'm not familiar with the review policy on this list, but here's a minor 
one:

s/alghorithm/algorithm/

Blair


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba March 6, 2013, 6:33 p.m. UTC | #2
On Wed, Mar 06, 2013 at 09:14:45AM -0800, Blair Zajac wrote:
> On 03/06/2013 07:56 AM, David Sterba wrote:
> >  /*
> >+ * Return 0 if the superblock checksum type matches the checksum value of that
> >+ * alghorithm. Pass the raw disk superblock data.
> 
> I'm not familiar with the review policy on this list, but here's a minor
> one:
> 
> s/alghorithm/algorithm/

Thanks, review(er)s are always welcome.

Josef, I'll not resend the patch, please pull it from branch
dev/check-super in my git repo.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zach Brown March 6, 2013, 6:56 p.m. UTC | #3
On Wed, Mar 06, 2013 at 04:56:40PM +0100, David Sterba wrote:
> The superblock checksum is not verified upon mount. <awkward silence>

Hah!

>  /*
> + * Return 0 if the superblock checksum type matches the checksum value of that
> + * alghorithm. Pass the raw disk superblock data.
> + */
> +static int btrfs_check_super_csum(char *raw_disk_sb)

I'd have it return 0 or -errno and print warnings with additional info
so that each caller doesn't have to.

> +{
> +	struct btrfs_super_block *disk_sb =
> +		(struct btrfs_super_block *)raw_disk_sb;
> +	u16 csum_type = btrfs_super_csum_type(disk_sb);

> +	if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) {
> +		printk(KERN_ERR "btrfs: unsupported checksum algorithm %u\n",
> +				csum_type);
> +		return 1;
> +	}

Does this mean we can get rid of that BUG_ON in btrfs_super_csum_size()?

And you can move this check down in an else after the CRC32 test to get
rid of the extra exit path.  Have each case ret = , the end of the
function returns ret.

> +	if (csum_type == BTRFS_CSUM_TYPE_CRC32) {
> [...]
> +	}
> +
> +	return 1;

	int ret = 0

	if (csum_type == BTRFS_CSUM_TYPE_CRC32) {
		[..]
		if (memcmp())
			ret = -EIO; /* or whatever */
	} else if (type > array_size() {
		printk("I'm sad.");
		ret = -EBOOHOO;
	}

	return ret;

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba March 7, 2013, 9:42 a.m. UTC | #4
I've missed an important detail in the reproducer, sorry

On Wed, Mar 06, 2013 at 04:56:40PM +0100, David Sterba wrote:
> Reproducer:
> $ mfks.btrfs /dev/sda
> $ mount /dev/sda /mnt
> $ btrfs scrub start /mnt

sleep 5

> $ btrfs scrub status /mnt
> ... super:2 ...

otherwise the scrub status is not updated yet.

$ scrub start
scrub started on /mnt/test, fsid afd3a16e-509b-451d-8608-4d47a7584917 (pid=5029)

$ scrub status
scrub status for afd3a16e-509b-451d-8608-4d47a7584917
        no stats available
        total bytes scrubbed: 0.00 with 0 errors
$ sleep 5

$ scrub status
scrub status for afd3a16e-509b-451d-8608-4d47a7584917
        scrub started at Thu Mar  7 10:37:16 2013 and finished after 1 seconds
        total bytes scrubbed: 56.00KB with 2 errors
        error details: super=2
        corrected errors: 0, uncorrectable errors: 0, unverified errors: 0
---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba March 7, 2013, 2:52 p.m. UTC | #5
On Wed, Mar 06, 2013 at 10:56:37AM -0800, Zach Brown wrote:
> > +static int btrfs_check_super_csum(char *raw_disk_sb)
> 
> I'd have it return 0 or -errno and print warnings with additional info
> so that each caller doesn't have to.

What errno values do you suggest? For me it's 'checksum is correct:
yes/no', hence return 1/0.

I see an -EIO below, but that does not seem right here. There's a call
to btrfs_read_dev_super that would indicate an unreadable block. We
could use EFSCORRUPTED as xfs does (with value of EUCLEAN = 117) though.

As implemented now, EINVAL will make 'mount' print the instructive
message "look into the syslog", I'm not sure if we wouldn't need to
update userspace to reflect the fine grained error codes.

Ad message printk: it's a simple helper, potentially usable in other
places, I think it's better to let the caller decide whether to print
anything or not.

> > +{
> > +	struct btrfs_super_block *disk_sb =
> > +		(struct btrfs_super_block *)raw_disk_sb;
> > +	u16 csum_type = btrfs_super_csum_type(disk_sb);
> 
> > +	if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) {
> > +		printk(KERN_ERR "btrfs: unsupported checksum algorithm %u\n",
> > +				csum_type);
> > +		return 1;
> > +	}
> 
> Does this mean we can get rid of that BUG_ON in btrfs_super_csum_size()?

Yes.

> And you can move this check down in an else after the CRC32 test to get
> rid of the extra exit path.  Have each case ret = , the end of the
> function returns ret.
> 
> > +	if (csum_type == BTRFS_CSUM_TYPE_CRC32) {
> > [...]
> > +	}
> > +
> > +	return 1;
> 
> 	int ret = 0
> 
> 	if (csum_type == BTRFS_CSUM_TYPE_CRC32) {
> 		[..]
> 		if (memcmp())
> 			ret = -EIO; /* or whatever */
> 	} else if (type > array_size() {
> 		printk("I'm sad.");
> 		ret = -EBOOHOO;
> 	}
> 
> 	return ret;

Ok, looks better structured.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zach Brown March 7, 2013, 5 p.m. UTC | #6
> What errno values do you suggest? For me it's 'checksum is correct:
> yes/no', hence return 1/0.

Oh, I have no strong preerence here.

> I see an -EIO below, but that does not seem right here. There's a call
> to btrfs_read_dev_super that would indicate an unreadable block. We
> could use EFSCORRUPTED as xfs does (with value of EUCLEAN = 117) though.
> 
> As implemented now, EINVAL will make 'mount' print the instructive
> message "look into the syslog", I'm not sure if we wouldn't need to
> update userspace to reflect the fine grained error codes.

Yeah, EINVAL seems reasonable.  mount(2):

       EINVAL source had an invalid superblock.

Documented (ish) and everything!

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 7d84651..d5c710c 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -354,6 +354,42 @@  out:
 }
 
 /*
+ * Return 0 if the superblock checksum type matches the checksum value of that
+ * alghorithm. Pass the raw disk superblock data.
+ */
+static int btrfs_check_super_csum(char *raw_disk_sb)
+{
+	struct btrfs_super_block *disk_sb =
+		(struct btrfs_super_block *)raw_disk_sb;
+	u16 csum_type = btrfs_super_csum_type(disk_sb);
+
+	if (csum_type >= ARRAY_SIZE(btrfs_csum_sizes)) {
+		printk(KERN_ERR "btrfs: unsupported checksum algorithm %u\n",
+				csum_type);
+		return 1;
+	}
+
+	if (csum_type == BTRFS_CSUM_TYPE_CRC32) {
+		const int csum_size = btrfs_csum_sizes[csum_type];
+		u32 crc = ~(u32)0;
+		char result[csum_size];
+
+		/*
+		 * The super_block structure does not span the whole
+		 * BTRFS_SUPER_INFO_SIZE range, we expect that the unused space
+		 * is filled with zeros and is included in the checkum.
+		 */
+		crc = btrfs_csum_data(NULL, raw_disk_sb + BTRFS_CSUM_SIZE,
+				crc, BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE);
+		btrfs_csum_final(crc, result);
+
+		return !!memcmp(raw_disk_sb, result, csum_size);
+	}
+
+	return 1;
+}
+
+/*
  * helper to read a given tree block, doing retries as required when
  * the checksums don't match and we have alternate mirrors to try.
  */
@@ -2205,12 +2241,31 @@  int open_ctree(struct super_block *sb,
 		     fs_info, BTRFS_ROOT_TREE_OBJECTID);
 
 	invalidate_bdev(fs_devices->latest_bdev);
+
+	/*
+	 * Read super block and check the signature bytes only
+	 */
 	bh = btrfs_read_dev_super(fs_devices->latest_bdev);
 	if (!bh) {
 		err = -EINVAL;
 		goto fail_alloc;
 	}
 
+	/*
+	 * We want to check superblock checksum, the type is stored inside.
+	 * Pass the whole disk block of size BTRFS_SUPER_INFO_SIZE (4k).
+	 */
+	if (btrfs_check_super_csum(bh->b_data)) {
+		printk(KERN_ERR "btrfs: superblock checksum mismatch\n");
+		err = -EINVAL;
+		goto fail_alloc;
+	}
+
+	/*
+	 * super_copy is zeroed at allocation time and we never touch the
+	 * following bytes up to INFO_SIZE, the checksum is calculated from
+	 * the whole block of INFO_SIZE
+	 */
 	memcpy(fs_info->super_copy, bh->b_data, sizeof(*fs_info->super_copy));
 	memcpy(fs_info->super_for_commit, fs_info->super_copy,
 	       sizeof(*fs_info->super_for_commit));
@@ -2218,6 +2273,13 @@  int open_ctree(struct super_block *sb,
 
 	memcpy(fs_info->fsid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE);
 
+	ret = btrfs_check_super_valid(fs_info, sb->s_flags & MS_RDONLY);
+	if (ret) {
+		printk(KERN_ERR "btrfs: superblock contains fatal errors\n");
+		err = -EINVAL;
+		goto fail_alloc;
+	}
+
 	disk_super = fs_info->super_copy;
 	if (!btrfs_super_root(disk_super))
 		goto fail_alloc;
@@ -2226,13 +2288,6 @@  int open_ctree(struct super_block *sb,
 	if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_ERROR)
 		set_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state);
 
-	ret = btrfs_check_super_valid(fs_info, sb->s_flags & MS_RDONLY);
-	if (ret) {
-		printk(KERN_ERR "btrfs: superblock contains fatal errors\n");
-		err = ret;
-		goto fail_alloc;
-	}
-
 	/*
 	 * run through our array of backup supers and setup
 	 * our ring pointer to the oldest one
@@ -3561,14 +3616,9 @@  int btrfs_read_buffer(struct extent_buffer *buf, u64 parent_transid)
 static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info,
 			      int read_only)
 {
-	if (btrfs_super_csum_type(fs_info->super_copy) >= ARRAY_SIZE(btrfs_csum_sizes)) {
-		printk(KERN_ERR "btrfs: unsupported checksum algorithm\n");
-		return -EINVAL;
-	}
-
-	if (read_only)
-		return 0;
-
+	/*
+	 * Placeholder for checks
+	 */
 	return 0;
 }