diff mbox

fixes for btrfs check --repair

Message ID ora96m3r2v.fsf@free.home (mailing list archive)
State Accepted
Headers show

Commit Message

Alexandre Oliva Aug. 30, 2014, 4 p.m. UTC
I got a faulty memory module a while ago, and it ran for a while,
corrupting a number of filesystems on that server.  Most of the
corruption is long gone, as the filesystems (ceph osds) were
reconstructed, but I tried really hard to avoid having to rebuild one
4TB filesystem from scratch, since it was still fully operational.  I
failed, but in the process, I ran into and fixed two btrfs check
--repair bugs.  I gave up when removing an old snapshot caused the
delayed refs processing to abort because it couldn't find a ref to
delete, whereas btrfs check --repair completed successfully without
fixing anything.  Mounting the apparently-clean filesystem would still
run into the same delayed refs error, but trying to map the logical
extent back to a file produced an error.  Since it was far too big to
preserve, even in metadata only, I didn't, and proceeded to mkfs.btrfs
right away.

Here are the patches.

Comments

David Sterba Sept. 23, 2014, 3:54 p.m. UTC | #1
Hi,

On Sat, Aug 30, 2014 at 01:00:40PM -0300, Alexandre Oliva wrote:
> Here are the patches.

thanks, I've put them into the queue (branch with other fsck fixes).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

check: do not dereference tree_refs as data_refs

From: Alexandre Oliva <oliva@gnu.org>

In a filesystem corrupted by a faulty memory module, btrfsck would get
very confused attempting to access backrefs that weren't data backrefs
as if they were.  Besides invoking undefined behavior for accessing
potentially-uninitialized data past the end of objects, or with
dynamic types unrelated with the static types held in the
corresponding memory, it used offsets and lengths from such fields
that did not correspond to anything in the filesystem proper.

Moving the test for full backrefs and checking that they're data
backrefs earlier avoided the crash I was running into, but that was
not enough to make the filesystem complete a successful repair.

Signed-off-by: Alexandre Oliva <oliva@gnu.org>
---
 cmds-check.c |   19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 66c982f..319dd2b 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -4781,15 +4781,17 @@  static int verify_backrefs(struct btrfs_trans_handle *trans,
 		return 0;
 
 	list_for_each_entry(back, &rec->backrefs, list) {
+		if (back->full_backref || !back->is_data)
+			continue;
+
 		dback = (struct data_backref *)back;
+
 		/*
 		 * We only pay attention to backrefs that we found a real
 		 * backref for.
 		 */
 		if (dback->found_ref == 0)
 			continue;
-		if (back->full_backref)
-			continue;
 
 		/*
 		 * For now we only catch when the bytes don't match, not the
@@ -4905,6 +4907,9 @@  static int verify_backrefs(struct btrfs_trans_handle *trans,
 	 * references and fix up the ones that don't match.
 	 */
 	list_for_each_entry(back, &rec->backrefs, list) {
+		if (back->full_backref || !back->is_data)
+			continue;
+
 		dback = (struct data_backref *)back;
 
 		/*
@@ -4913,8 +4918,6 @@  static int verify_backrefs(struct btrfs_trans_handle *trans,
 		 */
 		if (dback->found_ref == 0)
 			continue;
-		if (back->full_backref)
-			continue;
 
 		if (dback->bytes == best->bytes &&
 		    dback->disk_bytenr == best->bytenr)
@@ -5134,14 +5137,16 @@  static int find_possible_backrefs(struct btrfs_trans_handle *trans,
 	int ret;
 
 	list_for_each_entry(back, &rec->backrefs, list) {
+		/* Don't care about full backrefs (poor unloved backrefs) */
+		if (back->full_backref || !back->is_data)
+			continue;
+
 		dback = (struct data_backref *)back;
 
 		/* We found this one, we don't need to do a lookup */
 		if (dback->found_ref)
 			continue;
-		/* Don't care about full backrefs (poor unloved backrefs) */
-		if (back->full_backref)
-			continue;
+
 		key.objectid = dback->root;
 		key.type = BTRFS_ROOT_ITEM_KEY;
 		key.offset = (u64)-1;