diff mbox

F21 fails to mount root part, btrfs check: Couldn't open file system

Message ID CACPiFCKx7G3pYdSs_MBne5Z6XP+zbgQK_iaNWvMQPs76R=VQNg@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Martin Langhoff April 1, 2015, 5:26 p.m. UTC
On Wed, Apr 1, 2015 at 1:03 PM, Chris Murphy <lists@colorremedies.com> wrote:
> mount /dev/sda6 /mnt
> btrfs inspect-internal inode-resolve 39841 /mnt

on the booted system...
# uname -a
Linux tp-martin.remote-learner.net 3.18.9-200.fc21.x86_64 #1 SMP Mon
Mar 9 15:10:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
# btrfs inspect-internal inode-resolve 39841 /
//etc/shadow-
# diff -u /etc/shadow{,-}

Bizarre.

cheers,



m

Comments

Chris Murphy April 1, 2015, 6:04 p.m. UTC | #1
On Wed, Apr 1, 2015 at 11:26 AM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Wed, Apr 1, 2015 at 1:03 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> mount /dev/sda6 /mnt
>> btrfs inspect-internal inode-resolve 39841 /mnt
>
> on the booted system...
> # uname -a
> Linux tp-martin.remote-learner.net 3.18.9-200.fc21.x86_64 #1 SMP Mon
> Mar 9 15:10:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> # btrfs inspect-internal inode-resolve 39841 /
> //etc/shadow-
> # diff -u /etc/shadow{,-}
> --- /etc/shadow 2015-03-04 02:26:59.478255332 -0500
> +++ /etc/shadow-        2015-03-04 02:26:59.000000000 -0500
> @@ -42,4 +42,3 @@
>  systemd-timesync:!!:16498::::::
>  systemd-network:!!:16498::::::
>  systemd-resolve:!!:16498::::::
> -systemd-bus-proxy:!!:16498::::::
>
> Bizarre.

When I had this same btrfs check error, it was the exact inode number
and same /etc/shadow file. I didn't diff the two shadow files, but I
the the cp mv rm routine, and then the system booted. Goofy cakes.
It's almost like an April Fools joke.
Martin Langhoff April 1, 2015, 6:16 p.m. UTC | #2
On Wed, Apr 1, 2015 at 2:04 PM, Chris Murphy <lists@colorremedies.com> wrote:
> When I had this same btrfs check error, it was the exact inode number
> and same /etc/shadow file. I didn't diff the two shadow files, but I

That's too bizarre for words. Two folks, on two different systems,
getting btrfs problems on similar kernels on the exact same filepath.
In my case, the file was last frobbed by yum/rpm. Do we have a strange
interaction between a kernel regression and yum/rpm rubbing the
filesystem the wrong way?

BTW, I did not change/touch the file at all. My only "fix" action was
the btrfs check --repair mentioned earlier. Right now, on the booted
system I did

# uname -a
Linux tp-martin.remote-learner.net 3.18.9-200.fc21.x86_64 #1 SMP Mon
Mar 9 15:10:50 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
# btrfs scrub start -BrR   /
scrub done for 94637b35-a294-4be2-aa47-82c52d6d53ef
        scrub started at Wed Apr  1 13:46:20 2015 and finished after 266 seconds
        data_extents_scrubbed: 344155
        tree_extents_scrubbed: 58048
        data_bytes_scrubbed: 11896840192
        tree_bytes_scrubbed: 951058432
        read_errors: 0
        csum_errors: 0
        verify_errors: 0
        no_csum: 20268
        csum_discards: 254459
        super_errors: 0
        malloc_errors: 0
        uncorrectable_errors: 0
        unverified_errors: 0
        corrected_errors: 0
        last_physical: 23928504320

cheers,



m
Chris Murphy April 1, 2015, 6:20 p.m. UTC | #3
On Wed, Apr 1, 2015 at 12:16 PM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Wed, Apr 1, 2015 at 2:04 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> When I had this same btrfs check error, it was the exact inode number
>> and same /etc/shadow file. I didn't diff the two shadow files, but I
>
> That's too bizarre for words. Two folks, on two different systems,
> getting btrfs problems on similar kernels on the exact same filepath.
> In my case, the file was last frobbed by yum/rpm. Do we have a strange
> interaction between a kernel regression and yum/rpm rubbing the
> filesystem the wrong way?

No idea, but it happened to me more than once, same inode number, same file.



> BTW, I did not change/touch the file at all. My only "fix" action was
> the btrfs check --repair mentioned earlier.

That won't fix it. Once errors 400 appears, at this point you have to
replace the affected file.
Martin Langhoff April 1, 2015, 6:29 p.m. UTC | #4
On Wed, Apr 1, 2015 at 2:20 PM, Chris Murphy <lists@colorremedies.com> wrote:
> That won't fix it. Once errors 400 appears, at this point you have to
> replace the affected file.

Interesting.

Right now I am booting without problems. I have no evidence of
continued problems. What would I do to check whether I see an error
similar to yours on this fs?

Trying to ascertain whether my fs is cured, and whether we can learn
something else about this oddity...

cheers,


m
Chris Murphy April 1, 2015, 6:54 p.m. UTC | #5
On Wed, Apr 1, 2015 at 12:29 PM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Wed, Apr 1, 2015 at 2:20 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> That won't fix it. Once errors 400 appears, at this point you have to
>> replace the affected file.
>
> Interesting.
>
> Right now I am booting without problems. I have no evidence of
> continued problems. What would I do to check whether I see an error
> similar to yours on this fs?
>
> Trying to ascertain whether my fs is cured, and whether we can learn
> something else about this oddity...

Re-run the btrfs check. The error is still there even after a --repair.
Martin Langhoff April 1, 2015, 7:23 p.m. UTC | #6
On Wed, Apr 1, 2015 at 2:54 PM, Chris Murphy <lists@colorremedies.com> wrote:
> Re-run the btrfs check. The error is still there even after a --repair.

Bingo! You are right the error persists.

It has no effect on my use of the system right now. Is anyone
interested in debugging this further?

cheers,



martin
Chris Murphy April 1, 2015, 8:14 p.m. UTC | #7
On Wed, Apr 1, 2015 at 1:23 PM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Wed, Apr 1, 2015 at 2:54 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> Re-run the btrfs check. The error is still there even after a --repair.
>
> Bingo! You are right the error persists.
>
> It has no effect on my use of the system right now. Is anyone
> interested in debugging this further?

400 errors, nbytes wrong, isn't repaired by current btrfs check
https://bugzilla.kernel.org/show_bug.cgi?id=90071

What's interesting in that bug report that I'd forgotten about?

># btrfs inspect inode 804 /mnt/root
>/mnt/root/etc/shadow-

Different inode number, but the shadow file is affected. In every
single case I've had now (about 1/2 dozen) with this errors 400
message, it's involved the shadow file. I have no idea what's going on
between Btrfs and the shadow file, but something seems to be. Or it's
quite a coincidence.
Chris Murphy April 1, 2015, 8:21 p.m. UTC | #8
Related bugs:

https://bugzilla.kernel.org/show_bug.cgi?id=68411
https://bugzilla.redhat.com/show_bug.cgi?id=1037963

The RHBZ one also mentioned the shadow file.

Anyway, it seems to be a somewhat known problem, but it's just not
known yet what causes it.

Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Murphy April 2, 2015, 3:29 a.m. UTC | #9
Whenever I have these boot problems, I'm noticing that sometimes the
device, /dev/sda5, is showing up with lsblk (libblkid) as
/dev/block/8:5 while everything else (not-Btrfs) on that device shows
up as /dev/sdaX. Does anyone know what that might mean?


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- /etc/shadow 2015-03-04 02:26:59.478255332 -0500
+++ /etc/shadow-        2015-03-04 02:26:59.000000000 -0500
@@ -42,4 +42,3 @@ 
 systemd-timesync:!!:16498::::::
 systemd-network:!!:16498::::::
 systemd-resolve:!!:16498::::::
-systemd-bus-proxy:!!:16498::::::