[ping] btrfs: warn about RAID5/6 being experimental at mount time
diff mbox

Message ID 20170419210745.15263-1-kilobyte@angband.pl
State New
Headers show

Commit Message

Adam Borowski April 19, 2017, 9:07 p.m. UTC
Too many people come complaining about losing their data -- and indeed,
there's no warning outside a wiki and the mailing list tribal knowledge.
Message severity chosen for consistency with XFS -- "alert" makes dmesg
produce nice red background which should get the point across.

Signed-off-by: Adam Borowski <kilobyte@angband.pl>
---
Resent alone, the other patch in the original series (dropping the incompat
flag when no longer needed) is pointless on its own.

I intend to ask for inclusion of this one (or an equivalent) in 4.9, either
in Debian or via GregKH -- while for us kernels "that old" are history,
regular users expect stable releases to be free of known serious data loss
bugs.


 fs/btrfs/disk-io.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Duncan April 20, 2017, 8:13 p.m. UTC | #1
Adam Borowski posted on Wed, 19 Apr 2017 23:07:45 +0200 as excerpted:

> Too many people come complaining about losing their data -- and indeed,
> there's no warning outside a wiki and the mailing list tribal knowledge.
> Message severity chosen for consistency with XFS -- "alert" makes dmesg
> produce nice red background which should get the point across.

Commenting on the idea and comment, because as a non-coder list regular, 
that's what I can evaluate fairly. =:^)

A kernel dmesg warning like this makes more sense to me than trying to 
put it in, for instance, mkfs.btrfs, because the instability is primarily 
kernel code and at least the message can stay synced with it, being 
removed when considered appropriate, unlike userspace code which can't, 
because people often run userspace and kernelspace versions well out of 
sync with each other.

> I intend to ask for inclusion of this one (or an equivalent) in 4.9,
> either in Debian or via GregKH -- while for us kernels "that old" are
> history, regular users expect stable releases to be free of known
> serious data loss bugs.

Arguably it should go in the LTS-4.4 series as well, because we at least 
try to support the last two LTS series on-list, more or less giving up 
beyond that, and that's the relatively new 4.9 and the now going stale 
but we really should be still trying to support it 4.4.  Older than that, 
4.1 was the only LTS after initial code completion, but since it should 
be simple enough even before that and certainly would be true, queuing 
the patch for any still being updated LTS back to initial partial support 
(3.9 IIRC) is arguably worthwhile.
Sargun Dhillon April 20, 2017, 8:16 p.m. UTC | #2
On Thu, Apr 20, 2017 at 3:13 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Adam Borowski posted on Wed, 19 Apr 2017 23:07:45 +0200 as excerpted:
>
>> Too many people come complaining about losing their data -- and indeed,
>> there's no warning outside a wiki and the mailing list tribal knowledge.
>> Message severity chosen for consistency with XFS -- "alert" makes dmesg
>> produce nice red background which should get the point across.
>
> Commenting on the idea and comment, because as a non-coder list regular,
> that's what I can evaluate fairly. =:^)
>
> A kernel dmesg warning like this makes more sense to me than trying to
> put it in, for instance, mkfs.btrfs, because the instability is primarily
> kernel code and at least the message can stay synced with it, being
> removed when considered appropriate, unlike userspace code which can't,
> because people often run userspace and kernelspace versions well out of
> sync with each other.
>
Seconded. As someone who's been trying to get BtrFS adopted, the
biggest hurdle has been around perception. Rarely do people use the
userspace tools directly, but rather through multiple layers of
abstraction where they don't see any warnings coming from it. I think
adding these warnings to kernel logs is an excellent suggestion.

>> I intend to ask for inclusion of this one (or an equivalent) in 4.9,
>> either in Debian or via GregKH -- while for us kernels "that old" are
>> history, regular users expect stable releases to be free of known
>> serious data loss bugs.
>
> Arguably it should go in the LTS-4.4 series as well, because we at least
> try to support the last two LTS series on-list, more or less giving up
> beyond that, and that's the relatively new 4.9 and the now going stale
> but we really should be still trying to support it 4.4.  Older than that,
> 4.1 was the only LTS after initial code completion, but since it should
> be simple enough even before that and certainly would be true, queuing
> the patch for any still being updated LTS back to initial partial support
> (3.9 IIRC) is arguably worthwhile.
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Borowski April 26, 2017, 2:14 a.m. UTC | #3
On Wed, Apr 19, 2017 at 11:07:45PM +0200, Adam Borowski wrote:
> Too many people come complaining about losing their data -- and indeed,
> there's no warning outside a wiki and the mailing list tribal knowledge.
> Message severity chosen for consistency with XFS -- "alert" makes dmesg
> produce nice red background which should get the point across.
...
> I intend to ask for inclusion of this one (or an equivalent) in 4.9, either
> in Debian or via GregKH -- while for us kernels "that old" are history,
> regular users expect stable releases to be free of known serious data loss
> bugs.

Hi guys, could you please comment?  While there's only relatively little
urgency for mainline (heck, it'd be best if the warning was not needed at
all!), there's a Debian release close by, and it's be grossly inresponsible
to not let people know that a feature advertised in the documentation is in
an unusable state (especially as of 4.9).  For you, filesystem developers,
a way of thinking that "the user should do research" might be acceptable,
but once it filters down to a stable release, the user expects no known
serious bugs.

And here the severity is "critical -- causes serious data loss".

> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index e54844767fe5..e7f91f70e149 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3083,6 +3083,14 @@ int open_ctree(struct super_block *sb,
>  		btrfs_set_opt(fs_info->mount_opt, SSD);
>  	}
>  
> +	if ((fs_info->avail_data_alloc_bits |
> +	     fs_info->avail_metadata_alloc_bits |
> +	     fs_info->avail_system_alloc_bits) &
> +	    BTRFS_BLOCK_GROUP_RAID56_MASK) {
> +		btrfs_alert(fs_info,
> +		"btrfs RAID5/6 is EXPERIMENTAL and has known data-loss bugs");
> +	}
> +
>  	/*
>  	 * Mount does not set all options immediately, we can do it now and do
>  	 * not have to wait for transaction commit
> -- 

Doing this in the kernel should be better than in userspace (like
https://patchwork.kernel.org/patch/9450035/) as it can deal with a future
kernel with working RAID5/6 on old -progs; but if you prefer, I can finish
that patch and request its inclusion in Debian stretch -progs instead or in
addition to the above warning in the kernel.

ᛗᛖᛟᚹ!
David Sterba May 5, 2017, 7:45 p.m. UTC | #4
On Wed, Apr 26, 2017 at 04:14:16AM +0200, Adam Borowski wrote:
> On Wed, Apr 19, 2017 at 11:07:45PM +0200, Adam Borowski wrote:
> > Too many people come complaining about losing their data -- and indeed,
> > there's no warning outside a wiki and the mailing list tribal knowledge.
> > Message severity chosen for consistency with XFS -- "alert" makes dmesg
> > produce nice red background which should get the point across.
> ...
> > I intend to ask for inclusion of this one (or an equivalent) in 4.9, either
> > in Debian or via GregKH -- while for us kernels "that old" are history,
> > regular users expect stable releases to be free of known serious data loss
> > bugs.
> 
> Hi guys, could you please comment?  While there's only relatively little
> urgency for mainline (heck, it'd be best if the warning was not needed at
> all!), there's a Debian release close by, and it's be grossly inresponsible
> to not let people know that a feature advertised in the documentation is in
> an unusable state (especially as of 4.9).  For you, filesystem developers,
> a way of thinking that "the user should do research" might be acceptable,
> but once it filters down to a stable release, the user expects no known
> serious bugs.

There are some raid56 fixes in 4.12, but IIRC the write hole is still
unfixed so the warning is still valid even for 4.12. It would be easier
to get the patch to 4.4 or 4.9 once it's in Linus tree.

> 
> And here the severity is "critical -- causes serious data loss".
> 
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index e54844767fe5..e7f91f70e149 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -3083,6 +3083,14 @@ int open_ctree(struct super_block *sb,
> >  		btrfs_set_opt(fs_info->mount_opt, SSD);
> >  	}
> >  
> > +	if ((fs_info->avail_data_alloc_bits |
> > +	     fs_info->avail_metadata_alloc_bits |
> > +	     fs_info->avail_system_alloc_bits) &
> > +	    BTRFS_BLOCK_GROUP_RAID56_MASK) {
> > +		btrfs_alert(fs_info,
> > +		"btrfs RAID5/6 is EXPERIMENTAL and has known data-loss bugs");
> > +	}
> > +
> >  	/*
> >  	 * Mount does not set all options immediately, we can do it now and do
> >  	 * not have to wait for transaction commit
> > -- 
> 
> Doing this in the kernel should be better than in userspace (like
> https://patchwork.kernel.org/patch/9450035/) as it can deal with a future
> kernel with working RAID5/6 on old -progs; but if you prefer, I can finish
> that patch and request its inclusion in Debian stretch -progs instead or in
> addition to the above warning in the kernel.

I'll have look again.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adam Borowski May 9, 2017, 1:49 a.m. UTC | #5
On Fri, May 05, 2017 at 09:45:30PM +0200, David Sterba wrote:
> On Wed, Apr 26, 2017 at 04:14:16AM +0200, Adam Borowski wrote:
> > On Wed, Apr 19, 2017 at 11:07:45PM +0200, Adam Borowski wrote:
> > > Too many people come complaining about losing their data -- and indeed,
> > > there's no warning outside a wiki and the mailing list tribal knowledge.
> > > Message severity chosen for consistency with XFS -- "alert" makes dmesg
> > > produce nice red background which should get the point across.
> > ...
> > > I intend to ask for inclusion of this one (or an equivalent) in 4.9, either
> > > in Debian or via GregKH -- while for us kernels "that old" are history,
> > > regular users expect stable releases to be free of known serious data loss
> > > bugs.
> > 
> There are some raid56 fixes in 4.12, but IIRC the write hole is still
> unfixed so the warning is still valid even for 4.12. It would be easier
> to get the patch to 4.4 or 4.9 once it's in Linus tree.

I've taken pre-4.12 for a spin (mason/for-linus-4.12 atop of midway merge
window v4.11-10603-g13e098814037), and it indeed succeeds on a number of
tests I've thrown at it that 4.11 fails.  My tests were not exhaustive
(corruption at rest only, with no unclean shutdowns) but it looks good so
far.  So implementation bugs are getting ironed out; 4.9..4.11 had no
improvements but 4.12 is a nice step forward.

Still dies horribly on 32-bit.

Write hole is pretty nasty for metadata (likely to cause total filesystem
loss) but when on -draid{5,6} -mraid{1,10} it's nowhere as bad.  So for 4.12
it might be ok to put up big warnings only for metadata.  On the other hand,
data loss limited to 1-2 files is still data loss -- CoW is supposed to
never damage files already written.

A real fix is obviously better than slapping on warnings.

> > And here the severity is "critical -- causes serious data loss".

As for 4.9, though, the Debian release is coming very soon, and kernel
updates there require far more effort than the usual every 4-8 days GregKH
point release.  So I'd need to harass Ben Hutchings really soon (and
possibly it's already too late).

> > > +	if ((fs_info->avail_data_alloc_bits |
> > > +	     fs_info->avail_metadata_alloc_bits |
> > > +	     fs_info->avail_system_alloc_bits) &
> > > +	    BTRFS_BLOCK_GROUP_RAID56_MASK) {
> > > +		btrfs_alert(fs_info,
> > > +		"btrfs RAID5/6 is EXPERIMENTAL and has known data-loss bugs");

> > Doing this in the kernel should be better than in userspace (like
> > https://patchwork.kernel.org/patch/9450035/) as it can deal with a future
> > kernel with working RAID5/6 on old -progs; but if you prefer, I can finish
> > that patch and request its inclusion in Debian stretch -progs instead or in
> > addition to the above warning in the kernel.
> 
> I'll have look again.

I haven't addresses the concerns for the -progs patch -- at this moment
having you look there would be a waste of time.  So the question is: do you
want the warning to be in kernel (where it won't get out of sync), progs
(where it might be easier to notice) or both?


Meow!
Goffredo Baroncelli May 9, 2017, 7:34 p.m. UTC | #6
Hi,

On 2017-05-09 03:49, Adam Borowski wrote:
> Write hole is pretty nasty for metadata (likely to cause total filesystem
> loss) but when on -draid{5,6} -mraid{1,10} it's nowhere as bad.  So for 4.12
> it might be ok to put up big warnings only for metadata.  On the other hand,
> data loss limited to 1-2 files is still data loss -- CoW is supposed to
> never damage files already written.
> 
> A real fix is obviously better than slapping on warnings.

The write hole is a real concern ? Only in the last year the linux MD raid implementation has gained a journal to avoid this problem. This means the in the last 15-years (or even more) this problem was here but its severity was "acceptable".

What I am trying to say, is that until the kernel 4.12, btrfs  had several bugs which prevent to work even the basic raid5/6 functionalities (i.e. rebuild). This is a thing that the user should be warned. Because these kinds of bugs are unexpected by a "stable filesystem".

But the raid5/6 write hole is "defect" of all raid5/6 implementation. Until ZFS and the last iteration of MD, the real/only mitigation was a battery backup. In this BTRFS is not worse (nor better) than its competitor (xfs/ext3,4....). I am inclined to think that a warning for the write hole is a bit excessive.

BR
G.Baroncelli

Patch
diff mbox

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e54844767fe5..e7f91f70e149 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3083,6 +3083,14 @@  int open_ctree(struct super_block *sb,
 		btrfs_set_opt(fs_info->mount_opt, SSD);
 	}
 
+	if ((fs_info->avail_data_alloc_bits |
+	     fs_info->avail_metadata_alloc_bits |
+	     fs_info->avail_system_alloc_bits) &
+	    BTRFS_BLOCK_GROUP_RAID56_MASK) {
+		btrfs_alert(fs_info,
+		"btrfs RAID5/6 is EXPERIMENTAL and has known data-loss bugs");
+	}
+
 	/*
 	 * Mount does not set all options immediately, we can do it now and do
 	 * not have to wait for transaction commit