diff mbox series

btrfs-progs: balance: Sync the fs before balancing metadata chunks

Message ID 20190129065739.31707-1-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs-progs: balance: Sync the fs before balancing metadata chunks | expand

Commit Message

Qu Wenruo Jan. 29, 2019, 6:57 a.m. UTC
[BUG]
Btrfs will report false ENOSPC balancing metadata chunk.
The following script can easily reproduce it:

  #!/bin/bash
  dev=/dev/test/test
  mnt=/mnt/btrfs

  umount $dev &> /dev/null
  umount $mnt &> /dev/null
  mkfs.btrfs -f $dev

  mount $dev $mnt
  btrfs subv create $mnt/subv
  for ((i = 0; i < 1024; i++)) do
  	xfs_io -f -c "pwrite 0 4k" $mnt/subv/file_$i > /dev/null
  done
  btrfs balance start -m $mnt

[CAUSE]
It's metadata space_info::bytes_may_use causing the problem.
For above case, we need to reserve enough metadata space for all the
created small files.

[FIX]
The most straightforward is to sync the fs before balancing metadata
chunks.

We could enhance the kernel bytes_may_use calculation, but I doubt about
the complexity.
So I take the easy fix to reduce the false ENOSPC reports.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 cmds-balance.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Comments

David Sterba Feb. 25, 2019, 5:21 p.m. UTC | #1
On Tue, Jan 29, 2019 at 02:57:39PM +0800, Qu Wenruo wrote:
> The most straightforward is to sync the fs before balancing metadata
> chunks.
> 
> We could enhance the kernel bytes_may_use calculation, but I doubt about
> the complexity.
> So I take the easy fix to reduce the false ENOSPC reports.

Agreed.

> +	/*
> +	 * There may be many over-reserved space for metadata block groups,
> +	 * especially for inlined file extents.
> +	 *
> +	 * Do a sync here will free those over-reserved space and hugely
> +	 * reduce the possibility of some false ENOSPC
> +	 */
> +	if (args->flags & BTRFS_BALANCE_METADATA) {
> +		ret = btrfs_util_sync(path);

As the fd is already open, we should use the _fd version,

> +		if (ret) {
> +			error("failed to sync the fs before balance: %m");
> +			ret = -errno;
> +			goto out;

and possibly only warn if there's an error returned as the sync failure
is not a critical condition.
Qu Wenruo Feb. 26, 2019, 5:55 a.m. UTC | #2
On 2019/2/26 上午1:21, David Sterba wrote:
> On Tue, Jan 29, 2019 at 02:57:39PM +0800, Qu Wenruo wrote:
>> The most straightforward is to sync the fs before balancing metadata
>> chunks.
>>
>> We could enhance the kernel bytes_may_use calculation, but I doubt about
>> the complexity.
>> So I take the easy fix to reduce the false ENOSPC reports.
> 
> Agreed.
> 
>> +	/*
>> +	 * There may be many over-reserved space for metadata block groups,
>> +	 * especially for inlined file extents.
>> +	 *
>> +	 * Do a sync here will free those over-reserved space and hugely
>> +	 * reduce the possibility of some false ENOSPC
>> +	 */
>> +	if (args->flags & BTRFS_BALANCE_METADATA) {
>> +		ret = btrfs_util_sync(path);
> 
> As the fd is already open, we should use the _fd version,
> 
>> +		if (ret) {
>> +			error("failed to sync the fs before balance: %m");
>> +			ret = -errno;
>> +			goto out;
> 
> and possibly only warn if there's an error returned as the sync failure
> is not a critical condition.

AFAIK if we can't even sync the fs, the balance is definitely going to
fail, as the most common failure mode for syncfs is RO fs, caused by
aborted transaction.

Thus I still think we should error out. Or is there some other
non-critical failure mode I missed?

Thanks,
Qu
David Sterba Feb. 27, 2019, 4:25 p.m. UTC | #3
On Tue, Feb 26, 2019 at 01:55:06PM +0800, Qu Wenruo wrote:
> 
> 
> On 2019/2/26 上午1:21, David Sterba wrote:
> > On Tue, Jan 29, 2019 at 02:57:39PM +0800, Qu Wenruo wrote:
> >> The most straightforward is to sync the fs before balancing metadata
> >> chunks.
> >>
> >> We could enhance the kernel bytes_may_use calculation, but I doubt about
> >> the complexity.
> >> So I take the easy fix to reduce the false ENOSPC reports.
> > 
> > Agreed.
> > 
> >> +	/*
> >> +	 * There may be many over-reserved space for metadata block groups,
> >> +	 * especially for inlined file extents.
> >> +	 *
> >> +	 * Do a sync here will free those over-reserved space and hugely
> >> +	 * reduce the possibility of some false ENOSPC
> >> +	 */
> >> +	if (args->flags & BTRFS_BALANCE_METADATA) {
> >> +		ret = btrfs_util_sync(path);
> > 
> > As the fd is already open, we should use the _fd version,
> > 
> >> +		if (ret) {
> >> +			error("failed to sync the fs before balance: %m");
> >> +			ret = -errno;
> >> +			goto out;
> > 
> > and possibly only warn if there's an error returned as the sync failure
> > is not a critical condition.
> 
> AFAIK if we can't even sync the fs, the balance is definitely going to
> fail, as the most common failure mode for syncfs is RO fs, caused by
> aborted transaction.
> 
> Thus I still think we should error out. Or is there some other
> non-critical failure mode I missed?

The read-only filesystem will be checked when balance starts, that's
where it gets reported.
diff mbox series

Patch

diff --git a/cmds-balance.c b/cmds-balance.c
index 15dc385e..a617a1d2 100644
--- a/cmds-balance.c
+++ b/cmds-balance.c
@@ -24,6 +24,7 @@ 
 #include <sys/stat.h>
 #include <fcntl.h>
 #include <errno.h>
+#include <btrfsutil.h>
 
 #include "kerncompat.h"
 #include "ctree.h"
@@ -32,6 +33,7 @@ 
 
 #include "commands.h"
 #include "utils.h"
+#include "utils.h"
 #include "help.h"
 
 static const char * const balance_cmd_group_usage[] = {
@@ -455,6 +457,22 @@  static int do_balance(const char *path, struct btrfs_ioctl_balance_args *args,
 		printf("\nStarting balance without any filters.\n");
 	}
 
+	/*
+	 * There may be many over-reserved space for metadata block groups,
+	 * especially for inlined file extents.
+	 *
+	 * Do a sync here will free those over-reserved space and hugely
+	 * reduce the possibility of some false ENOSPC
+	 */
+	if (args->flags & BTRFS_BALANCE_METADATA) {
+		ret = btrfs_util_sync(path);
+		if (ret) {
+			error("failed to sync the fs before balance: %m");
+			ret = -errno;
+			goto out;
+		}
+	}
+
 	ret = ioctl(fd, BTRFS_IOC_BALANCE_V2, args);
 	if (ret < 0) {
 		/*