Message ID | 20190402180956.28893-1-jeffm@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] btrfs-progs: check: run delayed refs after writing out dirty block groups | expand |
On Tue, Apr 2, 2019 at 7:29 PM <jeffm@suse.com> wrote: > > From: Jeff Mahoney <jeffm@suse.com> > > When repairing the extent tree, it's possible for delayed extents to > be created when running btrfs_write_dirty_block_groups. We run > delayed refs one last time in the kernel but that is missing in > the userspace tools. > > That results in delayed refs getting dropped on the floor, the extent > records not getting created, and in the next tranaction, when the > extent tree is CoW'd again, we hit the BUG_ON when we can't find > the extent record. > > We can fix this by running the delayed refs after writing out the > dirty block groups. > > Signed-off-by: Jeff Mahoney <jeffm@suse.com> > --- > transaction.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/transaction.c b/transaction.c > index e756db33..2f19e9c8 100644 > --- a/transaction.c > +++ b/transaction.c > @@ -194,6 +194,8 @@ commit_tree: > ret = btrfs_run_delayed_refs(trans, -1); > BUG_ON(ret); > btrfs_write_dirty_block_groups(trans); > + ret = btrfs_run_delayed_refs(trans, -1); > + BUG_ON(ret); And running delayed refs can dirty more block groups as well. At this point shouldn't we loop running delayed refs until no more dirty block groups exist? Just like in the kernel. thanks > __commit_transaction(trans, root); > if (ret < 0) > goto out; > -- > 2.16.4 >
On 4/2/19 3:19 PM, Filipe Manana wrote: > On Tue, Apr 2, 2019 at 7:29 PM <jeffm@suse.com> wrote: >> >> From: Jeff Mahoney <jeffm@suse.com> >> >> When repairing the extent tree, it's possible for delayed extents to >> be created when running btrfs_write_dirty_block_groups. We run >> delayed refs one last time in the kernel but that is missing in >> the userspace tools. >> >> That results in delayed refs getting dropped on the floor, the extent >> records not getting created, and in the next tranaction, when the >> extent tree is CoW'd again, we hit the BUG_ON when we can't find >> the extent record. >> >> We can fix this by running the delayed refs after writing out the >> dirty block groups. >> >> Signed-off-by: Jeff Mahoney <jeffm@suse.com> >> --- >> transaction.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/transaction.c b/transaction.c >> index e756db33..2f19e9c8 100644 >> --- a/transaction.c >> +++ b/transaction.c >> @@ -194,6 +194,8 @@ commit_tree: >> ret = btrfs_run_delayed_refs(trans, -1); >> BUG_ON(ret); >> btrfs_write_dirty_block_groups(trans); >> + ret = btrfs_run_delayed_refs(trans, -1); >> + BUG_ON(ret); > > And running delayed refs can dirty more block groups as well. > At this point shouldn't we loop running delayed refs until no more > dirty block groups exist? Just like in the kernel. Right. This is another argument for code sharing between the kernel and userspace. -Jeff > thanks > >> __commit_transaction(trans, root); >> if (ret < 0) >> goto out; >> -- >> 2.16.4 >> > >
On Wed, Apr 03, 2019 at 10:38:09PM -0400, Jeff Mahoney wrote: > On 4/2/19 3:19 PM, Filipe Manana wrote: > > On Tue, Apr 2, 2019 at 7:29 PM <jeffm@suse.com> wrote: > >> > >> From: Jeff Mahoney <jeffm@suse.com> > >> > >> When repairing the extent tree, it's possible for delayed extents to > >> be created when running btrfs_write_dirty_block_groups. We run > >> delayed refs one last time in the kernel but that is missing in > >> the userspace tools. > >> > >> That results in delayed refs getting dropped on the floor, the extent > >> records not getting created, and in the next tranaction, when the > >> extent tree is CoW'd again, we hit the BUG_ON when we can't find > >> the extent record. > >> > >> We can fix this by running the delayed refs after writing out the > >> dirty block groups. > >> > >> Signed-off-by: Jeff Mahoney <jeffm@suse.com> > >> --- > >> transaction.c | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/transaction.c b/transaction.c > >> index e756db33..2f19e9c8 100644 > >> --- a/transaction.c > >> +++ b/transaction.c > >> @@ -194,6 +194,8 @@ commit_tree: > >> ret = btrfs_run_delayed_refs(trans, -1); > >> BUG_ON(ret); > >> btrfs_write_dirty_block_groups(trans); > >> + ret = btrfs_run_delayed_refs(trans, -1); > >> + BUG_ON(ret); > > > > And running delayed refs can dirty more block groups as well. > > At this point shouldn't we loop running delayed refs until no more > > dirty block groups exist? Just like in the kernel. > > Right. This is another argument for code sharing between the kernel and > userspace. Sharing code in this function would be really hard, I've implemented the loop in commit in progs.
On Wed, May 15, 2019 at 3:15 PM David Sterba <dsterba@suse.cz> wrote: > > On Wed, Apr 03, 2019 at 10:38:09PM -0400, Jeff Mahoney wrote: > > On 4/2/19 3:19 PM, Filipe Manana wrote: > > > On Tue, Apr 2, 2019 at 7:29 PM <jeffm@suse.com> wrote: > > >> > > >> From: Jeff Mahoney <jeffm@suse.com> > > >> > > >> When repairing the extent tree, it's possible for delayed extents to > > >> be created when running btrfs_write_dirty_block_groups. We run > > >> delayed refs one last time in the kernel but that is missing in > > >> the userspace tools. > > >> > > >> That results in delayed refs getting dropped on the floor, the extent > > >> records not getting created, and in the next tranaction, when the > > >> extent tree is CoW'd again, we hit the BUG_ON when we can't find > > >> the extent record. > > >> > > >> We can fix this by running the delayed refs after writing out the > > >> dirty block groups. > > >> > > >> Signed-off-by: Jeff Mahoney <jeffm@suse.com> > > >> --- > > >> transaction.c | 2 ++ > > >> 1 file changed, 2 insertions(+) > > >> > > >> diff --git a/transaction.c b/transaction.c > > >> index e756db33..2f19e9c8 100644 > > >> --- a/transaction.c > > >> +++ b/transaction.c > > >> @@ -194,6 +194,8 @@ commit_tree: > > >> ret = btrfs_run_delayed_refs(trans, -1); > > >> BUG_ON(ret); > > >> btrfs_write_dirty_block_groups(trans); > > >> + ret = btrfs_run_delayed_refs(trans, -1); > > >> + BUG_ON(ret); > > > > > > And running delayed refs can dirty more block groups as well. > > > At this point shouldn't we loop running delayed refs until no more > > > dirty block groups exist? Just like in the kernel. > > > > Right. This is another argument for code sharing between the kernel and > > userspace. > > Sharing code in this function would be really hard, I've implemented the > loop in commit in progs. Shouldn't the new patch version be sent to the list for review? It doesn't seem to be a trivial change on first through. Thanks.
On Wed, May 15, 2019 at 03:45:13PM +0100, Filipe Manana wrote: > > > > And running delayed refs can dirty more block groups as well. > > > > At this point shouldn't we loop running delayed refs until no more > > > > dirty block groups exist? Just like in the kernel. > > > > > > Right. This is another argument for code sharing between the kernel and > > > userspace. > > > > Sharing code in this function would be really hard, I've implemented the > > loop in commit in progs. > > Shouldn't the new patch version be sent to the list for review? > It doesn't seem to be a trivial change on first through. Ok, I've removed the patches from devel and will send them once the release is done.
On 5/17/19 9:12 AM, David Sterba wrote: > On Wed, May 15, 2019 at 03:45:13PM +0100, Filipe Manana wrote: >>>>> And running delayed refs can dirty more block groups as well. >>>>> At this point shouldn't we loop running delayed refs until no more >>>>> dirty block groups exist? Just like in the kernel. >>>> >>>> Right. This is another argument for code sharing between the kernel and >>>> userspace. >>> >>> Sharing code in this function would be really hard, I've implemented the >>> loop in commit in progs. >> >> Shouldn't the new patch version be sent to the list for review? >> It doesn't seem to be a trivial change on first through. > > Ok, I've removed the patches from devel and will send them once the > release is done. > Hi Dave - Did this ever get revisited? -Jeff
On Wed, Jul 24, 2019 at 09:53:33AM -0400, Jeff Mahoney wrote: > On 5/17/19 9:12 AM, David Sterba wrote: > > On Wed, May 15, 2019 at 03:45:13PM +0100, Filipe Manana wrote: > >>>>> And running delayed refs can dirty more block groups as well. > >>>>> At this point shouldn't we loop running delayed refs until no more > >>>>> dirty block groups exist? Just like in the kernel. > >>>> > >>>> Right. This is another argument for code sharing between the kernel and > >>>> userspace. > >>> > >>> Sharing code in this function would be really hard, I've implemented the > >>> loop in commit in progs. > >> > >> Shouldn't the new patch version be sent to the list for review? > >> It doesn't seem to be a trivial change on first through. > > > > Ok, I've removed the patches from devel and will send them once the > > release is done. > > > > Hi Dave - > > Did this ever get revisited? No, but Qu sent the fix, that's now in devel.
diff --git a/transaction.c b/transaction.c index e756db33..2f19e9c8 100644 --- a/transaction.c +++ b/transaction.c @@ -194,6 +194,8 @@ commit_tree: ret = btrfs_run_delayed_refs(trans, -1); BUG_ON(ret); btrfs_write_dirty_block_groups(trans); + ret = btrfs_run_delayed_refs(trans, -1); + BUG_ON(ret); __commit_transaction(trans, root); if (ret < 0) goto out;