Message ID | 20181116110845.28561-1-fdmanana@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Btrfs: bring back key search optimization to btrfs_search_old_slot() | expand |
On Fri, Nov 16, 2018 at 11:09 AM <fdmanana@kernel.org> wrote: > > From: Filipe Manana <fdmanana@suse.com> > > Commit d7396f07358a ("Btrfs: optimize key searches in btrfs_search_slot"), > dated from August 2013, introduced an optimization to search for keys in a > node/leaf to both btrfs_search_slot() and btrfs_search_old_slot(). For the > later, it ended up being reverted in commit d4b4087c43cc ("Btrfs: do a > full search everytime in btrfs_search_old_slot"), from September 2013, > because the content of extent buffers were often inconsistent during > replay. It turned out that the reason why they were often inconsistent was > because the extent buffer replay stopped being done atomically, and got > broken after commit c8cc63416537 ("Btrfs: stop using GFP_ATOMIC for the > tree mod log allocations"), introduced in July 2013. The extent buffer > replay issue was then found and fixed by commit 5de865eebb83 ("Btrfs: fix > tree mod logging"), dated from December 2013. > > So bring back the optimization to btrfs_search_old_slot() as skipping it > and its comment about disabling it confusing. After all, if unwinding > extent buffers resulted in some inconsistency, the normal searches (binary > searches) would also not always work. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> David, please remove this change from the integration branch. It turns out after 3 weeks of stress tests it finally triggered an assertion failure (hard to hit) and it's indeed not reliable to use the search optimization because of how the mod log tree currently works. The idea was just to not make it different from btrfs_search_slot(). Use of the mod log tree is limited to some cases where occasional faster search wouldn't bring much benefits. Thanks. > --- > fs/btrfs/ctree.c | 8 ++------ > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 089b46c4d97f..cf5487a79c96 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -2966,7 +2966,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, > int level; > int lowest_unlock = 1; > u8 lowest_level = 0; > - int prev_cmp = -1; > + int prev_cmp; > > lowest_level = p->lowest_level; > WARN_ON(p->nodes[0] != NULL); > @@ -2977,6 +2977,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, > } > > again: > + prev_cmp = -1; > b = get_old_root(root, time_seq); > level = btrfs_header_level(b); > p->locks[level] = BTRFS_READ_LOCK; > @@ -2994,11 +2995,6 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, > */ > btrfs_unlock_up_safe(p, level + 1); > > - /* > - * Since we can unwind ebs we want to do a real search every > - * time. > - */ > - prev_cmp = -1; > ret = key_search(b, key, level, &prev_cmp, &slot); > > if (level != 0) { > -- > 2.11.0 >
On Mon, Nov 26, 2018 at 04:53:11PM +0000, Filipe Manana wrote: > On Fri, Nov 16, 2018 at 11:09 AM <fdmanana@kernel.org> wrote: > > > > From: Filipe Manana <fdmanana@suse.com> > > > > Commit d7396f07358a ("Btrfs: optimize key searches in btrfs_search_slot"), > > dated from August 2013, introduced an optimization to search for keys in a > > node/leaf to both btrfs_search_slot() and btrfs_search_old_slot(). For the > > later, it ended up being reverted in commit d4b4087c43cc ("Btrfs: do a > > full search everytime in btrfs_search_old_slot"), from September 2013, > > because the content of extent buffers were often inconsistent during > > replay. It turned out that the reason why they were often inconsistent was > > because the extent buffer replay stopped being done atomically, and got > > broken after commit c8cc63416537 ("Btrfs: stop using GFP_ATOMIC for the > > tree mod log allocations"), introduced in July 2013. The extent buffer > > replay issue was then found and fixed by commit 5de865eebb83 ("Btrfs: fix > > tree mod logging"), dated from December 2013. > > > > So bring back the optimization to btrfs_search_old_slot() as skipping it > > and its comment about disabling it confusing. After all, if unwinding > > extent buffers resulted in some inconsistency, the normal searches (binary > > searches) would also not always work. > > > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > > David, please remove this change from the integration branch. > > It turns out after 3 weeks of stress tests it finally triggered an > assertion failure (hard to hit) and > it's indeed not reliable to use the search optimization because of how > the mod log tree currently works. > The idea was just to not make it different from btrfs_search_slot(). > Use of the mod log tree is limited > to some cases where occasional faster search wouldn't bring much benefits. Understood, thanks. Patch removed from misc-next.
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 089b46c4d97f..cf5487a79c96 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -2966,7 +2966,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, int level; int lowest_unlock = 1; u8 lowest_level = 0; - int prev_cmp = -1; + int prev_cmp; lowest_level = p->lowest_level; WARN_ON(p->nodes[0] != NULL); @@ -2977,6 +2977,7 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, } again: + prev_cmp = -1; b = get_old_root(root, time_seq); level = btrfs_header_level(b); p->locks[level] = BTRFS_READ_LOCK; @@ -2994,11 +2995,6 @@ int btrfs_search_old_slot(struct btrfs_root *root, const struct btrfs_key *key, */ btrfs_unlock_up_safe(p, level + 1); - /* - * Since we can unwind ebs we want to do a real search every - * time. - */ - prev_cmp = -1; ret = key_search(b, key, level, &prev_cmp, &slot); if (level != 0) {