diff mbox

Btrfs: try to avoid doing a search in btrfs_next_leaf

Message ID 1348516979-2343-1-git-send-email-jbacik@fusionio.com (mailing list archive)
State New, archived
Headers show

Commit Message

Josef Bacik Sept. 24, 2012, 8:02 p.m. UTC
Things like btrfs_drop_extents call btrfs_next_leaf to see if there is
anything else they need on the next leaf.  This will result in a re-search,
but if we are already at the last leaf in the tree or if the first item in
the next leaf doesn't match the objectid of the one we want we can just
return without doing the search at all.  This helps in things like fsync()
where our tree is pretty shallow and we're likely to be on the last leaf
often.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
---
 fs/btrfs/ctree.c |   27 +++++++++++++++++++++++++++
 fs/btrfs/ctree.h |    1 +
 2 files changed, 28 insertions(+), 0 deletions(-)

Comments

Alex Lyakas Sept. 30, 2012, 11:28 a.m. UTC | #1
Hi Josef,
I guess I am missing something, but I thought btrfs_next_leaf() should
just jump to the next leaf (or item, if it was added meanwhile)
irrespective of the key that is in the last slot of the current leaf.
The change you added is effective when there is a next leaf, but you
refuse to go there unless its first key has the same objectid. (I
think you use the ctree property that the key in the first node/leaf
of a tree block is equal to its parent's key).
Can you pls explain why you insist on the same objectid?

Thanks,
Alex.


On Mon, Sep 24, 2012 at 10:02 PM, Josef Bacik <jbacik@fusionio.com> wrote:
> Things like btrfs_drop_extents call btrfs_next_leaf to see if there is
> anything else they need on the next leaf.  This will result in a re-search,
> but if we are already at the last leaf in the tree or if the first item in
> the next leaf doesn't match the objectid of the one we want we can just
> return without doing the search at all.  This helps in things like fsync()
> where our tree is pretty shallow and we're likely to be on the last leaf
> often.  Thanks,
>
> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
> ---
>  fs/btrfs/ctree.c |   27 +++++++++++++++++++++++++++
>  fs/btrfs/ctree.h |    1 +
>  2 files changed, 28 insertions(+), 0 deletions(-)
>
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 6d183f6..64ea61c 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -2441,6 +2441,7 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root
>         lowest_level = p->lowest_level;
>         WARN_ON(lowest_level && ins_len > 0);
>         WARN_ON(p->nodes[0] != NULL);
> +       p->shecantgoanyfarthercapt = 1;
>
>         if (ins_len < 0) {
>                 lowest_unlock = 2;
> @@ -2568,6 +2569,13 @@ cow_done:
>
>                 if (level != 0) {
>                         int dec = 0;
> +
> +                       /*
> +                        * Slot is not the last in the node, we can go farther
> +                        * capt.
> +                        */
> +                       if (slot < btrfs_header_nritems(b))
> +                               p->shecantgoanyfarthercapt = 0;
>                         if (ret && slot > 0) {
>                                 dec = 1;
>                                 slot -= 1;
> @@ -5612,8 +5620,27 @@ int btrfs_next_old_leaf(struct btrfs_root *root, struct btrfs_path *path,
>         nritems = btrfs_header_nritems(path->nodes[0]);
>         if (nritems == 0)
>                 return 1;
> +       if (path->shecantgoanyfarthercapt)
> +               return 1;
> +       if (!path->nodes[1])
> +               return 1;
>
>         btrfs_item_key_to_cpu(path->nodes[0], &key, nritems - 1);
> +
> +       /*
> +        * If we have the level above us locked already just check and see if
> +        * the key in the next leaf even has the same objectid, and if not
> +        * return 1 and avoid the search.
> +        */
> +       if (path->locks[1] &&
> +           path->slots[1] + 1 < btrfs_header_nritems(path->nodes[1])) {
> +               struct btrfs_key tmp;
> +
> +               btrfs_node_key_to_cpu(path->nodes[1], &tmp,
> +                                     path->slots[1] + 1);
> +               if (key.objectid != tmp.objectid)
> +                       return 1;
> +       }
>  again:
>         level = 1;
>         next = NULL;
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 6f2e7e6..2e5c6c5 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -571,6 +571,7 @@ struct btrfs_path {
>         unsigned int skip_locking:1;
>         unsigned int leave_spinning:1;
>         unsigned int search_commit_root:1;
> +       unsigned int shecantgoanyfarthercapt:1;
>  };
>
>  /*
> --
> 1.7.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Oct. 1, 2012, 4:45 p.m. UTC | #2
On Sun, Sep 30, 2012 at 05:28:23AM -0600, Alex Lyakas wrote:
> Hi Josef,
> I guess I am missing something, but I thought btrfs_next_leaf() should
> just jump to the next leaf (or item, if it was added meanwhile)
> irrespective of the key that is in the last slot of the current leaf.
> The change you added is effective when there is a next leaf, but you
> refuse to go there unless its first key has the same objectid. (I
> think you use the ctree property that the key in the first node/leaf
> of a tree block is equal to its parent's key).
> Can you pls explain why you insist on the same objectid?
> 

It's to avoid another search forward.  Unless I missed something everybody who
calls btrfs_next_leaf() only wants to walk forward based on a particular item.
If I've missed something and that's not the case then this needs to be dropped,
or we can set some flag in path to ignore this logic, either way.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Lyakas Oct. 2, 2012, 8:17 a.m. UTC | #3
Hi Josef,
thanks for clarifying.
By "wants to walk forward based on a particular item" you mean - find
all items that have the same objectid, until objectid changes? I wrote
some code once (for educational purposes) that just walked the tree
irrespective of a particular objectid, so that's why I was somewhat
puzzled by your change. If you decide to keep your change, can you pls
update the comment of btrfs_next_leaf()/next_old_leaf().

Thanks again,
Alex.


On Mon, Oct 1, 2012 at 6:45 PM, Josef Bacik <jbacik@fusionio.com> wrote:
> On Sun, Sep 30, 2012 at 05:28:23AM -0600, Alex Lyakas wrote:
>> Hi Josef,
>> I guess I am missing something, but I thought btrfs_next_leaf() should
>> just jump to the next leaf (or item, if it was added meanwhile)
>> irrespective of the key that is in the last slot of the current leaf.
>> The change you added is effective when there is a next leaf, but you
>> refuse to go there unless its first key has the same objectid. (I
>> think you use the ctree property that the key in the first node/leaf
>> of a tree block is equal to its parent's key).
>> Can you pls explain why you insist on the same objectid?
>>
>
> It's to avoid another search forward.  Unless I missed something everybody who
> calls btrfs_next_leaf() only wants to walk forward based on a particular item.
> If I've missed something and that's not the case then this needs to be dropped,
> or we can set some flag in path to ignore this logic, either way.  Thanks,
>
> Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Oct. 2, 2012, 2:25 p.m. UTC | #4
Hi,

On Mon, Sep 24, 2012 at 04:02:59PM -0400, Josef Bacik wrote:
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -571,6 +571,7 @@ struct btrfs_path {
>  	unsigned int skip_locking:1;
>  	unsigned int leave_spinning:1;
>  	unsigned int search_commit_root:1;
> +	unsigned int shecantgoanyfarthercapt:1;

so you did not make it to LWN's quote of the week, can you please rename
it to something sensible?

thanks,
david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Oct. 2, 2012, 2:32 p.m. UTC | #5
On Tue, Oct 02, 2012 at 08:25:38AM -0600, David Sterba wrote:
> Hi,
> 
> On Mon, Sep 24, 2012 at 04:02:59PM -0400, Josef Bacik wrote:
> > --- a/fs/btrfs/ctree.h
> > +++ b/fs/btrfs/ctree.h
> > @@ -571,6 +571,7 @@ struct btrfs_path {
> >  	unsigned int skip_locking:1;
> >  	unsigned int leave_spinning:1;
> >  	unsigned int search_commit_root:1;
> > +	unsigned int shecantgoanyfarthercapt:1;
> 
> so you did not make it to LWN's quote of the week, can you please rename
> it to something sensible?
> 

I don't see what's unsensible about it.

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba Oct. 2, 2012, 3:05 p.m. UTC | #6
On Tue, Oct 02, 2012 at 10:32:32AM -0400, Josef Bacik wrote:
> On Tue, Oct 02, 2012 at 08:25:38AM -0600, David Sterba wrote:
> > Hi,
> > 
> > On Mon, Sep 24, 2012 at 04:02:59PM -0400, Josef Bacik wrote:
> > > --- a/fs/btrfs/ctree.h
> > > +++ b/fs/btrfs/ctree.h
> > > @@ -571,6 +571,7 @@ struct btrfs_path {
> > >  	unsigned int skip_locking:1;
> > >  	unsigned int leave_spinning:1;
> > >  	unsigned int search_commit_root:1;
> > > +	unsigned int shecantgoanyfarthercapt:1;
> > 
> > so you did not make it to LWN's quote of the week, can you please rename
> > it to something sensible?
> > 
> 
> I don't see what's unsensible about it.

Itsfunnybuthardtoreadanddoesnothelpunderstandingthecodewhereitsused.

david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Oct. 2, 2012, 3:27 p.m. UTC | #7
On Tue, Oct 02, 2012 at 09:05:43AM -0600, David Sterba wrote:
> On Tue, Oct 02, 2012 at 10:32:32AM -0400, Josef Bacik wrote:
> > On Tue, Oct 02, 2012 at 08:25:38AM -0600, David Sterba wrote:
> > > Hi,
> > > 
> > > On Mon, Sep 24, 2012 at 04:02:59PM -0400, Josef Bacik wrote:
> > > > --- a/fs/btrfs/ctree.h
> > > > +++ b/fs/btrfs/ctree.h
> > > > @@ -571,6 +571,7 @@ struct btrfs_path {
> > > >  	unsigned int skip_locking:1;
> > > >  	unsigned int leave_spinning:1;
> > > >  	unsigned int search_commit_root:1;
> > > > +	unsigned int shecantgoanyfarthercapt:1;
> > > 
> > > so you did not make it to LWN's quote of the week, can you please rename
> > > it to something sensible?
> > > 
> > 
> > I don't see what's unsensible about it.
> 
> Itsfunnybuthardtoreadanddoesnothelpunderstandingthecodewhereitsused.
> 

Idisagreeithinkithelpstremendouslywiththeunderstandingofthecode,youcantgoanyfarther,butfineineedotaddacommenttobtrfs_next_leafanywaysoi'llnameitsomethingelse,anysuggestions?

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arne Jansen Oct. 2, 2012, 3:30 p.m. UTC | #8
On 02.10.2012 17:27, Josef Bacik wrote:
> On Tue, Oct 02, 2012 at 09:05:43AM -0600, David Sterba wrote:
>> On Tue, Oct 02, 2012 at 10:32:32AM -0400, Josef Bacik wrote:
>>> On Tue, Oct 02, 2012 at 08:25:38AM -0600, David Sterba wrote:
>>>> Hi,
>>>>
>>>> On Mon, Sep 24, 2012 at 04:02:59PM -0400, Josef Bacik wrote:
>>>>> --- a/fs/btrfs/ctree.h
>>>>> +++ b/fs/btrfs/ctree.h
>>>>> @@ -571,6 +571,7 @@ struct btrfs_path {
>>>>>  	unsigned int skip_locking:1;
>>>>>  	unsigned int leave_spinning:1;
>>>>>  	unsigned int search_commit_root:1;
>>>>> +	unsigned int shecantgoanyfarthercapt:1;
>>>>
>>>> so you did not make it to LWN's quote of the week, can you please rename
>>>> it to something sensible?
>>>>
>>>
>>> I don't see what's unsensible about it.
>>
>> Itsfunnybuthardtoreadanddoesnothelpunderstandingthecodewhereitsused.
>>
> 
> Idisagreeithinkithelpstremendouslywiththeunderstandingofthecode,youcantgoanyfarther,butfineineedotaddacommenttobtrfs_next_leafanywaysoi'llnameitsomethingelse,anysuggestions?
> 

wecantgoanyfarther

> Josef
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Lyakas Oct. 3, 2012, 12:57 p.m. UTC | #9
Hi Josef,
in send code, in full_send_tree() there is code like this:
                key.objectid = found_key.objectid;
                key.type = found_key.type;
                key.offset = found_key.offset + 1;

                ret = btrfs_next_item(send_root, path);
                if (ret < 0)
                        goto out;
                if (ret) {
                        ret  = 0;
                        break;

It wants to jump to the next greater key in the tree, whatever
objectid it has (the reason is that here we need to traverse the whole
file tree). I believe with your change it might get broken... What do
you think?

Thanks,
Alex.




On Tue, Oct 2, 2012 at 5:30 PM, Arne Jansen <sensille@gmx.net> wrote:
> On 02.10.2012 17:27, Josef Bacik wrote:
>> On Tue, Oct 02, 2012 at 09:05:43AM -0600, David Sterba wrote:
>>> On Tue, Oct 02, 2012 at 10:32:32AM -0400, Josef Bacik wrote:
>>>> On Tue, Oct 02, 2012 at 08:25:38AM -0600, David Sterba wrote:
>>>>> Hi,
>>>>>
>>>>> On Mon, Sep 24, 2012 at 04:02:59PM -0400, Josef Bacik wrote:
>>>>>> --- a/fs/btrfs/ctree.h
>>>>>> +++ b/fs/btrfs/ctree.h
>>>>>> @@ -571,6 +571,7 @@ struct btrfs_path {
>>>>>>   unsigned int skip_locking:1;
>>>>>>   unsigned int leave_spinning:1;
>>>>>>   unsigned int search_commit_root:1;
>>>>>> + unsigned int shecantgoanyfarthercapt:1;
>>>>>
>>>>> so you did not make it to LWN's quote of the week, can you please rename
>>>>> it to something sensible?
>>>>>
>>>>
>>>> I don't see what's unsensible about it.
>>>
>>> Itsfunnybuthardtoreadanddoesnothelpunderstandingthecodewhereitsused.
>>>
>>
>> Idisagreeithinkithelpstremendouslywiththeunderstandingofthecode,youcantgoanyfarther,butfineineedotaddacommenttobtrfs_next_leafanywaysoi'llnameitsomethingelse,anysuggestions?
>>
>
> wecantgoanyfarther
>
>> Josef
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Oct. 3, 2012, 1:32 p.m. UTC | #10
On Wed, Oct 03, 2012 at 06:57:26AM -0600, Alex Lyakas wrote:
> Hi Josef,
> in send code, in full_send_tree() there is code like this:
>                 key.objectid = found_key.objectid;
>                 key.type = found_key.type;
>                 key.offset = found_key.offset + 1;
> 
>                 ret = btrfs_next_item(send_root, path);
>                 if (ret < 0)
>                         goto out;
>                 if (ret) {
>                         ret  = 0;
>                         break;
> 
> It wants to jump to the next greater key in the tree, whatever
> objectid it has (the reason is that here we need to traverse the whole
> file tree). I believe with your change it might get broken... What do
> you think?

Yeah you are right, I'll hold off on this for now.  Thanks,

Josef 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 6d183f6..64ea61c 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -2441,6 +2441,7 @@  int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root
 	lowest_level = p->lowest_level;
 	WARN_ON(lowest_level && ins_len > 0);
 	WARN_ON(p->nodes[0] != NULL);
+	p->shecantgoanyfarthercapt = 1;
 
 	if (ins_len < 0) {
 		lowest_unlock = 2;
@@ -2568,6 +2569,13 @@  cow_done:
 
 		if (level != 0) {
 			int dec = 0;
+
+			/*
+			 * Slot is not the last in the node, we can go farther
+			 * capt.
+			 */
+			if (slot < btrfs_header_nritems(b))
+				p->shecantgoanyfarthercapt = 0;
 			if (ret && slot > 0) {
 				dec = 1;
 				slot -= 1;
@@ -5612,8 +5620,27 @@  int btrfs_next_old_leaf(struct btrfs_root *root, struct btrfs_path *path,
 	nritems = btrfs_header_nritems(path->nodes[0]);
 	if (nritems == 0)
 		return 1;
+	if (path->shecantgoanyfarthercapt)
+		return 1;
+	if (!path->nodes[1])
+		return 1;
 
 	btrfs_item_key_to_cpu(path->nodes[0], &key, nritems - 1);
+
+	/*
+	 * If we have the level above us locked already just check and see if
+	 * the key in the next leaf even has the same objectid, and if not
+	 * return 1 and avoid the search.
+	 */
+	if (path->locks[1] &&
+	    path->slots[1] + 1 < btrfs_header_nritems(path->nodes[1])) {
+		struct btrfs_key tmp;
+
+		btrfs_node_key_to_cpu(path->nodes[1], &tmp,
+				      path->slots[1] + 1);
+		if (key.objectid != tmp.objectid)
+			return 1;
+	}
 again:
 	level = 1;
 	next = NULL;
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 6f2e7e6..2e5c6c5 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -571,6 +571,7 @@  struct btrfs_path {
 	unsigned int skip_locking:1;
 	unsigned int leave_spinning:1;
 	unsigned int search_commit_root:1;
+	unsigned int shecantgoanyfarthercapt:1;
 };
 
 /*