diff mbox

IPoIB header prefetch for fragmented SKB

Message ID CANn89iLwF9LUohz_c765c4b+6DC0gTH77bN5QzbAkR-4LCcUog@mail.gmail.com (mailing list archive)
State Rejected
Headers show

Commit Message

Eric Dumazet April 17, 2013, 7:10 p.m. UTC
Please try :

                 * first buf size is IPOIB_UD_HEAD_SIZE

On Wed, Apr 17, 2013 at 11:38 AM, Markus Stockhausen
<markus.stockhausen@gmx.de> wrote:
>>
>>
>>
>>That's probably because of a cache line miss.
>>
>>The thing I don't really understand is that normally, the first cache
>>line (64 bytes) contains both the Ethernet header and IPv4 header.
>>
>>So what does this adapter in this respect ?
>>
>>I guess you should try to use IPOIB_UD_HEAD_SIZE=64 to use the whole
>>cache line.
>>
>>Many drivers use prefetch() to make sure cpu starts to bring this
>>cache line into cache as soon as possible.
>>
>>A single prefetch() call at the right place might help a lot.
>
> Hello,
>
> @Eric: Thanks for the tip.
>
> In the 4K MAX MTU IPoIB driver path ipoib_ib_handle_rx_wc() will
> produce an empty skb linear part with the whole data placed into
> the first fragment. napi_gro_receive() finally pulls the IP
> header out of the fragment into the linear part.
>
> As far as I understand the pull out of the fragment should come
> without additional cost when one calls a prefetch "long" before
> the skb_pull().
>
> I'm willing to check this out but I'm unsure if the IP header
> is aligned to a cache line of 64 bytes. As a simple guess I
> would implement the prefetch here:
>
> static void ipoib_ib_handle_rx_wc();
>   ...
>   skb_pull (skb, IB_GRH_BYTES);
>
>   skb->protocol = ((struct ipoib_header *) skb->data)->proto;
>   skb_reset_mac_header(skb);
>   skb_pull(skb, IPOIB_ENCAP_LEN);
> +
> + if (ipoib_ud_need_sg(priv->max_ib_mtu))
> +   prefetch(<whatever address>);
>   ...
>
> Can you give me a hint what address one should put into the call?
>
> Thanks in advance.
>
> Markus
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Roland Dreier April 17, 2013, 9:49 p.m. UTC | #1
On Wed, Apr 17, 2013 at 12:10 PM, Eric Dumazet <edumazet@google.com> wrote:
> +               prefetch(page_address(frag->page.p));

I guess we would have to change our allocation to __get_free_page()
(instead of alloc_page()), so that we make sure never to get a highmem
page?

Other than that, seems fine.

By the way, is there any value to header splitting -- having the
adapter put the TCP/IP headers in the linear part and the payload in
the fragment page?  How about in the case where we have a small enough
MTU that we could put the whole packet in the linear part?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet April 17, 2013, 10:04 p.m. UTC | #2
alloc_page(GFP_ATOMIC) is safe. it doesnt give you a highmempage

Not sure about header splitting. If done, it probably should be smart
as you said (allowing small frames to be in the first part)

On Wed, Apr 17, 2013 at 2:49 PM, Roland Dreier <roland@kernel.org> wrote:
> On Wed, Apr 17, 2013 at 12:10 PM, Eric Dumazet <edumazet@google.com> wrote:
>> +               prefetch(page_address(frag->page.p));
>
> I guess we would have to change our allocation to __get_free_page()
> (instead of alloc_page()), so that we make sure never to get a highmem
> page?
>
> Other than that, seems fine.
>
> By the way, is there any value to header splitting -- having the
> adapter put the TCP/IP headers in the linear part and the payload in
> the fragment page?  How about in the case where we have a small enough
> MTU that we could put the whole packet in the linear part?
>
>  - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Markus Stockhausen April 19, 2013, 6:07 p.m. UTC | #3
Am 17.04.13 21:10 schrieb "Eric Dumazet" unter <edumazet@google.com>:

>Please try :
>
>diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>index 2cfa76f..9bdff21 100644
>--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>@@ -112,6 +112,8 @@ static void ipoib_ud_skb_put_frags(struct
>ipoib_dev_priv *priv,
>        if (ipoib_ud_need_sg(priv->max_ib_mtu)) {
>                skb_frag_t *frag = &skb_shinfo(skb)->frags[0];
>                unsigned int size;
>+
>+               prefetch(page_address(frag->page.p));
>                /*
>                 * There is only two buffers needed for max_payload = 4K,
>                 * first buf size is IPOIB_UD_HEAD_SIZE

Thanks to both of you for spending time on this.

Sorry to tell you that the above patch does not give any performance
improvements. So I put a debug message inside inside pskb_may_pull()
just in front of __pskb_pull_tail() to give an idea what happens.

During a netserver/netperf run with an MTU of 2044 bytes I can read
the following repeating messages:

[   52.909805] 0 bytes inline, 1988 bytes in fragment, pull 20 bytes
[   52.909807] 20 bytes inline, 1968 bytes in fragment, pull 40 bytes
[   52.909809] 40 bytes inline, 1948 bytes in fragment, pull 52 bytes
[   52.909812] 0 bytes inline, 1988 bytes in fragment, pull 20 bytes
[   52.909814] 20 bytes inline, 1968 bytes in fragment, pull 40 bytes
[   52.909816] 40 bytes inline, 1948 bytes in fragment, pull 52 bytes


Now I'm getting lost somehow. From 2044 bytes 1988 are left in the
fragment and 112 have to be pulled into the linear part. This is
achieved by 3 steps. So we are far beyond the cache line size of 64
bytes.

Any ideas?

Markus


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet April 19, 2013, 6:09 p.m. UTC | #4
I dont think so : a total of 52 bytes are pulled.

(20 bytes for IPv4 header, 32 bytes for TCP header (TS))

On Fri, Apr 19, 2013 at 11:07 AM, Markus Stockhausen
<markus.stockhausen@gmx.de> wrote:
>
>
> Am 17.04.13 21:10 schrieb "Eric Dumazet" unter <edumazet@google.com>:
>
>>Please try :
>>
>>diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>>b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>>index 2cfa76f..9bdff21 100644
>>--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>>+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>>@@ -112,6 +112,8 @@ static void ipoib_ud_skb_put_frags(struct
>>ipoib_dev_priv *priv,
>>        if (ipoib_ud_need_sg(priv->max_ib_mtu)) {
>>                skb_frag_t *frag = &skb_shinfo(skb)->frags[0];
>>                unsigned int size;
>>+
>>+               prefetch(page_address(frag->page.p));
>>                /*
>>                 * There is only two buffers needed for max_payload = 4K,
>>                 * first buf size is IPOIB_UD_HEAD_SIZE
>
> Thanks to both of you for spending time on this.
>
> Sorry to tell you that the above patch does not give any performance
> improvements. So I put a debug message inside inside pskb_may_pull()
> just in front of __pskb_pull_tail() to give an idea what happens.
>
> During a netserver/netperf run with an MTU of 2044 bytes I can read
> the following repeating messages:
>
> [   52.909805] 0 bytes inline, 1988 bytes in fragment, pull 20 bytes
> [   52.909807] 20 bytes inline, 1968 bytes in fragment, pull 40 bytes
> [   52.909809] 40 bytes inline, 1948 bytes in fragment, pull 52 bytes
> [   52.909812] 0 bytes inline, 1988 bytes in fragment, pull 20 bytes
> [   52.909814] 20 bytes inline, 1968 bytes in fragment, pull 40 bytes
> [   52.909816] 40 bytes inline, 1948 bytes in fragment, pull 52 bytes
>
>
> Now I'm getting lost somehow. From 2044 bytes 1988 are left in the
> fragment and 112 have to be pulled into the linear part. This is
> achieved by 3 steps. So we are far beyond the cache line size of 64
> bytes.
>
> Any ideas?
>
> Markus
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet April 19, 2013, 6:13 p.m. UTC | #5
This pulls cannot explain a major performance problem.

Its a memcpy( ... 20 bytes). No more no less.

I have NICs here, reaching 20Gbp on a single TCP flow, were the pull is done.

Something else is going on.

On Fri, Apr 19, 2013 at 11:09 AM, Eric Dumazet <edumazet@google.com> wrote:
> I dont think so : a total of 52 bytes are pulled.
>
> (20 bytes for IPv4 header, 32 bytes for TCP header (TS))
>
> On Fri, Apr 19, 2013 at 11:07 AM, Markus Stockhausen
> <markus.stockhausen@gmx.de> wrote:
>>
>>
>> Am 17.04.13 21:10 schrieb "Eric Dumazet" unter <edumazet@google.com>:
>>
>>>Please try :
>>>
>>>diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>>>b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>>>index 2cfa76f..9bdff21 100644
>>>--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>>>+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>>>@@ -112,6 +112,8 @@ static void ipoib_ud_skb_put_frags(struct
>>>ipoib_dev_priv *priv,
>>>        if (ipoib_ud_need_sg(priv->max_ib_mtu)) {
>>>                skb_frag_t *frag = &skb_shinfo(skb)->frags[0];
>>>                unsigned int size;
>>>+
>>>+               prefetch(page_address(frag->page.p));
>>>                /*
>>>                 * There is only two buffers needed for max_payload = 4K,
>>>                 * first buf size is IPOIB_UD_HEAD_SIZE
>>
>> Thanks to both of you for spending time on this.
>>
>> Sorry to tell you that the above patch does not give any performance
>> improvements. So I put a debug message inside inside pskb_may_pull()
>> just in front of __pskb_pull_tail() to give an idea what happens.
>>
>> During a netserver/netperf run with an MTU of 2044 bytes I can read
>> the following repeating messages:
>>
>> [   52.909805] 0 bytes inline, 1988 bytes in fragment, pull 20 bytes
>> [   52.909807] 20 bytes inline, 1968 bytes in fragment, pull 40 bytes
>> [   52.909809] 40 bytes inline, 1948 bytes in fragment, pull 52 bytes
>> [   52.909812] 0 bytes inline, 1988 bytes in fragment, pull 20 bytes
>> [   52.909814] 20 bytes inline, 1968 bytes in fragment, pull 40 bytes
>> [   52.909816] 40 bytes inline, 1948 bytes in fragment, pull 52 bytes
>>
>>
>> Now I'm getting lost somehow. From 2044 bytes 1988 are left in the
>> fragment and 112 have to be pulled into the linear part. This is
>> achieved by 3 steps. So we are far beyond the cache line size of 64
>> bytes.
>>
>> Any ideas?
>>
>> Markus
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roland Dreier April 20, 2013, 12:18 a.m. UTC | #6
On Wed, Apr 17, 2013 at 12:10 PM, Eric Dumazet <edumazet@google.com> wrote:
> +               prefetch(page_address(frag->page.p));

Would it also make sense to do a

    prefetchw(skb->data);

here?  (That's where we'll copy the header, right?)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet April 20, 2013, 12:46 a.m. UTC | #7
Almost no driver care to do that, but you could try anyway ;)


On Fri, Apr 19, 2013 at 5:18 PM, Roland Dreier <roland@kernel.org> wrote:
> On Wed, Apr 17, 2013 at 12:10 PM, Eric Dumazet <edumazet@google.com> wrote:
>> +               prefetch(page_address(frag->page.p));
>
> Would it also make sense to do a
>
>     prefetchw(skb->data);
>
> here?  (That's where we'll copy the header, right?)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 2cfa76f..9bdff21 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -112,6 +112,8 @@  static void ipoib_ud_skb_put_frags(struct
ipoib_dev_priv *priv,
        if (ipoib_ud_need_sg(priv->max_ib_mtu)) {
                skb_frag_t *frag = &skb_shinfo(skb)->frags[0];
                unsigned int size;
+
+               prefetch(page_address(frag->page.p));
                /*
                 * There is only two buffers needed for max_payload = 4K,