Message ID | 20211026131859.59114-1-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net: gro: set the last skb->next to NULL when it get merged | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Single patches do not need cover letters |
netdev/fixes_present | fail | Series targets non-next tree, but doesn't contain any Fixes tags |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 9 of 9 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Signed-off-by tag matches author and committer |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 1 this patch: 1 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | No Fixes tag |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 7 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 1 this patch: 1 |
netdev/header_inline | success | No static functions without inline keyword in header files |
On Tue, Oct 26, 2021 at 9:19 PM <kerneljasonxing@gmail.com> wrote: > > From: Jason Xing <xingwanli@kuaishou.com> > > Setting the @next of the last skb to NULL to prevent the panic in future > when someone does something to the last of the gro list but its @next is > invalid. > > For example, without the fix (commit: ece23711dd95), a panic could happen > with the clsact loaded when skb is redirected and then validated in > validate_xmit_skb_list() which could access the error addr of the @next > of the last skb. Thus, "general protection fault" would appear after that. > > Signed-off-by: Jason Xing <xingwanli@kuaishou.com> > --- > net/core/skbuff.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 2170bea..7b248f1 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) > skb_shinfo(p)->frag_list = skb; > else > NAPI_GRO_CB(p)->last->next = skb; > + skb->next = NULL; > NAPI_GRO_CB(p)->last = skb; Besides, I'm a little bit confused that this operation inserts the newest skb into the tail of the flow, so the tail of flow is the newest, head oldest. The patch (commit: 600adc18) introduces the flush of the oldest when the flow is full to lower the latency, but actually it fetches the tail of the flow. Do I get something wrong here? I feel it is really odd. Thanks, Jason > __skb_header_release(skb); > lp = p; > -- > 1.8.3.1 >
On Wed, Oct 27, 2021 at 3:23 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Tue, Oct 26, 2021 at 9:19 PM <kerneljasonxing@gmail.com> wrote: > > > > From: Jason Xing <xingwanli@kuaishou.com> > > > > Setting the @next of the last skb to NULL to prevent the panic in future > > when someone does something to the last of the gro list but its @next is > > invalid. > > > > For example, without the fix (commit: ece23711dd95), a panic could happen > > with the clsact loaded when skb is redirected and then validated in > > validate_xmit_skb_list() which could access the error addr of the @next > > of the last skb. Thus, "general protection fault" would appear after that. > > > > Signed-off-by: Jason Xing <xingwanli@kuaishou.com> > > --- > > net/core/skbuff.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > > index 2170bea..7b248f1 100644 > > --- a/net/core/skbuff.c > > +++ b/net/core/skbuff.c > > @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) > > skb_shinfo(p)->frag_list = skb; > > else > > NAPI_GRO_CB(p)->last->next = skb; > > + skb->next = NULL; > > NAPI_GRO_CB(p)->last = skb; > > Besides, I'm a little bit confused that this operation inserts the > newest skb into the tail of the flow, so the tail of flow is the > newest, head oldest. The patch (commit: 600adc18) introduces the flush > of the oldest when the flow is full to lower the latency, but actually > it fetches the tail of the flow. Do I get something wrong here? I feel I have to update this part. The commit 600adc18 evicts and flushes the oldest flow. But for the current kernel, when "napi->gro_hash[hash].count >= MAX_GRO_SKBS" happens, the gro_flush_oldest() flushes the oldest skb of one certain flow, actually it is the newest skb because it is at the end of the list. > it is really odd. > > Thanks, > Jason > > > __skb_header_release(skb); > > lp = p; > > -- > > 1.8.3.1 > >
On Wed, Oct 27, 2021 at 4:07 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Wed, Oct 27, 2021 at 3:23 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > On Tue, Oct 26, 2021 at 9:19 PM <kerneljasonxing@gmail.com> wrote: > > > > > > From: Jason Xing <xingwanli@kuaishou.com> > > > > > > Setting the @next of the last skb to NULL to prevent the panic in future > > > when someone does something to the last of the gro list but its @next is > > > invalid. > > > > > > For example, without the fix (commit: ece23711dd95), a panic could happen > > > with the clsact loaded when skb is redirected and then validated in > > > validate_xmit_skb_list() which could access the error addr of the @next > > > of the last skb. Thus, "general protection fault" would appear after that. > > > > > > Signed-off-by: Jason Xing <xingwanli@kuaishou.com> > > > --- > > > net/core/skbuff.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > > > index 2170bea..7b248f1 100644 > > > --- a/net/core/skbuff.c > > > +++ b/net/core/skbuff.c > > > @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) > > > skb_shinfo(p)->frag_list = skb; > > > else > > > NAPI_GRO_CB(p)->last->next = skb; > > > + skb->next = NULL; > > > NAPI_GRO_CB(p)->last = skb; > > > > Besides, I'm a little bit confused that this operation inserts the > > newest skb into the tail of the flow, so the tail of flow is the > > newest, head oldest. The patch (commit: 600adc18) introduces the flush > > of the oldest when the flow is full to lower the latency, but actually > > it fetches the tail of the flow. Do I get something wrong here? I feel > > I have to update this part. The commit 600adc18 evicts and flushes the > oldest flow. But for the current kernel, when > "napi->gro_hash[hash].count >= MAX_GRO_SKBS" happens, the > gro_flush_oldest() flushes the oldest skb of one certain flow, > actually it is the newest skb because it is at the end of the list. I just submitted another patch to explain how it happens, please help me review both patches. Link: https://lore.kernel.org/lkml/20211027084944.4508-1-kerneljasonxing@gmail.com/ Thanks again, Jason > > > it is really odd. > > > > Thanks, > > Jason > > > > > __skb_header_release(skb); > > > lp = p; > > > -- > > > 1.8.3.1 > > >
On 2021/10/27 16:56, Jason Xing wrote: > On Wed, Oct 27, 2021 at 4:07 PM Jason Xing <kerneljasonxing@gmail.com> wrote: >> >> On Wed, Oct 27, 2021 at 3:23 PM Jason Xing <kerneljasonxing@gmail.com> wrote: >>> >>> On Tue, Oct 26, 2021 at 9:19 PM <kerneljasonxing@gmail.com> wrote: >>>> >>>> From: Jason Xing <xingwanli@kuaishou.com> >>>> >>>> Setting the @next of the last skb to NULL to prevent the panic in future >>>> when someone does something to the last of the gro list but its @next is >>>> invalid. >>>> >>>> For example, without the fix (commit: ece23711dd95), a panic could happen >>>> with the clsact loaded when skb is redirected and then validated in >>>> validate_xmit_skb_list() which could access the error addr of the @next >>>> of the last skb. Thus, "general protection fault" would appear after that. >>>> >>>> Signed-off-by: Jason Xing <xingwanli@kuaishou.com> >>>> --- >>>> net/core/skbuff.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>>> index 2170bea..7b248f1 100644 >>>> --- a/net/core/skbuff.c >>>> +++ b/net/core/skbuff.c >>>> @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) >>>> skb_shinfo(p)->frag_list = skb; >>>> else >>>> NAPI_GRO_CB(p)->last->next = skb; >>>> + skb->next = NULL; >>>> NAPI_GRO_CB(p)->last = skb; >>> >>> Besides, I'm a little bit confused that this operation inserts the >>> newest skb into the tail of the flow, so the tail of flow is the >>> newest, head oldest. The patch (commit: 600adc18) introduces the flush >>> of the oldest when the flow is full to lower the latency, but actually >>> it fetches the tail of the flow. Do I get something wrong here? I feel >> >> I have to update this part. The commit 600adc18 evicts and flushes the >> oldest flow. But for the current kernel, when >> "napi->gro_hash[hash].count >= MAX_GRO_SKBS" happens, the >> gro_flush_oldest() flushes the oldest skb of one certain flow, >> actually it is the newest skb because it is at the end of the list. it seems the below is more matched with the gro_flush_oldest() instead of the above code block: https://elixir.bootlin.com/linux/v5.15-rc3/source/net/core/dev.c#L6118 > > I just submitted another patch to explain how it happens, please help > me review both patches. > > Link: https://lore.kernel.org/lkml/20211027084944.4508-1-kerneljasonxing@gmail.com/ > > Thanks again, > Jason > >> >>> it is really odd. >>> >>> Thanks, >>> Jason >>> >>>> __skb_header_release(skb); >>>> lp = p; >>>> -- >>>> 1.8.3.1 >>>> > . >
On Wed, Oct 27, 2021 at 8:40 PM Yunsheng Lin <linyunsheng@huawei.com> wrote: > > On 2021/10/27 16:56, Jason Xing wrote: > > On Wed, Oct 27, 2021 at 4:07 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > >> > >> On Wed, Oct 27, 2021 at 3:23 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > >>> > >>> On Tue, Oct 26, 2021 at 9:19 PM <kerneljasonxing@gmail.com> wrote: > >>>> > >>>> From: Jason Xing <xingwanli@kuaishou.com> > >>>> > >>>> Setting the @next of the last skb to NULL to prevent the panic in future > >>>> when someone does something to the last of the gro list but its @next is > >>>> invalid. > >>>> > >>>> For example, without the fix (commit: ece23711dd95), a panic could happen > >>>> with the clsact loaded when skb is redirected and then validated in > >>>> validate_xmit_skb_list() which could access the error addr of the @next > >>>> of the last skb. Thus, "general protection fault" would appear after that. > >>>> > >>>> Signed-off-by: Jason Xing <xingwanli@kuaishou.com> > >>>> --- > >>>> net/core/skbuff.c | 1 + > >>>> 1 file changed, 1 insertion(+) > >>>> > >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c > >>>> index 2170bea..7b248f1 100644 > >>>> --- a/net/core/skbuff.c > >>>> +++ b/net/core/skbuff.c > >>>> @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) > >>>> skb_shinfo(p)->frag_list = skb; > >>>> else > >>>> NAPI_GRO_CB(p)->last->next = skb; > >>>> + skb->next = NULL; > >>>> NAPI_GRO_CB(p)->last = skb; > >>> > >>> Besides, I'm a little bit confused that this operation inserts the > >>> newest skb into the tail of the flow, so the tail of flow is the > >>> newest, head oldest. The patch (commit: 600adc18) introduces the flush > >>> of the oldest when the flow is full to lower the latency, but actually > >>> it fetches the tail of the flow. Do I get something wrong here? I feel > >> > >> I have to update this part. The commit 600adc18 evicts and flushes the > >> oldest flow. But for the current kernel, when > >> "napi->gro_hash[hash].count >= MAX_GRO_SKBS" happens, the > >> gro_flush_oldest() flushes the oldest skb of one certain flow, > >> actually it is the newest skb because it is at the end of the list. > > it seems the below is more matched with the gro_flush_oldest() instead > of the above code block: > https://elixir.bootlin.com/linux/v5.15-rc3/source/net/core/dev.c#L6118 > What you said is the @skb->list but not the list between skbs which is connected by skb->next when the new incoming skb needs to get merged. The @skb->list->next/prev is not the same as @skb->next. > > > > I just submitted another patch to explain how it happens, please help > > me review both patches. > > > > Link: https://lore.kernel.org/lkml/20211027084944.4508-1-kerneljasonxing@gmail.com/ > > > > Thanks again, > > Jason > > > >> > >>> it is really odd. > >>> > >>> Thanks, > >>> Jason > >>> > >>>> __skb_header_release(skb); > >>>> lp = p; > >>>> -- > >>>> 1.8.3.1 > >>>> > > . > >
On Wed, Oct 27, 2021 at 8:54 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Wed, Oct 27, 2021 at 8:40 PM Yunsheng Lin <linyunsheng@huawei.com> wrote: > > > > On 2021/10/27 16:56, Jason Xing wrote: > > > On Wed, Oct 27, 2021 at 4:07 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > >> > > >> On Wed, Oct 27, 2021 at 3:23 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > >>> > > >>> On Tue, Oct 26, 2021 at 9:19 PM <kerneljasonxing@gmail.com> wrote: > > >>>> > > >>>> From: Jason Xing <xingwanli@kuaishou.com> > > >>>> > > >>>> Setting the @next of the last skb to NULL to prevent the panic in future > > >>>> when someone does something to the last of the gro list but its @next is > > >>>> invalid. > > >>>> > > >>>> For example, without the fix (commit: ece23711dd95), a panic could happen > > >>>> with the clsact loaded when skb is redirected and then validated in > > >>>> validate_xmit_skb_list() which could access the error addr of the @next > > >>>> of the last skb. Thus, "general protection fault" would appear after that. > > >>>> > > >>>> Signed-off-by: Jason Xing <xingwanli@kuaishou.com> > > >>>> --- > > >>>> net/core/skbuff.c | 1 + > > >>>> 1 file changed, 1 insertion(+) > > >>>> > > >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c > > >>>> index 2170bea..7b248f1 100644 > > >>>> --- a/net/core/skbuff.c > > >>>> +++ b/net/core/skbuff.c > > >>>> @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) > > >>>> skb_shinfo(p)->frag_list = skb; > > >>>> else > > >>>> NAPI_GRO_CB(p)->last->next = skb; > > >>>> + skb->next = NULL; > > >>>> NAPI_GRO_CB(p)->last = skb; > > >>> > > >>> Besides, I'm a little bit confused that this operation inserts the > > >>> newest skb into the tail of the flow, so the tail of flow is the > > >>> newest, head oldest. The patch (commit: 600adc18) introduces the flush > > >>> of the oldest when the flow is full to lower the latency, but actually > > >>> it fetches the tail of the flow. Do I get something wrong here? I feel > > >> > > >> I have to update this part. The commit 600adc18 evicts and flushes the > > >> oldest flow. But for the current kernel, when > > >> "napi->gro_hash[hash].count >= MAX_GRO_SKBS" happens, the > > >> gro_flush_oldest() flushes the oldest skb of one certain flow, > > >> actually it is the newest skb because it is at the end of the list. > > > > it seems the below is more matched with the gro_flush_oldest() instead > > of the above code block: > > https://elixir.bootlin.com/linux/v5.15-rc3/source/net/core/dev.c#L6118 > > > > What you said is the @skb->list but not the list between skbs which is > connected by skb->next when the new incoming skb needs to get merged. > The @skb->list->next/prev is not the same as @skb->next. > > > > > > > I just submitted another patch to explain how it happens, please help > > > me review both patches. > > > > > > Link: https://lore.kernel.org/lkml/20211027084944.4508-1-kerneljasonxing@gmail.com/ > > > Emm, I think you're right, Yunsheng. The gro_flush_oldest() fetches the list of @skb->list. Do you think the tail of skb's next pointer should be set to NULL? Thanks, Jason > > > Thanks again, > > > Jason > > > > > >> > > >>> it is really odd. > > >>> > > >>> Thanks, > > >>> Jason > > >>> > > >>>> __skb_header_release(skb); > > >>>> lp = p; > > >>>> -- > > >>>> 1.8.3.1 > > >>>> > > > . > > >
On 10/26/21 6:18 AM, kerneljasonxing@gmail.com wrote: > From: Jason Xing <xingwanli@kuaishou.com> > > Setting the @next of the last skb to NULL to prevent the panic in future > when someone does something to the last of the gro list but its @next is > invalid. > > For example, without the fix (commit: ece23711dd95), a panic could happen > with the clsact loaded when skb is redirected and then validated in > validate_xmit_skb_list() which could access the error addr of the @next > of the last skb. Thus, "general protection fault" would appear after that. > If this a bug, please provide a Fixes: tag > Signed-off-by: Jason Xing <xingwanli@kuaishou.com> > --- > net/core/skbuff.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 2170bea..7b248f1 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) > skb_shinfo(p)->frag_list = skb; > else > NAPI_GRO_CB(p)->last->next = skb; > + skb->next = NULL; Really at this point skb->next must be already NULL. Please provide a stack trace so that we fix the caller instead of adding more code in GRO layer. > NAPI_GRO_CB(p)->last = skb; > __skb_header_release(skb); > lp = p; >
On 10/27/21 1:07 AM, Jason Xing wrote: > On Wed, Oct 27, 2021 at 3:23 PM Jason Xing <kerneljasonxing@gmail.com> wrote: >> >> On Tue, Oct 26, 2021 at 9:19 PM <kerneljasonxing@gmail.com> wrote: >>> >>> From: Jason Xing <xingwanli@kuaishou.com> >>> >>> Setting the @next of the last skb to NULL to prevent the panic in future >>> when someone does something to the last of the gro list but its @next is >>> invalid. >>> >>> For example, without the fix (commit: ece23711dd95), a panic could happen >>> with the clsact loaded when skb is redirected and then validated in >>> validate_xmit_skb_list() which could access the error addr of the @next >>> of the last skb. Thus, "general protection fault" would appear after that. >>> >>> Signed-off-by: Jason Xing <xingwanli@kuaishou.com> >>> --- >>> net/core/skbuff.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c >>> index 2170bea..7b248f1 100644 >>> --- a/net/core/skbuff.c >>> +++ b/net/core/skbuff.c >>> @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) >>> skb_shinfo(p)->frag_list = skb; >>> else >>> NAPI_GRO_CB(p)->last->next = skb; >>> + skb->next = NULL; >>> NAPI_GRO_CB(p)->last = skb; >> >> Besides, I'm a little bit confused that this operation inserts the >> newest skb into the tail of the flow, so the tail of flow is the >> newest, head oldest. The patch (commit: 600adc18) introduces the flush >> of the oldest when the flow is full to lower the latency, but actually >> it fetches the tail of the flow. Do I get something wrong here? I feel > > I have to update this part. The commit 600adc18 evicts and flushes the > oldest flow. But for the current kernel, when > "napi->gro_hash[hash].count >= MAX_GRO_SKBS" happens, the > gro_flush_oldest() flushes the oldest skb of one certain flow, > actually it is the newest skb because it is at the end of the list. GRO only keeps one skb per flow in the main hash/lru. I think you are not understanding GRO correctly.
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 2170bea..7b248f1 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -4396,6 +4396,7 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) skb_shinfo(p)->frag_list = skb; else NAPI_GRO_CB(p)->last->next = skb; + skb->next = NULL; NAPI_GRO_CB(p)->last = skb; __skb_header_release(skb); lp = p;