Message ID | 20211202024723.76257-3-xiangxia.m.yue@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | Fix bpf_redirect to ifb netdev | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Clearly marked for net |
netdev/fixes_present | success | Fixes tag present in non-next series |
netdev/subject_prefix | success | Link |
netdev/cover_letter | success | Series has a cover letter |
netdev/patch_count | success | Link |
netdev/header_inline | success | No static functions without inline keyword in header files |
netdev/build_32bit | success | Errors and warnings before: 18 this patch: 18 |
netdev/cc_maintainers | warning | 1 maintainers not CCed: bpf@vger.kernel.org |
netdev/build_clang | success | Errors and warnings before: 20 this patch: 20 |
netdev/module_param | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Signed-off-by tag matches author and committer |
netdev/verify_fixes | success | No Fixes tag |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 9 this patch: 9 |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 9 lines checked |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/source_inline | success | Was 0 now: 0 |
bpf/vmtest-bpf-next-PR | success | PR summary |
bpf/vmtest-bpf-next | success | VM_Test |
On 12/2/21 3:47 AM, xiangxia.m.yue@gmail.com wrote: > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > Try to resolve the issues as below: > * We look up and then check tc_skip_classify flag in net > sched layer, even though skb don't want to be classified. > That case may consume a lot of cpu cycles. > > Install the rules as below: > $ for id in $(seq 1 10000); do > $ tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0 > $ done > > netperf: > $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32 > $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32 > > Before: 152.04 tps, 0.58 Mbit/s > After: 303.07 tps, 1.51 Mbit/s > For TCP_RR, there are 99.3% improvement, TCP_STREAM 160.3%. As it was pointed out earlier by Eric in v3, these numbers are moot since noone is realistically running such a setup in practice with 10k linear rules.
On Fri, Dec 3, 2021 at 1:35 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 12/2/21 3:47 AM, xiangxia.m.yue@gmail.com wrote: > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > Try to resolve the issues as below: > > * We look up and then check tc_skip_classify flag in net > > sched layer, even though skb don't want to be classified. > > That case may consume a lot of cpu cycles. > > > > Install the rules as below: > > $ for id in $(seq 1 10000); do > > $ tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0 > > $ done > > > > netperf: > > $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32 > > $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32 > > > > Before: 152.04 tps, 0.58 Mbit/s > > After: 303.07 tps, 1.51 Mbit/s > > For TCP_RR, there are 99.3% improvement, TCP_STREAM 160.3%. > > As it was pointed out earlier by Eric in v3, these numbers are moot since noone > is realistically running such a setup in practice with 10k linear rules. Yes, I am so sorry that I used a sarcastic comment. I really should have asked if a real world case was using a lot of filters. If so, maybe we can do something about that, for packets actually going through these filters. Thanks
On Sat, Dec 4, 2021 at 5:35 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 12/2/21 3:47 AM, xiangxia.m.yue@gmail.com wrote: > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > Try to resolve the issues as below: > > * We look up and then check tc_skip_classify flag in net > > sched layer, even though skb don't want to be classified. > > That case may consume a lot of cpu cycles. > > > > Install the rules as below: > > $ for id in $(seq 1 10000); do > > $ tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0 > > $ done > > > > netperf: > > $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32 > > $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32 > > > > Before: 152.04 tps, 0.58 Mbit/s > > After: 303.07 tps, 1.51 Mbit/s > > For TCP_RR, there are 99.3% improvement, TCP_STREAM 160.3%. > > As it was pointed out earlier by Eric in v3, these numbers are moot since noone > is realistically running such a setup in practice with 10k linear rules. Yes. As I said in v1, in production, we use the 5+ prio. With this patch, little improvements, 1.x% This patch also fixes the packets loopback, if we use the bpf_redirect to ifb in egress path.
On Sat, Dec 4, 2021 at 9:42 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > On Sat, Dec 4, 2021 at 5:35 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > On 12/2/21 3:47 AM, xiangxia.m.yue@gmail.com wrote: > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > > > Try to resolve the issues as below: > > > * We look up and then check tc_skip_classify flag in net > > > sched layer, even though skb don't want to be classified. > > > That case may consume a lot of cpu cycles. > > > > > > Install the rules as below: > > > $ for id in $(seq 1 10000); do > > > $ tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0 > > > $ done > > > > > > netperf: > > > $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32 > > > $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32 > > > > > > Before: 152.04 tps, 0.58 Mbit/s > > > After: 303.07 tps, 1.51 Mbit/s > > > For TCP_RR, there are 99.3% improvement, TCP_STREAM 160.3%. > > > > As it was pointed out earlier by Eric in v3, these numbers are moot since noone > > is realistically running such a setup in practice with 10k linear rules. > Yes. As I said in v1, in production, we use the 5+ prio. With this > patch, little improvements, 1.x% > > This patch also fixes the packets loopback, if we use the bpf_redirect > to ifb in egress path. Hi Daniel, Eric What should I do next?This patch try to fix the bug, and improve the performance(~1% in production). Should I update the commit message and send v5? > -- > Best regards, Tonghao
On Mon, Dec 6, 2021 at 6:17 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > On Sat, Dec 4, 2021 at 9:42 AM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > On Sat, Dec 4, 2021 at 5:35 AM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > > > On 12/2/21 3:47 AM, xiangxia.m.yue@gmail.com wrote: > > > > From: Tonghao Zhang <xiangxia.m.yue@gmail.com> > > > > > > > > Try to resolve the issues as below: > > > > * We look up and then check tc_skip_classify flag in net > > > > sched layer, even though skb don't want to be classified. > > > > That case may consume a lot of cpu cycles. > > > > > > > > Install the rules as below: > > > > $ for id in $(seq 1 10000); do > > > > $ tc filter add ... egress prio $id ... action mirred egress redirect dev ifb0 > > > > $ done > > > > > > > > netperf: > > > > $ taskset -c 1 netperf -t TCP_RR -H ip -- -r 32,32 > > > > $ taskset -c 1 netperf -t TCP_STREAM -H ip -- -m 32 > > > > > > > > Before: 152.04 tps, 0.58 Mbit/s > > > > After: 303.07 tps, 1.51 Mbit/s > > > > For TCP_RR, there are 99.3% improvement, TCP_STREAM 160.3%. > > > > > > As it was pointed out earlier by Eric in v3, these numbers are moot since noone > > > is realistically running such a setup in practice with 10k linear rules. > > Yes. As I said in v1, in production, we use the 5+ prio. With this > > patch, little improvements, 1.x% > > > > This patch also fixes the packets loopback, if we use the bpf_redirect > > to ifb in egress path. > Hi Daniel, Eric > What should I do next?This patch try to fix the bug, and improve the > performance(~1% in production). > Should I update the commit message and send v5? This is adding yet another bit in skb :/ We also have another patch series today adding one bit in skb. For me, this is really an issue. This should be a last resort. For example xmit_more is no longer a field in skb. Why, because most probably this property does not need to stick to skb, but the context of execution. Also this is not clear how stacked devices will be handled (bonding/team/tunnels)
diff --git a/net/core/dev.c b/net/core/dev.c index d30adecc2bb2..10bad44e2ec4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -3823,6 +3823,9 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) if (!miniq) return skb; + if (skb_skip_tc_classify(skb)) + return skb; + /* qdisc_skb_cb(skb)->pkt_len was already set by the caller. */ qdisc_skb_cb(skb)->mru = 0; qdisc_skb_cb(skb)->post_ct = false;