Message ID | 20230328142112.12493-1-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2,net] net: rps: avoid raising a softirq on the current cpu when scheduling napi | expand |
On Tue, Mar 28, 2023 at 4:21 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > From: Jason Xing <kernelxing@tencent.com> > > When we are scheduling napi and then RPS decides to put the skb into > a backlog queue of another cpu, we shouldn't raise the softirq for > the current cpu. When to raise a softirq is based on whether we have > more data left to process later. But apparently, as to the current > cpu, there is no indication of more data enqueued, so we do not need > this action. After enqueuing to another cpu, net_rx_action() or > process_backlog() will call ipi and then another cpu will raise the > softirq as expected. > > Also, raising more softirqs which set the corresponding bit field > can make the IRQ mechanism think we probably need to start ksoftirqd > on the current cpu. Actually it shouldn't happen. > > Here are some codes to clarify how it can trigger ksoftirqd: > __do_softirq() > [1] net_rx_action() -> enqueue_to_backlog() -> raise an IRQ > [2] check if pending is set again -> wakeup_softirqd > > Comments on above: > [1] when RPS chooses another cpu to enqueue skb > [2] in __do_softirq() it will wait a little bit of time around 2 jiffies > > In this patch, raising an IRQ can be avoided when RPS enqueues the skb > into another backlog queue not the current one. > > I captured some data when starting one iperf3 process and found out > we can reduces around ~1500 times/sec at least calling > __raise_softirq_irqoff(). > > Fixes: 0a9627f2649a ("rps: Receive Packet Steering") No Fixes: tag, when you are trying to optimize things, and so far fail at this. > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > v2: > 1) change the title and add more details. > 2) add one parameter to recognise whether it is napi or non-napi case > suggested by Eric. > Link: https://lore.kernel.org/lkml/20230325152417.5403-1-kerneljasonxing@gmail.com/ > --- Wrong again. I think I will send a series, instead of you trying so hard to break the stack. You have not considered busy polling, and that netif_receive_skb() contract does not enforce it to be called from net_rx_action().
diff --git a/net/core/dev.c b/net/core/dev.c index 1518a366783b..504dc3fc09b1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4586,7 +4586,7 @@ static void trigger_rx_softirq(void *data) * If yes, queue it to our IPI list and return 1 * If no, return 0 */ -static int napi_schedule_rps(struct softnet_data *sd) +static int napi_schedule_rps(struct softnet_data *sd, bool napi) { struct softnet_data *mysd = this_cpu_ptr(&softnet_data); @@ -4594,8 +4594,9 @@ static int napi_schedule_rps(struct softnet_data *sd) if (sd != mysd) { sd->rps_ipi_next = mysd->rps_ipi_list; mysd->rps_ipi_list = sd; + if (!napi) + __raise_softirq_irqoff(NET_RX_SOFTIRQ); - __raise_softirq_irqoff(NET_RX_SOFTIRQ); return 1; } #endif /* CONFIG_RPS */ @@ -4648,7 +4649,7 @@ static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen) * queue (may be a remote CPU queue). */ static int enqueue_to_backlog(struct sk_buff *skb, int cpu, - unsigned int *qtail) + unsigned int *qtail, bool napi) { enum skb_drop_reason reason; struct softnet_data *sd; @@ -4675,7 +4676,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu, * We can use non atomic operation since we own the queue lock */ if (!__test_and_set_bit(NAPI_STATE_SCHED, &sd->backlog.state)) - napi_schedule_rps(sd); + napi_schedule_rps(sd, napi); goto enqueue; } reason = SKB_DROP_REASON_CPU_BACKLOG; @@ -4933,7 +4934,7 @@ static int netif_rx_internal(struct sk_buff *skb) if (cpu < 0) cpu = smp_processor_id(); - ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail); + ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail, false); rcu_read_unlock(); } else @@ -4941,7 +4942,7 @@ static int netif_rx_internal(struct sk_buff *skb) { unsigned int qtail; - ret = enqueue_to_backlog(skb, smp_processor_id(), &qtail); + ret = enqueue_to_backlog(skb, smp_processor_id(), &qtail, false); } return ret; } @@ -5670,7 +5671,7 @@ static int netif_receive_skb_internal(struct sk_buff *skb) int cpu = get_rps_cpu(skb->dev, skb, &rflow); if (cpu >= 0) { - ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail); + ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail, false); rcu_read_unlock(); return ret; } @@ -5705,7 +5706,7 @@ void netif_receive_skb_list_internal(struct list_head *head) if (cpu >= 0) { /* Will be handled, remove from list */ skb_list_del_init(skb); - enqueue_to_backlog(skb, cpu, &rflow->last_qtail); + enqueue_to_backlog(skb, cpu, &rflow->last_qtail, true); } } }