Message ID | 20220516042456.3014395-2-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 97e719a82b43c6c2bb5eebdb3c5d479a332ac2ac |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: polish skb defer freeing | expand |
On Sun, 15 May 2022 21:24:53 -0700 Eric Dumazet wrote: > A cpu can observe sd->defer_count reaching 128, > and call smp_call_function_single_async() > > Problem is that the remote CPU can clear sd->defer_count > before the IPI is run/acknowledged. > > Other cpus can queue more packets and also decide > to call smp_call_function_single_async() while the pending > IPI was not yet delivered. > > This is a common issue with smp_call_function_single_async(). > Callers must ensure correct synchronization and serialization. > > I triggered this issue while experimenting smaller threshold. > Performing the call to smp_call_function_single_async() > under sd->defer_lock protection did not solve the problem. > > Commit 5a18ceca6350 ("smp: Allow smp_call_function_single_async() > to insert locked csd") replaced an informative WARN_ON_ONCE() > with a return of -EBUSY, which is often ignored. > Test of CSD_FLAG_LOCK presence is racy anyway. If I'm reading this right this is useful for backports but in net-next it really is a noop? The -EBUSY would be perfectly safe to ignore? Just checking.
On Mon, May 16, 2022 at 11:16 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Sun, 15 May 2022 21:24:53 -0700 Eric Dumazet wrote: > > A cpu can observe sd->defer_count reaching 128, > > and call smp_call_function_single_async() > > > > Problem is that the remote CPU can clear sd->defer_count > > before the IPI is run/acknowledged. > > > > Other cpus can queue more packets and also decide > > to call smp_call_function_single_async() while the pending > > IPI was not yet delivered. > > > > This is a common issue with smp_call_function_single_async(). > > Callers must ensure correct synchronization and serialization. > > > > I triggered this issue while experimenting smaller threshold. > > Performing the call to smp_call_function_single_async() > > under sd->defer_lock protection did not solve the problem. > > > > Commit 5a18ceca6350 ("smp: Allow smp_call_function_single_async() > > to insert locked csd") replaced an informative WARN_ON_ONCE() > > with a return of -EBUSY, which is often ignored. > > Test of CSD_FLAG_LOCK presence is racy anyway. > > If I'm reading this right this is useful for backports but in net-next > it really is a noop? The -EBUSY would be perfectly safe to ignore? > Just checking. Not sure I understand the question. trigger_rx_softirq() and friends were only in net-next, so there is no backport needed. Are you talking of calls from net_rps_send_ipi() ? These are fine, because we own an atomic bit (NAPI_STATE_SCHED).
On Mon, 16 May 2022 11:24:40 -0700 Eric Dumazet wrote: > On Mon, May 16, 2022 at 11:16 AM Jakub Kicinski <kuba@kernel.org> wrote: > > If I'm reading this right this is useful for backports but in net-next > > it really is a noop? The -EBUSY would be perfectly safe to ignore? > > Just checking. > > Not sure I understand the question. > > trigger_rx_softirq() and friends were only in net-next, so there is no > backport needed. > > Are you talking of calls from net_rps_send_ipi() ? > These are fine, because we own an atomic bit (NAPI_STATE_SCHED). Ah, I think I get it now. It was unclear what's the problem this patch is solving this part of the commit message really is key: > This is a common issue with smp_call_function_single_async(). > Callers must ensure correct synchronization and serialization. smp_call_function_single_async() does not protect its own internal state so we need to wrap it in our own locking (of some form thereof). Sorry for the noise.
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 536321691c725ebd311088f4654dd04b9abbaaef..89699a299ba1b0b544b7abb782708d827abe55ac 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3127,6 +3127,7 @@ struct softnet_data { /* Another possibly contended cache line */ spinlock_t defer_lock ____cacheline_aligned_in_smp; int defer_count; + int defer_ipi_scheduled; struct sk_buff *defer_list; call_single_data_t defer_csd; }; diff --git a/net/core/dev.c b/net/core/dev.c index a601da3b4a7c800801f763f097f00f3a3b591107..d708f95356e0b03a61e8211adcf6d272dfa322b5 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4581,9 +4581,12 @@ static void rps_trigger_softirq(void *data) #endif /* CONFIG_RPS */ /* Called from hardirq (IPI) context */ -static void trigger_rx_softirq(void *data __always_unused) +static void trigger_rx_softirq(void *data) { + struct softnet_data *sd = data; + __raise_softirq_irqoff(NET_RX_SOFTIRQ); + smp_store_release(&sd->defer_ipi_scheduled, 0); } /* @@ -11381,7 +11384,7 @@ static int __init net_dev_init(void) INIT_CSD(&sd->csd, rps_trigger_softirq, sd); sd->cpu = i; #endif - INIT_CSD(&sd->defer_csd, trigger_rx_softirq, NULL); + INIT_CSD(&sd->defer_csd, trigger_rx_softirq, sd); spin_lock_init(&sd->defer_lock); init_gro_hash(&sd->backlog); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index bd16e158b3668f496a9d2c8e8b6f3433a326314c..1e2180682f2e94c45e3f26059af6d18be2d9f9d3 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -6514,8 +6514,7 @@ void skb_attempt_defer_free(struct sk_buff *skb) sd->defer_count++; /* kick every time queue length reaches 128. - * This should avoid blocking in smp_call_function_single_async(). - * This condition should hardly be bit under normal conditions, + * This condition should hardly be hit under normal conditions, * unless cpu suddenly stopped to receive NIC interrupts. */ kick = sd->defer_count == 128; @@ -6525,6 +6524,6 @@ void skb_attempt_defer_free(struct sk_buff *skb) /* Make sure to trigger NET_RX_SOFTIRQ on the remote CPU * if we are unlucky enough (this seems very unlikely). */ - if (unlikely(kick)) + if (unlikely(kick) && !cmpxchg(&sd->defer_ipi_scheduled, 0, 1)) smp_call_function_single_async(cpu, &sd->defer_csd); }