Message ID | 167395854720.539380.12918805302179692095.stgit@firesoul (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: avoid irqsave in skb_defer_free_flush | expand |
On 1/17/2023 4:29 AM, Jesper Dangaard Brouer wrote: > The spin_lock irqsave/restore API variant in skb_defer_free_flush can > be replaced with the faster spin_lock irq variant, which doesn't need > to read and restore the CPU flags. > > Using the unconditional irq "disable/enable" API variant is safe, > because the skb_defer_free_flush() function is only called during > NAPI-RX processing in net_rx_action(), where it is known the IRQs > are enabled. > Did you mean disabled here? If IRQs are enabled that would mean the interrupt could be triggered and we would need to irqsave, no? > Expected gain is 14 cycles from avoiding reading and restoring CPU > flags in a spin_lock_irqsave/restore operation, measured via a > microbencmark kernel module[1] on CPU E5-1650 v4 @ 3.60GHz. > > Microbenchmark overhead of spin_lock+unlock: > - spin_lock_unlock_irq cost: 34 cycles(tsc) 9.486 ns > - spin_lock_unlock_irqsave cost: 48 cycles(tsc) 13.567 ns > Fairly minor change in perf, and.. > We don't expect to see a measurable packet performance gain, as > skb_defer_free_flush() is called infrequently once per NIC device NAPI > bulk cycle and conditionally only if SKBs have been deferred by other > CPUs via skb_attempt_defer_free(). > Not really measurable as its not called enough, but.. > [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_sample.c > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > --- > net/core/dev.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index cf78f35bc0b9..9c60190fe352 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6616,17 +6616,16 @@ static int napi_threaded_poll(void *data) > static void skb_defer_free_flush(struct softnet_data *sd) > { > struct sk_buff *skb, *next; > - unsigned long flags; > > /* Paired with WRITE_ONCE() in skb_attempt_defer_free() */ > if (!READ_ONCE(sd->defer_list)) > return; > > - spin_lock_irqsave(&sd->defer_lock, flags); > + spin_lock_irq(&sd->defer_lock); > skb = sd->defer_list; > sd->defer_list = NULL; > sd->defer_count = 0; > - spin_unlock_irqrestore(&sd->defer_lock, flags); > + spin_unlock_irq(&sd->defer_lock); > It's also less code and makes it more clear what dependency this section has. Seems ok to me, with the minor nit I think in the commit message: Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> > while (skb != NULL) { > next = skb->next; > >
On 17/01/2023 20.29, Jacob Keller wrote: > > On 1/17/2023 4:29 AM, Jesper Dangaard Brouer wrote: >> The spin_lock irqsave/restore API variant in skb_defer_free_flush can >> be replaced with the faster spin_lock irq variant, which doesn't need >> to read and restore the CPU flags. >> >> Using the unconditional irq "disable/enable" API variant is safe, >> because the skb_defer_free_flush() function is only called during >> NAPI-RX processing in net_rx_action(), where it is known the IRQs >> are enabled. >> > > Did you mean disabled here? If IRQs are enabled that would mean the > interrupt could be triggered and we would need to irqsave, no? I do mean 'enabled' in the text here. As you can see in net_rx_action() we are allowed to perform code like: local_irq_disable(); list_splice_init(&sd->poll_list, &list); local_irq_enable(); Disabling local IRQ without saving 'flags' and unconditionally enabling local IRQs again. Thus, in skb_defer_free_flush() we can do the same, without saving 'flags'. Hope it makes it more clear. >> Expected gain is 14 cycles from avoiding reading and restoring CPU >> flags in a spin_lock_irqsave/restore operation, measured via a >> microbencmark kernel module[1] on CPU E5-1650 v4 @ 3.60GHz. >> >> Microbenchmark overhead of spin_lock+unlock: >> - spin_lock_unlock_irq cost: 34 cycles(tsc) 9.486 ns >> - spin_lock_unlock_irqsave cost: 48 cycles(tsc) 13.567 ns >> > > Fairly minor change in perf, and.. > >> We don't expect to see a measurable packet performance gain, as >> skb_defer_free_flush() is called infrequently once per NIC device NAPI >> bulk cycle and conditionally only if SKBs have been deferred by other >> CPUs via skb_attempt_defer_free(). >> > > Not really measurable as its not called enough, but.. > >> [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_sample.c >> >> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> >> --- >> net/core/dev.c | 5 ++--- >> 1 file changed, 2 insertions(+), 3 deletions(-) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index cf78f35bc0b9..9c60190fe352 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -6616,17 +6616,16 @@ static int napi_threaded_poll(void *data) >> static void skb_defer_free_flush(struct softnet_data *sd) >> { >> struct sk_buff *skb, *next; >> - unsigned long flags; >> >> /* Paired with WRITE_ONCE() in skb_attempt_defer_free() */ >> if (!READ_ONCE(sd->defer_list)) >> return; >> >> - spin_lock_irqsave(&sd->defer_lock, flags); >> + spin_lock_irq(&sd->defer_lock); >> skb = sd->defer_list; >> sd->defer_list = NULL; >> sd->defer_count = 0; >> - spin_unlock_irqrestore(&sd->defer_lock, flags); >> + spin_unlock_irq(&sd->defer_lock); >> > > It's also less code and makes it more clear what dependency this section > has. > > Seems ok to me, with the minor nit I think in the commit message: > > Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Thanks for the review. --Jesper
On 1/18/2023 11:19 AM, Jesper Dangaard Brouer wrote: > > On 17/01/2023 20.29, Jacob Keller wrote: >> >> On 1/17/2023 4:29 AM, Jesper Dangaard Brouer wrote: >>> The spin_lock irqsave/restore API variant in skb_defer_free_flush can >>> be replaced with the faster spin_lock irq variant, which doesn't need >>> to read and restore the CPU flags. >>> >>> Using the unconditional irq "disable/enable" API variant is safe, >>> because the skb_defer_free_flush() function is only called during >>> NAPI-RX processing in net_rx_action(), where it is known the IRQs >>> are enabled. >>> >> >> Did you mean disabled here? If IRQs are enabled that would mean the >> interrupt could be triggered and we would need to irqsave, no? > > I do mean 'enabled' in the text here. > > As you can see in net_rx_action() we are allowed to perform code like: > > local_irq_disable(); > list_splice_init(&sd->poll_list, &list); > local_irq_enable(); > > Disabling local IRQ without saving 'flags' and unconditionally enabling > local IRQs again. Thus, in skb_defer_free_flush() we can do the same, > without saving 'flags'. Hope it makes it more clear. > Ahh, that makes sense. In that case, no further nits and: Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
diff --git a/net/core/dev.c b/net/core/dev.c index cf78f35bc0b9..9c60190fe352 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6616,17 +6616,16 @@ static int napi_threaded_poll(void *data) static void skb_defer_free_flush(struct softnet_data *sd) { struct sk_buff *skb, *next; - unsigned long flags; /* Paired with WRITE_ONCE() in skb_attempt_defer_free() */ if (!READ_ONCE(sd->defer_list)) return; - spin_lock_irqsave(&sd->defer_lock, flags); + spin_lock_irq(&sd->defer_lock); skb = sd->defer_list; sd->defer_list = NULL; sd->defer_count = 0; - spin_unlock_irqrestore(&sd->defer_lock, flags); + spin_unlock_irq(&sd->defer_lock); while (skb != NULL) { next = skb->next;
The spin_lock irqsave/restore API variant in skb_defer_free_flush can be replaced with the faster spin_lock irq variant, which doesn't need to read and restore the CPU flags. Using the unconditional irq "disable/enable" API variant is safe, because the skb_defer_free_flush() function is only called during NAPI-RX processing in net_rx_action(), where it is known the IRQs are enabled. Expected gain is 14 cycles from avoiding reading and restoring CPU flags in a spin_lock_irqsave/restore operation, measured via a microbencmark kernel module[1] on CPU E5-1650 v4 @ 3.60GHz. Microbenchmark overhead of spin_lock+unlock: - spin_lock_unlock_irq cost: 34 cycles(tsc) 9.486 ns - spin_lock_unlock_irqsave cost: 48 cycles(tsc) 13.567 ns We don't expect to see a measurable packet performance gain, as skb_defer_free_flush() is called infrequently once per NIC device NAPI bulk cycle and conditionally only if SKBs have been deferred by other CPUs via skb_attempt_defer_free(). [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_sample.c Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> --- net/core/dev.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)