Message ID | 20240823173103.94978-4-jdamato@fastly.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Suspend IRQs during application busy periods | expand |
On Fri, Aug 23, 2024 at 7:31 PM Joe Damato <jdamato@fastly.com> wrote: > > From: Martin Karsten <mkarsten@uwaterloo.ca> > > The napi_suspend_irqs routine bootstraps irq suspension by elongating > the defer timeout to irq_suspend_timeout. > > The napi_resume_irqs routine effectly cancels irq suspension by forcing > the napi to be scheduled immediately. > > Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> > Co-developed-by: Joe Damato <jdamato@fastly.com> > Signed-off-by: Joe Damato <jdamato@fastly.com> > Tested-by: Joe Damato <jdamato@fastly.com> > Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> > --- You have not CC me on all the patches in the series, making the review harder then necessary.
On Fri, Aug 23, 2024 at 07:56:32PM +0200, Eric Dumazet wrote: > On Fri, Aug 23, 2024 at 7:31 PM Joe Damato <jdamato@fastly.com> wrote: > > > > From: Martin Karsten <mkarsten@uwaterloo.ca> > > > > The napi_suspend_irqs routine bootstraps irq suspension by elongating > > the defer timeout to irq_suspend_timeout. > > > > The napi_resume_irqs routine effectly cancels irq suspension by forcing > > the napi to be scheduled immediately. > > > > Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> > > Co-developed-by: Joe Damato <jdamato@fastly.com> > > Signed-off-by: Joe Damato <jdamato@fastly.com> > > Tested-by: Joe Damato <jdamato@fastly.com> > > Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> > > --- > > You have not CC me on all the patches in the series, making the review > harder then necessary. My sincere apologies, Eric, and thank you for your time reviewing this. I used a script I'd been using for a while to generate the CC list, but it clearly has a bug. For any future revisions I will be sure to explicitly include you.
On Fri, Aug 23, 2024 at 7:31 PM Joe Damato <jdamato@fastly.com> wrote: > > From: Martin Karsten <mkarsten@uwaterloo.ca> > > The napi_suspend_irqs routine bootstraps irq suspension by elongating > the defer timeout to irq_suspend_timeout. > > The napi_resume_irqs routine effectly cancels irq suspension by forcing > the napi to be scheduled immediately. > > Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> > Co-developed-by: Joe Damato <jdamato@fastly.com> > Signed-off-by: Joe Damato <jdamato@fastly.com> > Tested-by: Joe Damato <jdamato@fastly.com> > Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> > --- > include/net/busy_poll.h | 3 +++ > net/core/dev.c | 33 +++++++++++++++++++++++++++++++++ > 2 files changed, 36 insertions(+) > > diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h > index 9b09acac538e..f095b2bdeee1 100644 > --- a/include/net/busy_poll.h > +++ b/include/net/busy_poll.h > @@ -52,6 +52,9 @@ void napi_busy_loop_rcu(unsigned int napi_id, > bool (*loop_end)(void *, unsigned long), > void *loop_end_arg, bool prefer_busy_poll, u16 budget); > > +void napi_suspend_irqs(unsigned int napi_id); > +void napi_resume_irqs(unsigned int napi_id); > + > #else /* CONFIG_NET_RX_BUSY_POLL */ > static inline unsigned long net_busy_loop_on(void) > { > diff --git a/net/core/dev.c b/net/core/dev.c > index 74060ba866d4..4de0dfc86e21 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6507,6 +6507,39 @@ void napi_busy_loop(unsigned int napi_id, > } > EXPORT_SYMBOL(napi_busy_loop); > > +void napi_suspend_irqs(unsigned int napi_id) > +{ > + struct napi_struct *napi; > + > + rcu_read_lock(); > + napi = napi_by_id(napi_id); > + if (napi) { > + unsigned long timeout = READ_ONCE(napi->dev->irq_suspend_timeout); > + > + if (timeout) > + hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED); > + } > + rcu_read_unlock(); > +} > +EXPORT_SYMBOL(napi_suspend_irqs); > + > +void napi_resume_irqs(unsigned int napi_id) > +{ > + struct napi_struct *napi; > + > + rcu_read_lock(); > + napi = napi_by_id(napi_id); > + if (napi) { > + if (READ_ONCE(napi->dev->irq_suspend_timeout)) { Since we'll read irq_suspend_timeout twice, we could have a situation where the napi_schedule() will not be done if another thread changes irq_suspend_timeout ? If this is fine, a comment would be nice :) The thing is that the kernel can not trust the user (think of syzbot) > + local_bh_disable(); > + napi_schedule(napi); > + local_bh_enable(); > + } > + } > + rcu_read_unlock(); > +} > +EXPORT_SYMBOL(napi_resume_irqs); > + > #endif /* CONFIG_NET_RX_BUSY_POLL */ > > static void napi_hash_add(struct napi_struct *napi) > -- > 2.25.1 >
On 2024-08-23 14:14, Eric Dumazet wrote: > On Fri, Aug 23, 2024 at 7:31 PM Joe Damato <jdamato@fastly.com> wrote: >> >> From: Martin Karsten <mkarsten@uwaterloo.ca> >> >> The napi_suspend_irqs routine bootstraps irq suspension by elongating >> the defer timeout to irq_suspend_timeout. >> >> The napi_resume_irqs routine effectly cancels irq suspension by forcing >> the napi to be scheduled immediately. >> >> Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> >> Co-developed-by: Joe Damato <jdamato@fastly.com> >> Signed-off-by: Joe Damato <jdamato@fastly.com> >> Tested-by: Joe Damato <jdamato@fastly.com> >> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> >> --- >> include/net/busy_poll.h | 3 +++ >> net/core/dev.c | 33 +++++++++++++++++++++++++++++++++ >> 2 files changed, 36 insertions(+) >> >> diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h >> index 9b09acac538e..f095b2bdeee1 100644 >> --- a/include/net/busy_poll.h >> +++ b/include/net/busy_poll.h >> @@ -52,6 +52,9 @@ void napi_busy_loop_rcu(unsigned int napi_id, >> bool (*loop_end)(void *, unsigned long), >> void *loop_end_arg, bool prefer_busy_poll, u16 budget); >> >> +void napi_suspend_irqs(unsigned int napi_id); >> +void napi_resume_irqs(unsigned int napi_id); >> + >> #else /* CONFIG_NET_RX_BUSY_POLL */ >> static inline unsigned long net_busy_loop_on(void) >> { >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 74060ba866d4..4de0dfc86e21 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -6507,6 +6507,39 @@ void napi_busy_loop(unsigned int napi_id, >> } >> EXPORT_SYMBOL(napi_busy_loop); >> >> +void napi_suspend_irqs(unsigned int napi_id) >> +{ >> + struct napi_struct *napi; >> + >> + rcu_read_lock(); >> + napi = napi_by_id(napi_id); >> + if (napi) { >> + unsigned long timeout = READ_ONCE(napi->dev->irq_suspend_timeout); >> + >> + if (timeout) >> + hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED); >> + } >> + rcu_read_unlock(); >> +} >> +EXPORT_SYMBOL(napi_suspend_irqs); >> + >> +void napi_resume_irqs(unsigned int napi_id) >> +{ >> + struct napi_struct *napi; >> + >> + rcu_read_lock(); >> + napi = napi_by_id(napi_id); >> + if (napi) { >> + if (READ_ONCE(napi->dev->irq_suspend_timeout)) { > > > Since we'll read irq_suspend_timeout twice, we could have a situation > where the napi_schedule() will not be done > if another thread changes irq_suspend_timeout ? > > If this is fine, a comment would be nice :) > > The thing is that the kernel can not trust the user (think of syzbot) Yes, this should be fine. Calling napi_resume_irqs is done to restart irq processing right away. In case irq_suspend_timeout is set to 0 between suspend and resume, the original value of irq_suspend_timeout (when napi_suspend_irqs was called) determines the safety timeout as intended and the watchdog will restart irq processing. We will a add comment to make this clear. Thanks, Martin
diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h index 9b09acac538e..f095b2bdeee1 100644 --- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -52,6 +52,9 @@ void napi_busy_loop_rcu(unsigned int napi_id, bool (*loop_end)(void *, unsigned long), void *loop_end_arg, bool prefer_busy_poll, u16 budget); +void napi_suspend_irqs(unsigned int napi_id); +void napi_resume_irqs(unsigned int napi_id); + #else /* CONFIG_NET_RX_BUSY_POLL */ static inline unsigned long net_busy_loop_on(void) { diff --git a/net/core/dev.c b/net/core/dev.c index 74060ba866d4..4de0dfc86e21 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6507,6 +6507,39 @@ void napi_busy_loop(unsigned int napi_id, } EXPORT_SYMBOL(napi_busy_loop); +void napi_suspend_irqs(unsigned int napi_id) +{ + struct napi_struct *napi; + + rcu_read_lock(); + napi = napi_by_id(napi_id); + if (napi) { + unsigned long timeout = READ_ONCE(napi->dev->irq_suspend_timeout); + + if (timeout) + hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL(napi_suspend_irqs); + +void napi_resume_irqs(unsigned int napi_id) +{ + struct napi_struct *napi; + + rcu_read_lock(); + napi = napi_by_id(napi_id); + if (napi) { + if (READ_ONCE(napi->dev->irq_suspend_timeout)) { + local_bh_disable(); + napi_schedule(napi); + local_bh_enable(); + } + } + rcu_read_unlock(); +} +EXPORT_SYMBOL(napi_resume_irqs); + #endif /* CONFIG_NET_RX_BUSY_POLL */ static void napi_hash_add(struct napi_struct *napi)