Message ID | 87wnrs9tvp.ffs@nanos.tec.linutronix.de (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | genirq: Provide new interfaces for affinity hints | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Fri, May 21, 2021 at 7:48 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > The discussion about removing the side effect of irq_set_affinity_hint() of > actually applying the cpumask (if not NULL) as affinity to the interrupt, > unearthed a few unpleasantries: > > 1) The modular perf drivers rely on the current behaviour for the very > wrong reasons. > > 2) While none of the other drivers prevents user space from changing > the affinity, a cursorily inspection shows that there are at least > expectations in some drivers. > > #1 needs to be cleaned up anyway, so that's not a problem > > #2 might result in subtle regressions especially when irqbalanced (which > nowadays ignores the affinity hint) is disabled. > > Provide new interfaces: > > irq_update_affinity_hint() - Only sets the affinity hint pointer > irq_apply_affinity_hint() - Set the pointer and apply the affinity to > the interrupt > > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and > document it to be phased out. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Link: https://lore.kernel.org/r/20210501021832.743094-1-jesse.brandeburg@intel.com > --- > Applies on: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core > --- > include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++- > kernel/irq/manage.c | 8 ++++---- > 2 files changed, 44 insertions(+), 5 deletions(-) > > --- a/include/linux/interrupt.h > +++ b/include/linux/interrupt.h > @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i > extern int irq_can_set_affinity(unsigned int irq); > extern int irq_select_affinity(unsigned int irq); > > -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m); > +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > + bool setaffinity); > + > +/** > + * irq_update_affinity_hint - Update the affinity hint > + * @irq: Interrupt to update > + * @cpumask: cpumask pointer (NULL to clear the hint) > + * > + * Updates the affinity hint, but does not change the affinity of the interrupt. > + */ > +static inline int > +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return __irq_apply_affinity_hint(irq, m, true); > +} Should it be: return __irq_apply_affinity_hint(irq, m, false); here? > + > +/** > + * irq_apply_affinity_hint - Update the affinity hint and apply the provided > + * cpumask to the interrupt > + * @irq: Interrupt to update > + * @cpumask: cpumask pointer (NULL to clear the hint) > + * > + * Updates the affinity hint and if @cpumask is not NULL it applies it as > + * the affinity of that interrupt. > + */ > +static inline int > +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return __irq_apply_affinity_hint(irq, m, true); > +} > + > +/* > + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint() > + * instead. > + */ > +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return irq_apply_affinity_hint(irq, cpumask); > +} > + > extern int irq_update_affinity_desc(unsigned int irq, > struct irq_affinity_desc *affinity); > > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq, > } > EXPORT_SYMBOL_GPL(irq_force_affinity); > > -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > + bool setaffinity) > { > unsigned long flags; > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); > @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i > return -EINVAL; > desc->affinity_hint = m; > irq_put_desc_unlock(desc, flags); > - /* set the initial affinity to prevent every interrupt being on CPU0 */ > - if (m) > + if (m && setaffinity) > __irq_set_affinity(irq, m, false); > return 0; > } > -EXPORT_SYMBOL_GPL(irq_set_affinity_hint); > +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint); > > static void irq_affinity_notify(struct work_struct *work) > {
On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > The discussion about removing the side effect of irq_set_affinity_hint() of > actually applying the cpumask (if not NULL) as affinity to the interrupt, > unearthed a few unpleasantries: > > 1) The modular perf drivers rely on the current behaviour for the very > wrong reasons. > > 2) While none of the other drivers prevents user space from changing > the affinity, a cursorily inspection shows that there are at least > expectations in some drivers. > > #1 needs to be cleaned up anyway, so that's not a problem > > #2 might result in subtle regressions especially when irqbalanced (which > nowadays ignores the affinity hint) is disabled. > > Provide new interfaces: > > irq_update_affinity_hint() - Only sets the affinity hint pointer > irq_apply_affinity_hint() - Set the pointer and apply the affinity to > the interrupt > Any reason why you ruled out the usage of irq_set_affinity_and_hint()? IMHO the latter makes it very clear what the function is meant to do. > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and > document it to be phased out. Right, so eventually we will be only left with the following APIs that the driver will use: irq_set_affinity()- for drivers that only wants to set the affinity mask irq_apply_affinity_hint/irq_set_affinity_and_hint() - for drivers that wants to set same affinity and hint mask irq_update_affinity_hint() - for drivers that only wants to update the hint mask Thanks for clearing this. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Link: https://lore.kernel.org/r/20210501021832.743094-1-jesse.brandeburg@intel.com > --- > Applies on: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core > --- > include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++- > kernel/irq/manage.c | 8 ++++---- > 2 files changed, 44 insertions(+), 5 deletions(-) > > --- a/include/linux/interrupt.h > +++ b/include/linux/interrupt.h > @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i > extern int irq_can_set_affinity(unsigned int irq); > extern int irq_select_affinity(unsigned int irq); > > -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m); > +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > + bool setaffinity); > + > +/** > + * irq_update_affinity_hint - Update the affinity hint > + * @irq: Interrupt to update > + * @cpumask: cpumask pointer (NULL to clear the hint) > + * > + * Updates the affinity hint, but does not change the affinity of the interrupt. > + */ > +static inline int > +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return __irq_apply_affinity_hint(irq, m, true); > +} > + > +/** > + * irq_apply_affinity_hint - Update the affinity hint and apply the provided > + * cpumask to the interrupt > + * @irq: Interrupt to update > + * @cpumask: cpumask pointer (NULL to clear the hint) > + * > + * Updates the affinity hint and if @cpumask is not NULL it applies it as > + * the affinity of that interrupt. > + */ > +static inline int > +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return __irq_apply_affinity_hint(irq, m, true); > +} > + > +/* > + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint() > + * instead. > + */ > +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return irq_apply_affinity_hint(irq, cpumask); > +} > + > extern int irq_update_affinity_desc(unsigned int irq, > struct irq_affinity_desc *affinity); > > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq, > } > EXPORT_SYMBOL_GPL(irq_force_affinity); > > -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > + bool setaffinity) > { > unsigned long flags; > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); > @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i > return -EINVAL; > desc->affinity_hint = m; > irq_put_desc_unlock(desc, flags); > - /* set the initial affinity to prevent every interrupt being on CPU0 */ > - if (m) > + if (m && setaffinity) > __irq_set_affinity(irq, m, false); > return 0; > } > -EXPORT_SYMBOL_GPL(irq_set_affinity_hint); > +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint); > > static void irq_affinity_notify(struct work_struct *work) > { > -- Nitesh
On Fri, May 21 2021 at 10:45, Lijun Pan wrote: > On Fri, May 21, 2021 at 7:48 AM Thomas Gleixner <tglx@linutronix.de> wrote: >> +/** >> + * irq_update_affinity_hint - Update the affinity hint >> + * @irq: Interrupt to update >> + * @cpumask: cpumask pointer (NULL to clear the hint) >> + * >> + * Updates the affinity hint, but does not change the affinity of the interrupt. >> + */ >> +static inline int >> +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m) >> +{ >> + return __irq_apply_affinity_hint(irq, m, true); >> +} > > Should it be: > return __irq_apply_affinity_hint(irq, m, false); > here? Of course. Copy & Pasta should be forbidden. Thanks for spotting it! tglx
On Fri, May 21 2021 at 12:13, Nitesh Lal wrote: > On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <tglx@linutronix.de> wrote: >> Provide new interfaces: >> >> irq_update_affinity_hint() - Only sets the affinity hint pointer >> irq_apply_affinity_hint() - Set the pointer and apply the affinity to >> the interrupt >> > > Any reason why you ruled out the usage of irq_set_affinity_and_hint()? > IMHO the latter makes it very clear what the function is meant to do. You're right. I was trying to phase the existing hint setter out, but that's probably pointless overengineering for no real value. Let me redo that. Thanks, tglx
Hi, On Fri, May 21, 2021 at 02:03:06PM +0200, Thomas Gleixner wrote: > The discussion about removing the side effect of irq_set_affinity_hint() of > actually applying the cpumask (if not NULL) as affinity to the interrupt, > unearthed a few unpleasantries: > > 1) The modular perf drivers rely on the current behaviour for the very > wrong reasons. > > 2) While none of the other drivers prevents user space from changing > the affinity, a cursorily inspection shows that there are at least > expectations in some drivers. > > #1 needs to be cleaned up anyway, so that's not a problem > > #2 might result in subtle regressions especially when irqbalanced (which > nowadays ignores the affinity hint) is disabled. > > Provide new interfaces: > > irq_update_affinity_hint() - Only sets the affinity hint pointer > irq_apply_affinity_hint() - Set the pointer and apply the affinity to > the interrupt > > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and > document it to be phased out. Is there recommended way to retrieve the CPU number that the interrupt has affinity? Previously a driver (I'm looking at drivers/net/ethernet/amazon/ena) that uses irq_set_affinity_hint() to spread out IRQ knows the corresponding CPU number since they're using their own spreading scheme. Now, phasing out irq_set_affinity_hint(), and thus relying on request_irq() to spread the load instead, there don't seem to be a easy way to get the CPU number. In theory the following could work, but including irq.h does not look like a good idea given that the comment in its explicitly ask not to be included in generic code. #include <linux/irq.h> int irq = request_irq(...); struct irq_data *data = irq_get_irq_data(irq); struct cpumask *mask = irq_data_get_effective_affinity_mask(data); int cpu = cpumask_first(mask); Any suggestions? Thanks, Shung-Hsi > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Link: https://lore.kernel.org/r/20210501021832.743094-1-jesse.brandeburg@intel.com > --- > Applies on: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core > --- > include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++- > kernel/irq/manage.c | 8 ++++---- > 2 files changed, 44 insertions(+), 5 deletions(-) > > --- a/include/linux/interrupt.h > +++ b/include/linux/interrupt.h > @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i > extern int irq_can_set_affinity(unsigned int irq); > extern int irq_select_affinity(unsigned int irq); > > -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m); > +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > + bool setaffinity); > + > +/** > + * irq_update_affinity_hint - Update the affinity hint > + * @irq: Interrupt to update > + * @cpumask: cpumask pointer (NULL to clear the hint) > + * > + * Updates the affinity hint, but does not change the affinity of the interrupt. > + */ > +static inline int > +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return __irq_apply_affinity_hint(irq, m, true); > +} > + > +/** > + * irq_apply_affinity_hint - Update the affinity hint and apply the provided > + * cpumask to the interrupt > + * @irq: Interrupt to update > + * @cpumask: cpumask pointer (NULL to clear the hint) > + * > + * Updates the affinity hint and if @cpumask is not NULL it applies it as > + * the affinity of that interrupt. > + */ > +static inline int > +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return __irq_apply_affinity_hint(irq, m, true); > +} > + > +/* > + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint() > + * instead. > + */ > +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return irq_apply_affinity_hint(irq, cpumask); > +} > + > extern int irq_update_affinity_desc(unsigned int irq, > struct irq_affinity_desc *affinity); > > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq, > } > EXPORT_SYMBOL_GPL(irq_force_affinity); > > -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > + bool setaffinity) > { > unsigned long flags; > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); > @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i > return -EINVAL; > desc->affinity_hint = m; > irq_put_desc_unlock(desc, flags); > - /* set the initial affinity to prevent every interrupt being on CPU0 */ > - if (m) > + if (m && setaffinity) > __irq_set_affinity(irq, m, false); > return 0; > } > -EXPORT_SYMBOL_GPL(irq_set_affinity_hint); > +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint); > > static void irq_affinity_notify(struct work_struct *work) > {
On Thu, May 27, 2021 at 06:03:29PM +0800, Shung-Hsi Yu wrote: > Hi, > > On Fri, May 21, 2021 at 02:03:06PM +0200, Thomas Gleixner wrote: > > The discussion about removing the side effect of irq_set_affinity_hint() of > > actually applying the cpumask (if not NULL) as affinity to the interrupt, > > unearthed a few unpleasantries: > > > > 1) The modular perf drivers rely on the current behaviour for the very > > wrong reasons. > > > > 2) While none of the other drivers prevents user space from changing > > the affinity, a cursorily inspection shows that there are at least > > expectations in some drivers. > > > > #1 needs to be cleaned up anyway, so that's not a problem > > > > #2 might result in subtle regressions especially when irqbalanced (which > > nowadays ignores the affinity hint) is disabled. > > > > Provide new interfaces: > > > > irq_update_affinity_hint() - Only sets the affinity hint pointer > > irq_apply_affinity_hint() - Set the pointer and apply the affinity to > > the interrupt > > > > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and > > document it to be phased out. > > Is there recommended way to retrieve the CPU number that the interrupt has > affinity? > > Previously a driver (I'm looking at drivers/net/ethernet/amazon/ena) that > uses irq_set_affinity_hint() to spread out IRQ knows the corresponding CPU > number since they're using their own spreading scheme. Now, phasing out > irq_set_affinity_hint(), and thus relying on request_irq() to spread the > load instead, there don't seem to be a easy way to get the CPU number. I should add that the main use-case for retrieving CPU number seems to be ensuring memory is allocated on the same NUMA node that serves the interrupt (again, looking at ena driver only, haven't check others yet). int cpu = i % num_online_cpu(); cpumask_set_cpu(cpu, &affinity_hint_mask); request_irq(irq, ...); irq_set_affinity_hint(irq, &affinity_hint_mask); int node = cpu_to_node(cpu); buffer = vzalloc(node); > In theory the following could work, but including irq.h does not look like a > good idea given that the comment in its explicitly ask not to be included in > generic code. > > #include <linux/irq.h> > int irq = request_irq(...); > struct irq_data *data = irq_get_irq_data(irq); > struct cpumask *mask = irq_data_get_effective_affinity_mask(data); > int cpu = cpumask_first(mask); > > Any suggestions? > > > Thanks, > Shung-Hsi > > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > Link: https://lore.kernel.org/r/20210501021832.743094-1-jesse.brandeburg@intel.com > > --- > > Applies on: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core > > --- > > include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++- > > kernel/irq/manage.c | 8 ++++---- > > 2 files changed, 44 insertions(+), 5 deletions(-) > > > > --- a/include/linux/interrupt.h > > +++ b/include/linux/interrupt.h > > @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i > > extern int irq_can_set_affinity(unsigned int irq); > > extern int irq_select_affinity(unsigned int irq); > > > > -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m); > > +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > > + bool setaffinity); > > + > > +/** > > + * irq_update_affinity_hint - Update the affinity hint > > + * @irq: Interrupt to update > > + * @cpumask: cpumask pointer (NULL to clear the hint) > > + * > > + * Updates the affinity hint, but does not change the affinity of the interrupt. > > + */ > > +static inline int > > +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m) > > +{ > > + return __irq_apply_affinity_hint(irq, m, true); > > +} > > + > > +/** > > + * irq_apply_affinity_hint - Update the affinity hint and apply the provided > > + * cpumask to the interrupt > > + * @irq: Interrupt to update > > + * @cpumask: cpumask pointer (NULL to clear the hint) > > + * > > + * Updates the affinity hint and if @cpumask is not NULL it applies it as > > + * the affinity of that interrupt. > > + */ > > +static inline int > > +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m) > > +{ > > + return __irq_apply_affinity_hint(irq, m, true); > > +} > > + > > +/* > > + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint() > > + * instead. > > + */ > > +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > > +{ > > + return irq_apply_affinity_hint(irq, cpumask); > > +} > > + > > extern int irq_update_affinity_desc(unsigned int irq, > > struct irq_affinity_desc *affinity); > > > > --- a/kernel/irq/manage.c > > +++ b/kernel/irq/manage.c > > @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq, > > } > > EXPORT_SYMBOL_GPL(irq_force_affinity); > > > > -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > > +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > > + bool setaffinity) > > { > > unsigned long flags; > > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); > > @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i > > return -EINVAL; > > desc->affinity_hint = m; > > irq_put_desc_unlock(desc, flags); > > - /* set the initial affinity to prevent every interrupt being on CPU0 */ > > - if (m) > > + if (m && setaffinity) > > __irq_set_affinity(irq, m, false); > > return 0; > > } > > -EXPORT_SYMBOL_GPL(irq_set_affinity_hint); > > +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint); > > > > static void irq_affinity_notify(struct work_struct *work) > > {
On Thu, May 27, 2021 at 6:03 AM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote: > > Hi, > > On Fri, May 21, 2021 at 02:03:06PM +0200, Thomas Gleixner wrote: > > The discussion about removing the side effect of irq_set_affinity_hint() of > > actually applying the cpumask (if not NULL) as affinity to the interrupt, > > unearthed a few unpleasantries: > > > > 1) The modular perf drivers rely on the current behaviour for the very > > wrong reasons. > > > > 2) While none of the other drivers prevents user space from changing > > the affinity, a cursorily inspection shows that there are at least > > expectations in some drivers. > > > > #1 needs to be cleaned up anyway, so that's not a problem > > > > #2 might result in subtle regressions especially when irqbalanced (which > > nowadays ignores the affinity hint) is disabled. > > > > Provide new interfaces: > > > > irq_update_affinity_hint() - Only sets the affinity hint pointer > > irq_apply_affinity_hint() - Set the pointer and apply the affinity to > > the interrupt > > > > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and > > document it to be phased out. > > Is there recommended way to retrieve the CPU number that the interrupt has > affinity? > > Previously a driver (I'm looking at drivers/net/ethernet/amazon/ena) that > uses irq_set_affinity_hint() to spread out IRQ knows the corresponding CPU > number since they're using their own spreading scheme. Now, phasing out > irq_set_affinity_hint(), and thus relying on request_irq() to spread the > load instead, there don't seem to be a easy way to get the CPU number. > For drivers that don't want to rely on request_irq for spreading and want to force their own affinity mask can use irq_set_affinity() which is an exported interface now [1] and clearly indicates the purpose of the usage. As Thomas suggested we are still keeping irq_set_affinity_hint() as a wrapper until we make appropriate changes in individual drivers that use this API for different reasons. Please feel free to send out a patch for this driver once the changes are merged. [1] https://lkml.org/lkml/2021/5/18/271
On Thu, May 27, 2021 at 09:06:04AM -0400, Nitesh Lal wrote: > On Thu, May 27, 2021 at 6:03 AM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote: > > > > Hi, > > > > On Fri, May 21, 2021 at 02:03:06PM +0200, Thomas Gleixner wrote: > > > The discussion about removing the side effect of irq_set_affinity_hint() of > > > actually applying the cpumask (if not NULL) as affinity to the interrupt, > > > unearthed a few unpleasantries: > > > > > > 1) The modular perf drivers rely on the current behaviour for the very > > > wrong reasons. > > > > > > 2) While none of the other drivers prevents user space from changing > > > the affinity, a cursorily inspection shows that there are at least > > > expectations in some drivers. > > > > > > #1 needs to be cleaned up anyway, so that's not a problem > > > > > > #2 might result in subtle regressions especially when irqbalanced (which > > > nowadays ignores the affinity hint) is disabled. > > > > > > Provide new interfaces: > > > > > > irq_update_affinity_hint() - Only sets the affinity hint pointer > > > irq_apply_affinity_hint() - Set the pointer and apply the affinity to > > > the interrupt > > > > > > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and > > > document it to be phased out. > > > > Is there recommended way to retrieve the CPU number that the interrupt has > > affinity? > > > > Previously a driver (I'm looking at drivers/net/ethernet/amazon/ena) that > > uses irq_set_affinity_hint() to spread out IRQ knows the corresponding CPU > > number since they're using their own spreading scheme. Now, phasing out > > irq_set_affinity_hint(), and thus relying on request_irq() to spread the > > load instead, there don't seem to be a easy way to get the CPU number. > > > > For drivers that don't want to rely on request_irq for spreading and want > to force their own affinity mask can use irq_set_affinity() I *do* want the driver to rely on request_irq() for spreading. It is retrieving effective affinity after request_irq() call that I can't seem to figure out. Thanks, Shung-Hsi > which is an exported interface now [1] and clearly indicates the purpose > of the usage. > > As Thomas suggested we are still keeping irq_set_affinity_hint() as a > wrapper until we make appropriate changes in individual drivers that use > this API for different reasons. Please feel free to send out a patch > for this driver once the changes are merged. > > [1] https://lkml.org/lkml/2021/5/18/271 > > -- > Thanks > Nitesh >
On Fri, May 21, 2021 at 5:48 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Fri, May 21 2021 at 12:13, Nitesh Lal wrote: > > On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <tglx@linutronix.de> wrote: > >> Provide new interfaces: > >> > >> irq_update_affinity_hint() - Only sets the affinity hint pointer > >> irq_apply_affinity_hint() - Set the pointer and apply the affinity to > >> the interrupt > >> > > > > Any reason why you ruled out the usage of irq_set_affinity_and_hint()? > > IMHO the latter makes it very clear what the function is meant to do. > > You're right. I was trying to phase the existing hint setter out, but > that's probably pointless overengineering for no real value. Let me redo > that. > Thomas, are you planning to send a v2 for this soon or did I somehow miss it? Since your other patch "genirq: Export affinity setter for modules" is already in linux-next, I have started looking into the drivers where we can use that. On thinking about this whole chunk a little more, I do wonder about the reason why we are still sticking with the hints. The two reasons that I could come up with are: - We are not entirely sure if irqbalance still consumes this or not - For future use by some other userspace daemon (?) Does that sound right?
On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > The discussion about removing the side effect of irq_set_affinity_hint() of > actually applying the cpumask (if not NULL) as affinity to the interrupt, > unearthed a few unpleasantries: > > 1) The modular perf drivers rely on the current behaviour for the very > wrong reasons. > > 2) While none of the other drivers prevents user space from changing > the affinity, a cursorily inspection shows that there are at least > expectations in some drivers. > > #1 needs to be cleaned up anyway, so that's not a problem > > #2 might result in subtle regressions especially when irqbalanced (which > nowadays ignores the affinity hint) is disabled. > > Provide new interfaces: > > irq_update_affinity_hint() - Only sets the affinity hint pointer > irq_apply_affinity_hint() - Set the pointer and apply the affinity to > the interrupt > > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and > document it to be phased out. > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Link: https://lore.kernel.org/r/20210501021832.743094-1-jesse.brandeburg@intel.com > --- > Applies on: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core > --- > include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++- > kernel/irq/manage.c | 8 ++++---- > 2 files changed, 44 insertions(+), 5 deletions(-) > > --- a/include/linux/interrupt.h > +++ b/include/linux/interrupt.h > @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i > extern int irq_can_set_affinity(unsigned int irq); > extern int irq_select_affinity(unsigned int irq); > > -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m); > +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > + bool setaffinity); > + > +/** > + * irq_update_affinity_hint - Update the affinity hint > + * @irq: Interrupt to update > + * @cpumask: cpumask pointer (NULL to clear the hint) > + * > + * Updates the affinity hint, but does not change the affinity of the interrupt. > + */ > +static inline int > +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return __irq_apply_affinity_hint(irq, m, true); > +} > + > +/** > + * irq_apply_affinity_hint - Update the affinity hint and apply the provided > + * cpumask to the interrupt > + * @irq: Interrupt to update > + * @cpumask: cpumask pointer (NULL to clear the hint) > + * > + * Updates the affinity hint and if @cpumask is not NULL it applies it as > + * the affinity of that interrupt. > + */ > +static inline int > +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return __irq_apply_affinity_hint(irq, m, true); > +} > + > +/* > + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint() > + * instead. > + */ > +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > +{ > + return irq_apply_affinity_hint(irq, cpumask); Another change required here, the above should be 'm' instead of 'cpumask'. > +} > + > extern int irq_update_affinity_desc(unsigned int irq, > struct irq_affinity_desc *affinity); > > --- a/kernel/irq/manage.c > +++ b/kernel/irq/manage.c > @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq, > } > EXPORT_SYMBOL_GPL(irq_force_affinity); > > -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > + bool setaffinity) > { > unsigned long flags; > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); > @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i > return -EINVAL; > desc->affinity_hint = m; > irq_put_desc_unlock(desc, flags); > - /* set the initial affinity to prevent every interrupt being on CPU0 */ > - if (m) > + if (m && setaffinity) > __irq_set_affinity(irq, m, false); > return 0; > } > -EXPORT_SYMBOL_GPL(irq_set_affinity_hint); > +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint); > > static void irq_affinity_notify(struct work_struct *work) > { >
On Mon, Jun 7, 2021 at 1:00 PM Nitesh Lal <nilal@redhat.com> wrote: > > On Fri, May 21, 2021 at 8:03 AM Thomas Gleixner <tglx@linutronix.de> wrote: > > > > The discussion about removing the side effect of irq_set_affinity_hint() of > > actually applying the cpumask (if not NULL) as affinity to the interrupt, > > unearthed a few unpleasantries: > > > > 1) The modular perf drivers rely on the current behaviour for the very > > wrong reasons. > > > > 2) While none of the other drivers prevents user space from changing > > the affinity, a cursorily inspection shows that there are at least > > expectations in some drivers. > > > > #1 needs to be cleaned up anyway, so that's not a problem > > > > #2 might result in subtle regressions especially when irqbalanced (which > > nowadays ignores the affinity hint) is disabled. > > > > Provide new interfaces: > > > > irq_update_affinity_hint() - Only sets the affinity hint pointer > > irq_apply_affinity_hint() - Set the pointer and apply the affinity to > > the interrupt > > > > Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and > > document it to be phased out. > > > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > Link: https://lore.kernel.org/r/20210501021832.743094-1-jesse.brandeburg@intel.com > > --- > > Applies on: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core > > --- > > include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++- > > kernel/irq/manage.c | 8 ++++---- > > 2 files changed, 44 insertions(+), 5 deletions(-) > > > > --- a/include/linux/interrupt.h > > +++ b/include/linux/interrupt.h > > @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i > > extern int irq_can_set_affinity(unsigned int irq); > > extern int irq_select_affinity(unsigned int irq); > > > > -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m); > > +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > > + bool setaffinity); > > + > > +/** > > + * irq_update_affinity_hint - Update the affinity hint > > + * @irq: Interrupt to update > > + * @cpumask: cpumask pointer (NULL to clear the hint) > > + * > > + * Updates the affinity hint, but does not change the affinity of the interrupt. > > + */ > > +static inline int > > +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m) > > +{ > > + return __irq_apply_affinity_hint(irq, m, true); > > +} > > + > > +/** > > + * irq_apply_affinity_hint - Update the affinity hint and apply the provided > > + * cpumask to the interrupt > > + * @irq: Interrupt to update > > + * @cpumask: cpumask pointer (NULL to clear the hint) > > + * > > + * Updates the affinity hint and if @cpumask is not NULL it applies it as > > + * the affinity of that interrupt. > > + */ > > +static inline int > > +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m) > > +{ > > + return __irq_apply_affinity_hint(irq, m, true); > > +} > > + > > +/* > > + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint() > > + * instead. > > + */ > > +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > > +{ > > + return irq_apply_affinity_hint(irq, cpumask); > > Another change required here, the above should be 'm' instead of 'cpumask'. I am going to and make the suggested changes to this patch and will post it with driver patches. Please let me know if there are any objections to that. > > > +} > > + > > extern int irq_update_affinity_desc(unsigned int irq, > > struct irq_affinity_desc *affinity); > > > > --- a/kernel/irq/manage.c > > +++ b/kernel/irq/manage.c > > @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq, > > } > > EXPORT_SYMBOL_GPL(irq_force_affinity); > > > > -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) > > +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, > > + bool setaffinity) > > { > > unsigned long flags; > > struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); > > @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i > > return -EINVAL; > > desc->affinity_hint = m; > > irq_put_desc_unlock(desc, flags); > > - /* set the initial affinity to prevent every interrupt being on CPU0 */ > > - if (m) > > + if (m && setaffinity) > > __irq_set_affinity(irq, m, false); > > return 0; > > } > > -EXPORT_SYMBOL_GPL(irq_set_affinity_hint); > > +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint); > > > > static void irq_affinity_notify(struct work_struct *work) > > { > > > > > -- > Thanks > Nitesh
--- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -328,7 +328,46 @@ extern int irq_force_affinity(unsigned i extern int irq_can_set_affinity(unsigned int irq); extern int irq_select_affinity(unsigned int irq); -extern int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m); +extern int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, + bool setaffinity); + +/** + * irq_update_affinity_hint - Update the affinity hint + * @irq: Interrupt to update + * @cpumask: cpumask pointer (NULL to clear the hint) + * + * Updates the affinity hint, but does not change the affinity of the interrupt. + */ +static inline int +irq_update_affinity_hint(unsigned int irq, const struct cpumask *m) +{ + return __irq_apply_affinity_hint(irq, m, true); +} + +/** + * irq_apply_affinity_hint - Update the affinity hint and apply the provided + * cpumask to the interrupt + * @irq: Interrupt to update + * @cpumask: cpumask pointer (NULL to clear the hint) + * + * Updates the affinity hint and if @cpumask is not NULL it applies it as + * the affinity of that interrupt. + */ +static inline int +irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m) +{ + return __irq_apply_affinity_hint(irq, m, true); +} + +/* + * Deprecated. Use irq_update_affinity_hint() or irq_apply_affinity_hint() + * instead. + */ +static inline int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) +{ + return irq_apply_affinity_hint(irq, cpumask); +} + extern int irq_update_affinity_desc(unsigned int irq, struct irq_affinity_desc *affinity); --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -487,7 +487,8 @@ int irq_force_affinity(unsigned int irq, } EXPORT_SYMBOL_GPL(irq_force_affinity); -int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m) +int __irq_apply_affinity_hint(unsigned int irq, const struct cpumask *m, + bool setaffinity) { unsigned long flags; struct irq_desc *desc = irq_get_desc_lock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); @@ -496,12 +497,11 @@ int irq_set_affinity_hint(unsigned int i return -EINVAL; desc->affinity_hint = m; irq_put_desc_unlock(desc, flags); - /* set the initial affinity to prevent every interrupt being on CPU0 */ - if (m) + if (m && setaffinity) __irq_set_affinity(irq, m, false); return 0; } -EXPORT_SYMBOL_GPL(irq_set_affinity_hint); +EXPORT_SYMBOL_GPL(__irq_apply_affinity_hint); static void irq_affinity_notify(struct work_struct *work) {
The discussion about removing the side effect of irq_set_affinity_hint() of actually applying the cpumask (if not NULL) as affinity to the interrupt, unearthed a few unpleasantries: 1) The modular perf drivers rely on the current behaviour for the very wrong reasons. 2) While none of the other drivers prevents user space from changing the affinity, a cursorily inspection shows that there are at least expectations in some drivers. #1 needs to be cleaned up anyway, so that's not a problem #2 might result in subtle regressions especially when irqbalanced (which nowadays ignores the affinity hint) is disabled. Provide new interfaces: irq_update_affinity_hint() - Only sets the affinity hint pointer irq_apply_affinity_hint() - Set the pointer and apply the affinity to the interrupt Make irq_set_affinity_hint() a wrapper around irq_apply_affinity_hint() and document it to be phased out. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210501021832.743094-1-jesse.brandeburg@intel.com --- Applies on: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core --- include/linux/interrupt.h | 41 ++++++++++++++++++++++++++++++++++++++++- kernel/irq/manage.c | 8 ++++---- 2 files changed, 44 insertions(+), 5 deletions(-)