diff mbox series

[v4,06/13] x86/IOMMU: don't restrict IRQ affinities to online CPUs

Message ID 923083ba-66f9-a88b-8909-a2f5e2808a69@suse.com (mailing list archive)
State New, archived
Headers show
Series [v4,01/13] x86/IRQ: deal with move-in-progress state in fixup_irqs() | expand

Commit Message

Jan Beulich July 16, 2019, 7:40 a.m. UTC
In line with "x86/IRQ: desc->affinity should strictly represent the
requested value" the internally used IRQ(s) also shouldn't be restricted
to online ones. Make set_desc_affinity() (set_msi_affinity() then does
by implication) cope with a NULL mask being passed (just like
assign_irq_vector() does), and have IOMMU code pass NULL instead of
&cpu_online_map (when, for VT-d, there's no NUMA node information
available).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: New.

Comments

Roger Pau Monné July 16, 2019, 9:12 a.m. UTC | #1
On Tue, Jul 16, 2019 at 07:40:57AM +0000, Jan Beulich wrote:
> In line with "x86/IRQ: desc->affinity should strictly represent the
> requested value" the internally used IRQ(s) also shouldn't be restricted
> to online ones. Make set_desc_affinity() (set_msi_affinity() then does
> by implication) cope with a NULL mask being passed (just like
> assign_irq_vector() does), and have IOMMU code pass NULL instead of
> &cpu_online_map (when, for VT-d, there's no NUMA node information
> available).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

LGTM, just one patch style comment and one code comment:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> v4: New.
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -796,18 +796,26 @@ unsigned int set_desc_affinity(struct ir
>       unsigned long flags;
>       cpumask_t dest_mask;
>   
> -    if (!cpumask_intersects(mask, &cpu_online_map))
> +    if ( mask && !cpumask_intersects(mask, &cpu_online_map) )
>           return BAD_APICID;
>   
>       spin_lock_irqsave(&vector_lock, flags);
> -    ret = _assign_irq_vector(desc, mask);
> +    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
>       spin_unlock_irqrestore(&vector_lock, flags);

I think the patch is somehow mangled at least on my end, there's one
prepended extra space in the non-modified lines AFAICT.

>   
> -    if (ret < 0)
> +    if ( ret < 0 )
>           return BAD_APICID;
>   
> -    cpumask_copy(desc->affinity, mask);

AFAICT you could also avoid the if and just do the same as in the
assign_irq_vector call above and pass TARGET_CPUS if mask is NULL?

Thanks, Roger.
Jan Beulich July 16, 2019, 10:20 a.m. UTC | #2
On 16.07.2019 11:12, Roger Pau Monné  wrote:
> On Tue, Jul 16, 2019 at 07:40:57AM +0000, Jan Beulich wrote:
>> In line with "x86/IRQ: desc->affinity should strictly represent the
>> requested value" the internally used IRQ(s) also shouldn't be restricted
>> to online ones. Make set_desc_affinity() (set_msi_affinity() then does
>> by implication) cope with a NULL mask being passed (just like
>> assign_irq_vector() does), and have IOMMU code pass NULL instead of
>> &cpu_online_map (when, for VT-d, there's no NUMA node information
>> available).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> LGTM, just one patch style comment and one code comment:
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> --- a/xen/arch/x86/irq.c
>> +++ b/xen/arch/x86/irq.c
>> @@ -796,18 +796,26 @@ unsigned int set_desc_affinity(struct ir
>>        unsigned long flags;
>>        cpumask_t dest_mask;
>>    
>> -    if (!cpumask_intersects(mask, &cpu_online_map))
>> +    if ( mask && !cpumask_intersects(mask, &cpu_online_map) )
>>            return BAD_APICID;
>>    
>>        spin_lock_irqsave(&vector_lock, flags);
>> -    ret = _assign_irq_vector(desc, mask);
>> +    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
>>        spin_unlock_irqrestore(&vector_lock, flags);
> 
> I think the patch is somehow mangled at least on my end, there's one
> prepended extra space in the non-modified lines AFAICT.

Well, yes, hence the last sentence in the cover letter and the attached
patches there. It is the mail system (more likely server than client)
over here which causes this issue (everywhere for me).

>>    
>> -    if (ret < 0)
>> +    if ( ret < 0 )
>>            return BAD_APICID;
>>    
>> -    cpumask_copy(desc->affinity, mask);
> 
> AFAICT you could also avoid the if and just do the same as in the
> assign_irq_vector call above and pass TARGET_CPUS if mask is NULL?

Are you talking about the if() in context above, or the one you've
stripped (immediately following the last quoted line of the patch)?
For the one in context I don't see how the rest of your remark is
related. For the other one - no, strictly not, as that would be
against the purpose of this change: We specifically want to _not_
restrict desc->affinity to online CPUs only (yet that's what
TARGET_CPUS resolves to).

Jan
Roger Pau Monné July 16, 2019, 11:17 a.m. UTC | #3
On Tue, Jul 16, 2019 at 10:20:10AM +0000, Jan Beulich wrote:
> On 16.07.2019 11:12, Roger Pau Monné  wrote:
> > On Tue, Jul 16, 2019 at 07:40:57AM +0000, Jan Beulich wrote:
> >> In line with "x86/IRQ: desc->affinity should strictly represent the
> >> requested value" the internally used IRQ(s) also shouldn't be restricted
> >> to online ones. Make set_desc_affinity() (set_msi_affinity() then does
> >> by implication) cope with a NULL mask being passed (just like
> >> assign_irq_vector() does), and have IOMMU code pass NULL instead of
> >> &cpu_online_map (when, for VT-d, there's no NUMA node information
> >> available).
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > LGTM, just one patch style comment and one code comment:
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Thanks.
> 
> >> --- a/xen/arch/x86/irq.c
> >> +++ b/xen/arch/x86/irq.c
> >> @@ -796,18 +796,26 @@ unsigned int set_desc_affinity(struct ir
> >>        unsigned long flags;
> >>        cpumask_t dest_mask;
> >>    
> >> -    if (!cpumask_intersects(mask, &cpu_online_map))
> >> +    if ( mask && !cpumask_intersects(mask, &cpu_online_map) )
> >>            return BAD_APICID;
> >>    
> >>        spin_lock_irqsave(&vector_lock, flags);
> >> -    ret = _assign_irq_vector(desc, mask);
> >> +    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
> >>        spin_unlock_irqrestore(&vector_lock, flags);
> > 
> > I think the patch is somehow mangled at least on my end, there's one
> > prepended extra space in the non-modified lines AFAICT.
> 
> Well, yes, hence the last sentence in the cover letter and the attached
> patches there. It is the mail system (more likely server than client)
> over here which causes this issue (everywhere for me).

Oh, sorry to hear that. Hope you get that sorted out, I guess it's
causing quite a lot of pain for more people at SUSE also.

> >>    
> >> -    if (ret < 0)
> >> +    if ( ret < 0 )
> >>            return BAD_APICID;
> >>    
> >> -    cpumask_copy(desc->affinity, mask);
> > 
> > AFAICT you could also avoid the if and just do the same as in the
> > assign_irq_vector call above and pass TARGET_CPUS if mask is NULL?
> 
> Are you talking about the if() in context above, or the one you've
> stripped (immediately following the last quoted line of the patch)?
> For the one in context I don't see how the rest of your remark is
> related. For the other one - no, strictly not, as that would be
> against the purpose of this change: We specifically want to _not_
> restrict desc->affinity to online CPUs only (yet that's what
> TARGET_CPUS resolves to).

Yes, that was my remark - which is wrong as you pointed out. I guess
you could use cpu_possible_map, but anyway the current approach is OK
IMO.

Thanks, Roger.
Andrew Cooper July 19, 2019, 1:27 p.m. UTC | #4
On 16/07/2019 08:40, Jan Beulich wrote:
> In line with "x86/IRQ: desc->affinity should strictly represent the
> requested value" the internally used IRQ(s) also shouldn't be restricted
> to online ones. Make set_desc_affinity() (set_msi_affinity() then does
> by implication) cope with a NULL mask being passed (just like
> assign_irq_vector() does), and have IOMMU code pass NULL instead of
> &cpu_online_map (when, for VT-d, there's no NUMA node information
> available).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tian, Kevin July 24, 2019, 12:53 a.m. UTC | #5
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, July 16, 2019 3:41 PM
> 
> In line with "x86/IRQ: desc->affinity should strictly represent the
> requested value" the internally used IRQ(s) also shouldn't be restricted
> to online ones. Make set_desc_affinity() (set_msi_affinity() then does
> by implication) cope with a NULL mask being passed (just like
> assign_irq_vector() does), and have IOMMU code pass NULL instead of
> &cpu_online_map (when, for VT-d, there's no NUMA node information
> available).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Woods, Brian July 24, 2019, 7:53 p.m. UTC | #6
On Tue, Jul 16, 2019 at 07:40:57AM +0000, Jan Beulich wrote:
> In line with "x86/IRQ: desc->affinity should strictly represent the
> requested value" the internally used IRQ(s) also shouldn't be restricted
> to online ones. Make set_desc_affinity() (set_msi_affinity() then does
> by implication) cope with a NULL mask being passed (just like
> assign_irq_vector() does), and have IOMMU code pass NULL instead of
> &cpu_online_map (when, for VT-d, there's no NUMA node information
> available).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Brian Woods <brian.woods@amd.com>

> ---
> v4: New.
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -796,18 +796,26 @@ unsigned int set_desc_affinity(struct ir
>       unsigned long flags;
>       cpumask_t dest_mask;
>   
> -    if (!cpumask_intersects(mask, &cpu_online_map))
> +    if ( mask && !cpumask_intersects(mask, &cpu_online_map) )
>           return BAD_APICID;
>   
>       spin_lock_irqsave(&vector_lock, flags);
> -    ret = _assign_irq_vector(desc, mask);
> +    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
>       spin_unlock_irqrestore(&vector_lock, flags);
>   
> -    if (ret < 0)
> +    if ( ret < 0 )
>           return BAD_APICID;
>   
> -    cpumask_copy(desc->affinity, mask);
> -    cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
> +    if ( mask )
> +    {
> +        cpumask_copy(desc->affinity, mask);
> +        cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
> +    }
> +    else
> +    {
> +        cpumask_setall(desc->affinity);
> +        cpumask_copy(&dest_mask, desc->arch.cpu_mask);
> +    }
>       cpumask_and(&dest_mask, &dest_mask, &cpu_online_map);
>   
>       return cpu_mask_to_apicid(&dest_mask);
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -888,7 +888,7 @@ static void enable_iommu(struct amd_iomm
>   
>       desc = irq_to_desc(iommu->msi.irq);
>       spin_lock(&desc->lock);
> -    set_msi_affinity(desc, &cpu_online_map);
> +    set_msi_affinity(desc, NULL);
>       spin_unlock(&desc->lock);
>   
>       amd_iommu_msi_enable(iommu, IOMMU_CONTROL_ENABLED);
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -2133,11 +2133,11 @@ static void adjust_irq_affinity(struct a
>       const struct acpi_rhsa_unit *rhsa = drhd_to_rhsa(drhd);
>       unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
>                                : NUMA_NO_NODE;
> -    const cpumask_t *cpumask = &cpu_online_map;
> +    const cpumask_t *cpumask = NULL;
>       struct irq_desc *desc;
>   
>       if ( node < MAX_NUMNODES && node_online(node) &&
> -         cpumask_intersects(&node_to_cpumask(node), cpumask) )
> +         cpumask_intersects(&node_to_cpumask(node), &cpu_online_map) )
>           cpumask = &node_to_cpumask(node);
>   
>       desc = irq_to_desc(drhd->iommu->msi.irq);
diff mbox series

Patch

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -796,18 +796,26 @@  unsigned int set_desc_affinity(struct ir
      unsigned long flags;
      cpumask_t dest_mask;
  
-    if (!cpumask_intersects(mask, &cpu_online_map))
+    if ( mask && !cpumask_intersects(mask, &cpu_online_map) )
          return BAD_APICID;
  
      spin_lock_irqsave(&vector_lock, flags);
-    ret = _assign_irq_vector(desc, mask);
+    ret = _assign_irq_vector(desc, mask ?: TARGET_CPUS);
      spin_unlock_irqrestore(&vector_lock, flags);
  
-    if (ret < 0)
+    if ( ret < 0 )
          return BAD_APICID;
  
-    cpumask_copy(desc->affinity, mask);
-    cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
+    if ( mask )
+    {
+        cpumask_copy(desc->affinity, mask);
+        cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
+    }
+    else
+    {
+        cpumask_setall(desc->affinity);
+        cpumask_copy(&dest_mask, desc->arch.cpu_mask);
+    }
      cpumask_and(&dest_mask, &dest_mask, &cpu_online_map);
  
      return cpu_mask_to_apicid(&dest_mask);
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -888,7 +888,7 @@  static void enable_iommu(struct amd_iomm
  
      desc = irq_to_desc(iommu->msi.irq);
      spin_lock(&desc->lock);
-    set_msi_affinity(desc, &cpu_online_map);
+    set_msi_affinity(desc, NULL);
      spin_unlock(&desc->lock);
  
      amd_iommu_msi_enable(iommu, IOMMU_CONTROL_ENABLED);
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2133,11 +2133,11 @@  static void adjust_irq_affinity(struct a
      const struct acpi_rhsa_unit *rhsa = drhd_to_rhsa(drhd);
      unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain)
                               : NUMA_NO_NODE;
-    const cpumask_t *cpumask = &cpu_online_map;
+    const cpumask_t *cpumask = NULL;
      struct irq_desc *desc;
  
      if ( node < MAX_NUMNODES && node_online(node) &&
-         cpumask_intersects(&node_to_cpumask(node), cpumask) )
+         cpumask_intersects(&node_to_cpumask(node), &cpu_online_map) )
          cpumask = &node_to_cpumask(node);
  
      desc = irq_to_desc(drhd->iommu->msi.irq);