diff mbox series

[for-4.13] AMD/IOMMU: honour IR setting while pre-filling DTEs

Message ID 1574715937-13565-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive)
State Superseded
Headers show
Series [for-4.13] AMD/IOMMU: honour IR setting while pre-filling DTEs | expand

Commit Message

Igor Druzhinin Nov. 25, 2019, 9:05 p.m. UTC
IV bit shouldn't be set in DTE if interrupt remapping is not
enabled. This was traced to be a root cause behind assertion in
interrupt handling code on Lisbon.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
---
 xen/drivers/passthrough/amd/iommu_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jan Beulich Nov. 26, 2019, 8:42 a.m. UTC | #1
On 25.11.2019 22:05, Igor Druzhinin wrote:
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -1279,7 +1279,7 @@ static int __init amd_iommu_setup_device_table(
>          for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
>              dt[bdf] = (struct amd_iommu_dte){
>                            .v = true,
> -                          .iv = true,
> +                          .iv = iommu_intremap,

This was very intentionally "true", and ignoring "iommu_intremap":
We're _pre_-filling DTEs here. Their actual values will be
established by the loop further down in the function, and just
for those devices that actually exist. By unilaterally setting IV
here we make sure that all interrupt requests from devices we
don't recognize get blocked rather than allowed through in an
un-remapped fashion.

The question continues to be which specific DTE the loop below
may wrongly leave untouched. Even if the the IDE device of the
chipset has no MSI/MSI-X, amd_iommu_set_intremap_table() at
the bottom of the loop should still get invoked, and hence IV
should still get set to false there when !iommu_intremap. There's
further investigation necessary, I'm afraid.

Jan
Andrew Cooper Nov. 26, 2019, 12:25 p.m. UTC | #2
On 26/11/2019 08:42, Jan Beulich wrote:
> On 25.11.2019 22:05, Igor Druzhinin wrote:
>> --- a/xen/drivers/passthrough/amd/iommu_init.c
>> +++ b/xen/drivers/passthrough/amd/iommu_init.c
>> @@ -1279,7 +1279,7 @@ static int __init amd_iommu_setup_device_table(
>>          for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
>>              dt[bdf] = (struct amd_iommu_dte){
>>                            .v = true,
>> -                          .iv = true,
>> +                          .iv = iommu_intremap,
> This was very intentionally "true", and ignoring "iommu_intremap":

Deliberate or not, it is a regression from 4.12.

Booting with iommu=no-intremap is a common debugging technique, and that
means no interrupt remapping anywhere in the system, even for
supposedly-unused DTEs.

~Andrew
Jan Beulich Nov. 26, 2019, 2:14 p.m. UTC | #3
On 26.11.2019 13:25, Andrew Cooper wrote:
> On 26/11/2019 08:42, Jan Beulich wrote:
>> On 25.11.2019 22:05, Igor Druzhinin wrote:
>>> --- a/xen/drivers/passthrough/amd/iommu_init.c
>>> +++ b/xen/drivers/passthrough/amd/iommu_init.c
>>> @@ -1279,7 +1279,7 @@ static int __init amd_iommu_setup_device_table(
>>>          for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
>>>              dt[bdf] = (struct amd_iommu_dte){
>>>                            .v = true,
>>> -                          .iv = true,
>>> +                          .iv = iommu_intremap,
>> This was very intentionally "true", and ignoring "iommu_intremap":
> 
> Deliberate or not, it is a regression from 4.12.

I accept it's a regression (which wants fixing), but I don't think
this is the way to address is. I could be convinced by good
arguments, though.

> Booting with iommu=no-intremap is a common debugging technique, and that
> means no interrupt remapping anywhere in the system, even for
> supposedly-unused DTEs.

Whether IV=1 or IV=0, there's no interrupt _remapping_ with this
option specified. There's some interrupt _blocking_, yes. It's
not immediately clear to me whether this is a good or a bad thing.

Jan
Igor Druzhinin Nov. 26, 2019, 2:24 p.m. UTC | #4
On 26/11/2019 14:14, Jan Beulich wrote:
> On 26.11.2019 13:25, Andrew Cooper wrote:
>> On 26/11/2019 08:42, Jan Beulich wrote:
>>> On 25.11.2019 22:05, Igor Druzhinin wrote:
>>>> --- a/xen/drivers/passthrough/amd/iommu_init.c
>>>> +++ b/xen/drivers/passthrough/amd/iommu_init.c
>>>> @@ -1279,7 +1279,7 @@ static int __init amd_iommu_setup_device_table(
>>>>          for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
>>>>              dt[bdf] = (struct amd_iommu_dte){
>>>>                            .v = true,
>>>> -                          .iv = true,
>>>> +                          .iv = iommu_intremap,
>>> This was very intentionally "true", and ignoring "iommu_intremap":
>>
>> Deliberate or not, it is a regression from 4.12.
> 
> I accept it's a regression (which wants fixing), but I don't think
> this is the way to address is. I could be convinced by good
> arguments, though.

Do you have any suggestions how to address that?

>> Booting with iommu=no-intremap is a common debugging technique, and that
>> means no interrupt remapping anywhere in the system, even for
>> supposedly-unused DTEs.
> 
> Whether IV=1 or IV=0, there's no interrupt _remapping_ with this
> option specified. There's some interrupt _blocking_, yes. It's
> not immediately clear to me whether this is a good or a bad thing.

From user point of view, if I supply "iommu=no-intremap" I'm not
expecting any interrupts in the system to be blocked either. And
as Andrew said we frequently use this option for debugging which
means we expect this functionality to be off completely.

Igor
Jan Beulich Nov. 26, 2019, 2:29 p.m. UTC | #5
On 26.11.2019 15:24, Igor Druzhinin wrote:
> On 26/11/2019 14:14, Jan Beulich wrote:
>> On 26.11.2019 13:25, Andrew Cooper wrote:
>>> On 26/11/2019 08:42, Jan Beulich wrote:
>>>> On 25.11.2019 22:05, Igor Druzhinin wrote:
>>>>> --- a/xen/drivers/passthrough/amd/iommu_init.c
>>>>> +++ b/xen/drivers/passthrough/amd/iommu_init.c
>>>>> @@ -1279,7 +1279,7 @@ static int __init amd_iommu_setup_device_table(
>>>>>          for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
>>>>>              dt[bdf] = (struct amd_iommu_dte){
>>>>>                            .v = true,
>>>>> -                          .iv = true,
>>>>> +                          .iv = iommu_intremap,
>>>> This was very intentionally "true", and ignoring "iommu_intremap":
>>>
>>> Deliberate or not, it is a regression from 4.12.
>>
>> I accept it's a regression (which wants fixing), but I don't think
>> this is the way to address is. I could be convinced by good
>> arguments, though.
> 
> Do you have any suggestions how to address that?

I'd like to reply in the other context, after a little more
thinking about the situation. I think I see an oversight of
mine.

Jan
Andrew Cooper Nov. 26, 2019, 2:34 p.m. UTC | #6
On 26/11/2019 14:14, Jan Beulich wrote:
> On 26.11.2019 13:25, Andrew Cooper wrote:
>> On 26/11/2019 08:42, Jan Beulich wrote:
>>> On 25.11.2019 22:05, Igor Druzhinin wrote:
>>>> --- a/xen/drivers/passthrough/amd/iommu_init.c
>>>> +++ b/xen/drivers/passthrough/amd/iommu_init.c
>>>> @@ -1279,7 +1279,7 @@ static int __init amd_iommu_setup_device_table(
>>>>          for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
>>>>              dt[bdf] = (struct amd_iommu_dte){
>>>>                            .v = true,
>>>> -                          .iv = true,
>>>> +                          .iv = iommu_intremap,
>>> This was very intentionally "true", and ignoring "iommu_intremap":
>> Deliberate or not, it is a regression from 4.12.
> I accept it's a regression (which wants fixing), but I don't think
> this is the way to address is. I could be convinced by good
> arguments, though.
>
>> Booting with iommu=no-intremap is a common debugging technique, and that
>> means no interrupt remapping anywhere in the system, even for
>> supposedly-unused DTEs.
> Whether IV=1 or IV=0, there's no interrupt _remapping_ with this
> option specified. There's some interrupt _blocking_, yes. It's
> not immediately clear to me whether this is a good or a bad thing.

You're attempting to argue semantics.  Blocking is a special case remapping.

"iommu=no-intremap" (for better or worse, naming wise) refers to the
interrupt mediation functionality in the IOMMU, and means "don't use any
of it".  Any other behaviour is a regression.

~Andrew
Jan Beulich Nov. 26, 2019, 3:16 p.m. UTC | #7
On 26.11.2019 15:34, Andrew Cooper wrote:
> On 26/11/2019 14:14, Jan Beulich wrote:
>> On 26.11.2019 13:25, Andrew Cooper wrote:
>>> On 26/11/2019 08:42, Jan Beulich wrote:
>>>> On 25.11.2019 22:05, Igor Druzhinin wrote:
>>>>> --- a/xen/drivers/passthrough/amd/iommu_init.c
>>>>> +++ b/xen/drivers/passthrough/amd/iommu_init.c
>>>>> @@ -1279,7 +1279,7 @@ static int __init amd_iommu_setup_device_table(
>>>>>          for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
>>>>>              dt[bdf] = (struct amd_iommu_dte){
>>>>>                            .v = true,
>>>>> -                          .iv = true,
>>>>> +                          .iv = iommu_intremap,
>>>> This was very intentionally "true", and ignoring "iommu_intremap":
>>> Deliberate or not, it is a regression from 4.12.
>> I accept it's a regression (which wants fixing), but I don't think
>> this is the way to address is. I could be convinced by good
>> arguments, though.
>>
>>> Booting with iommu=no-intremap is a common debugging technique, and that
>>> means no interrupt remapping anywhere in the system, even for
>>> supposedly-unused DTEs.
>> Whether IV=1 or IV=0, there's no interrupt _remapping_ with this
>> option specified. There's some interrupt _blocking_, yes. It's
>> not immediately clear to me whether this is a good or a bad thing.
> 
> You're attempting to argue semantics.  Blocking is a special case remapping.
> 
> "iommu=no-intremap" (for better or worse, naming wise) refers to the
> interrupt mediation functionality in the IOMMU, and means "don't use any
> of it".  Any other behaviour is a regression.

I can accept this pov. Nevertheless I'd like to first see whether
we can't address the issue at hand with a less big hammer solution.
We can then always decide to still put in this change.

Jan
diff mbox series

Patch

diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 16e84d4..2b81e38 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1279,7 +1279,7 @@  static int __init amd_iommu_setup_device_table(
         for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf )
             dt[bdf] = (struct amd_iommu_dte){
                           .v = true,
-                          .iv = true,
+                          .iv = iommu_intremap,
                       };
     }