diff mbox

[v1,06/10] iommu: Add extra use_iommu argument to iommu_domain_init()

Message ID 1494424994-26232-7-git-send-email-olekstysh@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Oleksandr Tyshchenko May 10, 2017, 2:03 p.m. UTC
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The presence of this flag lets us know that the guest
has devices which will most likely be used for passthrough
and as the result the use of IOMMU is expected for this domain.
In that case we have to call iommu_construct(), actually
what the real assign_device call usually does.

As iommu_domain_init() is called with use_iommu flag being forced
to false for now, no functional change is intended for both ARM and x86.

Basically, this patch is needed for non-shared IOMMUs on ARM only
since the non-shared IOMMUs on x86 are ok if iommu_construct() is called
later. But, in order to be more generic and for possible future optimization
make this change appliable for both platforms.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Jan Beulich <jbeulich@suse.com>

---
   Changes in v1:
      - Clarify patch subject/description.
      - s/bool_t/bool/
---
 xen/arch/arm/domain.c           |  2 +-
 xen/arch/x86/domain.c           |  2 +-
 xen/drivers/passthrough/iommu.c | 11 +++++++++--
 xen/include/xen/iommu.h         |  2 +-
 4 files changed, 12 insertions(+), 5 deletions(-)

Comments

Jan Beulich May 12, 2017, 2:31 p.m. UTC | #1
>>> On 10.05.17 at 16:03, <olekstysh@gmail.com> wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The presence of this flag lets us know that the guest
> has devices which will most likely be used for passthrough
> and as the result the use of IOMMU is expected for this domain.
> In that case we have to call iommu_construct(), actually
> what the real assign_device call usually does.
> 
> As iommu_domain_init() is called with use_iommu flag being forced
> to false for now, no functional change is intended for both ARM and x86.
> 
> Basically, this patch is needed for non-shared IOMMUs on ARM only
> since the non-shared IOMMUs on x86 are ok if iommu_construct() is called
> later. But, in order to be more generic and for possible future optimization
> make this change appliable for both platforms.

I continue to be unconvinced that this is wanted / needed, as I
continue to not see why shared vs unshared really matters here.
After all we have both modes working on x86 without this flag.

> @@ -142,7 +142,14 @@ int iommu_domain_init(struct domain *d)
>          return 0;
>  
>      hd->platform_ops = iommu_get_ops();
> -    return hd->platform_ops->init(d);
> +    ret = hd->platform_ops->init(d);
> +    if ( ret )
> +        return ret;
> +
> +    if ( use_iommu && !is_hardware_domain(d) )
> +        ret = iommu_construct(d);

You don't handle the -ERESTART you may (and likely will) get here
or in the caller.

Jan
Oleksandr Tyshchenko May 12, 2017, 5 p.m. UTC | #2
Hi, Jan.

On Fri, May 12, 2017 at 5:31 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 10.05.17 at 16:03, <olekstysh@gmail.com> wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> The presence of this flag lets us know that the guest
>> has devices which will most likely be used for passthrough
>> and as the result the use of IOMMU is expected for this domain.
>> In that case we have to call iommu_construct(), actually
>> what the real assign_device call usually does.
>>
>> As iommu_domain_init() is called with use_iommu flag being forced
>> to false for now, no functional change is intended for both ARM and x86.
>>
>> Basically, this patch is needed for non-shared IOMMUs on ARM only
>> since the non-shared IOMMUs on x86 are ok if iommu_construct() is called
>> later. But, in order to be more generic and for possible future optimization
>> make this change appliable for both platforms.
>
> I continue to be unconvinced that this is wanted / needed, as I
> continue to not see why shared vs unshared really matters here.
> After all we have both modes working on x86 without this flag.
I know. But, for ARM we need this hint. We can't reuse the "retrieving
mapping" code I moved to x86-specific part in patch #8 (due to lack of
M2P on ARM) .

>
>> @@ -142,7 +142,14 @@ int iommu_domain_init(struct domain *d)
>>          return 0;
>>
>>      hd->platform_ops = iommu_get_ops();
>> -    return hd->platform_ops->init(d);
>> +    ret = hd->platform_ops->init(d);
>> +    if ( ret )
>> +        return ret;
>> +
>> +    if ( use_iommu && !is_hardware_domain(d) )
>> +        ret = iommu_construct(d);
>
> You don't handle the -ERESTART you may (and likely will) get here
> or in the caller.
Indeed. I forgot about it.

I most likely rework this patch not to call iommu_construct at all.
But, there are an open questions here and in patch #7 I would like to
clarify the first.
1. Are you against extra arguments at all?
2. If this question still makes sense, Shall I add extra need_iommu
argument to "init" callback too and just pass thought
    incoming flag to IOMMU drivers? This change wouldn't affect x86 at
all since we set this flag for ARM only (see patch #9).
    For x86 this flag would be always treated as unused.

>
> Jan
>
Jan Beulich May 15, 2017, 7:27 a.m. UTC | #3
>>> On 12.05.17 at 19:00, <olekstysh@gmail.com> wrote:
> On Fri, May 12, 2017 at 5:31 PM, Jan Beulich <JBeulich@suse.com> wrote:
>>>>> On 10.05.17 at 16:03, <olekstysh@gmail.com> wrote:
>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>
>>> The presence of this flag lets us know that the guest
>>> has devices which will most likely be used for passthrough
>>> and as the result the use of IOMMU is expected for this domain.
>>> In that case we have to call iommu_construct(), actually
>>> what the real assign_device call usually does.
>>>
>>> As iommu_domain_init() is called with use_iommu flag being forced
>>> to false for now, no functional change is intended for both ARM and x86.
>>>
>>> Basically, this patch is needed for non-shared IOMMUs on ARM only
>>> since the non-shared IOMMUs on x86 are ok if iommu_construct() is called
>>> later. But, in order to be more generic and for possible future optimization
>>> make this change appliable for both platforms.
>>
>> I continue to be unconvinced that this is wanted / needed, as I
>> continue to not see why shared vs unshared really matters here.
>> After all we have both modes working on x86 without this flag.
> I know. But, for ARM we need this hint. We can't reuse the "retrieving
> mapping" code I moved to x86-specific part in patch #8 (due to lack of
> M2P on ARM) .

Well, see the reply to 08/10 I've sent a minute ago.

>>> @@ -142,7 +142,14 @@ int iommu_domain_init(struct domain *d)
>>>          return 0;
>>>
>>>      hd->platform_ops = iommu_get_ops();
>>> -    return hd->platform_ops->init(d);
>>> +    ret = hd->platform_ops->init(d);
>>> +    if ( ret )
>>> +        return ret;
>>> +
>>> +    if ( use_iommu && !is_hardware_domain(d) )
>>> +        ret = iommu_construct(d);
>>
>> You don't handle the -ERESTART you may (and likely will) get here
>> or in the caller.
> Indeed. I forgot about it.
> 
> I most likely rework this patch not to call iommu_construct at all.
> But, there are an open questions here and in patch #7 I would like to
> clarify the first.
> 1. Are you against extra arguments at all?

No, extra arguments aren't the point. The kind of information passed
is the questionable thing.

> 2. If this question still makes sense, Shall I add extra need_iommu
> argument to "init" callback too and just pass thought
>     incoming flag to IOMMU drivers? This change wouldn't affect x86 at
> all since we set this flag for ARM only (see patch #9).
>     For x86 this flag would be always treated as unused.

I won't give a firm answer here considering the wider question on
M2P, but I'd like to say that always-unused flags on one architecture
are often (but not always, namely not when there's a hardware
feature one architecture has but others don't) a sign of a design/
abstraction problem.

Jan
Julien Grall May 17, 2017, 7:52 p.m. UTC | #4
Hi Jan,

On 05/12/2017 03:31 PM, Jan Beulich wrote:
>>>> On 10.05.17 at 16:03, <olekstysh@gmail.com> wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> The presence of this flag lets us know that the guest
>> has devices which will most likely be used for passthrough
>> and as the result the use of IOMMU is expected for this domain.
>> In that case we have to call iommu_construct(), actually
>> what the real assign_device call usually does.
>>
>> As iommu_domain_init() is called with use_iommu flag being forced
>> to false for now, no functional change is intended for both ARM and x86.
>>
>> Basically, this patch is needed for non-shared IOMMUs on ARM only
>> since the non-shared IOMMUs on x86 are ok if iommu_construct() is called
>> later. But, in order to be more generic and for possible future optimization
>> make this change appliable for both platforms.
>
> I continue to be unconvinced that this is wanted / needed, as I
> continue to not see why shared vs unshared really matters here.
> After all we have both modes working on x86 without this flag.

Well on x86 you allocate the page table on the fly in the unsharing 
case. This is only useful if you don't know whether a domain will have 
device assigned (e.g hotplug case).

When you know that the domain will have device pass-throughed, you can 
populate the IOMMU page tables before hand avoiding to have to go 
through the list of page at the first assigned device.

In embedded platform hotplug is likely to be inexistent. For servers, I 
don't know but likely page tables are going to be shared (or as I 
mentioned earlier partially shared).

So I don't see any benefit of the current code over populating the IOMMU 
page tables from the beginning.

Cheers,
Jan Beulich May 18, 2017, 8:38 a.m. UTC | #5
>>> On 17.05.17 at 21:52, <julien.grall@arm.com> wrote:
> On 05/12/2017 03:31 PM, Jan Beulich wrote:
>>>>> On 10.05.17 at 16:03, <olekstysh@gmail.com> wrote:
>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>
>>> The presence of this flag lets us know that the guest
>>> has devices which will most likely be used for passthrough
>>> and as the result the use of IOMMU is expected for this domain.
>>> In that case we have to call iommu_construct(), actually
>>> what the real assign_device call usually does.
>>>
>>> As iommu_domain_init() is called with use_iommu flag being forced
>>> to false for now, no functional change is intended for both ARM and x86.
>>>
>>> Basically, this patch is needed for non-shared IOMMUs on ARM only
>>> since the non-shared IOMMUs on x86 are ok if iommu_construct() is called
>>> later. But, in order to be more generic and for possible future optimization
>>> make this change appliable for both platforms.
>>
>> I continue to be unconvinced that this is wanted / needed, as I
>> continue to not see why shared vs unshared really matters here.
>> After all we have both modes working on x86 without this flag.
> 
> Well on x86 you allocate the page table on the fly in the unsharing 
> case. This is only useful if you don't know whether a domain will have 
> device assigned (e.g hotplug case).
> 
> When you know that the domain will have device pass-throughed, you can 
> populate the IOMMU page tables before hand avoiding to have to go 
> through the list of page at the first assigned device.
> 
> In embedded platform hotplug is likely to be inexistent. For servers, I 
> don't know but likely page tables are going to be shared (or as I 
> mentioned earlier partially shared).
> 
> So I don't see any benefit of the current code over populating the IOMMU 
> page tables from the beginning.

Interesting. To me, the primary benefit is that we wouldn't need to
introduce new code to handle yet another case specially. Anyway,
the changes in this patch are simple enough, so I don't mean to
block it despite being unconvinced of the basic idea.

Jan
Oleksandr Tyshchenko May 18, 2017, 5:41 p.m. UTC | #6
Hi, all.

On Thu, May 18, 2017 at 11:38 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 17.05.17 at 21:52, <julien.grall@arm.com> wrote:
>> On 05/12/2017 03:31 PM, Jan Beulich wrote:
>>>>>> On 10.05.17 at 16:03, <olekstysh@gmail.com> wrote:
>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>
>>>> The presence of this flag lets us know that the guest
>>>> has devices which will most likely be used for passthrough
>>>> and as the result the use of IOMMU is expected for this domain.
>>>> In that case we have to call iommu_construct(), actually
>>>> what the real assign_device call usually does.
>>>>
>>>> As iommu_domain_init() is called with use_iommu flag being forced
>>>> to false for now, no functional change is intended for both ARM and x86.
>>>>
>>>> Basically, this patch is needed for non-shared IOMMUs on ARM only
>>>> since the non-shared IOMMUs on x86 are ok if iommu_construct() is called
>>>> later. But, in order to be more generic and for possible future optimization
>>>> make this change appliable for both platforms.
>>>
>>> I continue to be unconvinced that this is wanted / needed, as I
>>> continue to not see why shared vs unshared really matters here.
>>> After all we have both modes working on x86 without this flag.
>>
>> Well on x86 you allocate the page table on the fly in the unsharing
>> case. This is only useful if you don't know whether a domain will have
>> device assigned (e.g hotplug case).
>>
>> When you know that the domain will have device pass-throughed, you can
>> populate the IOMMU page tables before hand avoiding to have to go
>> through the list of page at the first assigned device.
>>
>> In embedded platform hotplug is likely to be inexistent. For servers, I
>> don't know but likely page tables are going to be shared (or as I
>> mentioned earlier partially shared).
>>
>> So I don't see any benefit of the current code over populating the IOMMU
>> page tables from the beginning.
>
> Interesting. To me, the primary benefit is that we wouldn't need to
> introduce new code to handle yet another case specially. Anyway,
> the changes in this patch are simple enough, so I don't mean to
> block it despite being unconvinced of the basic idea.

Thank you for your comments.
I would like to say that I share Julien's opinion, but understand
Jan's points too.
I think some mutually agreeable solution should be worked out.

It is not completely clear to me what I have to do with patches #6-#8.
So, I will try to summarize thoughts regarding them. Please, correct
me if I am wrong.

patch #6: As for the current patch, likely the "init" platform
callback should be extended with
extra "use_iommu" argument as well as the "iommu_domain_init" API. In
that case we
would just pass thought incoming flag to the IOMMU drivers followed by
updating "need_iommu" domain flag.

Something like this:
...
int iommu_domain_init(struct domain *d, bool use_iommu)
{
    struct domain_iommu *hd = dom_iommu(d);
    int ret = 0;

    ret = arch_iommu_domain_init(d);
    if ( ret )
        return ret;

    if ( !iommu_enabled )
        return 0;

    hd->platform_ops = iommu_get_ops();
    ret = hd->platform_ops->init(d, use_iommu);
    if ( ret )
        return ret;

    d->need_iommu = !!use_iommu;

    return 0;
}
...

patch #7: This patch should be just dropped.

patch #8: As we always allocate the page table for hardware domain,
this patch should be reworked.
The use_iommu flag should be set for both archs in case of hardware
domain. Having d->need_iommu set at the early stage we won't skip
IOMMU
mapping updates anymore. And as the result the existing in
iommu_hwdom_init() code that goes through the list of page and tries
to retrieve mapping could be just dropped
instead of moving it to the arch-specific part.

So, what we would have as the final result:
1. In case of hardware domain "use_iommu" flag is always set for both
ARM and x86.
2. For other domains the "use_iommu" flag is always unset for x86
only, but the real value is passed for ARM
according to the libxl expectation about IOMMU usage.
This would allow us to allocate the page table in advance on ARM and
retain the current behavior for x86 (allocating the page table
on-demand).

What do you think?

>
> Jan
>
Jan Beulich May 19, 2017, 6:30 a.m. UTC | #7
>>> On 18.05.17 at 19:41, <olekstysh@gmail.com> wrote:
> patch #6: As for the current patch, likely the "init" platform
> callback should be extended with
> extra "use_iommu" argument as well as the "iommu_domain_init" API. In
> that case we
> would just pass thought incoming flag to the IOMMU drivers followed by
> updating "need_iommu" domain flag.
> 
> Something like this:
> ...
> int iommu_domain_init(struct domain *d, bool use_iommu)
> {
>     struct domain_iommu *hd = dom_iommu(d);
>     int ret = 0;
> 
>     ret = arch_iommu_domain_init(d);
>     if ( ret )
>         return ret;
> 
>     if ( !iommu_enabled )
>         return 0;
> 
>     hd->platform_ops = iommu_get_ops();
>     ret = hd->platform_ops->init(d, use_iommu);
>     if ( ret )
>         return ret;
> 
>     d->need_iommu = !!use_iommu;
> 
>     return 0;
> }
> ...

The final shape of this primarily depends on ARM side needs.
However, you need to be careful to make sure the final setting
of d->need_iommu then is no different than it is today on at
least x86. I'm mentioning this in particular because of e.g.

    d->need_iommu = !!iommu_dom0_strict;

in iommu_hwdom_init().

Also as a minor remark note that in your new code the !! would
not be needed.

> patch #7: This patch should be just dropped.
> 
> patch #8: As we always allocate the page table for hardware domain,
> this patch should be reworked.
> The use_iommu flag should be set for both archs in case of hardware
> domain. Having d->need_iommu set at the early stage we won't skip
> IOMMU
> mapping updates anymore. And as the result the existing in
> iommu_hwdom_init() code that goes through the list of page and tries
> to retrieve mapping could be just dropped
> instead of moving it to the arch-specific part.

And again, careful here: There are three command line options
influencing which pages do actually get mapped, and in which way
(iommu=dom0-passthrough, iommu=dom0-strict, and VT-d's
iommu_inclusive_mapping). The behavior after your change must
not differ from current behavior regardless of which of these
options may be used.

Jan
Oleksandr Tyshchenko May 19, 2017, 8:56 a.m. UTC | #8
Hi, Jan

On Fri, May 19, 2017 at 9:30 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 18.05.17 at 19:41, <olekstysh@gmail.com> wrote:
>> patch #6: As for the current patch, likely the "init" platform
>> callback should be extended with
>> extra "use_iommu" argument as well as the "iommu_domain_init" API. In
>> that case we
>> would just pass thought incoming flag to the IOMMU drivers followed by
>> updating "need_iommu" domain flag.
>>
>> Something like this:
>> ...
>> int iommu_domain_init(struct domain *d, bool use_iommu)
>> {
>>     struct domain_iommu *hd = dom_iommu(d);
>>     int ret = 0;
>>
>>     ret = arch_iommu_domain_init(d);
>>     if ( ret )
>>         return ret;
>>
>>     if ( !iommu_enabled )
>>         return 0;
>>
>>     hd->platform_ops = iommu_get_ops();
>>     ret = hd->platform_ops->init(d, use_iommu);
>>     if ( ret )
>>         return ret;
>>
>>     d->need_iommu = !!use_iommu;
>>
>>     return 0;
>> }
>> ...
>
> The final shape of this primarily depends on ARM side needs.
> However, you need to be careful to make sure the final setting
> of d->need_iommu then is no different than it is today on at
> least x86. I'm mentioning this in particular because of e.g.
>
>     d->need_iommu = !!iommu_dom0_strict;
>
> in iommu_hwdom_init().
Yes, sure, I will take it into the account.

>
> Also as a minor remark note that in your new code the !! would
> not be needed.
>
>> patch #7: This patch should be just dropped.
>>
>> patch #8: As we always allocate the page table for hardware domain,
>> this patch should be reworked.
>> The use_iommu flag should be set for both archs in case of hardware
>> domain. Having d->need_iommu set at the early stage we won't skip
>> IOMMU
>> mapping updates anymore. And as the result the existing in
>> iommu_hwdom_init() code that goes through the list of page and tries
>> to retrieve mapping could be just dropped
>> instead of moving it to the arch-specific part.
>
> And again, careful here: There are three command line options
> influencing which pages do actually get mapped, and in which way
> (iommu=dom0-passthrough, iommu=dom0-strict, and VT-d's
> iommu_inclusive_mapping). The behavior after your change must
> not differ from current behavior regardless of which of these
> options may be used.
Yes, sure. This is my target not to brake things.

>
> Jan
>
diff mbox

Patch

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 76310ed..ec19310 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -569,7 +569,7 @@  int arch_domain_create(struct domain *d, unsigned int domcr_flags,
     ASSERT(config != NULL);
 
     /* p2m_init relies on some value initialized by the IOMMU subsystem */
-    if ( (rc = iommu_domain_init(d)) != 0 )
+    if ( (rc = iommu_domain_init(d, false)) != 0 )
         goto fail;
 
     if ( (rc = p2m_init(d)) != 0 )
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 90e2b1f..54037af 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -641,7 +641,7 @@  int arch_domain_create(struct domain *d, unsigned int domcr_flags,
         if ( (rc = init_domain_irq_mapping(d)) != 0 )
             goto fail;
 
-        if ( (rc = iommu_domain_init(d)) != 0 )
+        if ( (rc = iommu_domain_init(d, false)) != 0 )
             goto fail;
     }
     spin_lock_init(&d->arch.e820_lock);
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 3e9e4c3..c85f7b4 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -129,7 +129,7 @@  static void __init parse_iommu_param(char *s)
     } while ( ss );
 }
 
-int iommu_domain_init(struct domain *d)
+int iommu_domain_init(struct domain *d, bool use_iommu)
 {
     struct domain_iommu *hd = dom_iommu(d);
     int ret = 0;
@@ -142,7 +142,14 @@  int iommu_domain_init(struct domain *d)
         return 0;
 
     hd->platform_ops = iommu_get_ops();
-    return hd->platform_ops->init(d);
+    ret = hd->platform_ops->init(d);
+    if ( ret )
+        return ret;
+
+    if ( use_iommu && !is_hardware_domain(d) )
+        ret = iommu_construct(d);
+
+    return ret;
 }
 
 static void __hwdom_init check_hwdom_reqs(struct domain *d)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 3297998..3afbc3b 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -56,7 +56,7 @@  int iommu_setup(void);
 int iommu_add_device(struct pci_dev *pdev);
 int iommu_enable_device(struct pci_dev *pdev);
 int iommu_remove_device(struct pci_dev *pdev);
-int iommu_domain_init(struct domain *d);
+int iommu_domain_init(struct domain *d, bool use_iommu);
 void iommu_hwdom_init(struct domain *d);
 void iommu_domain_destroy(struct domain *d);
 int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn);