x86/mtrr: recalculate P2M type for domains with iocaps

Message ID	1556019478-10835-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> From: Igor Druzhinin <igor.druzhinin@citrix.com> To: <xen-devel@lists.xenproject.org> Date: Tue, 23 Apr 2019 12:37:58 +0100 Message-ID: <1556019478-10835-1-git-send-email-igor.druzhinin@citrix.com> MIME-Version: 1.0 Subject: [Xen-devel] [PATCH] x86/mtrr: recalculate P2M type for domains with iocaps Precedence: list Cc: andrew.cooper3@citrix.com, Igor Druzhinin <igor.druzhinin@citrix.com>, wei.liu2@citrix.com, jbeulich@suse.com, roger.pau@citrix.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org>
Series	x86/mtrr: recalculate P2M type for domains with iocaps \| expand x86/mtrr: recalculate P2M type for domains with iocaps

Igor Druzhinin April 23, 2019, 11:37 a.m. UTC

This change reflects the logic in epte_get_entry_emt() and allows
changes in guest MTTRs to be reflected in EPT for domains having
direct access to certain hardware memory regions but without IOMMU
context assigned (e.g. XenGT).

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
---
 xen/arch/x86/hvm/mtrr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Jan Beulich April 25, 2019, 1:30 p.m. UTC | #1

>>> On 23.04.19 at 13:37, <igor.druzhinin@citrix.com> wrote:
> This change reflects the logic in epte_get_entry_emt() and allows
> changes in guest MTTRs to be reflected in EPT for domains having
> direct access to certain hardware memory regions but without IOMMU
> context assigned (e.g. XenGT).
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>

Fundamentally I'm happy to get both in sync, so
Reviewed-by: Jan Beulich <jbeulich@suse.com>

But I have a question:

>  void memory_type_changed(struct domain *d)
>  {
> -    if ( has_iommu_pt(d) && d->vcpu && d->vcpu[0] )
> +    if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && d->vcpu[0] )
>      {
>          p2m_memory_type_changed(d);
>          flush_all(FLUSH_CACHE);

Wouldn't cache_flush_permitted() alone suffice, both here and
there? Even if anyone was to pass-through a device without any
MMIO or I/O port BAR, the memory type (which is a CPU side
thing only, i.e. doesn't affect DMA by the device) shouldn't
matter in that case (leaving aside the fact that a BAR-less
device is unlikely to be DMA-capable, unless the programming
of the DMA operations was to happen through vendor specific
config space accesses).

Jan

Igor Druzhinin April 25, 2019, 2:50 p.m. UTC | #2

On 25/04/2019 14:30, Jan Beulich wrote:
>>>> On 23.04.19 at 13:37, <igor.druzhinin@citrix.com> wrote:
>> This change reflects the logic in epte_get_entry_emt() and allows
>> changes in guest MTTRs to be reflected in EPT for domains having
>> direct access to certain hardware memory regions but without IOMMU
>> context assigned (e.g. XenGT).
>>
>> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> 
> Fundamentally I'm happy to get both in sync, so
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> But I have a question:
> 
>>  void memory_type_changed(struct domain *d)
>>  {
>> -    if ( has_iommu_pt(d) && d->vcpu && d->vcpu[0] )
>> +    if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && d->vcpu[0] )
>>      {
>>          p2m_memory_type_changed(d);
>>          flush_all(FLUSH_CACHE);
> 
> Wouldn't cache_flush_permitted() alone suffice, both here and
> there? Even if anyone was to pass-through a device without any
> MMIO or I/O port BAR, the memory type (which is a CPU side
> thing only, i.e. doesn't affect DMA by the device) shouldn't
> matter in that case (leaving aside the fact that a BAR-less
> device is unlikely to be DMA-capable, unless the programming
> of the DMA operations was to happen through vendor specific
> config space accesses).
> 

I don't think it's correct in case of EPT sharing with IOMMU as the
section 3.6.5 of VT-d spec implies there is direct effect of caching
information present in IOMMU page table on device accesses within
processor coherency domain. So even in case there are no BARs passed to
the domain, drivers inside it and devices operation on its memory could
still have assumptions on device memory access properties.

Igor

Jan Beulich April 25, 2019, 3:49 p.m. UTC | #3

>>> On 25.04.19 at 16:50, <igor.druzhinin@citrix.com> wrote:
> On 25/04/2019 14:30, Jan Beulich wrote:
>>>>> On 23.04.19 at 13:37, <igor.druzhinin@citrix.com> wrote:
>>>  void memory_type_changed(struct domain *d)
>>>  {
>>> -    if ( has_iommu_pt(d) && d->vcpu && d->vcpu[0] )
>>> +    if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && d->vcpu[0] )
>>>      {
>>>          p2m_memory_type_changed(d);
>>>          flush_all(FLUSH_CACHE);
>> 
>> Wouldn't cache_flush_permitted() alone suffice, both here and
>> there? Even if anyone was to pass-through a device without any
>> MMIO or I/O port BAR, the memory type (which is a CPU side
>> thing only, i.e. doesn't affect DMA by the device) shouldn't
>> matter in that case (leaving aside the fact that a BAR-less
>> device is unlikely to be DMA-capable, unless the programming
>> of the DMA operations was to happen through vendor specific
>> config space accesses).
> 
> I don't think it's correct in case of EPT sharing with IOMMU as the
> section 3.6.5 of VT-d spec implies there is direct effect of caching
> information present in IOMMU page table on device accesses within
> processor coherency domain. So even in case there are no BARs passed to
> the domain, drivers inside it and devices operation on its memory could
> still have assumptions on device memory access properties.

But that's a First-Level Translation subsection. For us right now
section 3.7.4 is relevant, while in the future 3.8.4 may become
relevant too. As per 3.7.4 all leaf page (actual data) accesses
are of WB type anyway.

Once 3.8.4 becomes applicable, we'd likely have to support IOMMU
side MTRRs as well, or else the MTRR imposed memory type would
be uniformly UC, allowing for only UC and WC effective memory
types.

Jan

Igor Druzhinin April 25, 2019, 4:22 p.m. UTC | #4

On 25/04/2019 16:49, Jan Beulich wrote:
>>>> On 25.04.19 at 16:50, <igor.druzhinin@citrix.com> wrote:
>> On 25/04/2019 14:30, Jan Beulich wrote:
>>>>>> On 23.04.19 at 13:37, <igor.druzhinin@citrix.com> wrote:
>>>>  void memory_type_changed(struct domain *d)
>>>>  {
>>>> -    if ( has_iommu_pt(d) && d->vcpu && d->vcpu[0] )
>>>> +    if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && d->vcpu[0] )
>>>>      {
>>>>          p2m_memory_type_changed(d);
>>>>          flush_all(FLUSH_CACHE);
>>>
>>> Wouldn't cache_flush_permitted() alone suffice, both here and
>>> there? Even if anyone was to pass-through a device without any
>>> MMIO or I/O port BAR, the memory type (which is a CPU side
>>> thing only, i.e. doesn't affect DMA by the device) shouldn't
>>> matter in that case (leaving aside the fact that a BAR-less
>>> device is unlikely to be DMA-capable, unless the programming
>>> of the DMA operations was to happen through vendor specific
>>> config space accesses).
>>
>> I don't think it's correct in case of EPT sharing with IOMMU as the
>> section 3.6.5 of VT-d spec implies there is direct effect of caching
>> information present in IOMMU page table on device accesses within
>> processor coherency domain. So even in case there are no BARs passed to
>> the domain, drivers inside it and devices operation on its memory could
>> still have assumptions on device memory access properties.
> 
> But that's a First-Level Translation subsection. For us right now
> section 3.7.4 is relevant, while in the future 3.8.4 may become
> relevant too. As per 3.7.4 all leaf page (actual data) accesses
> are of WB type anyway.
> 
> Once 3.8.4 becomes applicable, we'd likely have to support IOMMU
> side MTRRs as well, or else the MTRR imposed memory type would
> be uniformly UC, allowing for only UC and WC effective memory
> types.
> 

Oh yes, sorry that I mixed them up. In that case, probably yes - simply
cache_flush_permitted() would suffice. Although there are things that
I'm not sure about:
- whether iocaps are always used by software in Dom0 when passing
through certain physical memory and whether their usage is enforced by Xen
- there are could be assumptions elsewhere that with IOMMU context
assigned guest caching choice is in effect - since that fact has been
perpetuated in Xen code for very long time

Igor

Jan Beulich April 26, 2019, 7:09 a.m. UTC | #5

>>> On 25.04.19 at 18:22, <igor.druzhinin@citrix.com> wrote:
> Although there are things that I'm not sure about:
> - whether iocaps are always used by software in Dom0 when passing
> through certain physical memory and whether their usage is enforced by Xen

I'm afraid I'm struggling to understand what exactly you mean.
Passing through a device without also registering the BARs
for use by the target guest simply wouldn't yield a functioning
device (in the guest). Arguably it shouldn't be a separate
domctl, but that's how the people originally implementing
pass-through chose to structure things. It has the advantage
of devices working where Dom0 knows about extra resources
(MMIO or I/O ports) when Xen doesn't. It has the
disadvantage of being dependent on Xen, Dom0 kernel, and
the tool stack interacting correctly (rather than all logic being
in a single component).

There was a similarly odd arrangement in the original
implementation of MSI-X, where Xen depended upon Dom0
reporting the MSI-X table base. Nowadays we read the BAR
ourselves and simply check that the value matches what
Dom0 has passed.

> - there are could be assumptions elsewhere that with IOMMU context
> assigned guest caching choice is in effect - since that fact has been
> perpetuated in Xen code for very long time

Agreed - there's a certain chance that other places in Xen have
similar logic. But there's an equal chance that other places in
Xen similarly check only one of the two, when, in the spirit of your
patch, they really mean to check both. This could only be clarified
by doing a full audit.

Jan

x86/mtrr: recalculate P2M type for domains with iocaps

Commit Message

Comments

Patch