diff mbox

[2/2] arm/virt: Mark pcie controller node as dma-coherent

Message ID 1464870382-385-3-git-send-email-bogdan.purcareata@nxp.com (mailing list archive)
State New, archived
Headers show

Commit Message

Bogdan Purcareata June 2, 2016, 12:26 p.m. UTC
A PCI device is marked either as coherent or non-coherent based on the pcie
controller "dma-coherent" property. This is further used when configuring the
IOMMU ops for the device DMA resources (e.g. descriptor rings, for e1000e).

This dma-coherent property needs to be configured in the guest environment,
in case there's a directly assigned VFIO PCI device. Since the guest only
receives one emulated pcie controller bus - regardless of the host
configuration - add this property if there's at least one host pcie host
controller that is DMA coherent (this implies that the host interconnect
is coherent as well).

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com>
---
 hw/arm/virt.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

Comments

Mihai Claudiu Caraman June 3, 2016, 2:22 p.m. UTC | #1
> -----Original Message-----

> From: Qemu-devel [mailto:qemu-devel-bounces+mihai.caraman=freescale.com@nongnu.org] On Behalf Of Peter Maydell

> Sent: Thursday, June 02, 2016 3:33 PM

> To: Bogdan Purcareata <bogdan.purcareata@nxp.com>

> Cc: QEMU Developers <qemu-devel@nongnu.org>; Peter Crosthwaite <crosthwaite.peter@gmail.com>; Alexander Graf <agraf@suse.de>; qemu-arm <qemu-arm@nongnu.org>; Eric Auger <eric.auger@linaro.org>

> Subject: Re: [Qemu-devel] [PATCH 2/2] arm/virt: Mark pcie controller node as dma-coherent

> 

> On 2 June 2016 at 13:26, Bogdan Purcareata <bogdan.purcareata@nxp.com> wrote:

> > A PCI device is marked either as coherent or non-coherent based on the 

> > pcie controller "dma-coherent" property. This is further used when 

> > configuring the IOMMU ops for the device DMA resources (e.g. descriptor rings, for e1000e).

> >

> > This dma-coherent property needs to be configured in the guest 

> > environment, in case there's a directly assigned VFIO PCI device. 

> > Since the guest only receives one emulated pcie controller bus - 

> > regardless of the host configuration - add this property if there's at 

> > least one host pcie host controller that is DMA coherent (this implies 

> > that the host interconnect is coherent as well).

> 

> This patch seems to change the property of the emulated PCIe controller based on the host PCIe controller even if we're not doing any PCIe passthrough at all. That seems definitely wrong to me.

> 

> (Should the purely-emulated case be marked DMA-coherent anyway?

> I forget the fiddly details...)

> 

> thanks

> -- PMM


In particular for virtual and emulated devices the host CPU behaves as a DMA coherent 'device'. This should have been stated in patch description.

Regards,
Mike
Peter Maydell June 3, 2016, 2:37 p.m. UTC | #2
On 3 June 2016 at 15:22, Mihai Claudiu Caraman <mike.caraman@nxp.com> wrote:
> In particular for virtual and emulated devices the host CPU behaves
> as a DMA coherent 'device'. This should have been stated in patch
> description.

Wouldn't that imply that we should just always have the "dma-coherent"
property set, and we don't need to do any of the messing around looking
at the host sysfs ?

thanks
-- PMM
Alexander Graf June 3, 2016, 3:02 p.m. UTC | #3
On 2016-06-03 16:37, Peter Maydell wrote:
> On 3 June 2016 at 15:22, Mihai Claudiu Caraman <mike.caraman@nxp.com> 
> wrote:
>> In particular for virtual and emulated devices the host CPU behaves
>> as a DMA coherent 'device'. This should have been stated in patch
>> description.
> 
> Wouldn't that imply that we should just always have the "dma-coherent"
> property set, and we don't need to do any of the messing around looking
> at the host sysfs ?

What happens with VFIO targets that don't have coherent DMA on their PCI 
bus then? I think eventually we'll need to be able to spawn a second bus 
with explicit coherency setting.


Alex
Mihai Claudiu Caraman June 3, 2016, 3:16 p.m. UTC | #4
> -----Original Message-----

> From: Peter Maydell [mailto:peter.maydell@linaro.org] 

> Sent: Friday, June 03, 2016 5:38 PM

> To: Mihai Claudiu Caraman <mike.caraman@nxp.com>

> Cc: Bogdan Purcareata <bogdan.purcareata@nxp.com>; QEMU Developers <qemu-devel@nongnu.org>; Peter Crosthwaite <crosthwaite.peter@gmail.com>; Alexander Graf <agraf@suse.de>; qemu-arm <qemu-arm@nongnu.org>; Eric Auger <eric.auger@linaro.org>

> Subject: Re: [Qemu-devel] [PATCH 2/2] arm/virt: Mark pcie controller node as dma-coherent

> 

> On 3 June 2016 at 15:22, Mihai Claudiu Caraman <mike.caraman@nxp.com> wrote:

> > In particular for virtual and emulated devices the host CPU behaves as 

> > a DMA coherent 'device'. This should have been stated in patch 

> > description.

> 

> Wouldn't that imply that we should just always have the "dma-coherent"

> property set, and we don't need to do any of the messing around looking at the host sysfs ?

> 

> thanks

> -- PMM


We can always set "dma-coherent" for virtual and emulated devices but not for passthrough devices. So we can't have one PCIe controller for all devices marked as "dma-coherent".

Regards,
Mike
Alexander Graf June 3, 2016, 3:25 p.m. UTC | #5
On 06/03/2016 05:16 PM, Mihai Claudiu Caraman wrote:
>> -----Original Message-----
>> From: Peter Maydell [mailto:peter.maydell@linaro.org]
>> Sent: Friday, June 03, 2016 5:38 PM
>> To: Mihai Claudiu Caraman <mike.caraman@nxp.com>
>> Cc: Bogdan Purcareata <bogdan.purcareata@nxp.com>; QEMU Developers <qemu-devel@nongnu.org>; Peter Crosthwaite <crosthwaite.peter@gmail.com>; Alexander Graf <agraf@suse.de>; qemu-arm <qemu-arm@nongnu.org>; Eric Auger <eric.auger@linaro.org>
>> Subject: Re: [Qemu-devel] [PATCH 2/2] arm/virt: Mark pcie controller node as dma-coherent
>>
>> On 3 June 2016 at 15:22, Mihai Claudiu Caraman <mike.caraman@nxp.com> wrote:
>>> In particular for virtual and emulated devices the host CPU behaves as
>>> a DMA coherent 'device'. This should have been stated in patch
>>> description.
>> Wouldn't that imply that we should just always have the "dma-coherent"
>> property set, and we don't need to do any of the messing around looking at the host sysfs ?
>>
>> thanks
>> -- PMM
> We can always set "dma-coherent" for virtual and emulated devices but not for passthrough devices. So we can't have one PCIe controller for all devices marked as "dma-coherent".

The original patch is about the case where PCI is already cache coherent 
on the host.

I think at the end of the day this is simply outside of QEMU's scope to 
decide. What we can do is set dma-coherent per default (if Will and Ard 
agree) on the default PCIe bus and add code that allows to spawn a 
secondary PCIe bus which can then have different dma-coherent attributes 
and that you can then plug your non-coherent vfio devices into.


Alex
Mihai Claudiu Caraman June 3, 2016, 4:44 p.m. UTC | #6
-----Original Message-----
From: Alexander Graf [mailto:agraf@suse.de] 

Sent: Friday, June 03, 2016 6:26 PM
To: Mihai Claudiu Caraman <mike.caraman@nxp.com>; Peter Maydell <peter.maydell@linaro.org>
> Cc: Bogdan Purcareata <bogdan.purcareata@nxp.com>; QEMU Developers <qemu-devel@nongnu.org>; Peter Crosthwaite <crosthwaite.peter@gmail.com>; qemu-arm <qemu-arm@nongnu.org>; Eric Auger <eric.auger@linaro.org>

> Subject: Re: [Qemu-devel] [PATCH 2/2] arm/virt: Mark pcie controller node as dma-coherent

> 

> On 06/03/2016 05:16 PM, Mihai Claudiu Caraman wrote:

> >> -----Original Message-----

> >> From: Peter Maydell [mailto:peter.maydell@linaro.org]

> >> Sent: Friday, June 03, 2016 5:38 PM

> >> To: Mihai Claudiu Caraman <mike.caraman@nxp.com>

> >> Cc: Bogdan Purcareata <bogdan.purcareata@nxp.com>; QEMU Developers 

> >> <qemu-devel@nongnu.org>; Peter Crosthwaite 

> >> <crosthwaite.peter@gmail.com>; Alexander Graf <agraf@suse.de>; 

> >> qemu-arm <qemu-arm@nongnu.org>; Eric Auger <eric.auger@linaro.org>

> >> Subject: Re: [Qemu-devel] [PATCH 2/2] arm/virt: Mark pcie controller 

> >> node as dma-coherent

> >>

> >> On 3 June 2016 at 15:22, Mihai Claudiu Caraman <mike.caraman@nxp.com> wrote:

> >>> In particular for virtual and emulated devices the host CPU behaves 

> >>> as a DMA coherent 'device'. This should have been stated in patch 

> >>> description.

> >> Wouldn't that imply that we should just always have the "dma-coherent"

> >> property set, and we don't need to do any of the messing around looking at the host sysfs ?

> >>

> >> thanks

> >> -- PMM

> > We can always set "dma-coherent" for virtual and emulated devices but not for passthrough devices. So we can't have one PCIe controller for all devices marked as "dma-coherent".

> 

> The original patch is about the case where PCI is already cache coherent on the host.

>

> I think at the end of the day this is simply outside of QEMU's scope to decide. What we can do is set dma-coherent per default (if Will and Ard

> agree) on the default PCIe bus and add code that allows to spawn a secondary PCIe bus which can then have different dma-coherent attributes and that you can then plug your non-coherent vfio devices into. 


A dma-coherent default PCIe bus looks fine.

Thanks,
Mike
Eric Auger June 7, 2016, 7:58 a.m. UTC | #7
Hi,
Le 02/06/2016 à 14:45, Alexander Graf a écrit :
> 
> 
> On 02.06.16 14:32, Peter Maydell wrote:
>> On 2 June 2016 at 13:26, Bogdan Purcareata <bogdan.purcareata@nxp.com> wrote:
>>> A PCI device is marked either as coherent or non-coherent based on the pcie
>>> controller "dma-coherent" property. This is further used when configuring the
>>> IOMMU ops for the device DMA resources (e.g. descriptor rings, for e1000e).
>>>
>>> This dma-coherent property needs to be configured in the guest environment,
>>> in case there's a directly assigned VFIO PCI device. Since the guest only
>>> receives one emulated pcie controller bus - regardless of the host
>>> configuration - add this property if there's at least one host pcie host
>>> controller that is DMA coherent (this implies that the host interconnect
>>> is coherent as well).
>>
>> This patch seems to change the property of the emulated PCIe controller
>> based on the host PCIe controller even if we're not doing any PCIe
>> passthrough at all. That seems definitely wrong to me.
>>
>> (Should the purely-emulated case be marked DMA-coherent anyway?
>> I forget the fiddly details...)
> 
> I do too, let's involve a few people who know :). Not exposing it as
> coherent is definitely wrong, but whether "dma-coherent" is the right
> choice I don't know.

For information, on AMD overdrive the PCI host controller is marked as
dma-coherent too but I do not observe any issue with 82574L (e1000e)
and it worked fine as well on X540T1 Intel NIC. I just tested with
e1000e and dma-coherent guest PCI controller and works fine too. Don't
have x540t1 anymore so can't test with that PCIe CARD.

With Intel I35OT1 and guest non dma-coherent PCIe controller I cope with
some issues when assigning the PF but I don't know yet if there is any
relationship.

Best Regards

Eric
> 
> 
> Alex
>
Ard Biesheuvel June 16, 2016, 1:58 p.m. UTC | #8
On 2 June 2016 at 14:45, Alexander Graf <agraf@suse.de> wrote:
>
>
> On 02.06.16 14:32, Peter Maydell wrote:
>> On 2 June 2016 at 13:26, Bogdan Purcareata <bogdan.purcareata@nxp.com> wrote:
>>> A PCI device is marked either as coherent or non-coherent based on the pcie
>>> controller "dma-coherent" property. This is further used when configuring the
>>> IOMMU ops for the device DMA resources (e.g. descriptor rings, for e1000e).
>>>
>>> This dma-coherent property needs to be configured in the guest environment,
>>> in case there's a directly assigned VFIO PCI device. Since the guest only
>>> receives one emulated pcie controller bus - regardless of the host
>>> configuration - add this property if there's at least one host pcie host
>>> controller that is DMA coherent (this implies that the host interconnect
>>> is coherent as well).
>>
>> This patch seems to change the property of the emulated PCIe controller
>> based on the host PCIe controller even if we're not doing any PCIe
>> passthrough at all. That seems definitely wrong to me.
>>
>> (Should the purely-emulated case be marked DMA-coherent anyway?
>> I forget the fiddly details...)
>
> I do too, let's involve a few people who know :). Not exposing it as
> coherent is definitely wrong, but whether "dma-coherent" is the right
> choice I don't know.
>

As far as I understand it, the purely emulated case should be marked
DMA coherent, since otherwise, guest drivers may perform cache
maintenance that the host is not expecting. This is especially harmful
if the guest invalidates the caches after a device to memory transfer,
which may result in data being lost if the data was only present in
the caches to begin with (which is the case for devices that are
emulated by the host)
Peter Maydell June 28, 2016, 2:16 p.m. UTC | #9
On 16 June 2016 at 14:58, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> On 2 June 2016 at 14:45, Alexander Graf <agraf@suse.de> wrote:
>> On 02.06.16 14:32, Peter Maydell wrote:
>>> This patch seems to change the property of the emulated PCIe controller
>>> based on the host PCIe controller even if we're not doing any PCIe
>>> passthrough at all. That seems definitely wrong to me.
>>>
>>> (Should the purely-emulated case be marked DMA-coherent anyway?
>>> I forget the fiddly details...)
>>
>> I do too, let's involve a few people who know :). Not exposing it as
>> coherent is definitely wrong, but whether "dma-coherent" is the right
>> choice I don't know.

> As far as I understand it, the purely emulated case should be marked
> DMA coherent, since otherwise, guest drivers may perform cache
> maintenance that the host is not expecting. This is especially harmful
> if the guest invalidates the caches after a device to memory transfer,
> which may result in data being lost if the data was only present in
> the caches to begin with (which is the case for devices that are
> emulated by the host)

So the consensus seems to be that:
 * emulated PCI devices definitely need dma-coherent
 * passthrough devices where the host controller is dma-coherent
   also need dma-coherent
 * passthrough devices where the host controller is not dma-coherent
   don't want dma-coherent, but we have to set things per-PCI-controller

Would somebody like to write a patch which just unconditionally
sets the dma-coherent property on our PCI controller dt node?
That seems a clear improvement on what we have at the moment.
We can look at whether we want to support passthrough from a
non-dma-coherent host pci controller (via a 2nd guest pci controller?)
later...

thanks
-- PMM
Bogdan Purcareata June 29, 2016, 7:52 a.m. UTC | #10
On 28.06.2016 17:16, Peter Maydell wrote:
> On 16 June 2016 at 14:58, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

>> On 2 June 2016 at 14:45, Alexander Graf <agraf@suse.de> wrote:

>>> On 02.06.16 14:32, Peter Maydell wrote:

>>>> This patch seems to change the property of the emulated PCIe controller

>>>> based on the host PCIe controller even if we're not doing any PCIe

>>>> passthrough at all. That seems definitely wrong to me.

>>>>

>>>> (Should the purely-emulated case be marked DMA-coherent anyway?

>>>> I forget the fiddly details...)

>>>

>>> I do too, let's involve a few people who know :). Not exposing it as

>>> coherent is definitely wrong, but whether "dma-coherent" is the right

>>> choice I don't know.

>

>> As far as I understand it, the purely emulated case should be marked

>> DMA coherent, since otherwise, guest drivers may perform cache

>> maintenance that the host is not expecting. This is especially harmful

>> if the guest invalidates the caches after a device to memory transfer,

>> which may result in data being lost if the data was only present in

>> the caches to begin with (which is the case for devices that are

>> emulated by the host)

>

> So the consensus seems to be that:

>  * emulated PCI devices definitely need dma-coherent

>  * passthrough devices where the host controller is dma-coherent

>    also need dma-coherent

>  * passthrough devices where the host controller is not dma-coherent

>    don't want dma-coherent, but we have to set things per-PCI-controller

>

> Would somebody like to write a patch which just unconditionally

> sets the dma-coherent property on our PCI controller dt node?


I will send this patch later today.

Best regards,
Bogdan P.

> That seems a clear improvement on what we have at the moment.

> We can look at whether we want to support passthrough from a

> non-dma-coherent host pci controller (via a 2nd guest pci controller?)

> later...

>

> thanks

> -- PMM

>
diff mbox

Patch

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 371e3a7..b640174 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -867,6 +867,32 @@  static void create_pcie_irq_map(const VirtBoardInfo *vbi, uint32_t gic_phandle,
                            0x7           /* PCI irq */);
 }
 
+static bool is_host_pcie_dma_coherent(void)
+{
+    void *host_fdt;
+    char **node_path;
+    bool ret = false;
+
+    host_fdt = load_device_tree_from_sysfs();
+
+    node_path = qemu_fdt_node_path_prop(host_fdt, "pcie", "dma-coherent",
+            NULL, 0, &error_fatal);
+
+    /* no pcie controllers found on host, therefore non dma-coherent */
+    if (!node_path || !node_path[0]) {
+        goto out;
+    }
+
+    /* for now, if at least one pcie node is dma-coherent,
+     * it is considered that the host is dma-coherent with respect to pcie */
+    ret = true;
+
+out:
+    g_strfreev(node_path);
+    g_free(host_fdt);
+    return ret;
+}
+
 static void create_pcie(const VirtBoardInfo *vbi, qemu_irq *pic,
                         bool use_highmem)
 {
@@ -981,6 +1007,11 @@  static void create_pcie(const VirtBoardInfo *vbi, qemu_irq *pic,
     }
 
     qemu_fdt_setprop_cell(vbi->fdt, nodename, "#interrupt-cells", 1);
+
+    if (is_host_pcie_dma_coherent()) {
+        qemu_fdt_setprop(vbi->fdt, nodename, "dma-coherent", NULL, 0);
+    }
+
     create_pcie_irq_map(vbi, vbi->gic_phandle, irq, nodename);
 
     g_free(nodename);