diff mbox series

[v3] passthrough/vtd: Don't DMA to the stack in queue_invalidate_wait()

Message ID 20190716162355.1321-1-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series [v3] passthrough/vtd: Don't DMA to the stack in queue_invalidate_wait() | expand

Commit Message

Andrew Cooper July 16, 2019, 4:23 p.m. UTC
DMA-ing to the stack is considered bad practice.  In this case, if a
timeout occurs because of a sluggish device which is processing the
request, the completion notification will corrupt the stack of a
subsequent deeper call tree.

Place the poll_slot in a percpu area and DMA to that instead.

Fix the declaration of saddr in struct qinval_entry, to avoid a shift by
two.  The requirement here is that the DMA address is dword aligned,
which is covered by poll_slot's type.

This change does not address other issues.  Correlating completions
after a timeout with their request is a more complicated change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Kevin Tian <kevin.tian@intel.com>

It turns out that this has been pending since 4.10, and grossly late.

v3:
 * Fix saddr declarion to drop a shift-by-two.
 * Drop volatile attribute.  Use ACCESS_ONCE() instead.
---
 xen/drivers/passthrough/vtd/iommu.h  | 3 +--
 xen/drivers/passthrough/vtd/qinval.c | 9 +++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

Comments

Jan Beulich July 16, 2019, 4:33 p.m. UTC | #1
On 16.07.2019 18:23, Andrew Cooper wrote:
> DMA-ing to the stack is considered bad practice.  In this case, if a
> timeout occurs because of a sluggish device which is processing the
> request, the completion notification will corrupt the stack of a
> subsequent deeper call tree.
> 
> Place the poll_slot in a percpu area and DMA to that instead.
> 
> Fix the declaration of saddr in struct qinval_entry, to avoid a shift by
> two.  The requirement here is that the DMA address is dword aligned,
> which is covered by poll_slot's type.
> 
> This change does not address other issues.  Correlating completions
> after a timeout with their request is a more complicated change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <JBeulich@suse.com>

Must have been quite some time since v2 ...

Jan
Andrew Cooper July 16, 2019, 4:42 p.m. UTC | #2
On 16/07/2019 17:33, Jan Beulich wrote:
> On 16.07.2019 18:23, Andrew Cooper wrote:
>> DMA-ing to the stack is considered bad practice.  In this case, if a
>> timeout occurs because of a sluggish device which is processing the
>> request, the completion notification will corrupt the stack of a
>> subsequent deeper call tree.
>>
>> Place the poll_slot in a percpu area and DMA to that instead.
>>
>> Fix the declaration of saddr in struct qinval_entry, to avoid a shift by
>> two.  The requirement here is that the DMA address is dword aligned,
>> which is covered by poll_slot's type.
>>
>> This change does not address other issues.  Correlating completions
>> after a timeout with their request is a more complicated change.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <JBeulich@suse.com>
>
> Must have been quite some time since v2 ...

Oct. 19, 2017, 4:22 p.m. UTC according to patchwork.

And now I can talk about it, that's when I was getting really stuck in
to trying to fix Spectre/Meltdown

~Andrew
Tian, Kevin July 24, 2019, 12:52 a.m. UTC | #3
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, July 17, 2019 12:24 AM
> 
> DMA-ing to the stack is considered bad practice.  In this case, if a
> timeout occurs because of a sluggish device which is processing the
> request, the completion notification will corrupt the stack of a
> subsequent deeper call tree.
> 
> Place the poll_slot in a percpu area and DMA to that instead.
> 
> Fix the declaration of saddr in struct qinval_entry, to avoid a shift by
> two.  The requirement here is that the DMA address is dword aligned,
> which is covered by poll_slot's type.
> 
> This change does not address other issues.  Correlating completions
> after a timeout with their request is a more complicated change.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
diff mbox series

Patch

diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 1a992f72d6..c9290a3996 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -444,8 +444,7 @@  struct qinval_entry {
                     sdata   : 32;
             }lo;
             struct {
-                u64 res_1   : 2,
-                    saddr   : 62;
+                u64 saddr;
             }hi;
         }inv_wait_dsc;
     }q;
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index 01447cf9a8..09cbd36ebb 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -147,13 +147,15 @@  static int __must_check queue_invalidate_wait(struct iommu *iommu,
                                               u8 iflag, u8 sw, u8 fn,
                                               bool_t flush_dev_iotlb)
 {
-    volatile u32 poll_slot = QINVAL_STAT_INIT;
+    static DEFINE_PER_CPU(uint32_t, poll_slot);
     unsigned int index;
     unsigned long flags;
     u64 entry_base;
     struct qinval_entry *qinval_entry, *qinval_entries;
+    uint32_t *this_poll_slot = &this_cpu(poll_slot);
 
     spin_lock_irqsave(&iommu->register_lock, flags);
+    ACCESS_ONCE(*this_poll_slot) = QINVAL_STAT_INIT;
     index = qinval_next_index(iommu);
     entry_base = iommu_qi_ctrl(iommu)->qinval_maddr +
                  ((index >> QINVAL_ENTRY_ORDER) << PAGE_SHIFT);
@@ -166,8 +168,7 @@  static int __must_check queue_invalidate_wait(struct iommu *iommu,
     qinval_entry->q.inv_wait_dsc.lo.fn = fn;
     qinval_entry->q.inv_wait_dsc.lo.res_1 = 0;
     qinval_entry->q.inv_wait_dsc.lo.sdata = QINVAL_STAT_DONE;
-    qinval_entry->q.inv_wait_dsc.hi.res_1 = 0;
-    qinval_entry->q.inv_wait_dsc.hi.saddr = virt_to_maddr(&poll_slot) >> 2;
+    qinval_entry->q.inv_wait_dsc.hi.saddr = virt_to_maddr(this_poll_slot);
 
     unmap_vtd_domain_page(qinval_entries);
     qinval_update_qtail(iommu, index);
@@ -182,7 +183,7 @@  static int __must_check queue_invalidate_wait(struct iommu *iommu,
         timeout = NOW() + MILLISECS(flush_dev_iotlb ?
                                     iommu_dev_iotlb_timeout : VTD_QI_TIMEOUT);
 
-        while ( poll_slot != QINVAL_STAT_DONE )
+        while ( ACCESS_ONCE(*this_poll_slot) != QINVAL_STAT_DONE )
         {
             if ( NOW() > timeout )
             {