diff mbox

[v7,07/11] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending (top level ones)

Message ID 1465376344-28290-8-git-send-email-quan.xu@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Quan Xu June 8, 2016, 8:59 a.m. UTC
From: Quan Xu <quan.xu@intel.com>

Signed-off-by: Quan Xu <quan.xu@intel.com>

CC: Jan Beulich <jbeulich@suse.com>
CC: Liu Jinsong <jinsong.liu@alibaba-inc.com>
CC: Keir Fraser <keir@xen.org>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@arm.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Feng Wu <feng.wu@intel.com>

v7:
  1. return SAVED_ALL at the bottom of device_power_down(), instead
     of SAVED_NONE.
  2. drop the 'if ( error > 0 )', calling device_power_up(error)
     without any if().
  3. for vtd_suspend():
       - drop pointless initializer.
       - return 0 at the bottom to make obvious that no error path
         comes there.
---
 xen/arch/x86/acpi/power.c                     | 73 ++++++++++++++++++++-------
 xen/drivers/passthrough/amd/iommu_init.c      |  9 +++-
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  2 +-
 xen/drivers/passthrough/iommu.c               |  6 ++-
 xen/drivers/passthrough/vtd/iommu.c           | 35 +++++++++----
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  2 +-
 xen/include/xen/iommu.h                       |  4 +-
 7 files changed, 96 insertions(+), 35 deletions(-)

Comments

Jan Beulich June 8, 2016, 2:51 p.m. UTC | #1
>>> On 08.06.16 at 10:59, <quan.xu@intel.com> wrote:
> @@ -169,6 +203,7 @@ static int enter_state(u32 state)

Right above here we have

    if ( (error = device_power_down()) )

which is now wrong as long as SAVED_ALL is not zero.

>      {
>          printk(XENLOG_ERR "Some devices failed to power down.");
>          system_state = SYS_STATE_resume;
> +        device_power_up(error);
>          goto done;

For the goto you need to adjust "error", or else you return
something meaningless (a sort of random positive number) to your
caller.

Jan
Suravee Suthikulpanit June 9, 2016, 6:58 p.m. UTC | #2
On 6/8/2016 3:59 AM, Xu, Quan wrote:
> From: Quan Xu <quan.xu@intel.com>
>
> Signed-off-by: Quan Xu <quan.xu@intel.com>
>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Liu Jinsong <jinsong.liu@alibaba-inc.com>
> CC: Keir Fraser <keir@xen.org>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@arm.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Feng Wu <feng.wu@intel.com>
>
> v7:
>   1. return SAVED_ALL at the bottom of device_power_down(), instead
>      of SAVED_NONE.
>   2. drop the 'if ( error > 0 )', calling device_power_up(error)
>      without any if().
>   3. for vtd_suspend():
>        - drop pointless initializer.
>        - return 0 at the bottom to make obvious that no error path
>          comes there.

Shouldn't the changes log for v7 probably go ...
> ---
... HERE instead so that we don't get this in the commit log.


For AMD part,
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

Thanks,
Suravee
Tian, Kevin June 12, 2016, 6:58 a.m. UTC | #3
> From: Xu, Quan
> Sent: Wednesday, June 08, 2016 4:59 PM
> 
> From: Quan Xu <quan.xu@intel.com>
> 
> Signed-off-by: Quan Xu <quan.xu@intel.com>
> 
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Liu Jinsong <jinsong.liu@alibaba-inc.com>
> CC: Keir Fraser <keir@xen.org>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@arm.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Feng Wu <feng.wu@intel.com>
> 

Acked-by: Kevin Tian <kevin.tian@intel.com>
Quan Xu June 12, 2016, 7:42 a.m. UTC | #4
> -----Original Message-----

> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan

> Beulich

> Sent: Wednesday, June 08, 2016 10:52 PM

> To: Xu, Quan <quan.xu@intel.com>

> Cc: Tian, Kevin <kevin.tian@intel.com>; Stefano Stabellini

> <sstabellini@kernel.org>; Wu, Feng <feng.wu@intel.com>; Liu Jinsong

> <jinsong.liu@alibaba-inc.com>; dario.faggioli@citrix.com; xen-

> devel@lists.xen.org; Julien Grall <julien.grall@arm.com>; Suravee

> Suthikulpanit <suravee.suthikulpanit@amd.com>; Andrew Cooper

> <andrew.cooper3@citrix.com>; Keir Fraser <keir@xen.org>

> Subject: Re: [Xen-devel] [PATCH v7 07/11] IOMMU: propagate IOMMU

> Device-TLB flush error up to IOMMU suspending (top level ones)

> 


On 
> >>> On 08.06.16 at 10:59, <quan.xu@intel.com> wrote:

> > @@ -169,6 +203,7 @@ static int enter_state(u32 state)

> 

> Right above here we have

> 

>     if ( (error = device_power_down()) )

> 

> which is now wrong as long as SAVED_ALL is not zero.

> 

> >      {

> >          printk(XENLOG_ERR "Some devices failed to power down.");

> >          system_state = SYS_STATE_resume;

> > +        device_power_up(error);

> >          goto done;

> 

> For the goto you need to adjust "error", or else you return something

> meaningless (a sort of random positive number) to your caller.

> 


Yes, it is still not correct. Could I change it as following: 


-    if ( (error = device_power_down()) )
+    if ( (error = device_power_down()) != SAVED_ALL )
     {
         printk(XENLOG_ERR "Some devices failed to power down.");
         system_state = SYS_STATE_resume;
+        device_power_up(error);
+        error = -EIO;
         goto done;
     }

Quan
Jan Beulich June 13, 2016, 9:25 a.m. UTC | #5
>>> On 12.06.16 at 09:42, <quan.xu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan
>> Beulich
>> Sent: Wednesday, June 08, 2016 10:52 PM
>> To: Xu, Quan <quan.xu@intel.com>
>> Cc: Tian, Kevin <kevin.tian@intel.com>; Stefano Stabellini
>> <sstabellini@kernel.org>; Wu, Feng <feng.wu@intel.com>; Liu Jinsong
>> <jinsong.liu@alibaba-inc.com>; dario.faggioli@citrix.com; xen-
>> devel@lists.xen.org; Julien Grall <julien.grall@arm.com>; Suravee
>> Suthikulpanit <suravee.suthikulpanit@amd.com>; Andrew Cooper
>> <andrew.cooper3@citrix.com>; Keir Fraser <keir@xen.org>
>> Subject: Re: [Xen-devel] [PATCH v7 07/11] IOMMU: propagate IOMMU
>> Device-TLB flush error up to IOMMU suspending (top level ones)
>> 
> 
> On 
>> >>> On 08.06.16 at 10:59, <quan.xu@intel.com> wrote:
>> > @@ -169,6 +203,7 @@ static int enter_state(u32 state)
>> 
>> Right above here we have
>> 
>>     if ( (error = device_power_down()) )
>> 
>> which is now wrong as long as SAVED_ALL is not zero.
>> 
>> >      {
>> >          printk(XENLOG_ERR "Some devices failed to power down.");
>> >          system_state = SYS_STATE_resume;
>> > +        device_power_up(error);
>> >          goto done;
>> 
>> For the goto you need to adjust "error", or else you return something
>> meaningless (a sort of random positive number) to your caller.
>> 
> 
> Yes, it is still not correct. Could I change it as following: 
> 
> 
> -    if ( (error = device_power_down()) )
> +    if ( (error = device_power_down()) != SAVED_ALL )
>      {
>          printk(XENLOG_ERR "Some devices failed to power down.");
>          system_state = SYS_STATE_resume;
> +        device_power_up(error);
> +        error = -EIO;
>          goto done;
>      }

This would address only part of the issue afaict - SAVED_ALL
not being zero would still result in the function returning a
positive value instead of zero in the success case. But to be
honest I don't see why this simple to solve an issue requires
any kind of discussion on how to deal with it.

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 2885e31..717a809 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -43,36 +43,70 @@  struct acpi_sleep_info acpi_sinfo;
 
 void do_suspend_lowlevel(void);
 
+enum dev_power_saved
+{
+    SAVED_NONE,
+    SAVED_CONSOLE,
+    SAVED_TIME,
+    SAVED_I8259A,
+    SAVED_IOAPIC,
+    SAVED_IOMMU,
+    SAVED_LAPIC,
+    SAVED_ALL,
+};
+
 static int device_power_down(void)
 {
-    console_suspend();
+    if ( console_suspend() )
+        return SAVED_NONE;
 
-    time_suspend();
+    if ( time_suspend() )
+        return SAVED_CONSOLE;
 
-    i8259A_suspend();
+    if ( i8259A_suspend() )
+        return SAVED_TIME;
 
+    /* ioapic_suspend cannot fail */
     ioapic_suspend();
 
-    iommu_suspend();
+    if ( iommu_suspend() )
+        return SAVED_IOAPIC;
 
-    lapic_suspend();
+    if ( lapic_suspend() )
+        return SAVED_IOMMU;
 
-    return 0;
+    return SAVED_ALL;
 }
 
-static void device_power_up(void)
+static void device_power_up(enum dev_power_saved saved)
 {
-    lapic_resume();
-
-    iommu_resume();
-
-    ioapic_resume();
-
-    i8259A_resume();
-
-    time_resume();
-
-    console_resume();
+    switch ( saved )
+    {
+    case SAVED_ALL:
+    case SAVED_LAPIC:
+        lapic_resume();
+        /* fall through */
+    case SAVED_IOMMU:
+        iommu_resume();
+        /* fall through */
+    case SAVED_IOAPIC:
+        ioapic_resume();
+        /* fall through */
+    case SAVED_I8259A:
+        i8259A_resume();
+        /* fall through */
+    case SAVED_TIME:
+        time_resume();
+        /* fall through */
+    case SAVED_CONSOLE:
+        console_resume();
+        /* fall through */
+    case SAVED_NONE:
+        break;
+    default:
+        BUG();
+        break;
+    }
 }
 
 static void freeze_domains(void)
@@ -169,6 +203,7 @@  static int enter_state(u32 state)
     {
         printk(XENLOG_ERR "Some devices failed to power down.");
         system_state = SYS_STATE_resume;
+        device_power_up(error);
         goto done;
     }
 
@@ -196,7 +231,7 @@  static int enter_state(u32 state)
     write_cr4(cr4 & ~X86_CR4_MCE);
     write_efer(read_efer());
 
-    device_power_up();
+    device_power_up(SAVED_ALL);
 
     mcheck_init(&boot_cpu_data, 0);
     write_cr4(cr4);
diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 4536106..0b68596 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1339,7 +1339,14 @@  static void invalidate_all_devices(void)
     iterate_ivrs_mappings(_invalidate_all_devices);
 }
 
-void amd_iommu_suspend(void)
+int amd_iommu_suspend(void)
+{
+    amd_iommu_crash_shutdown();
+
+    return 0;
+}
+
+void amd_iommu_crash_shutdown(void)
 {
     struct amd_iommu *iommu;
 
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 4a860af..7761241 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -633,6 +633,6 @@  const struct iommu_ops amd_iommu_ops = {
     .suspend = amd_iommu_suspend,
     .resume = amd_iommu_resume,
     .share_p2m = amd_iommu_share_p2m,
-    .crash_shutdown = amd_iommu_suspend,
+    .crash_shutdown = amd_iommu_crash_shutdown,
     .dump_p2m_table = amd_dump_p2m_table,
 };
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 3a73fab..a9898fc 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -379,10 +379,12 @@  int __init iommu_setup(void)
     return rc;
 }
 
-void iommu_suspend()
+int iommu_suspend()
 {
     if ( iommu_enabled )
-        iommu_get_ops()->suspend();
+        return iommu_get_ops()->suspend();
+
+    return 0;
 }
 
 void iommu_resume()
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 5366267..0f17afb 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -541,7 +541,7 @@  static int iommu_flush_iotlb_psi(
     return status;
 }
 
-static void iommu_flush_all(void)
+static int __must_check iommu_flush_all(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
@@ -555,6 +555,8 @@  static void iommu_flush_all(void)
         flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
         iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
     }
+
+    return 0;
 }
 
 static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
@@ -1259,7 +1261,9 @@  static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
     setup_hwdom_pci_devices(d, setup_hwdom_device);
     setup_hwdom_rmrr(d);
 
-    iommu_flush_all();
+    if ( iommu_flush_all() )
+        printk(XENLOG_WARNING VTDPREFIX
+               " IOMMU flush all failed for hardware domain\n");
 
     for_each_drhd_unit ( drhd )
     {
@@ -2001,7 +2005,7 @@  int adjust_vtd_irq_affinities(void)
 }
 __initcall(adjust_vtd_irq_affinities);
 
-static int init_vtd_hw(void)
+static int __must_check init_vtd_hw(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
@@ -2099,8 +2103,8 @@  static int init_vtd_hw(void)
             return -EIO;
         }
     }
-    iommu_flush_all();
-    return 0;
+
+    return iommu_flush_all();
 }
 
 static void __hwdom_init setup_hwdom_rmrr(struct domain *d)
@@ -2389,16 +2393,25 @@  static int intel_iommu_group_id(u16 seg, u8 bus, u8 devfn)
 }
 
 static u32 iommu_state[MAX_IOMMUS][MAX_IOMMU_REGS];
-static void vtd_suspend(void)
+
+static int __must_check vtd_suspend(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
     u32    i;
+    int rc;
 
     if ( !iommu_enabled )
-        return;
+        return 0;
 
-    iommu_flush_all();
+    rc = iommu_flush_all();
+    if ( unlikely(rc) )
+    {
+        printk(XENLOG_WARNING VTDPREFIX
+               " suspend: IOMMU flush all failed: %d\n", rc);
+
+        return rc;
+    }
 
     for_each_drhd_unit ( drhd )
     {
@@ -2427,6 +2440,8 @@  static void vtd_suspend(void)
         if ( !iommu_intremap && iommu_qinval )
             disable_qinval(iommu);
     }
+
+    return 0;
 }
 
 static void vtd_crash_shutdown(void)
@@ -2437,7 +2452,9 @@  static void vtd_crash_shutdown(void)
     if ( !iommu_enabled )
         return;
 
-    iommu_flush_all();
+    if ( iommu_flush_all() )
+        printk(XENLOG_WARNING VTDPREFIX
+               " crash shutdown: IOMMU flush all failed\n");
 
     for_each_drhd_unit ( drhd )
     {
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index ac9f036..d08dc0b 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -119,7 +119,7 @@  extern unsigned long *shared_intremap_inuse;
 
 /* power management support */
 void amd_iommu_resume(void);
-void amd_iommu_suspend(void);
+int __must_check amd_iommu_suspend(void);
 void amd_iommu_crash_shutdown(void);
 
 /* guest iommu support */
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 57c9fbc..6535937 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -175,7 +175,7 @@  struct iommu_ops {
     unsigned int (*read_apic_from_ire)(unsigned int apic, unsigned int reg);
     int (*setup_hpet_msi)(struct msi_desc *);
 #endif /* CONFIG_X86 */
-    void (*suspend)(void);
+    int __must_check (*suspend)(void);
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
@@ -185,7 +185,7 @@  struct iommu_ops {
     void (*dump_p2m_table)(struct domain *d);
 };
 
-void iommu_suspend(void);
+int __must_check iommu_suspend(void);
 void iommu_resume(void);
 void iommu_crash_shutdown(void);
 int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);