Message ID | 1458197676-60696-3-git-send-email-quan.xu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
> From: Xu, Quan > Sent: Thursday, March 17, 2016 2:55 PM > > Current code would be panic(), when VT-d Device-TLB flush timed out. > the panic() is going to be eliminated, so we must check all kinds of > error and all the way up the call trees. sorry that I'm unclear what is the criteria of defining high level and low level functions in two patches. Could you elaborate? Once I thought you may mean common code and vendor specific code, however... > > Signed-off-by: Quan Xu <quan.xu@intel.com> > > CC: Jun Nakajima <jun.nakajima@intel.com> > CC: Kevin Tian <kevin.tian@intel.com> > CC: George Dunlap <george.dunlap@eu.citrix.com> > CC: Keir Fraser <keir@xen.org> > CC: Jan Beulich <jbeulich@suse.com> > CC: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > CC: Stefano Stabellini <stefano.stabellini@citrix.com> > CC: Julien Grall <julien.grall@arm.com> > CC: Feng Wu <feng.wu@intel.com> > CC: Dario Faggioli <dario.faggioli@citrix.com> > --- > xen/arch/x86/mm/p2m-ept.c | 2 +- > xen/drivers/passthrough/amd/iommu_init.c | 12 ++- > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > xen/drivers/passthrough/arm/smmu.c | 10 ++- > xen/drivers/passthrough/iommu.c | 17 ++-- > xen/drivers/passthrough/vtd/extern.h | 2 +- > xen/drivers/passthrough/vtd/iommu.c | 120 > ++++++++++++++++++-------- > xen/drivers/passthrough/vtd/quirks.c | 26 +++--- > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- > xen/include/asm-x86/iommu.h | 2 +- > xen/include/xen/iommu.h | 6 +- Above you have general passthrough/iommu.c though most of others are vendor specific. Then in PATCH [1/2], you have: xen/arch/x86/acpi/power.c | 14 +++++++++++++- xen/arch/x86/mm.c | 13 ++++++++----- xen/arch/x86/mm/p2m-ept.c | 10 +++++++++- xen/arch/x86/mm/p2m-pt.c | 12 ++++++++++-- xen/common/grant_table.c | 5 +++-- xen/common/memory.c | 5 +++-- xen/drivers/passthrough/iommu.c | 16 +++++++++++----- xen/drivers/passthrough/vtd/x86/vtd.c | 7 +++++-- xen/drivers/passthrough/x86/iommu.c | 6 +++++- xen/include/xen/iommu.h | 6 +++--- They are also mixed. Thanks Kevin
On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > Current code would be panic(), when VT-d Device-TLB flush timed out. > the panic() is going to be eliminated, so we must check all kinds of > error and all the way up the call trees. > > Signed-off-by: Quan Xu <quan.xu@intel.com> > > CC: Jun Nakajima <jun.nakajima@intel.com> > CC: Kevin Tian <kevin.tian@intel.com> > CC: George Dunlap <george.dunlap@eu.citrix.com> > CC: Keir Fraser <keir@xen.org> > CC: Jan Beulich <jbeulich@suse.com> > CC: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > CC: Stefano Stabellini <stefano.stabellini@citrix.com> > CC: Julien Grall <julien.grall@arm.com> > CC: Feng Wu <feng.wu@intel.com> > CC: Dario Faggioli <dario.faggioli@citrix.com> > --- > xen/arch/x86/mm/p2m-ept.c | 2 +- > xen/drivers/passthrough/amd/iommu_init.c | 12 ++- > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > xen/drivers/passthrough/arm/smmu.c | 10 ++- > xen/drivers/passthrough/iommu.c | 17 ++-- > xen/drivers/passthrough/vtd/extern.h | 2 +- > xen/drivers/passthrough/vtd/iommu.c | 120 ++++++++++++++++++-------- > xen/drivers/passthrough/vtd/quirks.c | 26 +++--- > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- > xen/include/asm-x86/iommu.h | 2 +- > xen/include/xen/iommu.h | 6 +- > 11 files changed, 133 insertions(+), 68 deletions(-) > > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > index f9bcce7..fa6c710 100644 > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -825,7 +825,7 @@ out: > need_modify_vtd_table ) > { > if ( iommu_hap_pt_share ) > - iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present); > + rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present); So this sort changes the meaning of the "rc" check near the end of the function, when we check whether we want to update altp2m. As it happens, I *think* it doesn't matter, because you can't have altp2m and passthrough enabled at the same time, right? If so, this at least merits a comment above the altp2m check; something like: "NB that if altp2m is enabled, rc cannot be non-zero here due to iommu_pte_flush, since you can't have altp2m and pass-through enabled at the same time." If you *can* have both altp2m and pass-through, then we need to make sure that the altp2m gets updated when the hostp2m is updated, even if the iommu flush fails. -George
On March 17, 2016 3:38pm, Tian, Kevin <kevin.tian@intel.com> wrote: > > From: Xu, Quan > > Sent: Thursday, March 17, 2016 2:55 PM > > > > Current code would be panic(), when VT-d Device-TLB flush timed out. > > the panic() is going to be eliminated, so we must check all kinds of > > error and all the way up the call trees. > > sorry that I'm unclear what is the criteria of defining high level and low level > functions in two patches. Could you elaborate? Once I thought you may mean > common code and vendor specific code, however... > In this patch set, it is adjusting top level functions of the call tree first, and working my way down to leaf ones. I tried to define that top level is mainly about MMU, and the low level is mainly about IOMMU. Mixed things are in Consideration of compiling and simplification. For high level of these call trees, IMO it is a reasonable attempt at splitting things. > > xen/arch/x86/mm/p2m-ept.c | 2 +- > > xen/drivers/passthrough/amd/iommu_init.c | 12 ++- > > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > > xen/drivers/passthrough/arm/smmu.c | 10 ++- > > xen/drivers/passthrough/iommu.c | 17 ++-- > > xen/drivers/passthrough/vtd/extern.h | 2 +- > > xen/drivers/passthrough/vtd/iommu.c | 120 > > ++++++++++++++++++-------- > > xen/drivers/passthrough/vtd/quirks.c | 26 +++--- > > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- > > xen/include/asm-x86/iommu.h | 2 +- > > xen/include/xen/iommu.h | 6 +- > > Above you have general passthrough/iommu.c though most of others are > vendor specific. > .e.g, as the 'struct iommu_ops' is common structure for arm/amd/intel. > Then in PATCH [1/2], you have: > > xen/arch/x86/acpi/power.c | 14 +++++++++++++- > xen/arch/x86/mm.c | 13 ++++++++----- > xen/arch/x86/mm/p2m-ept.c | 10 +++++++++- > xen/arch/x86/mm/p2m-pt.c | 12 ++++++++++-- > xen/common/grant_table.c | 5 +++-- > xen/common/memory.c | 5 +++-- > xen/drivers/passthrough/iommu.c | 16 +++++++++++----- > xen/drivers/passthrough/vtd/x86/vtd.c | 7 +++++-- > xen/drivers/passthrough/x86/iommu.c | 6 +++++- > xen/include/xen/iommu.h | 6 +++--- > > They are also mixed. > It is in Consideration of compiling and simplification. e.g. for this call tree, ...--iommu_suspend()--device_power_down()--... device_power_down() is in -xen/arch/x86/acpi/power.c iommu_suspend() is in -xen/drivers/passthrough/iommu.c when I tried to return error code from iommu_suspend(), which is with 'void' annotation. I need change it from 'void' to 'int', then it is unavoidable to mix things. Any good idea? To be honest, I am very tired to at splitting things like this :). Quan
On March 17, 2016 11:31pm, <George.Dunlap@eu.citrix.com> wrote: > On Thu, Mar 17, 2016 at 6:54 AM, Quan Xu <quan.xu@intel.com> wrote: > > Current code would be panic(), when VT-d Device-TLB flush timed out. > > the panic() is going to be eliminated, so we must check all kinds of > > error and all the way up the call trees. > > > > Signed-off-by: Quan Xu <quan.xu@intel.com> > > > > CC: Jun Nakajima <jun.nakajima@intel.com> > > CC: Kevin Tian <kevin.tian@intel.com> > > CC: George Dunlap <george.dunlap@eu.citrix.com> > > CC: Keir Fraser <keir@xen.org> > > CC: Jan Beulich <jbeulich@suse.com> > > CC: Andrew Cooper <andrew.cooper3@citrix.com> > > CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > > CC: Stefano Stabellini <stefano.stabellini@citrix.com> > > CC: Julien Grall <julien.grall@arm.com> > > CC: Feng Wu <feng.wu@intel.com> > > CC: Dario Faggioli <dario.faggioli@citrix.com> > > --- > > xen/arch/x86/mm/p2m-ept.c | 2 +- > > xen/drivers/passthrough/amd/iommu_init.c | 12 ++- > > xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- > > xen/drivers/passthrough/arm/smmu.c | 10 ++- > > xen/drivers/passthrough/iommu.c | 17 ++-- > > xen/drivers/passthrough/vtd/extern.h | 2 +- > > xen/drivers/passthrough/vtd/iommu.c | 120 > ++++++++++++++++++-------- > > xen/drivers/passthrough/vtd/quirks.c | 26 +++--- > > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- > > xen/include/asm-x86/iommu.h | 2 +- > > xen/include/xen/iommu.h | 6 +- > > 11 files changed, 133 insertions(+), 68 deletions(-) > > > > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > > index f9bcce7..fa6c710 100644 > > --- a/xen/arch/x86/mm/p2m-ept.c > > +++ b/xen/arch/x86/mm/p2m-ept.c > > @@ -825,7 +825,7 @@ out: > > need_modify_vtd_table ) > > { > > if ( iommu_hap_pt_share ) > > - iommu_pte_flush(d, gfn, &ept_entry->epte, order, > vtd_pte_present); > > + rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, > > + vtd_pte_present); > > So this sort changes the meaning of the "rc" check near the end of the function, > when we check whether we want to update altp2m. > > As it happens, I *think* it doesn't matter, because you can't have altp2m and > passthrough enabled at the same time, right? I think it is _not_. I have gone through the altp2m design/design discussion. It looks Xen _can_ have altp2m and passthrough enabled at the same time. Also you can refer to http://lists.xenproject.org/archives/html/xen-devel/2015-01/msg01963.html Andrew, could you help me double check it? > > If so, this at least merits a comment above the altp2m check; something like: > > "NB that if altp2m is enabled, rc cannot be non-zero here due to > iommu_pte_flush, since you can't have altp2m and pass-through enabled at the > same time." > > If you *can* have both altp2m and pass-through, then we need to make sure > that the altp2m gets updated when the hostp2m is updated, even if the iommu > flush fails. > I think it _can_, and let's wait for the above check first. Quan
>>> On 18.03.16 at 03:30, <quan.xu@intel.com> wrote: > Any good idea? To be honest, I am very tired to at splitting things like > this :). I understand that this is a tedious task; the code should have been propagating errors from the beginning. The most natural way of splitting things would be to go function by function, top level ones first, and leaf ones last, one function per patch (maybe pairs of functions, as in the map/unmap case). Such model would be problematic (almost) only when there's recursion at some point, which I don't think would be the case anywhere here. As mentioned before, the __must_check annotation is a great help to not miss any callers - both to you as you put things together and to the reviewers to be ascertained that nothing was missed. Jan
>>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > --- a/xen/drivers/passthrough/amd/iommu_init.c > +++ b/xen/drivers/passthrough/amd/iommu_init.c > @@ -1339,12 +1339,14 @@ static void invalidate_all_devices(void) > iterate_ivrs_mappings(_invalidate_all_devices); > } > > -void amd_iommu_suspend(void) > +int amd_iommu_suspend(void) > { > struct amd_iommu *iommu; > > for_each_amd_iommu ( iommu ) > disable_iommu(iommu); > + > + return 0; > } > > void amd_iommu_resume(void) > @@ -1368,3 +1370,11 @@ void amd_iommu_resume(void) > invalidate_all_domain_pages(); > } > } > + > +void amd_iommu_crash_shutdown(void) > +{ > + struct amd_iommu *iommu; > + > + for_each_amd_iommu ( iommu ) > + disable_iommu(iommu); > +} One of the two should clearly call the other - no need to have the same code twice. > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > ((page->u.inuse.type_info & PGT_type_mask) > == PGT_writable_page) ) > mapping |= IOMMUF_writable; > - hd->platform_ops->map_page(d, gfn, mfn, mapping); > + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) ) > + printk(XENLOG_G_ERR > + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n", > + gfn, mfn); > + Printing one message here is certainly necessary, but what if the failure repeats for very many pages? Also %#lx instead of 0x%lx please, and a blank before the opening parenthesis. > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > iommu = drhd->iommu; > iommu_flush_context_global(iommu, 0); > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > + > + if ( rc > 0 ) > + { > + iommu_flush_write_buffer(iommu); Why is this needed all of the sudden? (Note that if you did a more fine grained split, it might also be easier for you to note/ explain all the not directly related changes in the respective commit messages. Unless of course they fix actual bugs, in which case they should be split out anyway; such individual fixes would also likely have a much faster route to commit, relieving you earlier from the burden of at least some of the changes you have to carry and re-base.) > + rc = 0; > + } > + else if ( rc < 0 ) > + { > + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n"); > + break; > + } Is a log message really advisable here? > -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, > +static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, While I'm not VT-d maintainer, I think changes like this would be a good opportunity to also drop the stray double underscores: You need to touch all callers anyway. > @@ -584,37 +599,40 @@ static void __intel_iommu_iotlb_flush(struct domain *d, > unsigned long gfn, > continue; > > if ( page_count != 1 || gfn == INVALID_GFN ) > - { > - if ( iommu_flush_iotlb_dsi(iommu, iommu_domid, > - 0, flush_dev_iotlb) ) > - iommu_flush_write_buffer(iommu); > - } > + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, > + 0, flush_dev_iotlb); > else > + rc = iommu_flush_iotlb_psi(iommu, iommu_domid, > + (paddr_t)gfn << PAGE_SHIFT_4K, 0, > + !dma_old_pte_present, > + flush_dev_iotlb); > + if ( rc > 0 ) > { > - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, > - (paddr_t)gfn << PAGE_SHIFT_4K, PAGE_ORDER_4K, Note how this used PAGE_ORDER_4K so far? > - !dma_old_pte_present, flush_dev_iotlb) ) > - iommu_flush_write_buffer(iommu); > + iommu_flush_write_buffer(iommu); Same question again: Why is this all of the sudden needed on both paths? > @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) > if ( pg_maddr == 0 ) > { > spin_unlock(&hd->arch.mapping_lock); > - return; > + return -ENOMEM; > } addr_to_dma_page_maddr() gets called with "alloc" being false, so there can't be any memory allocation failure here. There simply is nothing to do in this case. > -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) > +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) > { > u32 id; > + int rc = 0; > > id = pci_conf_read32(0, 0, 0, 0, 0); > if ( IS_CTG(id) ) > { > /* quit if ME does not exist */ > if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff ) > - return; > + return -ENOENT; Is this really an error? IOW, do all systems which satisfy IS_CTG() have such a device? Jan
> From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Friday, March 18, 2016 4:06 PM > > >>> On 18.03.16 at 03:30, <quan.xu@intel.com> wrote: > > Any good idea? To be honest, I am very tired to at splitting things like > > this :). > > I understand that this is a tedious task; the code should have been > propagating errors from the beginning. > > The most natural way of splitting things would be to go function by > function, top level ones first, and leaf ones last, one function per > patch (maybe pairs of functions, as in the map/unmap case). Such > model would be problematic (almost) only when there's recursion at > some point, which I don't think would be the case anywhere here. > As mentioned before, the __must_check annotation is a great help > to not miss any callers - both to you as you put things together and > to the reviewers to be ascertained that nothing was missed. > Agree with this suggestion. It also allows different maintainers to focus on changes they really need to care about. Thanks Kevin
On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: > >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > > --- a/xen/drivers/passthrough/amd/iommu_init.c > > +++ b/xen/drivers/passthrough/amd/iommu_init.c > > @@ -1339,12 +1339,14 @@ static void invalidate_all_devices(void) > > iterate_ivrs_mappings(_invalidate_all_devices); > > } > > > > -void amd_iommu_suspend(void) > > +int amd_iommu_suspend(void) > > { > > struct amd_iommu *iommu; > > > > for_each_amd_iommu ( iommu ) > > disable_iommu(iommu); > > + > > + return 0; > > } > > > > void amd_iommu_resume(void) > > @@ -1368,3 +1370,11 @@ void amd_iommu_resume(void) > > invalidate_all_domain_pages(); > > } > > } > > + > > +void amd_iommu_crash_shutdown(void) > > +{ > > + struct amd_iommu *iommu; > > + > > + for_each_amd_iommu ( iommu ) > > + disable_iommu(iommu); > > +} > > One of the two should clearly call the other - no need to have the same code > twice. > Good idea. > > --- a/xen/drivers/passthrough/iommu.c > > +++ b/xen/drivers/passthrough/iommu.c > > @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct > domain *d) > > ((page->u.inuse.type_info & PGT_type_mask) > > == PGT_writable_page) ) > > mapping |= IOMMUF_writable; > > - hd->platform_ops->map_page(d, gfn, mfn, mapping); > > + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) ) > > + printk(XENLOG_G_ERR > > + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) > failed.\n", > > + gfn, mfn); > > + > > Printing one message here is certainly necessary, but what if the failure repeats > for very many pages? Yes, to me, it is ok, but I am open to your suggestion. > Also %#lx instead of 0x%lx please, and a blank before the > opening parenthesis. > OK, just check it: .. "IOMMU: Map page gfn: %#lx (mfn: %#lx) failed.\n" .. Right? > > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > > iommu = drhd->iommu; > > iommu_flush_context_global(iommu, 0); > > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > > + > > + if ( rc > 0 ) > > + { > > + iommu_flush_write_buffer(iommu); > > Why is this needed all of the sudden? As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my machine, and I can find the following log message: """ (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. """ __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be called to flush every IOMMU. > (Note that if you did a more fine grained > split, it might also be easier for you to note/ explain all the not directly related > changes in the respective commit messages. Unless of course they fix actual > bugs, in which case they should be split out anyway; such individual fixes would > also likely have a much faster route to commit, relieving you earlier from the > burden of at least some of the changes you have to carry and re-base.) > > > + rc = 0; > > + } > > + else if ( rc < 0 ) > > + { > > + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n"); > > + break; > > + } > > Is a log message really advisable here? > To me, It looks tricky too. I was struggling to make decision. For scheme B, I would try to do as below: if ( iommu_flush_all() ) printk("... nnn ..."); but there are 4 function calls, if so, to me, it looks redundant. Or, could I ignore the print out for iommu_flush_all() failed? > > -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long > > gfn, > > +static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long > > +gfn, > > While I'm not VT-d maintainer, I think changes like this would be a good > opportunity to also drop the stray double underscores: You need to touch all > callers anyway. > I think this is optional. > > @@ -584,37 +599,40 @@ static void __intel_iommu_iotlb_flush(struct > > domain *d, unsigned long gfn, > > continue; > > > > if ( page_count != 1 || gfn == INVALID_GFN ) > > - { > > - if ( iommu_flush_iotlb_dsi(iommu, iommu_domid, > > - 0, flush_dev_iotlb) ) > > - iommu_flush_write_buffer(iommu); > > - } > > + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, > > + 0, flush_dev_iotlb); > > else > > + rc = iommu_flush_iotlb_psi(iommu, iommu_domid, > > + (paddr_t)gfn << > PAGE_SHIFT_4K, 0, > > + !dma_old_pte_present, > > + flush_dev_iotlb); > > + if ( rc > 0 ) > > { > > - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, > > - (paddr_t)gfn << PAGE_SHIFT_4K, > PAGE_ORDER_4K, > > Note how this used PAGE_ORDER_4K so far? Sorry, this is a rebasing mistake. > > > - !dma_old_pte_present, flush_dev_iotlb) ) > > - iommu_flush_write_buffer(iommu); > > + iommu_flush_write_buffer(iommu); > > Same question again: Why is this all of the sudden needed on both paths? > The same as above question. Hold on first. > > @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain > *domain, u64 addr) > > if ( pg_maddr == 0 ) > > { > > spin_unlock(&hd->arch.mapping_lock); > > - return; > > + return -ENOMEM; > > } > > addr_to_dma_page_maddr() gets called with "alloc" being false, so there can't > be any memory allocation failure here. There simply is nothing to do in this > case. > I copy it from iommu_map_page(). Good, then the error of iommu_unmap_page() looks only from flush (the crash is at least obvious), then error handling can be lighter weight-- We may return an error, but don't roll back the failed operation. Right? > > -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) > > +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) > > { > > u32 id; > > + int rc = 0; > > > > id = pci_conf_read32(0, 0, 0, 0, 0); > > if ( IS_CTG(id) ) > > { > > /* quit if ME does not exist */ > > if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff ) > > - return; > > + return -ENOENT; > > Is this really an error? IOW, do all systems which satisfy IS_CTG() have such a > device? > To be honest, I didn't know much about me_wifi_quirk. Now, IMO I don't need to deal with me_wifi_quirk(). Quan
>>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: >> > --- a/xen/drivers/passthrough/iommu.c >> > +++ b/xen/drivers/passthrough/iommu.c >> > @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct >> domain *d) >> > ((page->u.inuse.type_info & PGT_type_mask) >> > == PGT_writable_page) ) >> > mapping |= IOMMUF_writable; >> > - hd->platform_ops->map_page(d, gfn, mfn, mapping); >> > + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) ) >> > + printk(XENLOG_G_ERR >> > + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) >> failed.\n", >> > + gfn, mfn); >> > + >> >> Printing one message here is certainly necessary, but what if the failure > repeats >> for very many pages? > > Yes, to me, it is ok, but I am open to your suggestion. > >> Also %#lx instead of 0x%lx please, and a blank before the >> opening parenthesis. >> > OK, just check it: > > .. > "IOMMU: Map page gfn: %#lx (mfn: %#lx) failed.\n" > .. > > Right? Almost: Generally no full stop in log messages please. And I think the word "page" is redundant here. As rule of thumb: Log messages should give as much as possible useful information (which includes the requirement for distinct ones to be distinguishable in resulting logs) with as little as possible verbosity. >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) >> > iommu = drhd->iommu; >> > iommu_flush_context_global(iommu, 0); >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); >> > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); >> > + >> > + if ( rc > 0 ) >> > + { >> > + iommu_flush_write_buffer(iommu); >> >> Why is this needed all of the sudden? > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my machine, and > I can find the following log message: > """ > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. > """ > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be called to > flush every IOMMU. For one what you say suggests that right now this is being done for some (one?) IOMMU(s), which I don't see being the case. And then what you say _still_ doesn't say _why_ this is now needed all of the sudden. If, in the course of doing your re-work here, you find pre-existing issues with the code, please split the necessary fixes out of your re-work and submit them separately with proper explanations in their commit messages. >> > + rc = 0; >> > + } >> > + else if ( rc < 0 ) >> > + { >> > + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n"); >> > + break; >> > + } >> >> Is a log message really advisable here? >> > > To me, It looks tricky too. I was struggling to make decision. For scheme B, > I would try to do as below: > > if ( iommu_flush_all() ) > printk("... nnn ..."); > > but there are 4 function calls, if so, to me, it looks redundant. > > Or, could I ignore the print out for iommu_flush_all() failed? Directing the question back is not helpful: You should know better than me why you had added a log message there in the first place. And it is this "why" which is to judge about whether it should stay there, move elsewhere, or get dropped altogether. >> > @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain >> *domain, u64 addr) >> > if ( pg_maddr == 0 ) >> > { >> > spin_unlock(&hd->arch.mapping_lock); >> > - return; >> > + return -ENOMEM; >> > } >> >> addr_to_dma_page_maddr() gets called with "alloc" being false, so there can't >> be any memory allocation failure here. There simply is nothing to do in this >> case. >> > > I copy it from iommu_map_page(). > > Good, then the error of iommu_unmap_page() looks only from flush (the crash > is at least obvious), then error handling can be lighter weight-- > We may return an error, but don't roll back the failed operation. > Right? I don't think so, and I can only re-iterate: There can't be any error here, so there's no error code to forward up the call tree. IOW the "pg_maddr == 0" case simply means "nothing to do" here. >> > -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) >> > +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) >> > { >> > u32 id; >> > + int rc = 0; >> > >> > id = pci_conf_read32(0, 0, 0, 0, 0); >> > if ( IS_CTG(id) ) >> > { >> > /* quit if ME does not exist */ >> > if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff ) >> > - return; >> > + return -ENOENT; >> >> Is this really an error? IOW, do all systems which satisfy IS_CTG() have such a >> device? >> > To be honest, I didn't know much about me_wifi_quirk. In such a case - how about asking the maintainers, who are your colleagues? And that of course after having looked at the history in an attempt to gain some understanding. > Now, IMO I don't need to deal with me_wifi_quirk(). Well, you clearly can't ignore it. Jan
On March 29, 2016 3:37pm, Jan Beulich <JBeulich@suse.com> wrote: > >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: > > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > >> > --- a/xen/drivers/passthrough/iommu.c > >> > +++ b/xen/drivers/passthrough/iommu.c > >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > >> > iommu = drhd->iommu; > >> > iommu_flush_context_global(iommu, 0); > >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > >> > + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > >> > + > >> > + if ( rc > 0 ) > >> > + { > >> > + iommu_flush_write_buffer(iommu); > >> > >> Why is this needed all of the sudden? > > > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my > > machine, and I can find the following log message: > > """ > > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. > > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. > > """ > > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be > > called to flush every IOMMU. > > For one what you say suggests that right now this is being done for some (one?) > IOMMU(s), which I don't see being the case. And then what you say _still_ > doesn't say _why_ this is now needed all of the sudden. If, in the course of > doing your re-work here, you find pre-existing issues with the code, please split > the necessary fixes out of your re-work and submit them separately with proper > explanations in their commit messages. > I find out it is no need modification for this function. I overlooked the parameter 'flush_non_present_entry', which is '0' to call iommu_flush_iotlb_global(). At the bottom of the call tree, (Existing code) >> In flush->iotlb(): { .... if ( flush_non_present_entry ) { if ( !cap_caching_mode(iommu->cap) ) return 1; else did = 0; } .... } << it is impossible to return '1', which is used to indicate caller needs to flush cache. Quan
On April 11, 2016 11:10am, <quan.xu@intel.com> wrote: > On March 29, 2016 3:37pm, Jan Beulich <JBeulich@suse.com> wrote: > > >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: > > > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: > > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > > >> > --- a/xen/drivers/passthrough/iommu.c > > >> > +++ b/xen/drivers/passthrough/iommu.c > > >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > > >> > iommu = drhd->iommu; > > >> > iommu_flush_context_global(iommu, 0); > > >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > > >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > > >> > + rc = iommu_flush_iotlb_global(iommu, 0, > > >> > + flush_dev_iotlb); > > >> > + > > >> > + if ( rc > 0 ) > > >> > + { > > >> > + iommu_flush_write_buffer(iommu); > > >> > > >> Why is this needed all of the sudden? > > > > > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my > > > machine, and I can find the following log message: > > > """ > > > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. > > > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. > > > """ > > > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be > > > called to flush every IOMMU. > > > > For one what you say suggests that right now this is being done for > > some (one?) IOMMU(s), which I don't see being the case. And then what > > you say _still_ doesn't say _why_ this is now needed all of the > > sudden. If, in the course of doing your re-work here, you find > > pre-existing issues with the code, please split the necessary fixes > > out of your re-work and submit them separately with proper explanations in > their commit messages. > > > > I find out it is no need modification for this function. Sorry, this modification refers to as: " + if ( rc > 0 ) + { + iommu_flush_write_buffer(iommu); + rc = 0; + } " at least errors need to be propagated. Quan > I overlooked the parameter 'flush_non_present_entry', which is '0' to call > iommu_flush_iotlb_global(). > At the bottom of the call tree, > (Existing code) > >> > In flush->iotlb(): > { > .... > if ( flush_non_present_entry ) > { > if ( !cap_caching_mode(iommu->cap) ) > return 1; > else > did = 0; > } > > .... > } > << > > > it is impossible to return '1', which is used to indicate caller needs to flush > cache.
>>> On 11.04.16 at 05:27, <quan.xu@intel.com> wrote: > On April 11, 2016 11:10am, <quan.xu@intel.com> wrote: >> On March 29, 2016 3:37pm, Jan Beulich <JBeulich@suse.com> wrote: >> > >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: >> > > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: >> > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: >> > >> > --- a/xen/drivers/passthrough/iommu.c >> > >> > +++ b/xen/drivers/passthrough/iommu.c >> > >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) >> > >> > iommu = drhd->iommu; >> > >> > iommu_flush_context_global(iommu, 0); >> > >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; >> > >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); >> > >> > + rc = iommu_flush_iotlb_global(iommu, 0, >> > >> > + flush_dev_iotlb); >> > >> > + >> > >> > + if ( rc > 0 ) >> > >> > + { >> > >> > + iommu_flush_write_buffer(iommu); >> > >> >> > >> Why is this needed all of the sudden? >> > > >> > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my >> > > machine, and I can find the following log message: >> > > """ >> > > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. >> > > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. >> > > """ >> > > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should be >> > > called to flush every IOMMU. >> > >> > For one what you say suggests that right now this is being done for >> > some (one?) IOMMU(s), which I don't see being the case. And then what >> > you say _still_ doesn't say _why_ this is now needed all of the >> > sudden. If, in the course of doing your re-work here, you find >> > pre-existing issues with the code, please split the necessary fixes >> > out of your re-work and submit them separately with proper explanations in >> their commit messages. >> > >> >> I find out it is no need modification for this function. > Sorry, this modification refers to as: > " > + if ( rc > 0 ) > + { > + iommu_flush_write_buffer(iommu); > + rc = 0; > + } > " > > at least errors need to be propagated. How does error propagation correlate with suddenly calling iommu_flush_write_buffer()? Jan
On April 12, 2016 12:35am, Jan Beulich <JBeulich@suse.com> wrote: > >>> On 11.04.16 at 05:27, <quan.xu@intel.com> wrote: > > On April 11, 2016 11:10am, <quan.xu@intel.com> wrote: > >> On March 29, 2016 3:37pm, Jan Beulich <JBeulich@suse.com> wrote: > >> > >>> On 25.03.16 at 10:27, <quan.xu@intel.com> wrote: > >> > > On March 18, 2016 6:20pm, <JBeulich@suse.com> wrote: > >> > >> >>> On 17.03.16 at 07:54, <quan.xu@intel.com> wrote: > >> > >> > --- a/xen/drivers/passthrough/iommu.c > >> > >> > +++ b/xen/drivers/passthrough/iommu.c > >> > >> > @@ -554,11 +555,24 @@ static void iommu_flush_all(void) > >> > >> > iommu = drhd->iommu; > >> > >> > iommu_flush_context_global(iommu, 0); > >> > >> > flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; > >> > >> > - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); > >> > >> > + rc = iommu_flush_iotlb_global(iommu, 0, > >> > >> > + flush_dev_iotlb); > >> > >> > + > >> > >> > + if ( rc > 0 ) > >> > >> > + { > >> > >> > + iommu_flush_write_buffer(iommu); > >> > >> > >> > >> Why is this needed all of the sudden? > >> > > > >> > > As there may be multiple IOMMUs. .e.g, there are 2 IOMMUs in my > >> > > machine, and I can find the following log message: > >> > > """ > >> > > (XEN) Intel VT-d iommu 0 supported page sizes: 4kB, 2MB, 1GB. > >> > > (XEN) Intel VT-d iommu 1 supported page sizes: 4kB, 2MB, 1GB. > >> > > """ > >> > > __iiuc__, iommu_flush_write_buffer() is per IOMMU, so It should > >> > > be called to flush every IOMMU. > >> > > >> > For one what you say suggests that right now this is being done for > >> > some (one?) IOMMU(s), which I don't see being the case. And then > >> > what you say _still_ doesn't say _why_ this is now needed all of > >> > the sudden. If, in the course of doing your re-work here, you find > >> > pre-existing issues with the code, please split the necessary fixes > >> > out of your re-work and submit them separately with proper > >> > explanations in > >> their commit messages. > >> > > >> > >> I find out it is no need modification for this function. > > Sorry, this modification refers to as: > > " > > + if ( rc > 0 ) > > + { > > + iommu_flush_write_buffer(iommu); > > + rc = 0; > > + } > > " > > My bad description, what I mean is that I will drop above modification, then.. > > at least errors need to be propagated. > > How does error propagation correlate with suddenly calling > iommu_flush_write_buffer()? > iommu_flush_write_buffer() is no longer called suddenly in this function. Quan
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c index f9bcce7..fa6c710 100644 --- a/xen/arch/x86/mm/p2m-ept.c +++ b/xen/arch/x86/mm/p2m-ept.c @@ -825,7 +825,7 @@ out: need_modify_vtd_table ) { if ( iommu_hap_pt_share ) - iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present); + rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present); else { if ( iommu_flags ) diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c index 4536106..8bb47b2 100644 --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -1339,12 +1339,14 @@ static void invalidate_all_devices(void) iterate_ivrs_mappings(_invalidate_all_devices); } -void amd_iommu_suspend(void) +int amd_iommu_suspend(void) { struct amd_iommu *iommu; for_each_amd_iommu ( iommu ) disable_iommu(iommu); + + return 0; } void amd_iommu_resume(void) @@ -1368,3 +1370,11 @@ void amd_iommu_resume(void) invalidate_all_domain_pages(); } } + +void amd_iommu_crash_shutdown(void) +{ + struct amd_iommu *iommu; + + for_each_amd_iommu ( iommu ) + disable_iommu(iommu); +} diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c index c8ee8dc..fb8e816 100644 --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -628,6 +628,6 @@ const struct iommu_ops amd_iommu_ops = { .suspend = amd_iommu_suspend, .resume = amd_iommu_resume, .share_p2m = amd_iommu_share_p2m, - .crash_shutdown = amd_iommu_suspend, + .crash_shutdown = amd_iommu_crash_shutdown, .dump_p2m_table = amd_dump_p2m_table, }; diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c index 62da087..36c2c83 100644 --- a/xen/drivers/passthrough/arm/smmu.c +++ b/xen/drivers/passthrough/arm/smmu.c @@ -2540,7 +2540,7 @@ static int force_stage = 2; */ static u32 platform_features = ARM_SMMU_FEAT_COHERENT_WALK; -static void arm_smmu_iotlb_flush_all(struct domain *d) +static int arm_smmu_iotlb_flush_all(struct domain *d) { struct arm_smmu_xen_domain *smmu_domain = domain_hvm_iommu(d)->arch.priv; struct iommu_domain *cfg; @@ -2557,13 +2557,15 @@ static void arm_smmu_iotlb_flush_all(struct domain *d) arm_smmu_tlb_inv_context(cfg->priv); } spin_unlock(&smmu_domain->lock); + + return 0; } -static void arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn, - unsigned int page_count) +static int arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn, + unsigned int page_count) { /* ARM SMMU v1 doesn't have flush by VMA and VMID */ - arm_smmu_iotlb_flush_all(d); + return arm_smmu_iotlb_flush_all(d); } static struct iommu_domain *arm_smmu_get_domain(struct domain *d, diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 29efbfe..08bb6f7 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -182,7 +182,11 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) ((page->u.inuse.type_info & PGT_type_mask) == PGT_writable_page) ) mapping |= IOMMUF_writable; - hd->platform_ops->map_page(d, gfn, mfn, mapping); + if ( hd->platform_ops->map_page(d, gfn, mfn, mapping) ) + printk(XENLOG_G_ERR + "IOMMU: Map page gfn: 0x%lx(mfn: 0x%lx) failed.\n", + gfn, mfn); + if ( !(i++ & 0xfffff) ) process_pending_softirqs(); } @@ -284,9 +288,7 @@ int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_cou if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush ) return 0; - hd->platform_ops->iotlb_flush(d, gfn, page_count); - - return 0; + return hd->platform_ops->iotlb_flush(d, gfn, page_count); } int iommu_iotlb_flush_all(struct domain *d) @@ -296,9 +298,7 @@ int iommu_iotlb_flush_all(struct domain *d) if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all ) return 0; - hd->platform_ops->iotlb_flush_all(d); - - return 0; + return hd->platform_ops->iotlb_flush_all(d); } int __init iommu_setup(void) @@ -375,8 +375,9 @@ int iommu_do_domctl( int iommu_suspend() { const struct iommu_ops *ops = iommu_get_ops(); + if ( iommu_enabled ) - ops->suspend(); + return ops->suspend(); return 0; } diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h index cbe0286..d4d37c3 100644 --- a/xen/drivers/passthrough/vtd/extern.h +++ b/xen/drivers/passthrough/vtd/extern.h @@ -91,7 +91,7 @@ int is_igd_vt_enabled_quirk(void); void platform_quirks_init(void); void vtd_ops_preamble_quirk(struct iommu* iommu); void vtd_ops_postamble_quirk(struct iommu* iommu); -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map); +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map); void pci_vtd_quirk(const struct pci_dev *); bool_t platform_supports_intremap(void); bool_t platform_supports_x2apic(void); diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 5ad25dc..4f8b63a 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -542,11 +542,12 @@ static int iommu_flush_iotlb_psi( return status; } -static void iommu_flush_all(void) +static int iommu_flush_all(void) { struct acpi_drhd_unit *drhd; struct iommu *iommu; int flush_dev_iotlb; + int rc = 0; flush_all_cache(); for_each_drhd_unit ( drhd ) @@ -554,11 +555,24 @@ static void iommu_flush_all(void) iommu = drhd->iommu; iommu_flush_context_global(iommu, 0); flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; - iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); + rc = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb); + + if ( rc > 0 ) + { + iommu_flush_write_buffer(iommu); + rc = 0; + } + else if ( rc < 0 ) + { + printk(XENLOG_G_ERR "IOMMU: IOMMU flush all failed.\n"); + break; + } } + + return rc; } -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, +static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, int dma_old_pte_present, unsigned int page_count) { struct hvm_iommu *hd = domain_hvm_iommu(d); @@ -566,6 +580,7 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, struct iommu *iommu; int flush_dev_iotlb; int iommu_domid; + int rc = 0; /* * No need pcideves_lock here because we have flush @@ -584,37 +599,40 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, continue; if ( page_count != 1 || gfn == INVALID_GFN ) - { - if ( iommu_flush_iotlb_dsi(iommu, iommu_domid, - 0, flush_dev_iotlb) ) - iommu_flush_write_buffer(iommu); - } + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, + 0, flush_dev_iotlb); else + rc = iommu_flush_iotlb_psi(iommu, iommu_domid, + (paddr_t)gfn << PAGE_SHIFT_4K, 0, + !dma_old_pte_present, + flush_dev_iotlb); + if ( rc > 0 ) { - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, - (paddr_t)gfn << PAGE_SHIFT_4K, PAGE_ORDER_4K, - !dma_old_pte_present, flush_dev_iotlb) ) - iommu_flush_write_buffer(iommu); + iommu_flush_write_buffer(iommu); + rc = 0; } } + + return rc; } -static void intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count) +static int intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count) { - __intel_iommu_iotlb_flush(d, gfn, 1, page_count); + return __intel_iommu_iotlb_flush(d, gfn, 1, page_count); } -static void intel_iommu_iotlb_flush_all(struct domain *d) +static int intel_iommu_iotlb_flush_all(struct domain *d) { - __intel_iommu_iotlb_flush(d, INVALID_GFN, 0, 0); + return __intel_iommu_iotlb_flush(d, INVALID_GFN, 0, 0); } /* clear one page's page table */ -static void dma_pte_clear_one(struct domain *domain, u64 addr) +static int dma_pte_clear_one(struct domain *domain, u64 addr) { struct hvm_iommu *hd = domain_hvm_iommu(domain); struct dma_pte *page = NULL, *pte = NULL; u64 pg_maddr; + int rc = 0; spin_lock(&hd->arch.mapping_lock); /* get last level pte */ @@ -622,7 +640,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) if ( pg_maddr == 0 ) { spin_unlock(&hd->arch.mapping_lock); - return; + return -ENOMEM; } page = (struct dma_pte *)map_vtd_domain_page(pg_maddr); @@ -632,7 +650,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) { spin_unlock(&hd->arch.mapping_lock); unmap_vtd_domain_page(page); - return; + return 0; } dma_clear_pte(*pte); @@ -640,9 +658,11 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr) iommu_flush_cache_entry(pte, sizeof(struct dma_pte)); if ( !this_cpu(iommu_dont_flush_iotlb) ) - __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K, 1, 1); + rc = __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K, 1, 1); unmap_vtd_domain_page(page); + + return rc; } static void iommu_free_pagetable(u64 pt_maddr, int level) @@ -1281,6 +1301,7 @@ int domain_context_mapping_one( u64 maddr, pgd_maddr; u16 seg = iommu->intel->drhd->segment; int agaw; + int rc = 0; ASSERT(pcidevs_locked()); spin_lock(&iommu->lock); @@ -1400,7 +1421,14 @@ int domain_context_mapping_one( else { int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; - iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb); + + rc = iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb); + + if ( rc > 0 ) + { + iommu_flush_write_buffer(iommu); + rc = 0; + } } set_bit(iommu->index, &hd->arch.iommu_bitmap); @@ -1410,7 +1438,7 @@ int domain_context_mapping_one( if ( !seg ) me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC); - return 0; + return rc; } static int domain_context_mapping( @@ -1505,6 +1533,7 @@ int domain_context_unmap_one( struct context_entry *context, *context_entries; u64 maddr; int iommu_domid; + int rc = 0; ASSERT(pcidevs_locked()); spin_lock(&iommu->lock); @@ -1539,7 +1568,14 @@ int domain_context_unmap_one( else { int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0; - iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb); + + rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb); + + if ( rc > 0 ) + { + iommu_flush_write_buffer(iommu); + rc = 0; + } } spin_unlock(&iommu->lock); @@ -1548,7 +1584,7 @@ int domain_context_unmap_one( if ( !iommu->intel->drhd->segment ) me_wifi_quirk(domain, bus, devfn, UNMAP_ME_PHANTOM_FUNC); - return 0; + return rc; } static int domain_context_unmap( @@ -1738,7 +1774,7 @@ static int intel_iommu_map_page( unmap_vtd_domain_page(page); if ( !this_cpu(iommu_dont_flush_iotlb) ) - __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1); + return __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1); return 0; } @@ -1749,19 +1785,18 @@ static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn) if ( iommu_passthrough && is_hardware_domain(d) ) return 0; - dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K); - - return 0; + return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K); } -void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, - int order, int present) +int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, + int order, int present) { struct acpi_drhd_unit *drhd; struct iommu *iommu = NULL; struct hvm_iommu *hd = domain_hvm_iommu(d); int flush_dev_iotlb; int iommu_domid; + int rc = 0; iommu_flush_cache_entry(pte, sizeof(struct dma_pte)); @@ -1775,11 +1810,17 @@ void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, iommu_domid= domain_iommu_domid(d, iommu); if ( iommu_domid == -1 ) continue; - if ( iommu_flush_iotlb_psi(iommu, iommu_domid, + rc = iommu_flush_iotlb_psi(iommu, iommu_domid, (paddr_t)gfn << PAGE_SHIFT_4K, - order, !present, flush_dev_iotlb) ) + order, !present, flush_dev_iotlb); + if ( rc > 0 ) + { iommu_flush_write_buffer(iommu); + rc = 0; + } } + + return rc; } static int __init vtd_ept_page_compatible(struct iommu *iommu) @@ -2099,8 +2140,8 @@ static int init_vtd_hw(void) return -EIO; } } - iommu_flush_all(); - return 0; + + return iommu_flush_all(); } static void __hwdom_init setup_hwdom_rmrr(struct domain *d) @@ -2389,16 +2430,19 @@ static int intel_iommu_group_id(u16 seg, u8 bus, u8 devfn) } static u32 iommu_state[MAX_IOMMUS][MAX_IOMMU_REGS]; -static void vtd_suspend(void) +static int vtd_suspend(void) { struct acpi_drhd_unit *drhd; struct iommu *iommu; u32 i; + int rc; if ( !iommu_enabled ) - return; + return 0; - iommu_flush_all(); + rc = iommu_flush_all(); + if ( rc ) + return rc; for_each_drhd_unit ( drhd ) { @@ -2427,6 +2471,8 @@ static void vtd_suspend(void) if ( !iommu_intremap && iommu_qinval ) disable_qinval(iommu); } + + return 0; } static void vtd_crash_shutdown(void) diff --git a/xen/drivers/passthrough/vtd/quirks.c b/xen/drivers/passthrough/vtd/quirks.c index 49df41d..a230b68 100644 --- a/xen/drivers/passthrough/vtd/quirks.c +++ b/xen/drivers/passthrough/vtd/quirks.c @@ -332,10 +332,11 @@ void __init platform_quirks_init(void) * assigning Intel integrated wifi device to a guest. */ -static void map_me_phantom_function(struct domain *domain, u32 dev, int map) +static int map_me_phantom_function(struct domain *domain, u32 dev, int map) { struct acpi_drhd_unit *drhd; struct pci_dev *pdev; + int rc; /* find ME VT-d engine base on a real ME device */ pdev = pci_get_pdev(0, 0, PCI_DEVFN(dev, 0)); @@ -343,23 +344,26 @@ static void map_me_phantom_function(struct domain *domain, u32 dev, int map) /* map or unmap ME phantom function */ if ( map ) - domain_context_mapping_one(domain, drhd->iommu, 0, - PCI_DEVFN(dev, 7), NULL); + rc = domain_context_mapping_one(domain, drhd->iommu, 0, + PCI_DEVFN(dev, 7), NULL); else - domain_context_unmap_one(domain, drhd->iommu, 0, - PCI_DEVFN(dev, 7)); + rc = domain_context_unmap_one(domain, drhd->iommu, 0, + PCI_DEVFN(dev, 7)); + + return rc; } -void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) +int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) { u32 id; + int rc = 0; id = pci_conf_read32(0, 0, 0, 0, 0); if ( IS_CTG(id) ) { /* quit if ME does not exist */ if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff ) - return; + return -ENOENT; /* if device is WLAN device, map ME phantom device 0:3.7 */ id = pci_conf_read32(0, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), 0); @@ -373,7 +377,7 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) case 0x423b8086: case 0x423c8086: case 0x423d8086: - map_me_phantom_function(domain, 3, map); + rc = map_me_phantom_function(domain, 3, map); break; default: break; @@ -383,7 +387,7 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) { /* quit if ME does not exist */ if ( pci_conf_read32(0, 0, 22, 0, 0) == 0xffffffff ) - return; + return -ENOENT; /* if device is WLAN device, map ME phantom device 0:22.7 */ id = pci_conf_read32(0, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), 0); @@ -399,12 +403,14 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map) case 0x42388086: /* Puma Peak */ case 0x422b8086: case 0x422c8086: - map_me_phantom_function(domain, 22, map); + rc = map_me_phantom_function(domain, 22, map); break; default: break; } } + + return rc; } void pci_vtd_quirk(const struct pci_dev *pdev) diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h index 9c51172..f540fc8 100644 --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -119,7 +119,7 @@ extern unsigned long *shared_intremap_inuse; /* power management support */ void amd_iommu_resume(void); -void amd_iommu_suspend(void); +int amd_iommu_suspend(void); void amd_iommu_crash_shutdown(void); /* guest iommu support */ diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h index f22b3a5..790701e 100644 --- a/xen/include/asm-x86/iommu.h +++ b/xen/include/asm-x86/iommu.h @@ -26,7 +26,7 @@ int iommu_setup_hpet_msi(struct msi_desc *); /* While VT-d specific, this must get declared in a generic header. */ int adjust_vtd_irq_affinities(void); -void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, int present); +int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, int present); bool_t iommu_supports_eim(void); int iommu_enable_x2apic_IR(void); void iommu_disable_x2apic_IR(void); diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index d6d489a..8710dfe 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -157,12 +157,12 @@ struct iommu_ops { unsigned int (*read_apic_from_ire)(unsigned int apic, unsigned int reg); int (*setup_hpet_msi)(struct msi_desc *); #endif /* CONFIG_X86 */ - void (*suspend)(void); + int (*suspend)(void); void (*resume)(void); void (*share_p2m)(struct domain *d); void (*crash_shutdown)(void); - void (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count); - void (*iotlb_flush_all)(struct domain *d); + int (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count); + int (*iotlb_flush_all)(struct domain *d); int (*get_reserved_device_memory)(iommu_grdm_t *, void *); void (*dump_p2m_table)(struct domain *d); };
Current code would be panic(), when VT-d Device-TLB flush timed out. the panic() is going to be eliminated, so we must check all kinds of error and all the way up the call trees. Signed-off-by: Quan Xu <quan.xu@intel.com> CC: Jun Nakajima <jun.nakajima@intel.com> CC: Kevin Tian <kevin.tian@intel.com> CC: George Dunlap <george.dunlap@eu.citrix.com> CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> CC: Stefano Stabellini <stefano.stabellini@citrix.com> CC: Julien Grall <julien.grall@arm.com> CC: Feng Wu <feng.wu@intel.com> CC: Dario Faggioli <dario.faggioli@citrix.com> --- xen/arch/x86/mm/p2m-ept.c | 2 +- xen/drivers/passthrough/amd/iommu_init.c | 12 ++- xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 +- xen/drivers/passthrough/arm/smmu.c | 10 ++- xen/drivers/passthrough/iommu.c | 17 ++-- xen/drivers/passthrough/vtd/extern.h | 2 +- xen/drivers/passthrough/vtd/iommu.c | 120 ++++++++++++++++++-------- xen/drivers/passthrough/vtd/quirks.c | 26 +++--- xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 +- xen/include/asm-x86/iommu.h | 2 +- xen/include/xen/iommu.h | 6 +- 11 files changed, 133 insertions(+), 68 deletions(-)