Message ID | 1459522059-102365-2-git-send-email-quan.xu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 01.04.16 at 16:47, <quan.xu@intel.com> wrote: The subject should mention "timeout", perhaps either in addition to or in place of "command line". > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -1532,6 +1532,24 @@ Note that if **watchdog** option is also specified vpmu will be turned off. > As the virtualisation is not 100% safe, don't use the vpmu flag on > production systems (see http://xenbits.xen.org/xsa/advisory-163.html)! > > +### vtd\_qi\_timeout (VT-d) > +> `= <integer>` > + > +> Default: `1` > + > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. > +By default, the spin timeout is 1ms, which can be boot-time changed. Especially the part after the comma makes little sense considering which file we're in. > +In current code, VT-d Queued Invalidation includes Device-TLB, IOTLB, > +Context and IEC flush. If Device-TLB flush timed out, we would hide > +the target ATS device and crash the domain owning this ATS device. > +If impacted domain is hardware domain, just throw out a warning (done > +in queue\_invalidate\_wait). IOTLB, Context and IEC flush timeout are > +still in TODO-list. Much of this doesn't seem to belong here either. > +When you see error 'Queue invalidate wait descriptor timed out', try > +increasing the vtd\_qi\_timeout to 10ms or more. Why 10ms? (If there's no specific reason, I think you'd better drop any explicit number.) Also there's no reason the spell out the command line option again here - the context makes clear which value needs increasing. Jan
On April 05, 2016 5:09pm, <JBeulich@suse.com> wrote: > >>> On 01.04.16 at 16:47, <quan.xu@intel.com> wrote: > > The subject should mention "timeout", perhaps either in addition to or in place > of "command line". > I prefer "VT-d: add a timeout parameter for Queued Invalidation". > > --- a/docs/misc/xen-command-line.markdown > > +++ b/docs/misc/xen-command-line.markdown > > @@ -1532,6 +1532,24 @@ Note that if **watchdog** option is also > specified vpmu will be turned off. > > As the virtualisation is not 100% safe, don't use the vpmu flag on > > production systems (see http://xenbits.xen.org/xsa/advisory-163.html)! > > > > +### vtd\_qi\_timeout (VT-d) > > +> `= <integer>` > > + > > +> Default: `1` > > + > > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. > > +By default, the spin timeout is 1ms, which can be boot-time changed. > > Especially the part after the comma makes little sense considering which file > we're in. Agreed. > > > +In current code, VT-d Queued Invalidation includes Device-TLB, IOTLB, > > +Context and IEC flush. If Device-TLB flush timed out, we would hide > > +the target ATS device and crash the domain owning this ATS device. > > +If impacted domain is hardware domain, just throw out a warning (done > > +in queue\_invalidate\_wait). IOTLB, Context and IEC flush timeout are > > +still in TODO-list. > > Much of this doesn't seem to belong here either. > Could I drop it? > > +When you see error 'Queue invalidate wait descriptor timed out', try > > +increasing the vtd\_qi\_timeout to 10ms or more. > > Why 10ms? (If there's no specific reason, I think you'd better drop any explicit > number.) Yes, no specific reason. Also there's no reason the spell out the command line option again > here - the context makes clear which value needs increasing. > Agreed. Then, the new description of xen-command-line.markdown: +### vtd\_qi\_timeout (VT-d) +> `= <integer>` + +> Default: `1` + +Specify the timeout of the VT-d Queued Invalidation in milliseconds. +By default, the timeout is 1ms. + +When you see error 'Queue invalidate wait descriptor timed out', try +increasing this value. + Any more suggestion? Quan
> From: Xu, Quan > Sent: Thursday, April 07, 2016 9:49 AM > > > > > > > +In current code, VT-d Queued Invalidation includes Device-TLB, IOTLB, > > > +Context and IEC flush. If Device-TLB flush timed out, we would hide > > > +the target ATS device and crash the domain owning this ATS device. > > > +If impacted domain is hardware domain, just throw out a warning (done > > > +in queue\_invalidate\_wait). IOTLB, Context and IEC flush timeout are > > > +still in TODO-list. > > > > Much of this doesn't seem to belong here either. > > > > Could I drop it? yes, please do it. Above belongs to either a patch commit or code comment. Thanks Kevin
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index ca77e3b..5a7ed5d 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1532,6 +1532,24 @@ Note that if **watchdog** option is also specified vpmu will be turned off. As the virtualisation is not 100% safe, don't use the vpmu flag on production systems (see http://xenbits.xen.org/xsa/advisory-163.html)! +### vtd\_qi\_timeout (VT-d) +> `= <integer>` + +> Default: `1` + +Specify the timeout of the VT-d Queued Invalidation in milliseconds. +By default, the spin timeout is 1ms, which can be boot-time changed. + +In current code, VT-d Queued Invalidation includes Device-TLB, IOTLB, +Context and IEC flush. If Device-TLB flush timed out, we would hide +the target ATS device and crash the domain owning this ATS device. +If impacted domain is hardware domain, just throw out a warning (done +in queue\_invalidate\_wait). IOTLB, Context and IEC flush timeout are +still in TODO-list. + +When you see error 'Queue invalidate wait descriptor timed out', try +increasing the vtd\_qi\_timeout to 10ms or more. + ### watchdog > `= force | <boolean>` diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c index b81b0bd..52ba2c2 100644 --- a/xen/drivers/passthrough/vtd/qinval.c +++ b/xen/drivers/passthrough/vtd/qinval.c @@ -28,6 +28,11 @@ #include "vtd.h" #include "extern.h" +static unsigned int __read_mostly vtd_qi_timeout = 1; +integer_param("vtd_qi_timeout", vtd_qi_timeout); + +#define IOMMU_QI_TIMEOUT (vtd_qi_timeout * MILLISECS(1)) + static void print_qi_regs(struct iommu *iommu) { u64 val; @@ -130,10 +135,10 @@ static void queue_invalidate_iotlb(struct iommu *iommu, spin_unlock_irqrestore(&iommu->register_lock, flags); } -static int queue_invalidate_wait(struct iommu *iommu, +static int __must_check queue_invalidate_wait(struct iommu *iommu, u8 iflag, u8 sw, u8 fn) { - s_time_t start_time; + s_time_t timeout; volatile u32 poll_slot = QINVAL_STAT_INIT; unsigned int index; unsigned long flags; @@ -164,13 +169,15 @@ static int queue_invalidate_wait(struct iommu *iommu, if ( sw ) { /* In case all wait descriptor writes to same addr with same data */ - start_time = NOW(); + timeout = NOW() + IOMMU_QI_TIMEOUT; while ( poll_slot != QINVAL_STAT_DONE ) { - if ( NOW() > (start_time + DMAR_OPERATION_TIMEOUT) ) + if ( NOW() > timeout ) { print_qi_regs(iommu); - panic("queue invalidate wait descriptor was not executed"); + printk(XENLOG_WARNING VTDPREFIX + "Queue invalidate wait descriptor timed out.\n"); + return -ETIMEDOUT; } cpu_relax(); }
The command line parameter 'vtd_qi_timeout' specifies the timeout of the VT-d Queued Invalidation in milliseconds. By default, the timeout is 1ms, which can be boot-time changed. Add a __must_check annotation. The followup patch titled 'VT-d IOTLB/Context/IEC flush issue' addresses the __mustcheck. That is the other callers of this routine (two or three levels up) ignore the return code. This patch does not address this but the other does. Signed-off-by: Quan Xu <quan.xu@intel.com> --- docs/misc/xen-command-line.markdown | 18 ++++++++++++++++++ xen/drivers/passthrough/vtd/qinval.c | 17 ++++++++++++----- 2 files changed, 30 insertions(+), 5 deletions(-)