Message ID | 1461322453-29216-2-git-send-email-quan.xu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 22.04.16 at 12:54, <quan.xu@intel.com> wrote: > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is also specified vpmu will be turned off. > As the virtualisation is not 100% safe, don't use the vpmu flag on > production systems (see http://xenbits.xen.org/xsa/advisory-163.html)! > > +### vtd\_qi\_timeout (VT-d) > +> `= <integer>` > + > +> Default: `1` > + > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. > + > +By default, the timeout is 1ms. When you see error 'Queue invalidate wait > +descriptor timed out', try increasing this value. So when someone enables ATS, will the 1ms timeout apply to the dev iotlb invalidations too? If so, that's surely too short, and would ideally be adjusted automatically, but the need for a higher timeout in that case should in any event be mentioned here. Apart from that aspect this patch seems to be ready, but will clearly need a VT-d maintainer's ack. Jan
On May 13, 2016 11:28 PM, Jan Beulich <JBeulich@suse.com> wrote: > >>> On 22.04.16 at 12:54, <quan.xu@intel.com> wrote: > > --- a/docs/misc/xen-command-line.markdown > > +++ b/docs/misc/xen-command-line.markdown > > @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is also > specified vpmu will be turned off. > > As the virtualisation is not 100% safe, don't use the vpmu flag on > > production systems (see http://xenbits.xen.org/xsa/advisory-163.html)! > > > > +### vtd\_qi\_timeout (VT-d) > > +> `= <integer>` > > + > > +> Default: `1` > > + > > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. > > + > > +By default, the timeout is 1ms. When you see error 'Queue invalidate > > +wait descriptor timed out', try increasing this value. > > So when someone enables ATS, will the 1ms timeout apply to the dev iotlb > invalidations too? Yes, The timeout is the same for IOTLB, Context, IEC and Device-TLB invalidation. > If so, that's surely too short, and would ideally be adjusted > automatically, but the need for a higher timeout in that case should in any > event be mentioned here. I can try to use 1ms for IOTLB, Context and IEC invalidation. As mentioned, 1 ms is enough for IOTLB, Context and IEC invalidation. What about 10 ms for Device-TLB (10 ms is just a higher timeout, no specific meaning)? > > Apart from that aspect this patch seems to be ready, but will clearly need a VT- > d maintainer's ack. > Thanks for your review. I will also test this patch against the last commit ( I am still out of office and I will do it around this Wednesday). Quan
> From: Xu, Quan > Sent: Monday, May 16, 2016 11:26 PM > > On May 13, 2016 11:28 PM, Jan Beulich <JBeulich@suse.com> wrote: > > >>> On 22.04.16 at 12:54, <quan.xu@intel.com> wrote: > > > --- a/docs/misc/xen-command-line.markdown > > > +++ b/docs/misc/xen-command-line.markdown > > > @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is also > > specified vpmu will be turned off. > > > As the virtualisation is not 100% safe, don't use the vpmu flag on > > > production systems (see http://xenbits.xen.org/xsa/advisory-163.html)! > > > > > > +### vtd\_qi\_timeout (VT-d) > > > +> `= <integer>` > > > + > > > +> Default: `1` > > > + > > > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. > > > + > > > +By default, the timeout is 1ms. When you see error 'Queue invalidate > > > +wait descriptor timed out', try increasing this value. > > > > So when someone enables ATS, will the 1ms timeout apply to the dev iotlb > > invalidations too? > > Yes, > The timeout is the same for IOTLB, Context, IEC and Device-TLB invalidation. > > > If so, that's surely too short, and would ideally be adjusted > > automatically, but the need for a higher timeout in that case should in any > > event be mentioned here. > > I can try to use 1ms for IOTLB, Context and IEC invalidation. As mentioned, 1 ms is > enough for IOTLB, Context and IEC invalidation. > What about 10 ms for Device-TLB (10 ms is just a higher timeout, no specific meaning)? I remember in earlier discussion we agreed to use 1ms as the default for both IOMMU-side and device-side flushes. For device-side flushes, we checked internal HW team that 1ms is a reasonable threshold for integrated devices. It's likely insufficient for discrete devices. We may check any automatic adjustment method later when it becomes a real problem. For now, please elaborate above information in the text. > > > > > Apart from that aspect this patch seems to be ready, but will clearly need a VT- > > d maintainer's ack. > > > > Thanks for your review. I will also test this patch against the last commit ( I am still out > of office and I will do it around this Wednesday). > Jan, do you want me to ack this series now, or wait until previous series handling error is checked in? (I thought you wanted the later in one of your replies earlier). Either way is OK to me. :-) Thanks Kevin
>>> On 17.05.16 at 05:19, <kevin.tian@intel.com> wrote: >> From: Xu, Quan >> Sent: Monday, May 16, 2016 11:26 PM >> >> On May 13, 2016 11:28 PM, Jan Beulich <JBeulich@suse.com> wrote: >> > >>> On 22.04.16 at 12:54, <quan.xu@intel.com> wrote: >> > > --- a/docs/misc/xen-command-line.markdown >> > > +++ b/docs/misc/xen-command-line.markdown >> > > @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is also >> > specified vpmu will be turned off. >> > > As the virtualisation is not 100% safe, don't use the vpmu flag on >> > > production systems (see http://xenbits.xen.org/xsa/advisory-163.html)! >> > > >> > > +### vtd\_qi\_timeout (VT-d) >> > > +> `= <integer>` >> > > + >> > > +> Default: `1` >> > > + >> > > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. >> > > + >> > > +By default, the timeout is 1ms. When you see error 'Queue invalidate >> > > +wait descriptor timed out', try increasing this value. >> > >> > So when someone enables ATS, will the 1ms timeout apply to the dev iotlb >> > invalidations too? >> >> Yes, >> The timeout is the same for IOTLB, Context, IEC and Device-TLB invalidation. >> >> > If so, that's surely too short, and would ideally be adjusted >> > automatically, but the need for a higher timeout in that case should in any >> > event be mentioned here. >> >> I can try to use 1ms for IOTLB, Context and IEC invalidation. As mentioned, 1 ms is >> enough for IOTLB, Context and IEC invalidation. >> What about 10 ms for Device-TLB (10 ms is just a higher timeout, no specific meaning)? > > I remember in earlier discussion we agreed to use 1ms as the default for both > IOMMU-side and device-side flushes. For device-side flushes, we checked internal > HW team that 1ms is a reasonable threshold for integrated devices. It's likely > insufficient for discrete devices. We may check any automatic adjustment method > later when it becomes a real problem. For now, please elaborate above information > in the text. Well, taking care of automation later is fine with me, but tying everything to a single timeout, when device iotlb invalidation may require a much larger value, isn't. >> > Apart from that aspect this patch seems to be ready, but will clearly need a VT- >> > d maintainer's ack. >> > >> >> Thanks for your review. I will also test this patch against the last commit ( I am still out >> of office and I will do it around this Wednesday). >> > > Jan, do you want me to ack this series now, or wait until previous series > handling error is checked in? (I thought you wanted the later in one of > your replies earlier). Either way is OK to me. :-) Well, I'm not to tell you when to ack a certain series. You should ack it once it meets the criteria for you to ack it. The dependency on the other series is what should be spelled out explicitly in this series, or even better this series should be merged with the other one (to the dependency make most obvious). But I'm repeating myself... Jan
On May 17, 2016 3:48 PM, Jan Beulich <JBeulich@suse.com> wrote: > >>> On 17.05.16 at 05:19, <kevin.tian@intel.com> wrote: > >> From: Xu, Quan > >> Sent: Monday, May 16, 2016 11:26 PM > >> > >> On May 13, 2016 11:28 PM, Jan Beulich <JBeulich@suse.com> wrote: > >> > >>> On 22.04.16 at 12:54, <quan.xu@intel.com> wrote: > >> > > --- a/docs/misc/xen-command-line.markdown > >> > > +++ b/docs/misc/xen-command-line.markdown > >> > > @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is also > >> > specified vpmu will be turned off. > >> > > As the virtualisation is not 100% safe, don't use the vpmu flag > >> > > on production systems (see http://xenbits.xen.org/xsa/advisory- > 163.html)! > >> > > > >> > > +### vtd\_qi\_timeout (VT-d) > >> > > +> `= <integer>` > >> > > + > >> > > +> Default: `1` > >> > > + > >> > > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. > >> > > + > >> > > +By default, the timeout is 1ms. When you see error 'Queue > >> > > +invalidate wait descriptor timed out', try increasing this value. > >> > > >> > So when someone enables ATS, will the 1ms timeout apply to the dev > >> > iotlb invalidations too? > >> > >> Yes, > >> The timeout is the same for IOTLB, Context, IEC and Device-TLB invalidation. > >> > >> > If so, that's surely too short, and would ideally be adjusted > >> > automatically, but the need for a higher timeout in that case > >> > should in any event be mentioned here. > >> > >> I can try to use 1ms for IOTLB, Context and IEC invalidation. As > >> mentioned, 1 ms is enough for IOTLB, Context and IEC invalidation. > >> What about 10 ms for Device-TLB (10 ms is just a higher timeout, no > specific meaning)? > > > > I remember in earlier discussion we agreed to use 1ms as the default > > for both IOMMU-side and device-side flushes. For device-side flushes, > > we checked internal HW team that 1ms is a reasonable threshold for > > integrated devices. It's likely insufficient for discrete devices. We > > may check any automatic adjustment method later when it becomes a real > > problem. For now, please elaborate above information in the text. > > Well, taking care of automation later is fine with me, > but tying everything to a > single timeout, when device iotlb invalidation may require a much larger value, > isn't. > A little bit confused. Check it -- could I leave patch 1/3 as is? btw, I have tested it against the last commit, no conflict. Quan
>>> On 18.05.16 at 14:53, <quan.xu@intel.com> wrote: > On May 17, 2016 3:48 PM, Jan Beulich <JBeulich@suse.com> wrote: >> >>> On 17.05.16 at 05:19, <kevin.tian@intel.com> wrote: >> >> From: Xu, Quan >> >> Sent: Monday, May 16, 2016 11:26 PM >> >> >> >> On May 13, 2016 11:28 PM, Jan Beulich <JBeulich@suse.com> wrote: >> >> > >>> On 22.04.16 at 12:54, <quan.xu@intel.com> wrote: >> >> > > --- a/docs/misc/xen-command-line.markdown >> >> > > +++ b/docs/misc/xen-command-line.markdown >> >> > > @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is also >> >> > specified vpmu will be turned off. >> >> > > As the virtualisation is not 100% safe, don't use the vpmu flag >> >> > > on production systems (see http://xenbits.xen.org/xsa/advisory- >> 163.html)! >> >> > > >> >> > > +### vtd\_qi\_timeout (VT-d) >> >> > > +> `= <integer>` >> >> > > + >> >> > > +> Default: `1` >> >> > > + >> >> > > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. >> >> > > + >> >> > > +By default, the timeout is 1ms. When you see error 'Queue >> >> > > +invalidate wait descriptor timed out', try increasing this value. >> >> > >> >> > So when someone enables ATS, will the 1ms timeout apply to the dev >> >> > iotlb invalidations too? >> >> >> >> Yes, >> >> The timeout is the same for IOTLB, Context, IEC and Device-TLB invalidation. >> >> >> >> > If so, that's surely too short, and would ideally be adjusted >> >> > automatically, but the need for a higher timeout in that case >> >> > should in any event be mentioned here. >> >> >> >> I can try to use 1ms for IOTLB, Context and IEC invalidation. As >> >> mentioned, 1 ms is enough for IOTLB, Context and IEC invalidation. >> >> What about 10 ms for Device-TLB (10 ms is just a higher timeout, no >> specific meaning)? >> > >> > I remember in earlier discussion we agreed to use 1ms as the default >> > for both IOMMU-side and device-side flushes. For device-side flushes, >> > we checked internal HW team that 1ms is a reasonable threshold for >> > integrated devices. It's likely insufficient for discrete devices. We >> > may check any automatic adjustment method later when it becomes a real >> > problem. For now, please elaborate above information in the text. >> >> Well, taking care of automation later is fine with me, >> but tying everything to a >> single timeout, when device iotlb invalidation may require a much larger value, >> isn't. >> > > A little bit confused. Check it -- could I leave patch 1/3 as is? The patch can imo remain as is only if the new default timeout is large enough for all possible cases (including those users who are adventurous enough to turn on ATS). > btw, I have tested it against the last commit, no conflict. No idea what you mean to say with this. Jan
> From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Wednesday, May 18, 2016 11:05 PM > > >>> On 18.05.16 at 14:53, <quan.xu@intel.com> wrote: > > On May 17, 2016 3:48 PM, Jan Beulich <JBeulich@suse.com> wrote: > >> >>> On 17.05.16 at 05:19, <kevin.tian@intel.com> wrote: > >> >> From: Xu, Quan > >> >> Sent: Monday, May 16, 2016 11:26 PM > >> >> > >> >> On May 13, 2016 11:28 PM, Jan Beulich <JBeulich@suse.com> wrote: > >> >> > >>> On 22.04.16 at 12:54, <quan.xu@intel.com> wrote: > >> >> > > --- a/docs/misc/xen-command-line.markdown > >> >> > > +++ b/docs/misc/xen-command-line.markdown > >> >> > > @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is also > >> >> > specified vpmu will be turned off. > >> >> > > As the virtualisation is not 100% safe, don't use the vpmu flag > >> >> > > on production systems (see http://xenbits.xen.org/xsa/advisory- > >> 163.html)! > >> >> > > > >> >> > > +### vtd\_qi\_timeout (VT-d) > >> >> > > +> `= <integer>` > >> >> > > + > >> >> > > +> Default: `1` > >> >> > > + > >> >> > > +Specify the timeout of the VT-d Queued Invalidation in milliseconds. > >> >> > > + > >> >> > > +By default, the timeout is 1ms. When you see error 'Queue > >> >> > > +invalidate wait descriptor timed out', try increasing this value. > >> >> > > >> >> > So when someone enables ATS, will the 1ms timeout apply to the dev > >> >> > iotlb invalidations too? > >> >> > >> >> Yes, > >> >> The timeout is the same for IOTLB, Context, IEC and Device-TLB invalidation. > >> >> > >> >> > If so, that's surely too short, and would ideally be adjusted > >> >> > automatically, but the need for a higher timeout in that case > >> >> > should in any event be mentioned here. > >> >> > >> >> I can try to use 1ms for IOTLB, Context and IEC invalidation. As > >> >> mentioned, 1 ms is enough for IOTLB, Context and IEC invalidation. > >> >> What about 10 ms for Device-TLB (10 ms is just a higher timeout, no > >> specific meaning)? > >> > > >> > I remember in earlier discussion we agreed to use 1ms as the default > >> > for both IOMMU-side and device-side flushes. For device-side flushes, > >> > we checked internal HW team that 1ms is a reasonable threshold for > >> > integrated devices. It's likely insufficient for discrete devices. We > >> > may check any automatic adjustment method later when it becomes a real > >> > problem. For now, please elaborate above information in the text. > >> > >> Well, taking care of automation later is fine with me, > >> but tying everything to a > >> single timeout, when device iotlb invalidation may require a much larger value, > >> isn't. > >> > > > > A little bit confused. Check it -- could I leave patch 1/3 as is? > > The patch can imo remain as is only if the new default timeout > is large enough for all possible cases (including those users > who are adventurous enough to turn on ATS). > > > btw, I have tested it against the last commit, no conflict. > > No idea what you mean to say with this. > A single default value for both IOMMU-side and device-side is anyway not optimal. What about introducing a new knob e.g. vtd_qi_device_timeout specifically for device-side flush while using vtd_qi_timeout for other places? If device-side timeout is not specified, it is then default to vtd_qi_timeout. Thanks Kevin
On May 19, 2016 8:33 AM, Tian, Kevin <kevin.tian@intel.com> wrote: > > From: Jan Beulich [mailto:JBeulich@suse.com] > > Sent: Wednesday, May 18, 2016 11:05 PM > > > > >>> On 18.05.16 at 14:53, <quan.xu@intel.com> wrote: > > > On May 17, 2016 3:48 PM, Jan Beulich <JBeulich@suse.com> wrote: > > >> >>> On 17.05.16 at 05:19, <kevin.tian@intel.com> wrote: > > >> >> From: Xu, Quan > > >> >> Sent: Monday, May 16, 2016 11:26 PM > > >> >> > > >> >> On May 13, 2016 11:28 PM, Jan Beulich <JBeulich@suse.com> wrote: > > >> >> > >>> On 22.04.16 at 12:54, <quan.xu@intel.com> wrote: > > >> >> > > --- a/docs/misc/xen-command-line.markdown > > >> >> > > +++ b/docs/misc/xen-command-line.markdown > > >> >> > > @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is > > >> >> > > also > > >> >> > specified vpmu will be turned off. > > >> >> > > As the virtualisation is not 100% safe, don't use the vpmu > > >> >> > > flag on production systems (see > > >> >> > > http://xenbits.xen.org/xsa/advisory- > > >> 163.html)! > > >> >> > > > > >> >> > > +### vtd\_qi\_timeout (VT-d) > > >> >> > > +> `= <integer>` > > >> >> > > + > > >> >> > > +> Default: `1` > > >> >> > > + > > >> >> > > +Specify the timeout of the VT-d Queued Invalidation in > milliseconds. > > >> >> > > + > > >> >> > > +By default, the timeout is 1ms. When you see error 'Queue > > >> >> > > +invalidate wait descriptor timed out', try increasing this value. > > >> >> > > > >> >> > So when someone enables ATS, will the 1ms timeout apply to the > > >> >> > dev iotlb invalidations too? > > >> >> > > >> >> Yes, > > >> >> The timeout is the same for IOTLB, Context, IEC and Device-TLB > invalidation. > > >> >> > > >> >> > If so, that's surely too short, and would ideally be adjusted > > >> >> > automatically, but the need for a higher timeout in that case > > >> >> > should in any event be mentioned here. > > >> >> > > >> >> I can try to use 1ms for IOTLB, Context and IEC invalidation. > > >> >> As mentioned, 1 ms is enough for IOTLB, Context and IEC invalidation. > > >> >> What about 10 ms for Device-TLB (10 ms is just a higher timeout, > > >> >> no > > >> specific meaning)? > > >> > > > >> > I remember in earlier discussion we agreed to use 1ms as the > > >> > default for both IOMMU-side and device-side flushes. For > > >> > device-side flushes, we checked internal HW team that 1ms is a > > >> > reasonable threshold for integrated devices. It's likely > > >> > insufficient for discrete devices. We may check any automatic > > >> > adjustment method later when it becomes a real problem. For now, > please elaborate above information in the text. > > >> > > >> Well, taking care of automation later is fine with me, but tying > > >> everything to a single timeout, when device iotlb invalidation may > > >> require a much larger value, isn't. > > >> > > > > > > A little bit confused. Check it -- could I leave patch 1/3 as is? > > > > The patch can imo remain as is only if the new default timeout is > > large enough for all possible cases (including those users who are > > adventurous enough to turn on ATS). > > Jan, I only have an ATS device (MYRICOM Inc. Myri-10G Dual-Protocol NIC). 1 ms is large enough for invalidation so far. Any suggestion for this new default timeout? > A single default value for both IOMMU-side and device-side is anyway not > optimal. What about introducing a new knob e.g. vtd_qi_device_timeout > specifically for device-side flush while using vtd_qi_timeout for other places? If > device-side timeout is not specified, it is then default to vtd_qi_timeout. > IMO, we are better to introduce only one variable for VT-d invalidation. The users may be not interested in such a detailed VT-d knowledge. Also this is taking into consideration of consistency. :) Quan
>>> "Xu, Quan" <quan.xu@intel.com> 05/19/16 3:35 AM >>> >On May 19, 2016 8:33 AM, Tian, Kevin <kevin.tian@intel.com> wrote: >> > From: Jan Beulich [mailto:JBeulich@suse.com] >> > Sent: Wednesday, May 18, 2016 11:05 PM >> > >>> On 18.05.16 at 14:53, <quan.xu@intel.com> wrote: >> > The patch can imo remain as is only if the new default timeout is >> > large enough for all possible cases (including those users who are >> > adventurous enough to turn on ATS). > >I only have an ATS device (MYRICOM Inc. Myri-10G Dual-Protocol NIC). 1 ms is large enough for invalidation so far. >Any suggestion for this new default timeout? Unless you have theoretical proof that a lower than the current value would suffice (as you do for the IOMMU side flushes), I think the default needs to remain the same as it is right now (which iirc is already _much_ lower than the real theoretical one). >> A single default value for both IOMMU-side and device-side is anyway not >> optimal. What about introducing a new knob e.g. vtd_qi_device_timeout >> specifically for device-side flush while using vtd_qi_timeout for other places? If >> device-side timeout is not specified, it is then default to vtd_qi_timeout. There should imo be a single command line option, allowing for two values to be passed (e.g. comma-separated). Jan
On May 19, 2016 2:13 PM, Jan Beulich <jbeulich@suse.com> wrote: > >>> "Xu, Quan" <quan.xu@intel.com> 05/19/16 3:35 AM >>> > >On May 19, 2016 8:33 AM, Tian, Kevin <kevin.tian@intel.com> wrote: > >> A single default value for both IOMMU-side and device-side is anyway > >> not optimal. What about introducing a new knob e.g. > >> vtd_qi_device_timeout specifically for device-side flush while using > >> vtd_qi_timeout for other places? If device-side timeout is not specified, it is > then default to vtd_qi_timeout. > > There should imo be a single command line option, allowing for two values to > be passed (e.g. comma-separated). > As mentioned, 1 ms is enough for VT-d IOTLB/Context/IEC invalidation, so we are no need to increasing the value of timeout or introduce a boot-time changed parameter. What about a constant (e.g. a macro), 1 ms, for VT-d IOTLB/Context/IEC invalidation timeout. For Device-TLB invalidation, we can introduce 'vtd_qi_device_timeout', which is boot-time changed, and 1 ms by default. Quan
>>> On 19.05.16 at 13:26, <quan.xu@intel.com> wrote: > On May 19, 2016 2:13 PM, Jan Beulich <jbeulich@suse.com> wrote: >> >>> "Xu, Quan" <quan.xu@intel.com> 05/19/16 3:35 AM >>> >> >On May 19, 2016 8:33 AM, Tian, Kevin <kevin.tian@intel.com> wrote: >> >> A single default value for both IOMMU-side and device-side is anyway >> >> not optimal. What about introducing a new knob e.g. >> >> vtd_qi_device_timeout specifically for device-side flush while using >> >> vtd_qi_timeout for other places? If device-side timeout is not specified, it is >> then default to vtd_qi_timeout. >> >> There should imo be a single command line option, allowing for two values to >> be passed (e.g. comma-separated). > > As mentioned, 1 ms is enough for VT-d IOTLB/Context/IEC invalidation, so we > are no need to increasing the value of timeout or introduce a boot-time > changed parameter. > What about a constant (e.g. a macro), 1 ms, for VT-d IOTLB/Context/IEC > invalidation timeout. If you're absolutely certain no-one will ever find a need to increase that value - sure. > For Device-TLB invalidation, we can introduce 'vtd_qi_device_timeout', which > is boot-time changed, and 1 ms by default. One question is whether making this a VT-d specific command line option is a good idea: Other IOMMU implementations ought to be in need of doing device IOTLB invalidation too, at least sooner or later. The other question is whether any timeout lower than the current one can be considered safe (and hence is usable as a default). Jan
On May 19, 2016 7:36 PM, Jan Beulich <JBeulich@suse.com> wrote: > >>> On 19.05.16 at 13:26, <quan.xu@intel.com> wrote: > > On May 19, 2016 2:13 PM, Jan Beulich <jbeulich@suse.com> wrote: > >> >>> "Xu, Quan" <quan.xu@intel.com> 05/19/16 3:35 AM >>> > >> >On May 19, 2016 8:33 AM, Tian, Kevin <kevin.tian@intel.com> wrote: > >> >> A single default value for both IOMMU-side and device-side is > >> >> anyway not optimal. What about introducing a new knob e.g. > >> >> vtd_qi_device_timeout specifically for device-side flush while > >> >> using vtd_qi_timeout for other places? If device-side timeout is > >> >> not specified, it is > >> then default to vtd_qi_timeout. > >> > >> There should imo be a single command line option, allowing for two > >> values to be passed (e.g. comma-separated). > > > > As mentioned, 1 ms is enough for VT-d IOTLB/Context/IEC invalidation, > > so we are no need to increasing the value of timeout or introduce a > > boot-time changed parameter. > > What about a constant (e.g. a macro), 1 ms, for VT-d IOTLB/Context/IEC > > invalidation timeout. > > If you're absolutely certain no-one will ever find a need to increase that value - > sure. > Sure. Also this was mentioned in Kevin's ' Revisit VT-d asynchronous flush issue ' , http://lists.xenproject.org/archives/html/xen-devel/2015-11/msg00041.html """-For context/iotlb/iec flush, our measurements show worst cases <10us. We also confirmed with hardware team, that 1ms is large enough for IOMMU internal flush. """ > > For Device-TLB invalidation, we can introduce 'vtd_qi_device_timeout', > > which is boot-time changed, and 1 ms by default. > > One question is whether making this a VT-d specific command line option is a > good idea: Other IOMMU implementations ought to be in need of doing > device IOTLB invalidation too, at least sooner or later. > This is indeed farsighted. Agreed. > The other question is whether any timeout lower than the current one can be > considered safe (and hence is usable as a default). > Actually the criticality, 1 ms, is from hardware team. IOW, Our hardware team confirmed that 1ms should be enough for integrated PCI devices with ATS. for discrete PCI devices with ATS, it's uncertain whether 1ms or 10ms is too restrictive to them, but there are only a few devices now in the market. Quan
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index ca77e3b..0b83068 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1532,6 +1532,16 @@ Note that if **watchdog** option is also specified vpmu will be turned off. As the virtualisation is not 100% safe, don't use the vpmu flag on production systems (see http://xenbits.xen.org/xsa/advisory-163.html)! +### vtd\_qi\_timeout (VT-d) +> `= <integer>` + +> Default: `1` + +Specify the timeout of the VT-d Queued Invalidation in milliseconds. + +By default, the timeout is 1ms. When you see error 'Queue invalidate wait +descriptor timed out', try increasing this value. + ### watchdog > `= force | <boolean>` diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c index b81b0bd..52ba2c2 100644 --- a/xen/drivers/passthrough/vtd/qinval.c +++ b/xen/drivers/passthrough/vtd/qinval.c @@ -28,6 +28,11 @@ #include "vtd.h" #include "extern.h" +static unsigned int __read_mostly vtd_qi_timeout = 1; +integer_param("vtd_qi_timeout", vtd_qi_timeout); + +#define IOMMU_QI_TIMEOUT (vtd_qi_timeout * MILLISECS(1)) + static void print_qi_regs(struct iommu *iommu) { u64 val; @@ -130,10 +135,10 @@ static void queue_invalidate_iotlb(struct iommu *iommu, spin_unlock_irqrestore(&iommu->register_lock, flags); } -static int queue_invalidate_wait(struct iommu *iommu, +static int __must_check queue_invalidate_wait(struct iommu *iommu, u8 iflag, u8 sw, u8 fn) { - s_time_t start_time; + s_time_t timeout; volatile u32 poll_slot = QINVAL_STAT_INIT; unsigned int index; unsigned long flags; @@ -164,13 +169,15 @@ static int queue_invalidate_wait(struct iommu *iommu, if ( sw ) { /* In case all wait descriptor writes to same addr with same data */ - start_time = NOW(); + timeout = NOW() + IOMMU_QI_TIMEOUT; while ( poll_slot != QINVAL_STAT_DONE ) { - if ( NOW() > (start_time + DMAR_OPERATION_TIMEOUT) ) + if ( NOW() > timeout ) { print_qi_regs(iommu); - panic("queue invalidate wait descriptor was not executed"); + printk(XENLOG_WARNING VTDPREFIX + "Queue invalidate wait descriptor timed out.\n"); + return -ETIMEDOUT; } cpu_relax(); }
The parameter 'vtd_qi_timeout' specifies the timeout of the VT-d Queued Invalidation in milliseconds. By default, the timeout is 1ms, which can be boot-time changed. Add a __must_check annotation. The followup patch titled 'VT-d IOTLB/Context/IEC flush issue' addresses the __mustcheck. That is the other callers of this routine (two or three levels up) ignore the return code. This patch does not address this but the other does. Signed-off-by: Quan Xu <quan.xu@intel.com> --- docs/misc/xen-command-line.markdown | 10 ++++++++++ xen/drivers/passthrough/vtd/qinval.c | 17 ++++++++++++----- 2 files changed, 22 insertions(+), 5 deletions(-)