Message ID | 20160803160037.ydi3ztjgjtbygxg6@mac (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 03.08.16 at 18:00, <roger.pau@citrix.com> wrote: > On Wed, Aug 03, 2016 at 09:37:41AM -0600, Jan Beulich wrote: >> >>> On 03.08.16 at 17:28, <george.dunlap@citrix.com> wrote: >> > On 03/08/16 16:25, Jan Beulich wrote: >> >>>>> On 03.08.16 at 17:11, <George.Dunlap@eu.citrix.com> wrote: >> >>> On Tue, Aug 2, 2016 at 5:12 PM, Jan Beulich <JBeulich@suse.com> wrote: >> >>>>>>> On 02.08.16 at 17:49, <roger.pau@citrix.com> wrote: >> >>>>> On Tue, Aug 02, 2016 at 11:47:22AM +0200, Roger Pau Monne wrote: >> >>>>>> On Fri, Jul 29, 2016 at 05:47:24PM +0100, Andrew Cooper wrote: >> >>>>>>> As this is for the construction of dom0, it would be better to take a >> >>>>>>> preemptible pointer, loop in construct_dom0(), with a >> >>>>>>> process_pending_softirqs() call in between. >> >>>>>> >> >>>>>> Now fixed. >> >>>>> >> >>>>> Hm, I have to stand corrected, using hypercall_preempt_check (as >> >>>>> any of the *_set_allocation function use), is not safe at this point: >> >>>>> >> >>>>> (XEN) ----[ Xen-4.8-unstable x86_64 debug=y Tainted: C ]---- >> >>>>> (XEN) CPU: 0 >> >>>>> (XEN) RIP: e008:[<ffff82d08022fd47>] >> >>> hap.c#local_events_need_delivery+0x27/0x40 >> >>>>> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor >> >>>>> (XEN) rax: 0000000000000000 rbx: ffff83023f5a5000 rcx: ffff82d080312900 >> >>>>> (XEN) rdx: 0000000000000001 rsi: ffff83023f5a56c8 rdi: ffff8300b213d000 >> >>>>> (XEN) rbp: ffff82d080307cc8 rsp: ffff82d080307cc8 r8: 0180000000000000 >> >>>>> (XEN) r9: 0000000000000000 r10: 0000000000247000 r11: ffff82d08029a5b0 >> >>>>> (XEN) r12: 0000000000000011 r13: 00000000000023ac r14: ffff82d080307d4c >> >>>>> (XEN) r15: ffff83023f5a56c8 cr0: 000000008005003b cr4: 00000000001526e0 >> >>>>> (XEN) cr3: 00000000b20fc000 cr2: 0000000000000000 >> >>>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 >> >>>>> (XEN) Xen code around <ffff82d08022fd47> >> >>> (hap.c#local_events_need_delivery+0x27/0x40): >> >>>>> (XEN) 0d ad fa ff 48 8b 47 08 <80> 38 00 74 09 80 78 01 00 0f 94 c0 eb 02 31 > >> >>> c0 >> >>>>> (XEN) Xen stack trace from rsp=ffff82d080307cc8: >> >>>>> (XEN) ffff82d080307d08 ffff82d08022fc47 0000000000000000 > ffff83023f5a5000 >> >>>>> (XEN) ffff83023f5a5648 0000000000000000 ffff82d080307d4c > 0000000000002400 >> >>>>> (XEN) ffff82d080307d38 ffff82d08020008c 00000000000ffffd > ffff8300b1efd000 >> >>>>> (XEN) ffff83023f5a5000 ffff82d080307d4c ffff82d080307d78 > ffff82d0802cad30 >> >>>>> (XEN) 0000000000203000 ffff83023f5a5000 ffff82d0802bf860 > 0000000000000000 >> >>>>> (XEN) 0000000000000001 ffff83000008bef0 ffff82d080307de8 > ffff82d0802c91e0 >> >>>>> (XEN) ffff82d080307de8 ffff82d080143900 ffff82d080307de8 > 0000000000000000 >> >>>>> (XEN) ffff83000008bf00 ffff82d0802eb480 ffff82d080307dc4 > ffff82d08028b1cd >> >>>>> (XEN) ffff83000008bf00 0000000000000000 0000000000000001 > ffff83023f5a5000 >> >>>>> (XEN) ffff82d080307f08 ffff82d0802bf0c9 0000000000000000 > 0000000000000000 >> >>>>> (XEN) 0000000000000000 ffff82d080307f18 ffff83000008bee0 > 0000000000000001 >> >>>>> (XEN) 0000000000000001 0000000000000001 0000000000000000 > 0000000000100000 >> >>>>> (XEN) 0000000000000001 0000000000247000 ffff83000008bef4 > 0000000000100000 >> >>>>> (XEN) ffff830100000000 0000000000247001 0000000000000014 > 0000000100000000 >> >>>>> (XEN) ffff8300ffffffec ffff83000008bef0 ffff82d0802e0640 > ffff83000008bfb0 >> >>>>> (XEN) 0000000000000000 0000000000000000 0000000000000111 > 0000000800000000 >> >>>>> (XEN) 000000010000006e 0000000000000003 00000000000002f8 > 0000000000000000 >> >>>>> (XEN) 00000000ad5c0bd0 0000000000000000 0000000000000001 > 0000000000000008 >> >>>>> (XEN) 0000000000000000 ffff82d080100073 0000000000000000 > 0000000000000000 >> >>>>> (XEN) 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 >> >>>>> (XEN) Xen call trace: >> >>>>> (XEN) [<ffff82d08022fd47>] hap.c#local_events_need_delivery+0x27/0x40 >> >>>>> (XEN) [<ffff82d08022fc47>] hap_set_allocation+0x107/0x130 >> >>>>> (XEN) [<ffff82d08020008c>] paging_set_allocation+0x4c/0x80 >> >>>>> (XEN) [<ffff82d0802cad30>] domain_build.c#hvm_setup_p2m+0x70/0x1a0 >> >>>>> (XEN) [<ffff82d0802c91e0>] domain_build.c#construct_dom0_hvm+0x60/0x120 >> >>>>> (XEN) [<ffff82d0802bf0c9>] __start_xen+0x1ea9/0x23a0 >> >>>>> (XEN) [<ffff82d080100073>] __high_start+0x53/0x60 >> >>>>> (XEN) >> >>>>> (XEN) Pagetable walk from 0000000000000000: >> >>>> >> >>>> Sadly you don't make clear what pointer it is that is NULL at that point. >> >>> >> >>> It sounds from what he says in the following paragraph like current is > NULL. >> >> >> >> I don't recall us re-setting current to this late in the boot process. >> >> Even during early boot we set it to a bogus non-NULL value rather >> >> than NULL. >> >> >> >>>>> I've tried setting current to d->vcpu[0], but that just makes the call to >> >>>>> hypercall_preempt_check crash in some scheduler assert. In any case, I've >> >>>>> added the preempt parameter to the paging_set_allocation function, but I >> >>>>> don't plan to use it in the domain builder for the time being. Does that >> >>>>> sound right? >> >>>> >> >>>> Not really, new huge latency issues like this shouldn't be reintroduced; >> >>>> we've been fighting hard to get rid of those (and we still occasionally >> >>>> find some no-one had noticed before). >> >>> >> >>> You mean latency in processing softirqs? >> >>> >> >>> Maybe what we need to do is to make local_events_need_delivery() safe >> >>> to call at this point by having it return 0 if current is NULL rather >> >>> than crashing? >> >> >> >> That would have the same effect - no softirq processing, and hence >> >> possible time issues on huge systems. >> > >> > No, local_events_delivery() only checks to see if the current vcpu has >> > outstanding virtual interrupts. The other half of >> > hypercall_preempt_check() checks for softirqs, which doesn't appear to >> > rely on having current available. >> >> Good point, but >> - current should nevertheless not be NULL (afaict at least), >> - hypercall_preempt_check() is probably the wrong construct, >> as we're no in a hypercall. >> The latter of course could be addressed by, as you did suggest, >> some refinement to one of the pieces it's being made up from, >> but I'm not sure that would be better than perhaps making its >> invocation conditional (with some better alternative in the "else" >> case) in hap_set_allocation(). Not the least because any >> adjustment to hypercall_preempt_check() itself would affect all >> other of its users. > > I had added the following patch to my queue in order to fix this: > > --- > xen/x86: allow calling hypercall_preempt_check with the idle domain > > This allows using hypercall_preempt_check while building Dom0. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > diff --git a/xen/include/asm-x86/event.h b/xen/include/asm-x86/event.h > index a82062e..d55a8bd 100644 > --- a/xen/include/asm-x86/event.h > +++ b/xen/include/asm-x86/event.h > @@ -23,6 +23,10 @@ int hvm_local_events_need_delivery(struct vcpu *v); > static inline int local_events_need_delivery(void) > { > struct vcpu *v = current; > + > + if ( is_idle_vcpu(v) ) > + return 0; As said, I think it would be better to not add it here, unless there is a significant amount of other calls into here from idle vCPU-s with your changes. Jan
On Wed, Aug 03, 2016 at 10:15:51AM -0600, Jan Beulich wrote: > >>> On 03.08.16 at 18:00, <roger.pau@citrix.com> wrote: > > On Wed, Aug 03, 2016 at 09:37:41AM -0600, Jan Beulich wrote: > >> >>> On 03.08.16 at 17:28, <george.dunlap@citrix.com> wrote: > >> > On 03/08/16 16:25, Jan Beulich wrote: > >> >>>>> On 03.08.16 at 17:11, <George.Dunlap@eu.citrix.com> wrote: > >> >>> On Tue, Aug 2, 2016 at 5:12 PM, Jan Beulich <JBeulich@suse.com> wrote: > >> >>>>>>> On 02.08.16 at 17:49, <roger.pau@citrix.com> wrote: > >> >>>>> On Tue, Aug 02, 2016 at 11:47:22AM +0200, Roger Pau Monne wrote: > >> >>>>>> On Fri, Jul 29, 2016 at 05:47:24PM +0100, Andrew Cooper wrote: > >> >>>>>>> As this is for the construction of dom0, it would be better to take a > >> >>>>>>> preemptible pointer, loop in construct_dom0(), with a > >> >>>>>>> process_pending_softirqs() call in between. > >> >>>>>> > >> >>>>>> Now fixed. > >> >>>>> > >> >>>>> Hm, I have to stand corrected, using hypercall_preempt_check (as > >> >>>>> any of the *_set_allocation function use), is not safe at this point: > >> >>>>> > >> >>>>> (XEN) ----[ Xen-4.8-unstable x86_64 debug=y Tainted: C ]---- > >> >>>>> (XEN) CPU: 0 > >> >>>>> (XEN) RIP: e008:[<ffff82d08022fd47>] > >> >>> hap.c#local_events_need_delivery+0x27/0x40 > >> >>>>> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor > >> >>>>> (XEN) rax: 0000000000000000 rbx: ffff83023f5a5000 rcx: ffff82d080312900 > >> >>>>> (XEN) rdx: 0000000000000001 rsi: ffff83023f5a56c8 rdi: ffff8300b213d000 > >> >>>>> (XEN) rbp: ffff82d080307cc8 rsp: ffff82d080307cc8 r8: 0180000000000000 > >> >>>>> (XEN) r9: 0000000000000000 r10: 0000000000247000 r11: ffff82d08029a5b0 > >> >>>>> (XEN) r12: 0000000000000011 r13: 00000000000023ac r14: ffff82d080307d4c > >> >>>>> (XEN) r15: ffff83023f5a56c8 cr0: 000000008005003b cr4: 00000000001526e0 > >> >>>>> (XEN) cr3: 00000000b20fc000 cr2: 0000000000000000 > >> >>>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > >> >>>>> (XEN) Xen code around <ffff82d08022fd47> > >> >>> (hap.c#local_events_need_delivery+0x27/0x40): > >> >>>>> (XEN) 0d ad fa ff 48 8b 47 08 <80> 38 00 74 09 80 78 01 00 0f 94 c0 eb 02 31 > > > >> >>> c0 > >> >>>>> (XEN) Xen stack trace from rsp=ffff82d080307cc8: > >> >>>>> (XEN) ffff82d080307d08 ffff82d08022fc47 0000000000000000 > > ffff83023f5a5000 > >> >>>>> (XEN) ffff83023f5a5648 0000000000000000 ffff82d080307d4c > > 0000000000002400 > >> >>>>> (XEN) ffff82d080307d38 ffff82d08020008c 00000000000ffffd > > ffff8300b1efd000 > >> >>>>> (XEN) ffff83023f5a5000 ffff82d080307d4c ffff82d080307d78 > > ffff82d0802cad30 > >> >>>>> (XEN) 0000000000203000 ffff83023f5a5000 ffff82d0802bf860 > > 0000000000000000 > >> >>>>> (XEN) 0000000000000001 ffff83000008bef0 ffff82d080307de8 > > ffff82d0802c91e0 > >> >>>>> (XEN) ffff82d080307de8 ffff82d080143900 ffff82d080307de8 > > 0000000000000000 > >> >>>>> (XEN) ffff83000008bf00 ffff82d0802eb480 ffff82d080307dc4 > > ffff82d08028b1cd > >> >>>>> (XEN) ffff83000008bf00 0000000000000000 0000000000000001 > > ffff83023f5a5000 > >> >>>>> (XEN) ffff82d080307f08 ffff82d0802bf0c9 0000000000000000 > > 0000000000000000 > >> >>>>> (XEN) 0000000000000000 ffff82d080307f18 ffff83000008bee0 > > 0000000000000001 > >> >>>>> (XEN) 0000000000000001 0000000000000001 0000000000000000 > > 0000000000100000 > >> >>>>> (XEN) 0000000000000001 0000000000247000 ffff83000008bef4 > > 0000000000100000 > >> >>>>> (XEN) ffff830100000000 0000000000247001 0000000000000014 > > 0000000100000000 > >> >>>>> (XEN) ffff8300ffffffec ffff83000008bef0 ffff82d0802e0640 > > ffff83000008bfb0 > >> >>>>> (XEN) 0000000000000000 0000000000000000 0000000000000111 > > 0000000800000000 > >> >>>>> (XEN) 000000010000006e 0000000000000003 00000000000002f8 > > 0000000000000000 > >> >>>>> (XEN) 00000000ad5c0bd0 0000000000000000 0000000000000001 > > 0000000000000008 > >> >>>>> (XEN) 0000000000000000 ffff82d080100073 0000000000000000 > > 0000000000000000 > >> >>>>> (XEN) 0000000000000000 0000000000000000 0000000000000000 > > 0000000000000000 > >> >>>>> (XEN) Xen call trace: > >> >>>>> (XEN) [<ffff82d08022fd47>] hap.c#local_events_need_delivery+0x27/0x40 > >> >>>>> (XEN) [<ffff82d08022fc47>] hap_set_allocation+0x107/0x130 > >> >>>>> (XEN) [<ffff82d08020008c>] paging_set_allocation+0x4c/0x80 > >> >>>>> (XEN) [<ffff82d0802cad30>] domain_build.c#hvm_setup_p2m+0x70/0x1a0 > >> >>>>> (XEN) [<ffff82d0802c91e0>] domain_build.c#construct_dom0_hvm+0x60/0x120 > >> >>>>> (XEN) [<ffff82d0802bf0c9>] __start_xen+0x1ea9/0x23a0 > >> >>>>> (XEN) [<ffff82d080100073>] __high_start+0x53/0x60 > >> >>>>> (XEN) > >> >>>>> (XEN) Pagetable walk from 0000000000000000: > >> >>>> > >> >>>> Sadly you don't make clear what pointer it is that is NULL at that point. > >> >>> > >> >>> It sounds from what he says in the following paragraph like current is > > NULL. > >> >> > >> >> I don't recall us re-setting current to this late in the boot process. > >> >> Even during early boot we set it to a bogus non-NULL value rather > >> >> than NULL. > >> >> > >> >>>>> I've tried setting current to d->vcpu[0], but that just makes the call to > >> >>>>> hypercall_preempt_check crash in some scheduler assert. In any case, I've > >> >>>>> added the preempt parameter to the paging_set_allocation function, but I > >> >>>>> don't plan to use it in the domain builder for the time being. Does that > >> >>>>> sound right? > >> >>>> > >> >>>> Not really, new huge latency issues like this shouldn't be reintroduced; > >> >>>> we've been fighting hard to get rid of those (and we still occasionally > >> >>>> find some no-one had noticed before). > >> >>> > >> >>> You mean latency in processing softirqs? > >> >>> > >> >>> Maybe what we need to do is to make local_events_need_delivery() safe > >> >>> to call at this point by having it return 0 if current is NULL rather > >> >>> than crashing? > >> >> > >> >> That would have the same effect - no softirq processing, and hence > >> >> possible time issues on huge systems. > >> > > >> > No, local_events_delivery() only checks to see if the current vcpu has > >> > outstanding virtual interrupts. The other half of > >> > hypercall_preempt_check() checks for softirqs, which doesn't appear to > >> > rely on having current available. > >> > >> Good point, but > >> - current should nevertheless not be NULL (afaict at least), > >> - hypercall_preempt_check() is probably the wrong construct, > >> as we're no in a hypercall. > >> The latter of course could be addressed by, as you did suggest, > >> some refinement to one of the pieces it's being made up from, > >> but I'm not sure that would be better than perhaps making its > >> invocation conditional (with some better alternative in the "else" > >> case) in hap_set_allocation(). Not the least because any > >> adjustment to hypercall_preempt_check() itself would affect all > >> other of its users. > > > > I had added the following patch to my queue in order to fix this: > > > > --- > > xen/x86: allow calling hypercall_preempt_check with the idle domain > > > > This allows using hypercall_preempt_check while building Dom0. > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > > > diff --git a/xen/include/asm-x86/event.h b/xen/include/asm-x86/event.h > > index a82062e..d55a8bd 100644 > > --- a/xen/include/asm-x86/event.h > > +++ b/xen/include/asm-x86/event.h > > @@ -23,6 +23,10 @@ int hvm_local_events_need_delivery(struct vcpu *v); > > static inline int local_events_need_delivery(void) > > { > > struct vcpu *v = current; > > + > > + if ( is_idle_vcpu(v) ) > > + return 0; > > As said, I think it would be better to not add it here, unless there > is a significant amount of other calls into here from idle vCPU-s > with your changes. No, the only functions that I use that call hypercall_preempt_check are the _set_allocation ones. I would like to add an ASSERT here to make sure local_events_need_delivery is not called with current == idle_vcpu. In any case, I would go with route 2 and modify _set_allocation to call softirq_pending instead of hypercall_preempt_check if current == idle_vcpu. This should solve the issue and doesn't involve changing hypercall_preempt_check itself. Roger.
>>> On 03.08.16 at 18:24, <roger.pau@citrix.com> wrote: > On Wed, Aug 03, 2016 at 10:15:51AM -0600, Jan Beulich wrote: >> >>> On 03.08.16 at 18:00, <roger.pau@citrix.com> wrote: >> > --- a/xen/include/asm-x86/event.h >> > +++ b/xen/include/asm-x86/event.h >> > @@ -23,6 +23,10 @@ int hvm_local_events_need_delivery(struct vcpu *v); >> > static inline int local_events_need_delivery(void) >> > { >> > struct vcpu *v = current; >> > + >> > + if ( is_idle_vcpu(v) ) >> > + return 0; >> >> As said, I think it would be better to not add it here, unless there >> is a significant amount of other calls into here from idle vCPU-s >> with your changes. > > No, the only functions that I use that call hypercall_preempt_check are the > _set_allocation ones. I would like to add an ASSERT here to make sure > local_events_need_delivery is not called with current == idle_vcpu. That's fine with me. > In any case, I would go with route 2 and modify _set_allocation to call > softirq_pending instead of hypercall_preempt_check if current == idle_vcpu. > This should solve the issue and doesn't involve changing > hypercall_preempt_check itself. Thanks. Jan
diff --git a/xen/include/asm-x86/event.h b/xen/include/asm-x86/event.h index a82062e..d55a8bd 100644 --- a/xen/include/asm-x86/event.h +++ b/xen/include/asm-x86/event.h @@ -23,6 +23,10 @@ int hvm_local_events_need_delivery(struct vcpu *v); static inline int local_events_need_delivery(void) { struct vcpu *v = current; + + if ( is_idle_vcpu(v) ) + return 0; + return (has_hvm_container_vcpu(v) ? hvm_local_events_need_delivery(v) : (vcpu_info(v, evtchn_upcall_pending) && !vcpu_info(v, evtchn_upcall_mask)));