diff mbox

[RFC,01/12] x86/paging: introduce paging_set_allocation

Message ID 20160803160037.ydi3ztjgjtbygxg6@mac (mailing list archive)
State New, archived
Headers show

Commit Message

Roger Pau Monne Aug. 3, 2016, 4 p.m. UTC
On Wed, Aug 03, 2016 at 09:37:41AM -0600, Jan Beulich wrote:
> >>> On 03.08.16 at 17:28, <george.dunlap@citrix.com> wrote:
> > On 03/08/16 16:25, Jan Beulich wrote:
> >>>>> On 03.08.16 at 17:11, <George.Dunlap@eu.citrix.com> wrote:
> >>> On Tue, Aug 2, 2016 at 5:12 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>>>>>> On 02.08.16 at 17:49, <roger.pau@citrix.com> wrote:
> >>>>> On Tue, Aug 02, 2016 at 11:47:22AM +0200, Roger Pau Monne wrote:
> >>>>>> On Fri, Jul 29, 2016 at 05:47:24PM +0100, Andrew Cooper wrote:
> >>>>>>> As this is for the construction of dom0, it would be better to take a
> >>>>>>> preemptible pointer, loop in construct_dom0(), with a
> >>>>>>> process_pending_softirqs() call in between.
> >>>>>>
> >>>>>> Now fixed.
> >>>>>
> >>>>> Hm, I have to stand corrected, using hypercall_preempt_check (as
> >>>>> any of the *_set_allocation function use), is not safe at this point:
> >>>>>
> >>>>> (XEN) ----[ Xen-4.8-unstable  x86_64  debug=y  Tainted:    C  ]----
> >>>>> (XEN) CPU:    0
> >>>>> (XEN) RIP:    e008:[<ffff82d08022fd47>] 
> >>> hap.c#local_events_need_delivery+0x27/0x40
> >>>>> (XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
> >>>>> (XEN) rax: 0000000000000000   rbx: ffff83023f5a5000   rcx: ffff82d080312900
> >>>>> (XEN) rdx: 0000000000000001   rsi: ffff83023f5a56c8   rdi: ffff8300b213d000
> >>>>> (XEN) rbp: ffff82d080307cc8   rsp: ffff82d080307cc8   r8:  0180000000000000
> >>>>> (XEN) r9:  0000000000000000   r10: 0000000000247000   r11: ffff82d08029a5b0
> >>>>> (XEN) r12: 0000000000000011   r13: 00000000000023ac   r14: ffff82d080307d4c
> >>>>> (XEN) r15: ffff83023f5a56c8   cr0: 000000008005003b   cr4: 00000000001526e0
> >>>>> (XEN) cr3: 00000000b20fc000   cr2: 0000000000000000
> >>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> >>>>> (XEN) Xen code around <ffff82d08022fd47> 
> >>> (hap.c#local_events_need_delivery+0x27/0x40):
> >>>>> (XEN)  0d ad fa ff 48 8b 47 08 <80> 38 00 74 09 80 78 01 00 0f 94 c0 eb 02 31 
> >>> c0
> >>>>> (XEN) Xen stack trace from rsp=ffff82d080307cc8:
> >>>>> (XEN)    ffff82d080307d08 ffff82d08022fc47 0000000000000000 ffff83023f5a5000
> >>>>> (XEN)    ffff83023f5a5648 0000000000000000 ffff82d080307d4c 0000000000002400
> >>>>> (XEN)    ffff82d080307d38 ffff82d08020008c 00000000000ffffd ffff8300b1efd000
> >>>>> (XEN)    ffff83023f5a5000 ffff82d080307d4c ffff82d080307d78 ffff82d0802cad30
> >>>>> (XEN)    0000000000203000 ffff83023f5a5000 ffff82d0802bf860 0000000000000000
> >>>>> (XEN)    0000000000000001 ffff83000008bef0 ffff82d080307de8 ffff82d0802c91e0
> >>>>> (XEN)    ffff82d080307de8 ffff82d080143900 ffff82d080307de8 0000000000000000
> >>>>> (XEN)    ffff83000008bf00 ffff82d0802eb480 ffff82d080307dc4 ffff82d08028b1cd
> >>>>> (XEN)    ffff83000008bf00 0000000000000000 0000000000000001 ffff83023f5a5000
> >>>>> (XEN)    ffff82d080307f08 ffff82d0802bf0c9 0000000000000000 0000000000000000
> >>>>> (XEN)    0000000000000000 ffff82d080307f18 ffff83000008bee0 0000000000000001
> >>>>> (XEN)    0000000000000001 0000000000000001 0000000000000000 0000000000100000
> >>>>> (XEN)    0000000000000001 0000000000247000 ffff83000008bef4 0000000000100000
> >>>>> (XEN)    ffff830100000000 0000000000247001 0000000000000014 0000000100000000
> >>>>> (XEN)    ffff8300ffffffec ffff83000008bef0 ffff82d0802e0640 ffff83000008bfb0
> >>>>> (XEN)    0000000000000000 0000000000000000 0000000000000111 0000000800000000
> >>>>> (XEN)    000000010000006e 0000000000000003 00000000000002f8 0000000000000000
> >>>>> (XEN)    00000000ad5c0bd0 0000000000000000 0000000000000001 0000000000000008
> >>>>> (XEN)    0000000000000000 ffff82d080100073 0000000000000000 0000000000000000
> >>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
> >>>>> (XEN) Xen call trace:
> >>>>> (XEN)    [<ffff82d08022fd47>] hap.c#local_events_need_delivery+0x27/0x40
> >>>>> (XEN)    [<ffff82d08022fc47>] hap_set_allocation+0x107/0x130
> >>>>> (XEN)    [<ffff82d08020008c>] paging_set_allocation+0x4c/0x80
> >>>>> (XEN)    [<ffff82d0802cad30>] domain_build.c#hvm_setup_p2m+0x70/0x1a0
> >>>>> (XEN)    [<ffff82d0802c91e0>] domain_build.c#construct_dom0_hvm+0x60/0x120
> >>>>> (XEN)    [<ffff82d0802bf0c9>] __start_xen+0x1ea9/0x23a0
> >>>>> (XEN)    [<ffff82d080100073>] __high_start+0x53/0x60
> >>>>> (XEN)
> >>>>> (XEN) Pagetable walk from 0000000000000000:
> >>>>
> >>>> Sadly you don't make clear what pointer it is that is NULL at that point.
> >>>
> >>> It sounds from what he says in the following paragraph like current is NULL.
> >> 
> >> I don't recall us re-setting current to this late in the boot process.
> >> Even during early boot we set it to a bogus non-NULL value rather
> >> than NULL.
> >> 
> >>>>> I've tried setting current to d->vcpu[0], but that just makes the call to
> >>>>> hypercall_preempt_check crash in some scheduler assert. In any case, I've
> >>>>> added the preempt parameter to the paging_set_allocation function, but I
> >>>>> don't plan to use it in the domain builder for the time being. Does that
> >>>>> sound right?
> >>>>
> >>>> Not really, new huge latency issues like this shouldn't be reintroduced;
> >>>> we've been fighting hard to get rid of those (and we still occasionally
> >>>> find some no-one had noticed before).
> >>>
> >>> You mean latency in processing softirqs?
> >>>
> >>> Maybe what we need to do is to make local_events_need_delivery() safe
> >>> to call at this point by having it return 0 if current is NULL rather
> >>> than crashing?
> >> 
> >> That would have the same effect - no softirq processing, and hence
> >> possible time issues on huge systems.
> > 
> > No, local_events_delivery() only checks to see if the current vcpu has
> > outstanding virtual interrupts.  The other half of
> > hypercall_preempt_check() checks for softirqs, which doesn't appear to
> > rely on having current available.
> 
> Good point, but
> - current should nevertheless not be NULL (afaict at least),
> - hypercall_preempt_check() is probably the wrong construct,
>   as we're no in a hypercall.
> The latter of course could be addressed by, as you did suggest,
> some refinement to one of the pieces it's being made up from,
> but I'm not sure that would be better than perhaps making its
> invocation conditional (with some better alternative in the "else"
> case) in hap_set_allocation(). Not the least because any
> adjustment to hypercall_preempt_check() itself would affect all
> other of its users.

I had added the following patch to my queue in order to fix this:

---
xen/x86: allow calling hypercall_preempt_check with the idle domain

This allows using hypercall_preempt_check while building Dom0.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>


---

But seeing your comments I now wonder whether that's appropriate. Is there 
anyway in Xen to know whether Xen is in hypercall context or not?

Another way to fix this would be to change both {hap/sh}_set_allocation 
functions to only call hypercall_check_preempt if current != idle_domain, 
and in the idle domain case just call softirq_pending (which is the same 
that the above change achieves).

Roger.

Comments

Jan Beulich Aug. 3, 2016, 4:15 p.m. UTC | #1
>>> On 03.08.16 at 18:00, <roger.pau@citrix.com> wrote:
> On Wed, Aug 03, 2016 at 09:37:41AM -0600, Jan Beulich wrote:
>> >>> On 03.08.16 at 17:28, <george.dunlap@citrix.com> wrote:
>> > On 03/08/16 16:25, Jan Beulich wrote:
>> >>>>> On 03.08.16 at 17:11, <George.Dunlap@eu.citrix.com> wrote:
>> >>> On Tue, Aug 2, 2016 at 5:12 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> >>>>>>> On 02.08.16 at 17:49, <roger.pau@citrix.com> wrote:
>> >>>>> On Tue, Aug 02, 2016 at 11:47:22AM +0200, Roger Pau Monne wrote:
>> >>>>>> On Fri, Jul 29, 2016 at 05:47:24PM +0100, Andrew Cooper wrote:
>> >>>>>>> As this is for the construction of dom0, it would be better to take a
>> >>>>>>> preemptible pointer, loop in construct_dom0(), with a
>> >>>>>>> process_pending_softirqs() call in between.
>> >>>>>>
>> >>>>>> Now fixed.
>> >>>>>
>> >>>>> Hm, I have to stand corrected, using hypercall_preempt_check (as
>> >>>>> any of the *_set_allocation function use), is not safe at this point:
>> >>>>>
>> >>>>> (XEN) ----[ Xen-4.8-unstable  x86_64  debug=y  Tainted:    C  ]----
>> >>>>> (XEN) CPU:    0
>> >>>>> (XEN) RIP:    e008:[<ffff82d08022fd47>] 
>> >>> hap.c#local_events_need_delivery+0x27/0x40
>> >>>>> (XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
>> >>>>> (XEN) rax: 0000000000000000   rbx: ffff83023f5a5000   rcx: ffff82d080312900
>> >>>>> (XEN) rdx: 0000000000000001   rsi: ffff83023f5a56c8   rdi: ffff8300b213d000
>> >>>>> (XEN) rbp: ffff82d080307cc8   rsp: ffff82d080307cc8   r8:  0180000000000000
>> >>>>> (XEN) r9:  0000000000000000   r10: 0000000000247000   r11: ffff82d08029a5b0
>> >>>>> (XEN) r12: 0000000000000011   r13: 00000000000023ac   r14: ffff82d080307d4c
>> >>>>> (XEN) r15: ffff83023f5a56c8   cr0: 000000008005003b   cr4: 00000000001526e0
>> >>>>> (XEN) cr3: 00000000b20fc000   cr2: 0000000000000000
>> >>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>> >>>>> (XEN) Xen code around <ffff82d08022fd47> 
>> >>> (hap.c#local_events_need_delivery+0x27/0x40):
>> >>>>> (XEN)  0d ad fa ff 48 8b 47 08 <80> 38 00 74 09 80 78 01 00 0f 94 c0 eb 02 31 
> 
>> >>> c0
>> >>>>> (XEN) Xen stack trace from rsp=ffff82d080307cc8:
>> >>>>> (XEN)    ffff82d080307d08 ffff82d08022fc47 0000000000000000 
> ffff83023f5a5000
>> >>>>> (XEN)    ffff83023f5a5648 0000000000000000 ffff82d080307d4c 
> 0000000000002400
>> >>>>> (XEN)    ffff82d080307d38 ffff82d08020008c 00000000000ffffd 
> ffff8300b1efd000
>> >>>>> (XEN)    ffff83023f5a5000 ffff82d080307d4c ffff82d080307d78 
> ffff82d0802cad30
>> >>>>> (XEN)    0000000000203000 ffff83023f5a5000 ffff82d0802bf860 
> 0000000000000000
>> >>>>> (XEN)    0000000000000001 ffff83000008bef0 ffff82d080307de8 
> ffff82d0802c91e0
>> >>>>> (XEN)    ffff82d080307de8 ffff82d080143900 ffff82d080307de8 
> 0000000000000000
>> >>>>> (XEN)    ffff83000008bf00 ffff82d0802eb480 ffff82d080307dc4 
> ffff82d08028b1cd
>> >>>>> (XEN)    ffff83000008bf00 0000000000000000 0000000000000001 
> ffff83023f5a5000
>> >>>>> (XEN)    ffff82d080307f08 ffff82d0802bf0c9 0000000000000000 
> 0000000000000000
>> >>>>> (XEN)    0000000000000000 ffff82d080307f18 ffff83000008bee0 
> 0000000000000001
>> >>>>> (XEN)    0000000000000001 0000000000000001 0000000000000000 
> 0000000000100000
>> >>>>> (XEN)    0000000000000001 0000000000247000 ffff83000008bef4 
> 0000000000100000
>> >>>>> (XEN)    ffff830100000000 0000000000247001 0000000000000014 
> 0000000100000000
>> >>>>> (XEN)    ffff8300ffffffec ffff83000008bef0 ffff82d0802e0640 
> ffff83000008bfb0
>> >>>>> (XEN)    0000000000000000 0000000000000000 0000000000000111 
> 0000000800000000
>> >>>>> (XEN)    000000010000006e 0000000000000003 00000000000002f8 
> 0000000000000000
>> >>>>> (XEN)    00000000ad5c0bd0 0000000000000000 0000000000000001 
> 0000000000000008
>> >>>>> (XEN)    0000000000000000 ffff82d080100073 0000000000000000 
> 0000000000000000
>> >>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
> 0000000000000000
>> >>>>> (XEN) Xen call trace:
>> >>>>> (XEN)    [<ffff82d08022fd47>] hap.c#local_events_need_delivery+0x27/0x40
>> >>>>> (XEN)    [<ffff82d08022fc47>] hap_set_allocation+0x107/0x130
>> >>>>> (XEN)    [<ffff82d08020008c>] paging_set_allocation+0x4c/0x80
>> >>>>> (XEN)    [<ffff82d0802cad30>] domain_build.c#hvm_setup_p2m+0x70/0x1a0
>> >>>>> (XEN)    [<ffff82d0802c91e0>] domain_build.c#construct_dom0_hvm+0x60/0x120
>> >>>>> (XEN)    [<ffff82d0802bf0c9>] __start_xen+0x1ea9/0x23a0
>> >>>>> (XEN)    [<ffff82d080100073>] __high_start+0x53/0x60
>> >>>>> (XEN)
>> >>>>> (XEN) Pagetable walk from 0000000000000000:
>> >>>>
>> >>>> Sadly you don't make clear what pointer it is that is NULL at that point.
>> >>>
>> >>> It sounds from what he says in the following paragraph like current is 
> NULL.
>> >> 
>> >> I don't recall us re-setting current to this late in the boot process.
>> >> Even during early boot we set it to a bogus non-NULL value rather
>> >> than NULL.
>> >> 
>> >>>>> I've tried setting current to d->vcpu[0], but that just makes the call to
>> >>>>> hypercall_preempt_check crash in some scheduler assert. In any case, I've
>> >>>>> added the preempt parameter to the paging_set_allocation function, but I
>> >>>>> don't plan to use it in the domain builder for the time being. Does that
>> >>>>> sound right?
>> >>>>
>> >>>> Not really, new huge latency issues like this shouldn't be reintroduced;
>> >>>> we've been fighting hard to get rid of those (and we still occasionally
>> >>>> find some no-one had noticed before).
>> >>>
>> >>> You mean latency in processing softirqs?
>> >>>
>> >>> Maybe what we need to do is to make local_events_need_delivery() safe
>> >>> to call at this point by having it return 0 if current is NULL rather
>> >>> than crashing?
>> >> 
>> >> That would have the same effect - no softirq processing, and hence
>> >> possible time issues on huge systems.
>> > 
>> > No, local_events_delivery() only checks to see if the current vcpu has
>> > outstanding virtual interrupts.  The other half of
>> > hypercall_preempt_check() checks for softirqs, which doesn't appear to
>> > rely on having current available.
>> 
>> Good point, but
>> - current should nevertheless not be NULL (afaict at least),
>> - hypercall_preempt_check() is probably the wrong construct,
>>   as we're no in a hypercall.
>> The latter of course could be addressed by, as you did suggest,
>> some refinement to one of the pieces it's being made up from,
>> but I'm not sure that would be better than perhaps making its
>> invocation conditional (with some better alternative in the "else"
>> case) in hap_set_allocation(). Not the least because any
>> adjustment to hypercall_preempt_check() itself would affect all
>> other of its users.
> 
> I had added the following patch to my queue in order to fix this:
> 
> ---
> xen/x86: allow calling hypercall_preempt_check with the idle domain
> 
> This allows using hypercall_preempt_check while building Dom0.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> diff --git a/xen/include/asm-x86/event.h b/xen/include/asm-x86/event.h
> index a82062e..d55a8bd 100644
> --- a/xen/include/asm-x86/event.h
> +++ b/xen/include/asm-x86/event.h
> @@ -23,6 +23,10 @@ int hvm_local_events_need_delivery(struct vcpu *v);
>  static inline int local_events_need_delivery(void)
>  {
>      struct vcpu *v = current;
> +
> +    if ( is_idle_vcpu(v) )
> +        return 0;

As said, I think it would be better to not add it here, unless there
is a significant amount of other calls into here from idle vCPU-s
with your changes.

Jan
Roger Pau Monne Aug. 3, 2016, 4:24 p.m. UTC | #2
On Wed, Aug 03, 2016 at 10:15:51AM -0600, Jan Beulich wrote:
> >>> On 03.08.16 at 18:00, <roger.pau@citrix.com> wrote:
> > On Wed, Aug 03, 2016 at 09:37:41AM -0600, Jan Beulich wrote:
> >> >>> On 03.08.16 at 17:28, <george.dunlap@citrix.com> wrote:
> >> > On 03/08/16 16:25, Jan Beulich wrote:
> >> >>>>> On 03.08.16 at 17:11, <George.Dunlap@eu.citrix.com> wrote:
> >> >>> On Tue, Aug 2, 2016 at 5:12 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >> >>>>>>> On 02.08.16 at 17:49, <roger.pau@citrix.com> wrote:
> >> >>>>> On Tue, Aug 02, 2016 at 11:47:22AM +0200, Roger Pau Monne wrote:
> >> >>>>>> On Fri, Jul 29, 2016 at 05:47:24PM +0100, Andrew Cooper wrote:
> >> >>>>>>> As this is for the construction of dom0, it would be better to take a
> >> >>>>>>> preemptible pointer, loop in construct_dom0(), with a
> >> >>>>>>> process_pending_softirqs() call in between.
> >> >>>>>>
> >> >>>>>> Now fixed.
> >> >>>>>
> >> >>>>> Hm, I have to stand corrected, using hypercall_preempt_check (as
> >> >>>>> any of the *_set_allocation function use), is not safe at this point:
> >> >>>>>
> >> >>>>> (XEN) ----[ Xen-4.8-unstable  x86_64  debug=y  Tainted:    C  ]----
> >> >>>>> (XEN) CPU:    0
> >> >>>>> (XEN) RIP:    e008:[<ffff82d08022fd47>] 
> >> >>> hap.c#local_events_need_delivery+0x27/0x40
> >> >>>>> (XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
> >> >>>>> (XEN) rax: 0000000000000000   rbx: ffff83023f5a5000   rcx: ffff82d080312900
> >> >>>>> (XEN) rdx: 0000000000000001   rsi: ffff83023f5a56c8   rdi: ffff8300b213d000
> >> >>>>> (XEN) rbp: ffff82d080307cc8   rsp: ffff82d080307cc8   r8:  0180000000000000
> >> >>>>> (XEN) r9:  0000000000000000   r10: 0000000000247000   r11: ffff82d08029a5b0
> >> >>>>> (XEN) r12: 0000000000000011   r13: 00000000000023ac   r14: ffff82d080307d4c
> >> >>>>> (XEN) r15: ffff83023f5a56c8   cr0: 000000008005003b   cr4: 00000000001526e0
> >> >>>>> (XEN) cr3: 00000000b20fc000   cr2: 0000000000000000
> >> >>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> >> >>>>> (XEN) Xen code around <ffff82d08022fd47> 
> >> >>> (hap.c#local_events_need_delivery+0x27/0x40):
> >> >>>>> (XEN)  0d ad fa ff 48 8b 47 08 <80> 38 00 74 09 80 78 01 00 0f 94 c0 eb 02 31 
> > 
> >> >>> c0
> >> >>>>> (XEN) Xen stack trace from rsp=ffff82d080307cc8:
> >> >>>>> (XEN)    ffff82d080307d08 ffff82d08022fc47 0000000000000000 
> > ffff83023f5a5000
> >> >>>>> (XEN)    ffff83023f5a5648 0000000000000000 ffff82d080307d4c 
> > 0000000000002400
> >> >>>>> (XEN)    ffff82d080307d38 ffff82d08020008c 00000000000ffffd 
> > ffff8300b1efd000
> >> >>>>> (XEN)    ffff83023f5a5000 ffff82d080307d4c ffff82d080307d78 
> > ffff82d0802cad30
> >> >>>>> (XEN)    0000000000203000 ffff83023f5a5000 ffff82d0802bf860 
> > 0000000000000000
> >> >>>>> (XEN)    0000000000000001 ffff83000008bef0 ffff82d080307de8 
> > ffff82d0802c91e0
> >> >>>>> (XEN)    ffff82d080307de8 ffff82d080143900 ffff82d080307de8 
> > 0000000000000000
> >> >>>>> (XEN)    ffff83000008bf00 ffff82d0802eb480 ffff82d080307dc4 
> > ffff82d08028b1cd
> >> >>>>> (XEN)    ffff83000008bf00 0000000000000000 0000000000000001 
> > ffff83023f5a5000
> >> >>>>> (XEN)    ffff82d080307f08 ffff82d0802bf0c9 0000000000000000 
> > 0000000000000000
> >> >>>>> (XEN)    0000000000000000 ffff82d080307f18 ffff83000008bee0 
> > 0000000000000001
> >> >>>>> (XEN)    0000000000000001 0000000000000001 0000000000000000 
> > 0000000000100000
> >> >>>>> (XEN)    0000000000000001 0000000000247000 ffff83000008bef4 
> > 0000000000100000
> >> >>>>> (XEN)    ffff830100000000 0000000000247001 0000000000000014 
> > 0000000100000000
> >> >>>>> (XEN)    ffff8300ffffffec ffff83000008bef0 ffff82d0802e0640 
> > ffff83000008bfb0
> >> >>>>> (XEN)    0000000000000000 0000000000000000 0000000000000111 
> > 0000000800000000
> >> >>>>> (XEN)    000000010000006e 0000000000000003 00000000000002f8 
> > 0000000000000000
> >> >>>>> (XEN)    00000000ad5c0bd0 0000000000000000 0000000000000001 
> > 0000000000000008
> >> >>>>> (XEN)    0000000000000000 ffff82d080100073 0000000000000000 
> > 0000000000000000
> >> >>>>> (XEN)    0000000000000000 0000000000000000 0000000000000000 
> > 0000000000000000
> >> >>>>> (XEN) Xen call trace:
> >> >>>>> (XEN)    [<ffff82d08022fd47>] hap.c#local_events_need_delivery+0x27/0x40
> >> >>>>> (XEN)    [<ffff82d08022fc47>] hap_set_allocation+0x107/0x130
> >> >>>>> (XEN)    [<ffff82d08020008c>] paging_set_allocation+0x4c/0x80
> >> >>>>> (XEN)    [<ffff82d0802cad30>] domain_build.c#hvm_setup_p2m+0x70/0x1a0
> >> >>>>> (XEN)    [<ffff82d0802c91e0>] domain_build.c#construct_dom0_hvm+0x60/0x120
> >> >>>>> (XEN)    [<ffff82d0802bf0c9>] __start_xen+0x1ea9/0x23a0
> >> >>>>> (XEN)    [<ffff82d080100073>] __high_start+0x53/0x60
> >> >>>>> (XEN)
> >> >>>>> (XEN) Pagetable walk from 0000000000000000:
> >> >>>>
> >> >>>> Sadly you don't make clear what pointer it is that is NULL at that point.
> >> >>>
> >> >>> It sounds from what he says in the following paragraph like current is 
> > NULL.
> >> >> 
> >> >> I don't recall us re-setting current to this late in the boot process.
> >> >> Even during early boot we set it to a bogus non-NULL value rather
> >> >> than NULL.
> >> >> 
> >> >>>>> I've tried setting current to d->vcpu[0], but that just makes the call to
> >> >>>>> hypercall_preempt_check crash in some scheduler assert. In any case, I've
> >> >>>>> added the preempt parameter to the paging_set_allocation function, but I
> >> >>>>> don't plan to use it in the domain builder for the time being. Does that
> >> >>>>> sound right?
> >> >>>>
> >> >>>> Not really, new huge latency issues like this shouldn't be reintroduced;
> >> >>>> we've been fighting hard to get rid of those (and we still occasionally
> >> >>>> find some no-one had noticed before).
> >> >>>
> >> >>> You mean latency in processing softirqs?
> >> >>>
> >> >>> Maybe what we need to do is to make local_events_need_delivery() safe
> >> >>> to call at this point by having it return 0 if current is NULL rather
> >> >>> than crashing?
> >> >> 
> >> >> That would have the same effect - no softirq processing, and hence
> >> >> possible time issues on huge systems.
> >> > 
> >> > No, local_events_delivery() only checks to see if the current vcpu has
> >> > outstanding virtual interrupts.  The other half of
> >> > hypercall_preempt_check() checks for softirqs, which doesn't appear to
> >> > rely on having current available.
> >> 
> >> Good point, but
> >> - current should nevertheless not be NULL (afaict at least),
> >> - hypercall_preempt_check() is probably the wrong construct,
> >>   as we're no in a hypercall.
> >> The latter of course could be addressed by, as you did suggest,
> >> some refinement to one of the pieces it's being made up from,
> >> but I'm not sure that would be better than perhaps making its
> >> invocation conditional (with some better alternative in the "else"
> >> case) in hap_set_allocation(). Not the least because any
> >> adjustment to hypercall_preempt_check() itself would affect all
> >> other of its users.
> > 
> > I had added the following patch to my queue in order to fix this:
> > 
> > ---
> > xen/x86: allow calling hypercall_preempt_check with the idle domain
> > 
> > This allows using hypercall_preempt_check while building Dom0.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > 
> > diff --git a/xen/include/asm-x86/event.h b/xen/include/asm-x86/event.h
> > index a82062e..d55a8bd 100644
> > --- a/xen/include/asm-x86/event.h
> > +++ b/xen/include/asm-x86/event.h
> > @@ -23,6 +23,10 @@ int hvm_local_events_need_delivery(struct vcpu *v);
> >  static inline int local_events_need_delivery(void)
> >  {
> >      struct vcpu *v = current;
> > +
> > +    if ( is_idle_vcpu(v) )
> > +        return 0;
> 
> As said, I think it would be better to not add it here, unless there
> is a significant amount of other calls into here from idle vCPU-s
> with your changes.

No, the only functions that I use that call hypercall_preempt_check are the 
_set_allocation ones. I would like to add an ASSERT here to make sure 
local_events_need_delivery is not called with current == idle_vcpu.

In any case, I would go with route 2 and modify _set_allocation to call 
softirq_pending instead of hypercall_preempt_check if current == idle_vcpu. 
This should solve the issue and doesn't involve changing 
hypercall_preempt_check itself.

Roger.
Jan Beulich Aug. 4, 2016, 6:19 a.m. UTC | #3
>>> On 03.08.16 at 18:24, <roger.pau@citrix.com> wrote:
> On Wed, Aug 03, 2016 at 10:15:51AM -0600, Jan Beulich wrote:
>> >>> On 03.08.16 at 18:00, <roger.pau@citrix.com> wrote:
>> > --- a/xen/include/asm-x86/event.h
>> > +++ b/xen/include/asm-x86/event.h
>> > @@ -23,6 +23,10 @@ int hvm_local_events_need_delivery(struct vcpu *v);
>> >  static inline int local_events_need_delivery(void)
>> >  {
>> >      struct vcpu *v = current;
>> > +
>> > +    if ( is_idle_vcpu(v) )
>> > +        return 0;
>> 
>> As said, I think it would be better to not add it here, unless there
>> is a significant amount of other calls into here from idle vCPU-s
>> with your changes.
> 
> No, the only functions that I use that call hypercall_preempt_check are the 
> _set_allocation ones. I would like to add an ASSERT here to make sure 
> local_events_need_delivery is not called with current == idle_vcpu.

That's fine with me.

> In any case, I would go with route 2 and modify _set_allocation to call 
> softirq_pending instead of hypercall_preempt_check if current == idle_vcpu. 
> This should solve the issue and doesn't involve changing 
> hypercall_preempt_check itself.

Thanks.

Jan
diff mbox

Patch

diff --git a/xen/include/asm-x86/event.h b/xen/include/asm-x86/event.h
index a82062e..d55a8bd 100644
--- a/xen/include/asm-x86/event.h
+++ b/xen/include/asm-x86/event.h
@@ -23,6 +23,10 @@  int hvm_local_events_need_delivery(struct vcpu *v);
 static inline int local_events_need_delivery(void)
 {
     struct vcpu *v = current;
+
+    if ( is_idle_vcpu(v) )
+        return 0;
+
     return (has_hvm_container_vcpu(v) ? hvm_local_events_need_delivery(v) :
             (vcpu_info(v, evtchn_upcall_pending) &&
              !vcpu_info(v, evtchn_upcall_mask)));