Message ID | 20191205223008.8623-5-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen: Support continuations from tasklets | expand |
Hi Andrew, On 05/12/2019 22:30, Andrew Cooper wrote: > These hypercalls each use continue_hypercall_on_cpu(), whose API is about to > switch to use -ERESTART. Update the soon-to-be affected paths to cope, > folding existing contination logic where applicable. > > In addition: > * For platform op and sysctl, insert a cpu_relax() into what is otherwise a > tight spinlock loop, and make the continuation logic common at the > epilogue. > * Contrary to the comment in the code, kexec_exec() does return in the > KEXEC_REBOOT case, needs to pass ret back to the caller. It is not entirely trivial to me that KEXEC_REBOOT refers to KEXEC_DEFAULT_TYPE. The more that if you look at the kexec_reboot() helper, it will not return (see BUG()). What may return is continue_hypercall_on_cpu(). So would it make sense to use KEXEC_DEFAULT_TYPE? [...] > @@ -816,6 +819,13 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op) > out: > spin_unlock(&xenpf_lock); > > + if ( ret == -ERESTART ) > + { > + create_continuation: Shall we indent create_continuation the same way as out? [...] > @@ -1263,13 +1263,25 @@ static int do_kexec_op_internal(unsigned long op, > > long do_kexec_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) uarg) > { > - return do_kexec_op_internal(op, uarg, 0); > + int ret = do_kexec_op_internal(op, uarg, 0); Shouldn't it be long (or unsigned long) here? Otherwise, the return of hypercall_create_continuation() may be truncated. > + > + if ( ret == -ERESTART ) > + ret = hypercall_create_continuation(__HYPERVISOR_kexec_op, > + "lh", op, uarg); > + > + return ret; > } [...] > @@ -516,6 +519,12 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) > __copy_to_guest(u_sysctl, op, 1) ) > ret = -EFAULT; > > + if ( ret == -ERESTART ) > + { > + create_continuation: Similar question as the previous label create_continuation. > + ret = hypercall_create_continuation(__HYPERVISOR_sysctl, "h", u_sysctl); > + } > + > return ret; > } > >
On 05.12.2019 23:30, Andrew Cooper wrote: > These hypercalls each use continue_hypercall_on_cpu(), whose API is about to > switch to use -ERESTART. Update the soon-to-be affected paths to cope, > folding existing contination logic where applicable. > > In addition: > * For platform op and sysctl, insert a cpu_relax() into what is otherwise a > tight spinlock loop, and make the continuation logic common at the > epilogue. Is this really needed with a hypercall_preempt_check() invocation already in the bodies of these loops? Jan
On 09.12.2019 17:25, Jan Beulich wrote: > On 05.12.2019 23:30, Andrew Cooper wrote: >> These hypercalls each use continue_hypercall_on_cpu(), whose API is about to >> switch to use -ERESTART. Update the soon-to-be affected paths to cope, >> folding existing contination logic where applicable. >> >> In addition: >> * For platform op and sysctl, insert a cpu_relax() into what is otherwise a >> tight spinlock loop, and make the continuation logic common at the >> epilogue. > > Is this really needed with a hypercall_preempt_check() invocation > already in the bodies of these loops? And if it's really to be added, shouldn't it be at the bottom of the loop bodies rather than at the top? Jan
On 08/12/2019 12:57, Julien Grall wrote: > Hi Andrew, > > On 05/12/2019 22:30, Andrew Cooper wrote: >> These hypercalls each use continue_hypercall_on_cpu(), whose API is >> about to >> switch to use -ERESTART. Update the soon-to-be affected paths to cope, >> folding existing contination logic where applicable. >> >> In addition: >> * For platform op and sysctl, insert a cpu_relax() into what is >> otherwise a >> tight spinlock loop, and make the continuation logic common at the >> epilogue. >> * Contrary to the comment in the code, kexec_exec() does return in the >> KEXEC_REBOOT case, needs to pass ret back to the caller. > > It is not entirely trivial to me that KEXEC_REBOOT refers to > KEXEC_DEFAULT_TYPE. The more that if you look at the kexec_reboot() > helper, it will not return (see BUG()). What may return is > continue_hypercall_on_cpu(). > > So would it make sense to use KEXEC_DEFAULT_TYPE? I'm not sure why I capitalised it, but no - using KEXEC_DEFAULT_TYPE is worse. A casual reader is far more likely to understand kexec_reboot() in this context. > > [...] > >> @@ -816,6 +819,13 @@ ret_t >> do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op) >> out: >> spin_unlock(&xenpf_lock); >> + if ( ret == -ERESTART ) >> + { >> + create_continuation: > > Shall we indent create_continuation the same way as out? They have different scopes, and while it may look weird, this is in accordance with our style. > > [...] > >> @@ -1263,13 +1263,25 @@ static int do_kexec_op_internal(unsigned long >> op, >> long do_kexec_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) >> uarg) >> { >> - return do_kexec_op_internal(op, uarg, 0); >> + int ret = do_kexec_op_internal(op, uarg, 0); > Shouldn't it be long (or unsigned long) here? Otherwise, the return of > hypercall_create_continuation() may be truncated. If you're concerned about truncation via this pattern, then there are other areas of the code to be worred about. However, there is nothing to truncate. The return value of hypercall_create_continuation() is the primary hypercall number, i.e. __HYPERVISOR_kexec_op in this case. ~Andrew
On 09/12/2019 16:29, Jan Beulich wrote: > On 09.12.2019 17:25, Jan Beulich wrote: >> On 05.12.2019 23:30, Andrew Cooper wrote: >>> These hypercalls each use continue_hypercall_on_cpu(), whose API is about to >>> switch to use -ERESTART. Update the soon-to-be affected paths to cope, >>> folding existing contination logic where applicable. >>> >>> In addition: >>> * For platform op and sysctl, insert a cpu_relax() into what is otherwise a >>> tight spinlock loop, and make the continuation logic common at the >>> epilogue. >> Is this really needed with a hypercall_preempt_check() invocation >> already in the bodies of these loops? Yes. The reason you're supposed to pause is to stop having memory traffic constantly trying to pull the spinlock's cacheline into shared state. Racing round a tight loop constantly reading 4 or 5 memory locations is almost as bad. > And if it's really to be added, shouldn't it be at the bottom > of the loop bodies rather than at the top? It doesn't matter. ~Andrew
On 09.12.2019 18:43, Andrew Cooper wrote: > On 09/12/2019 16:29, Jan Beulich wrote: >> On 09.12.2019 17:25, Jan Beulich wrote: >>> On 05.12.2019 23:30, Andrew Cooper wrote: >>>> These hypercalls each use continue_hypercall_on_cpu(), whose API is about to >>>> switch to use -ERESTART. Update the soon-to-be affected paths to cope, >>>> folding existing contination logic where applicable. >>>> >>>> In addition: >>>> * For platform op and sysctl, insert a cpu_relax() into what is otherwise a >>>> tight spinlock loop, and make the continuation logic common at the >>>> epilogue. >>> Is this really needed with a hypercall_preempt_check() invocation >>> already in the bodies of these loops? > > Yes. The reason you're supposed to pause is to stop having memory > traffic constantly trying to pull the spinlock's cacheline into shared > state. > > Racing round a tight loop constantly reading 4 or 5 memory locations is > almost as bad. > >> And if it's really to be added, shouldn't it be at the bottom >> of the loop bodies rather than at the top? > > It doesn't matter. Well, modern documentation indeed gives no hint whatsoever towards its placement. Recalling the initial guidelines from Intel (from even before the whitepaper was made available) there was a suggestion towards placing it close ahead of the problematic memory access. The last example in the whitepaper looks to support this without explicitly saying so. Anyway, preferably with it moved Reviewed-by: Jan Beulich <jbeulich@suse.com> Jan
Hi, On 09/12/2019 17:37, Andrew Cooper wrote: > On 08/12/2019 12:57, Julien Grall wrote: >> Hi Andrew, >> >> On 05/12/2019 22:30, Andrew Cooper wrote: >>> These hypercalls each use continue_hypercall_on_cpu(), whose API is >>> about to >>> switch to use -ERESTART. Update the soon-to-be affected paths to cope, >>> folding existing contination logic where applicable. >>> >>> In addition: >>> * For platform op and sysctl, insert a cpu_relax() into what is >>> otherwise a >>> tight spinlock loop, and make the continuation logic common at the >>> epilogue. >>> * Contrary to the comment in the code, kexec_exec() does return in the >>> KEXEC_REBOOT case, needs to pass ret back to the caller. >> >> It is not entirely trivial to me that KEXEC_REBOOT refers to >> KEXEC_DEFAULT_TYPE. The more that if you look at the kexec_reboot() >> helper, it will not return (see BUG()). What may return is >> continue_hypercall_on_cpu(). >> >> So would it make sense to use KEXEC_DEFAULT_TYPE? > > I'm not sure why I capitalised it, but no - using KEXEC_DEFAULT_TYPE is > worse. A casual reader is far more likely to understand kexec_reboot() > in this context. But kexec_reboot() cannot return (see BUG()) so a reader may not understand why you suggest that it will return. So I think this want to be reworded. >> >> [...] >> >>> @@ -816,6 +819,13 @@ ret_t >>> do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op) >>> out: >>> spin_unlock(&xenpf_lock); >>> + if ( ret == -ERESTART ) >>> + { >>> + create_continuation: >> >> Shall we indent create_continuation the same way as out? > > They have different scopes, and while it may look weird, this is in > accordance with our style. > >> >> [...] >> >>> @@ -1263,13 +1263,25 @@ static int do_kexec_op_internal(unsigned long >>> op, >>> long do_kexec_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) >>> uarg) >>> { >>> - return do_kexec_op_internal(op, uarg, 0); >>> + int ret = do_kexec_op_internal(op, uarg, 0); >> Shouldn't it be long (or unsigned long) here? Otherwise, the return of >> hypercall_create_continuation() may be truncated. > > If you're concerned about truncation via this pattern, then there are > other areas of the code to be worred about. I knew you would mention the other places :). > > However, there is nothing to truncate. The return value of > hypercall_create_continuation() is the primary hypercall number, i.e. > __HYPERVISOR_kexec_op in this case. Make sense. And I guess the signed bit will be propagated so if do_kexec_op() return -EINVAL (32-bit), then it would still be -EINVAL (64-bit). Cheers,
diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c index b19f6ec4ed..c0c209baac 100644 --- a/xen/arch/x86/platform_hypercall.c +++ b/xen/arch/x86/platform_hypercall.c @@ -201,9 +201,12 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op) * with this vcpu. */ while ( !spin_trylock(&xenpf_lock) ) + { + cpu_relax(); + if ( hypercall_preempt_check() ) - return hypercall_create_continuation( - __HYPERVISOR_platform_op, "h", u_xenpf_op); + goto create_continuation; + } switch ( op->cmd ) { @@ -816,6 +819,13 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op) out: spin_unlock(&xenpf_lock); + if ( ret == -ERESTART ) + { + create_continuation: + ret = hypercall_create_continuation(__HYPERVISOR_platform_op, + "h", u_xenpf_op); + } + return ret; } diff --git a/xen/common/compat/domain.c b/xen/common/compat/domain.c index 11c6afc463..1a14403672 100644 --- a/xen/common/compat/domain.c +++ b/xen/common/compat/domain.c @@ -79,11 +79,6 @@ int compat_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) ar xfree(ctxt); } - - if ( rc == -ERESTART ) - rc = hypercall_create_continuation(__HYPERVISOR_vcpu_op, "iih", - cmd, vcpuid, arg); - break; } @@ -130,6 +125,10 @@ int compat_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) ar break; } + if ( rc == -ERESTART ) + rc = hypercall_create_continuation(__HYPERVISOR_vcpu_op, "iih", + cmd, vcpuid, arg); + return rc; } diff --git a/xen/common/domain.c b/xen/common/domain.c index 865a1cb9d7..ab7e4d09c0 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1422,10 +1422,6 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) return -EINVAL; rc = arch_initialise_vcpu(v, arg); - if ( rc == -ERESTART ) - rc = hypercall_create_continuation(__HYPERVISOR_vcpu_op, "iih", - cmd, vcpuid, arg); - break; case VCPUOP_up: @@ -1598,6 +1594,10 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg) break; } + if ( rc == -ERESTART ) + rc = hypercall_create_continuation(__HYPERVISOR_vcpu_op, "iih", + cmd, vcpuid, arg); + return rc; } diff --git a/xen/common/kexec.c b/xen/common/kexec.c index a262cc5a18..2fca75cec0 100644 --- a/xen/common/kexec.c +++ b/xen/common/kexec.c @@ -842,7 +842,7 @@ static int kexec_exec(XEN_GUEST_HANDLE_PARAM(void) uarg) break; } - return -EINVAL; /* never reached */ + return ret; } static int kexec_swap_images(int type, struct kexec_image *new, @@ -1220,7 +1220,7 @@ static int do_kexec_op_internal(unsigned long op, return ret; if ( test_and_set_bit(KEXEC_FLAG_IN_HYPERCALL, &kexec_flags) ) - return hypercall_create_continuation(__HYPERVISOR_kexec_op, "lh", op, uarg); + return -ERESTART; switch ( op ) { @@ -1263,13 +1263,25 @@ static int do_kexec_op_internal(unsigned long op, long do_kexec_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) uarg) { - return do_kexec_op_internal(op, uarg, 0); + int ret = do_kexec_op_internal(op, uarg, 0); + + if ( ret == -ERESTART ) + ret = hypercall_create_continuation(__HYPERVISOR_kexec_op, + "lh", op, uarg); + + return ret; } #ifdef CONFIG_COMPAT int compat_kexec_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) uarg) { - return do_kexec_op_internal(op, uarg, 1); + int ret = do_kexec_op_internal(op, uarg, 1); + + if ( ret == -ERESTART ) + ret = hypercall_create_continuation(__HYPERVISOR_kexec_op, + "lh", op, uarg); + + return ret; } #endif diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c index f88a285e7f..7b55047bb9 100644 --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -51,9 +51,12 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) * with this vcpu. */ while ( !spin_trylock(&sysctl_lock) ) + { + cpu_relax(); + if ( hypercall_preempt_check() ) - return hypercall_create_continuation( - __HYPERVISOR_sysctl, "h", u_sysctl); + goto create_continuation; + } switch ( op->cmd ) { @@ -516,6 +519,12 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) __copy_to_guest(u_sysctl, op, 1) ) ret = -EFAULT; + if ( ret == -ERESTART ) + { + create_continuation: + ret = hypercall_create_continuation(__HYPERVISOR_sysctl, "h", u_sysctl); + } + return ret; }
These hypercalls each use continue_hypercall_on_cpu(), whose API is about to switch to use -ERESTART. Update the soon-to-be affected paths to cope, folding existing contination logic where applicable. In addition: * For platform op and sysctl, insert a cpu_relax() into what is otherwise a tight spinlock loop, and make the continuation logic common at the epilogue. * Contrary to the comment in the code, kexec_exec() does return in the KEXEC_REBOOT case, needs to pass ret back to the caller. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Wei Liu <wl@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Julien Grall <julien@xen.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> --- xen/arch/x86/platform_hypercall.c | 14 ++++++++++++-- xen/common/compat/domain.c | 9 ++++----- xen/common/domain.c | 8 ++++---- xen/common/kexec.c | 20 ++++++++++++++++---- xen/common/sysctl.c | 13 +++++++++++-- 5 files changed, 47 insertions(+), 17 deletions(-)