diff mbox series

[1/5] x86/HVM: wire up multicalls

Message ID c3c5df61-32f8-1ffa-aac4-0c83d620d385@suse.com (mailing list archive)
State Superseded
Headers show
Series allow xc_domain_maximum_gpfn() to observe full GFN value | expand

Commit Message

Jan Beulich June 18, 2021, 10:23 a.m. UTC
To be able to use them from, in particular, the tool stack, they need to
be supported for all guest types. Note that xc_resource_op() already
does, so would not work without this on PVH Dom0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Andrew Cooper June 18, 2021, 1:02 p.m. UTC | #1
On 18/06/2021 11:23, Jan Beulich wrote:
> To be able to use them from, in particular, the tool stack, they need to
> be supported for all guest types. Note that xc_resource_op() already
> does, so would not work without this on PVH Dom0.

I'm not a fan of multicalls as a concept - they're mostly a layering
violation adding substantial complexity - and frankly, working around a
Linux kernel/user ABI error is a terrible reason to make this change.

But I won't object if it happens to be the least terrible option going. 
I accept that there are no good options here.

> @@ -334,6 +336,39 @@ int hvm_hypercall(struct cpu_user_regs *
>      return curr->hcall_preempted ? HVM_HCALL_preempted : HVM_HCALL_completed;
>  }
>  
> +enum mc_disposition hvm_do_multicall_call(struct mc_state *state)
> +{
> +    struct vcpu *curr = current;
> +    hypercall_fn_t *func = NULL;
> +
> +    if ( hvm_guest_x86_mode(curr) == 8 )
> +    {
> +        struct multicall_entry *call = &state->call;
> +
> +        if ( call->op < ARRAY_SIZE(hvm_hypercall_table) )
> +            func = array_access_nospec(hvm_hypercall_table, call->op).native;
> +        if ( func )
> +            call->result = func(call->args[0], call->args[1], call->args[2],
> +                                call->args[3], call->args[4], call->args[5]);
> +        else
> +            call->result = -ENOSYS;
> +    }
> +    else
> +    {
> +        struct compat_multicall_entry *call = &state->compat_call;
> +
> +        if ( call->op < ARRAY_SIZE(hvm_hypercall_table) )
> +            func = array_access_nospec(hvm_hypercall_table, call->op).compat;
> +        if ( func )
> +            call->result = func(call->args[0], call->args[1], call->args[2],
> +                                call->args[3], call->args[4], call->args[5]);
> +        else
> +            call->result = -ENOSYS;
> +    }
> +
> +    return !hvm_get_cpl(curr) ? mc_continue : mc_preempt;

This is ported across from XSA-213, but even for PV guests, it was just
defence in depth IIRC for any cases we hadn't spotted, changing privilege.

There is no pagetable accounting in HVM guests to become confused by a
privilege change, and hvm_get_cpl() isn't totally free.  Any kernel
which puts VCPUOP_initialise in a multicall gets to keep all resulting
pieces.

I think this wants to be just "return mc_continue;"

If so, Begrudingly acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

~Andrew
Jan Beulich June 18, 2021, 1:11 p.m. UTC | #2
On 18.06.2021 15:02, Andrew Cooper wrote:
> On 18/06/2021 11:23, Jan Beulich wrote:
>> To be able to use them from, in particular, the tool stack, they need to
>> be supported for all guest types. Note that xc_resource_op() already
>> does, so would not work without this on PVH Dom0.
> 
> I'm not a fan of multicalls as a concept - they're mostly a layering
> violation adding substantial complexity - and frankly, working around a
> Linux kernel/user ABI error is a terrible reason to make this change.

While I agree with the latter, I don't think there's much complexity
here, and there are certainly savings in terms of mode switch between
guest and hypervisor when you can batch up arbitrary calls (and not
just sufficiently similar ones with built-in batching).

>> @@ -334,6 +336,39 @@ int hvm_hypercall(struct cpu_user_regs *
>>      return curr->hcall_preempted ? HVM_HCALL_preempted : HVM_HCALL_completed;
>>  }
>>  
>> +enum mc_disposition hvm_do_multicall_call(struct mc_state *state)
>> +{
>> +    struct vcpu *curr = current;
>> +    hypercall_fn_t *func = NULL;
>> +
>> +    if ( hvm_guest_x86_mode(curr) == 8 )
>> +    {
>> +        struct multicall_entry *call = &state->call;
>> +
>> +        if ( call->op < ARRAY_SIZE(hvm_hypercall_table) )
>> +            func = array_access_nospec(hvm_hypercall_table, call->op).native;
>> +        if ( func )
>> +            call->result = func(call->args[0], call->args[1], call->args[2],
>> +                                call->args[3], call->args[4], call->args[5]);
>> +        else
>> +            call->result = -ENOSYS;
>> +    }
>> +    else
>> +    {
>> +        struct compat_multicall_entry *call = &state->compat_call;
>> +
>> +        if ( call->op < ARRAY_SIZE(hvm_hypercall_table) )
>> +            func = array_access_nospec(hvm_hypercall_table, call->op).compat;
>> +        if ( func )
>> +            call->result = func(call->args[0], call->args[1], call->args[2],
>> +                                call->args[3], call->args[4], call->args[5]);
>> +        else
>> +            call->result = -ENOSYS;
>> +    }
>> +
>> +    return !hvm_get_cpl(curr) ? mc_continue : mc_preempt;
> 
> This is ported across from XSA-213, but even for PV guests, it was just
> defence in depth IIRC for any cases we hadn't spotted, changing privilege.
> 
> There is no pagetable accounting in HVM guests to become confused by a
> privilege change, and hvm_get_cpl() isn't totally free.  Any kernel
> which puts VCPUOP_initialise in a multicall gets to keep all resulting
> pieces.
> 
> I think this wants to be just "return mc_continue;"

I had it this way first, but I think the state setting hypercalls
ought to be similarly protected. In fact I did this adjustment at
the last moment before sending, after having looked at Arm code.
If we don't want it here, it ought to go away there as well, and
then also for PV (where then only IRET would need special casing).

> If so, Begrudingly acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks, I'll take this for the moment (ignoring the "if so"), but
I'll wait some to see whether the above wants further discussing.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -26,6 +26,7 @@ 
 #include <asm/hvm/emulate.h>
 #include <asm/hvm/support.h>
 #include <asm/hvm/viridian.h>
+#include <asm/multicall.h>
 
 #include <public/hvm/hvm_op.h>
 #include <public/hvm/params.h>
@@ -125,6 +126,7 @@  static const struct {
     hypercall_fn_t *native, *compat;
 } hvm_hypercall_table[] = {
     HVM_CALL(memory_op),
+    COMPAT_CALL(multicall),
 #ifdef CONFIG_GRANT_TABLE
     HVM_CALL(grant_table_op),
 #endif
@@ -334,6 +336,39 @@  int hvm_hypercall(struct cpu_user_regs *
     return curr->hcall_preempted ? HVM_HCALL_preempted : HVM_HCALL_completed;
 }
 
+enum mc_disposition hvm_do_multicall_call(struct mc_state *state)
+{
+    struct vcpu *curr = current;
+    hypercall_fn_t *func = NULL;
+
+    if ( hvm_guest_x86_mode(curr) == 8 )
+    {
+        struct multicall_entry *call = &state->call;
+
+        if ( call->op < ARRAY_SIZE(hvm_hypercall_table) )
+            func = array_access_nospec(hvm_hypercall_table, call->op).native;
+        if ( func )
+            call->result = func(call->args[0], call->args[1], call->args[2],
+                                call->args[3], call->args[4], call->args[5]);
+        else
+            call->result = -ENOSYS;
+    }
+    else
+    {
+        struct compat_multicall_entry *call = &state->compat_call;
+
+        if ( call->op < ARRAY_SIZE(hvm_hypercall_table) )
+            func = array_access_nospec(hvm_hypercall_table, call->op).compat;
+        if ( func )
+            call->result = func(call->args[0], call->args[1], call->args[2],
+                                call->args[3], call->args[4], call->args[5]);
+        else
+            call->result = -ENOSYS;
+    }
+
+    return !hvm_get_cpl(curr) ? mc_continue : mc_preempt;
+}
+
 /*
  * Local variables:
  * mode: C
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -20,6 +20,7 @@ 
  */
 
 #include <xen/hypercall.h>
+#include <asm/multicall.h>
 
 #ifdef CONFIG_COMPAT
 #define ARGS(x, n)                              \
@@ -273,13 +274,18 @@  int hypercall_xlat_continuation(unsigned
     return rc;
 }
 
-#ifndef CONFIG_PV
-/* Stub for arch_do_multicall_call */
-enum mc_disposition arch_do_multicall_call(struct mc_state *mc)
+enum mc_disposition arch_do_multicall_call(struct mc_state *state)
 {
+    const struct domain *currd = current->domain;
+
+    if ( is_pv_domain(currd) )
+        return pv_do_multicall_call(state);
+
+    if ( is_hvm_domain(currd) )
+        return hvm_do_multicall_call(state);
+
     return mc_exit;
 }
-#endif
 
 /*
  * Local variables:
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -23,6 +23,7 @@ 
 #include <xen/hypercall.h>
 #include <xen/nospec.h>
 #include <xen/trace.h>
+#include <asm/multicall.h>
 #include <irq_vectors.h>
 
 #ifdef CONFIG_PV32
@@ -245,7 +246,7 @@  void pv_hypercall(struct cpu_user_regs *
     perfc_incr(hypercalls);
 }
 
-enum mc_disposition arch_do_multicall_call(struct mc_state *state)
+enum mc_disposition pv_do_multicall_call(struct mc_state *state)
 {
     struct vcpu *curr = current;
     unsigned long op;
--- /dev/null
+++ b/xen/include/asm-x86/multicall.h
@@ -0,0 +1,12 @@ 
+/******************************************************************************
+ * asm-x86/multicall.h
+ */
+
+#ifndef __ASM_X86_MULTICALL_H__
+#define __ASM_X86_MULTICALL_H__
+
+#include <xen/multicall.h>
+
+typeof(arch_do_multicall_call) pv_do_multicall_call, hvm_do_multicall_call;
+
+#endif /* __ASM_X86_MULTICALL_H__ */