diff mbox series

[4/6] x86/fpu: Add guest support to xfd_enable_feature()

Message ID 20211214024947.991506193@linutronix.de (mailing list archive)
State New, archived
Headers show
Series x86/fpu: Preparatory changes for guest AMX support | expand

Commit Message

Thomas Gleixner Dec. 14, 2021, 2:50 a.m. UTC
Guest support for dynamically enabling FPU features requires a few
modifications to the enablement function which is currently invoked from
the #NM handler:

  1) Use guest permissions and sizes for the update

  2) Update fpu_guest state accordingly

  3) Take into account that the enabling can be triggered either from a
     running guest via XSETBV and MSR_IA32_XFD write emulation and from
     a guest restore. In the latter case the guests fpstate is not the
     current tasks active fpstate.

Split the function and implement the guest mechanics throughout the
callchain.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
Split out from combo patch. Add fpstate.in_use initialization.
---
 arch/x86/kernel/fpu/xstate.c |   73 ++++++++++++++++++++++---------------------
 arch/x86/kernel/fpu/xstate.h |    2 +
 2 files changed, 41 insertions(+), 34 deletions(-)

Comments

Tian, Kevin Dec. 14, 2021, 6:05 a.m. UTC | #1
> From: Thomas Gleixner <tglx@linutronix.de>
> Sent: Tuesday, December 14, 2021 10:50 AM
> 
> Guest support for dynamically enabling FPU features requires a few

'enabling' -> 'enabled'

> modifications to the enablement function which is currently invoked from
> the #NM handler:
> 
>   1) Use guest permissions and sizes for the update
> 
>   2) Update fpu_guest state accordingly
> 
>   3) Take into account that the enabling can be triggered either from a
>      running guest via XSETBV and MSR_IA32_XFD write emulation and from

'and from' -> 'or from'

>      a guest restore. In the latter case the guests fpstate is not the
>      current tasks active fpstate.
> 
> Split the function and implement the guest mechanics throughout the
> callchain.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

[...]
> @@ -1553,6 +1531,13 @@ static int fpstate_realloc(u64 xfeatures
>  	newfps->user_size = usize;
>  	newfps->is_valloc = true;
> 
> +	if (guest_fpu) {
> +		newfps->is_guest = true;
> +		newfps->is_confidential = curfps->is_confidential;
> +		newfps->in_use = curfps->in_use;
> +		guest_fpu->xfeatures |= xfeatures;
> +	}
> +

As you explained guest fpstate is not current active in the restoring 
path, thus it's not correct to always inherit attributes from the 
active one.

Also we want to avoid touching real hardware state if guest_fpstate
!= curfps, e.g.:

	if (test_thread_flag(TIF_NEED_FPU_LOAD))
		fpregs_restore_userregs();

> +	if (guest_fpu) {
> +		curfps = xchg(&guest_fpu->fpstate, newfps);
> +		/* If curfps is active, update the FPU fpstate pointer */
> +		if (fpu->fpstate == curfps)
> +			fpu->fpstate = newfps;
> +	} else {
> +		curfps = xchg(&fpu->fpstate, newfps);
> +	}
> +
> +	xfd_update_state(fpu->fpstate);

and also here.

> @@ -1697,14 +1694,16 @@ int xfd_enable_feature(u64 xfd_err)
>  	spin_lock_irq(&current->sighand->siglock);
> 
>  	/* If not permitted let it die */
> -	if ((xstate_get_host_group_perm() & xfd_event) != xfd_event) {
> +	if ((xstate_get_group_perm(!!guest_fpu) & xfd_event) != xfd_event) {
>  		spin_unlock_irq(&current->sighand->siglock);
>  		return -EPERM;
>  	}
> 
>  	fpu = &current->group_leader->thread.fpu;
> -	ksize = fpu->perm.__state_size;
> -	usize = fpu->perm.__user_state_size;
> +	perm = guest_fpu ? &fpu->guest_perm : &fpu->perm;
> +	ksize = perm->__state_size;
> +	usize = perm->__user_state_size;
> +

Do we want to mention in the commit msg that fpstate 
reallocation size is based on permissions instead of requested 
features? The intuitive thought is that each time a new feature is 
requested this expands the buffer to match the requested feature...

Thanks
Kevin
Paolo Bonzini Dec. 14, 2021, 10:21 a.m. UTC | #2
On 12/14/21 07:05, Tian, Kevin wrote:
>> +	if (guest_fpu) {
>> +		newfps->is_guest = true;
>> +		newfps->is_confidential = curfps->is_confidential;
>> +		newfps->in_use = curfps->in_use;
>> +		guest_fpu->xfeatures |= xfeatures;
>> +	}
>> +
> As you explained guest fpstate is not current active in the restoring
> path, thus it's not correct to always inherit attributes from the
> active one.

Indeed, guest_fpu->fpstate should be used instead of curfps.

Paolo
Thomas Gleixner Dec. 14, 2021, 1:15 p.m. UTC | #3
On Tue, Dec 14 2021 at 11:21, Paolo Bonzini wrote:
> On 12/14/21 07:05, Tian, Kevin wrote:
>>> +	if (guest_fpu) {
>>> +		newfps->is_guest = true;
>>> +		newfps->is_confidential = curfps->is_confidential;
>>> +		newfps->in_use = curfps->in_use;
>>> +		guest_fpu->xfeatures |= xfeatures;
>>> +	}
>>> +
>> As you explained guest fpstate is not current active in the restoring
>> path, thus it's not correct to always inherit attributes from the
>> active one.

Good catch!

> Indeed, guest_fpu->fpstate should be used instead of curfps.

Something like the below.

Thanks,

        tglx
---
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1500,35 +1500,13 @@ void fpstate_free(struct fpu *fpu)
 }
 
 /**
- * fpu_install_fpstate - Update the active fpstate in the FPU
- *
- * @fpu:	A struct fpu * pointer
- * @newfps:	A struct fpstate * pointer
- *
- * Returns:	A null pointer if the last active fpstate is the embedded
- *		one or the new fpstate is already installed;
- *		otherwise, a pointer to the old fpstate which has to
- *		be freed by the caller.
- */
-static struct fpstate *fpu_install_fpstate(struct fpu *fpu,
-					   struct fpstate *newfps)
-{
-	struct fpstate *oldfps = fpu->fpstate;
-
-	if (fpu->fpstate == newfps)
-		return NULL;
-
-	fpu->fpstate = newfps;
-	return oldfps != &fpu->__fpstate ? oldfps : NULL;
-}
-
-/**
  * fpstate_realloc - Reallocate struct fpstate for the requested new features
  *
  * @xfeatures:	A bitmap of xstate features which extend the enabled features
  *		of that task
  * @ksize:	The required size for the kernel buffer
  * @usize:	The required size for user space buffers
+ * @guest_fpu:	Pointer to a guest FPU container. NULL for host allocations
  *
  * Note vs. vmalloc(): If the task with a vzalloc()-allocated buffer
  * terminates quickly, vfree()-induced IPIs may be a concern, but tasks
@@ -1537,13 +1515,13 @@ static struct fpstate *fpu_install_fpsta
  * Returns: 0 on success, -ENOMEM on allocation error.
  */
 static int fpstate_realloc(u64 xfeatures, unsigned int ksize,
-			   unsigned int usize)
+			   unsigned int usize, struct fpu_guest *guest_fpu)
 {
 	struct fpu *fpu = &current->thread.fpu;
 	struct fpstate *curfps, *newfps = NULL;
 	unsigned int fpsize;
+	bool in_use;
 
-	curfps = fpu->fpstate;
 	fpsize = ksize + ALIGN(offsetof(struct fpstate, regs), 64);
 
 	newfps = vzalloc(fpsize);
@@ -1553,28 +1531,55 @@ static int fpstate_realloc(u64 xfeatures
 	newfps->user_size = usize;
 	newfps->is_valloc = true;
 
+	/*
+	 * When a guest FPU is supplied, use @guest_fpu->fpstate
+	 * as reference independent whether it is in use or not.
+	 */
+	curfps = guest_fpu ? guest_fpu->fpstate : fpu->fpstate;
+
+	/* Determine whether @curfps is the active fpstate */
+	in_use = fpu->fpstate == curfps;
+
+	if (guest_fpu) {
+		newfps->is_guest = true;
+		newfps->is_confidential = curfps->is_confidential;
+		newfps->in_use = curfps->in_use;
+		guest_fpu->xfeatures |= xfeatures;
+	}
+
 	fpregs_lock();
 	/*
-	 * Ensure that the current state is in the registers before
-	 * swapping fpstate as that might invalidate it due to layout
-	 * changes.
+	 * If @curfps is in use, ensure that the current state is in the
+	 * registers before swapping fpstate as that might invalidate it
+	 * due to layout changes.
 	 */
-	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+	if (in_use && test_thread_flag(TIF_NEED_FPU_LOAD))
 		fpregs_restore_userregs();
 
 	newfps->xfeatures = curfps->xfeatures | xfeatures;
 	newfps->user_xfeatures = curfps->user_xfeatures | xfeatures;
 	newfps->xfd = curfps->xfd & ~xfeatures;
 
-	curfps = fpu_install_fpstate(fpu, newfps);
-
 	/* Do the final updates within the locked region */
 	xstate_init_xcomp_bv(&newfps->regs.xsave, newfps->xfeatures);
-	xfd_update_state(newfps);
 
+	if (guest_fpu) {
+		curfps = xchg(&guest_fpu->fpstate, newfps);
+		/* If curfps is active, update the FPU fpstate pointer */
+		if (in_use)
+			fpu->fpstate = newfps;
+	} else {
+		curfps = xchg(&fpu->fpstate, newfps);
+	}
+
+	if (in_use)
+		xfd_update_state(fpu->fpstate);
 	fpregs_unlock();
 
-	vfree(curfps);
+	/* Only free valloc'ed state */
+	if (curfps && curfps->is_valloc)
+		vfree(curfps);
+
 	return 0;
 }
 
@@ -1682,14 +1687,16 @@ static int xstate_request_perm(unsigned
 	return ret;
 }
 
-int xfd_enable_feature(u64 xfd_err)
+int __xfd_enable_feature(u64 xfd_err, struct fpu_guest *guest_fpu)
 {
 	u64 xfd_event = xfd_err & XFEATURE_MASK_USER_DYNAMIC;
+	struct fpu_state_perm *perm;
 	unsigned int ksize, usize;
 	struct fpu *fpu;
 
 	if (!xfd_event) {
-		pr_err_once("XFD: Invalid xfd error: %016llx\n", xfd_err);
+		if (!guest_fpu)
+			pr_err_once("XFD: Invalid xfd error: %016llx\n", xfd_err);
 		return 0;
 	}
 
@@ -1697,14 +1704,16 @@ int xfd_enable_feature(u64 xfd_err)
 	spin_lock_irq(&current->sighand->siglock);
 
 	/* If not permitted let it die */
-	if ((xstate_get_host_group_perm() & xfd_event) != xfd_event) {
+	if ((xstate_get_group_perm(!!guest_fpu) & xfd_event) != xfd_event) {
 		spin_unlock_irq(&current->sighand->siglock);
 		return -EPERM;
 	}
 
 	fpu = &current->group_leader->thread.fpu;
-	ksize = fpu->perm.__state_size;
-	usize = fpu->perm.__user_state_size;
+	perm = guest_fpu ? &fpu->guest_perm : &fpu->perm;
+	ksize = perm->__state_size;
+	usize = perm->__user_state_size;
+
 	/*
 	 * The feature is permitted. State size is sufficient.  Dropping
 	 * the lock is safe here even if more features are added from
@@ -1717,10 +1726,16 @@ int xfd_enable_feature(u64 xfd_err)
 	 * Try to allocate a new fpstate. If that fails there is no way
 	 * out.
 	 */
-	if (fpstate_realloc(xfd_event, ksize, usize))
+	if (fpstate_realloc(xfd_event, ksize, usize, guest_fpu))
 		return -EFAULT;
 	return 0;
 }
+
+int xfd_enable_feature(u64 xfd_err)
+{
+	return __xfd_enable_feature(xfd_err, NULL);
+}
+
 #else /* CONFIG_X86_64 */
 static inline int xstate_request_perm(unsigned long idx, bool guest)
 {
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -60,6 +60,8 @@ extern void fpu__init_system_xstate(unsi
 
 extern void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 
+extern int __xfd_enable_feature(u64 which, struct fpu_guest *guest_fpu);
+
 static inline u64 xfeatures_mask_supervisor(void)
 {
 	return fpu_kernel_cfg.max_features & XFEATURE_MASK_SUPERVISOR_SUPPORTED;
Tian, Kevin Dec. 15, 2021, 5:46 a.m. UTC | #4
> From: Thomas Gleixner <tglx@linutronix.de>
> Sent: Tuesday, December 14, 2021 9:16 PM
> 
> On Tue, Dec 14 2021 at 11:21, Paolo Bonzini wrote:
> > On 12/14/21 07:05, Tian, Kevin wrote:
> >>> +	if (guest_fpu) {
> >>> +		newfps->is_guest = true;
> >>> +		newfps->is_confidential = curfps->is_confidential;
> >>> +		newfps->in_use = curfps->in_use;
> >>> +		guest_fpu->xfeatures |= xfeatures;
> >>> +	}
> >>> +
> >> As you explained guest fpstate is not current active in the restoring
> >> path, thus it's not correct to always inherit attributes from the
> >> active one.
> 
> Good catch!
> 
> > Indeed, guest_fpu->fpstate should be used instead of curfps.
> 
> Something like the below.

Looks good. Just two nits.

> +	/*
> +	 * When a guest FPU is supplied, use @guest_fpu->fpstate
> +	 * as reference independent whether it is in use or not.
> +	 */
> +	curfps = guest_fpu ? guest_fpu->fpstate : fpu->fpstate;
> +
> +	/* Determine whether @curfps is the active fpstate */
> +	in_use = fpu->fpstate == curfps;
> +
> +	if (guest_fpu) {
> +		newfps->is_guest = true;
> +		newfps->is_confidential = curfps->is_confidential;
> +		newfps->in_use = curfps->in_use;

What is the purpose of this 'in_use' field? Currently it's only
touched in three places:

  - set when entering guest;
  - cleared when exiting to userspace;
  - checked when freeing a guest FPU;

The last one can be easily checked by comparing to current fps.

Any other intended usage which is currently missed for guest FPU?

> 
> +	if (guest_fpu) {
> +		curfps = xchg(&guest_fpu->fpstate, newfps);

This can be a direct value update to guest_fpu->fpstate since 
curfps has already been acquired in the start.

> +		/* If curfps is active, update the FPU fpstate pointer */
> +		if (in_use)
> +			fpu->fpstate = newfps;
> +	} else {
> +		curfps = xchg(&fpu->fpstate, newfps);

ditto.

Thanks
Kevin
Thomas Gleixner Dec. 15, 2021, 9:53 a.m. UTC | #5
On Wed, Dec 15 2021 at 05:46, Kevin Tian wrote:
>> From: Thomas Gleixner <tglx@linutronix.de>
>> +	if (guest_fpu) {
>> +		newfps->is_guest = true;
>> +		newfps->is_confidential = curfps->is_confidential;
>> +		newfps->in_use = curfps->in_use;
>
> What is the purpose of this 'in_use' field? Currently it's only
> touched in three places:
>
>   - set when entering guest;
>   - cleared when exiting to userspace;
>   - checked when freeing a guest FPU;
>
> The last one can be easily checked by comparing to current fps.

I added it for paranoia sake because the destruction of the KVM FPU
state is not necessarily in the context of the vCPU thread. Yes, it
should not happen...

>> +	if (guest_fpu) {
>> +		curfps = xchg(&guest_fpu->fpstate, newfps);
>
> This can be a direct value update to guest_fpu->fpstate since 
> curfps has already been acquired in the start.

Indeed.

Thanks,

        tglx
Tian, Kevin Dec. 15, 2021, 10:02 a.m. UTC | #6
> From: Thomas Gleixner <tglx@linutronix.de>
> Sent: Wednesday, December 15, 2021 5:53 PM
> 
> On Wed, Dec 15 2021 at 05:46, Kevin Tian wrote:
> >> From: Thomas Gleixner <tglx@linutronix.de>
> >> +	if (guest_fpu) {
> >> +		newfps->is_guest = true;
> >> +		newfps->is_confidential = curfps->is_confidential;
> >> +		newfps->in_use = curfps->in_use;
> >
> > What is the purpose of this 'in_use' field? Currently it's only
> > touched in three places:
> >
> >   - set when entering guest;
> >   - cleared when exiting to userspace;
> >   - checked when freeing a guest FPU;
> >
> > The last one can be easily checked by comparing to current fps.
> 
> I added it for paranoia sake because the destruction of the KVM FPU
> state is not necessarily in the context of the vCPU thread. Yes, it
> should not happen...
> 
> >> +	if (guest_fpu) {
> >> +		curfps = xchg(&guest_fpu->fpstate, newfps);
> >
> > This can be a direct value update to guest_fpu->fpstate since
> > curfps has already been acquired in the start.
> 
> Indeed.
> 

Thanks for confirmation. We'll include those changes in next version.

Thanks
Kevin
diff mbox series

Patch

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1500,35 +1500,13 @@  void fpstate_free(struct fpu *fpu)
 }
 
 /**
- * fpu_install_fpstate - Update the active fpstate in the FPU
- *
- * @fpu:	A struct fpu * pointer
- * @newfps:	A struct fpstate * pointer
- *
- * Returns:	A null pointer if the last active fpstate is the embedded
- *		one or the new fpstate is already installed;
- *		otherwise, a pointer to the old fpstate which has to
- *		be freed by the caller.
- */
-static struct fpstate *fpu_install_fpstate(struct fpu *fpu,
-					   struct fpstate *newfps)
-{
-	struct fpstate *oldfps = fpu->fpstate;
-
-	if (fpu->fpstate == newfps)
-		return NULL;
-
-	fpu->fpstate = newfps;
-	return oldfps != &fpu->__fpstate ? oldfps : NULL;
-}
-
-/**
  * fpstate_realloc - Reallocate struct fpstate for the requested new features
  *
  * @xfeatures:	A bitmap of xstate features which extend the enabled features
  *		of that task
  * @ksize:	The required size for the kernel buffer
  * @usize:	The required size for user space buffers
+ * @guest_fpu:	Pointer to a guest FPU container. NULL for host allocations
  *
  * Note vs. vmalloc(): If the task with a vzalloc()-allocated buffer
  * terminates quickly, vfree()-induced IPIs may be a concern, but tasks
@@ -1537,7 +1515,7 @@  static struct fpstate *fpu_install_fpsta
  * Returns: 0 on success, -ENOMEM on allocation error.
  */
 static int fpstate_realloc(u64 xfeatures, unsigned int ksize,
-			   unsigned int usize)
+			   unsigned int usize, struct fpu_guest *guest_fpu)
 {
 	struct fpu *fpu = &current->thread.fpu;
 	struct fpstate *curfps, *newfps = NULL;
@@ -1553,6 +1531,13 @@  static int fpstate_realloc(u64 xfeatures
 	newfps->user_size = usize;
 	newfps->is_valloc = true;
 
+	if (guest_fpu) {
+		newfps->is_guest = true;
+		newfps->is_confidential = curfps->is_confidential;
+		newfps->in_use = curfps->in_use;
+		guest_fpu->xfeatures |= xfeatures;
+	}
+
 	fpregs_lock();
 	/*
 	 * Ensure that the current state is in the registers before
@@ -1566,15 +1551,25 @@  static int fpstate_realloc(u64 xfeatures
 	newfps->user_xfeatures = curfps->user_xfeatures | xfeatures;
 	newfps->xfd = curfps->xfd & ~xfeatures;
 
-	curfps = fpu_install_fpstate(fpu, newfps);
-
 	/* Do the final updates within the locked region */
 	xstate_init_xcomp_bv(&newfps->regs.xsave, newfps->xfeatures);
-	xfd_update_state(newfps);
 
+	if (guest_fpu) {
+		curfps = xchg(&guest_fpu->fpstate, newfps);
+		/* If curfps is active, update the FPU fpstate pointer */
+		if (fpu->fpstate == curfps)
+			fpu->fpstate = newfps;
+	} else {
+		curfps = xchg(&fpu->fpstate, newfps);
+	}
+
+	xfd_update_state(fpu->fpstate);
 	fpregs_unlock();
 
-	vfree(curfps);
+	/* Only free valloc'ed state */
+	if (curfps && curfps->is_valloc)
+		vfree(curfps);
+
 	return 0;
 }
 
@@ -1682,14 +1677,16 @@  static int xstate_request_perm(unsigned
 	return ret;
 }
 
-int xfd_enable_feature(u64 xfd_err)
+int __xfd_enable_feature(u64 xfd_err, struct fpu_guest *guest_fpu)
 {
 	u64 xfd_event = xfd_err & XFEATURE_MASK_USER_DYNAMIC;
+	struct fpu_state_perm *perm;
 	unsigned int ksize, usize;
 	struct fpu *fpu;
 
 	if (!xfd_event) {
-		pr_err_once("XFD: Invalid xfd error: %016llx\n", xfd_err);
+		if (!guest_fpu)
+			pr_err_once("XFD: Invalid xfd error: %016llx\n", xfd_err);
 		return 0;
 	}
 
@@ -1697,14 +1694,16 @@  int xfd_enable_feature(u64 xfd_err)
 	spin_lock_irq(&current->sighand->siglock);
 
 	/* If not permitted let it die */
-	if ((xstate_get_host_group_perm() & xfd_event) != xfd_event) {
+	if ((xstate_get_group_perm(!!guest_fpu) & xfd_event) != xfd_event) {
 		spin_unlock_irq(&current->sighand->siglock);
 		return -EPERM;
 	}
 
 	fpu = &current->group_leader->thread.fpu;
-	ksize = fpu->perm.__state_size;
-	usize = fpu->perm.__user_state_size;
+	perm = guest_fpu ? &fpu->guest_perm : &fpu->perm;
+	ksize = perm->__state_size;
+	usize = perm->__user_state_size;
+
 	/*
 	 * The feature is permitted. State size is sufficient.  Dropping
 	 * the lock is safe here even if more features are added from
@@ -1717,10 +1716,16 @@  int xfd_enable_feature(u64 xfd_err)
 	 * Try to allocate a new fpstate. If that fails there is no way
 	 * out.
 	 */
-	if (fpstate_realloc(xfd_event, ksize, usize))
+	if (fpstate_realloc(xfd_event, ksize, usize, guest_fpu))
 		return -EFAULT;
 	return 0;
 }
+
+int xfd_enable_feature(u64 xfd_err)
+{
+	return __xfd_enable_feature(xfd_err, NULL);
+}
+
 #else /* CONFIG_X86_64 */
 static inline int xstate_request_perm(unsigned long idx, bool guest)
 {
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -60,6 +60,8 @@  extern void fpu__init_system_xstate(unsi
 
 extern void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 
+extern int __xfd_enable_feature(u64 which, struct fpu_guest *guest_fpu);
+
 static inline u64 xfeatures_mask_supervisor(void)
 {
 	return fpu_kernel_cfg.max_features & XFEATURE_MASK_SUPERVISOR_SUPPORTED;