diff mbox series

[v19,030/130] KVM: TDX: Add helper functions to print TDX SEAMCALL error

Message ID 1259755bde3a07ec4dc2c78626fa348cf7323b33.1708933498.git.isaku.yamahata@intel.com (mailing list archive)
State New, archived
Headers show
Series [v19,001/130] x86/virt/tdx: Rename _offset to _member for TD_SYSINFO_MAP() macro | expand

Commit Message

Isaku Yamahata Feb. 26, 2024, 8:25 a.m. UTC
From: Isaku Yamahata <isaku.yamahata@intel.com>

Add helper functions to print out errors from the TDX module in a uniform
manner.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
---
v19:
- dropped unnecessary include <asm/tdx.h>

v18:
- Added Reviewed-by Binbin.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/Makefile        |  2 +-
 arch/x86/kvm/vmx/tdx_error.c | 21 +++++++++++++++++++++
 arch/x86/kvm/vmx/tdx_ops.h   |  4 ++++
 3 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/kvm/vmx/tdx_error.c

Comments

Huang, Kai March 20, 2024, 12:29 a.m. UTC | #1
On 26/02/2024 9:25 pm, isaku.yamahata@intel.com wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
> 
> Add helper functions to print out errors from the TDX module in a uniform
> manner.

Likely we need more information here.  See below.

> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
> Reviewed-by: Yuan Yao <yuan.yao@intel.com>
> ---
> v19:
> - dropped unnecessary include <asm/tdx.h>
> 
> v18:
> - Added Reviewed-by Binbin.

The tag doesn't show in the SoB chain.

> 
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> ---

[...]

> +void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_args *out)
> +{
> +	if (!out) {
> +		pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
> +				   op, error_code);
> +		return;
> +	}

I think this is the reason you still want the @out in tdx_seamcall()?

But I am not sure either -- even if you want to have @out *here* -- why 
cannot you pass a NULL explicitly when you *know* the concerned SEAMCALL 
doesn't have a valid output?

> +
> +#define MSG	\
> +	"SEAMCALL (0x%016llx) failed: 0x%016llx RCX 0x%016llx RDX 0x%016llx R8 0x%016llx R9 0x%016llx R10 0x%016llx R11 0x%016llx\n"
> +	pr_err_ratelimited(MSG, op, error_code, out->rcx, out->rdx, out->r8,
> +			   out->r9, out->r10, out->r11);
> +}

Besides the regs that you are printing, there are more regs (R12-R15, 
RDI, RSI) in the structure.

It's not clear why you only print some, but not all.

AFAICT the VP.ENTER SEAMCALL can have all regs as valid output?

Anyway, that being said, you might need to put more text in 
changelog/comment to make this patch (at least more) reviewable.
Isaku Yamahata March 20, 2024, 9:50 p.m. UTC | #2
On Wed, Mar 20, 2024 at 01:29:07PM +1300,
"Huang, Kai" <kai.huang@intel.com> wrote:

> 
> 
> On 26/02/2024 9:25 pm, isaku.yamahata@intel.com wrote:
> > From: Isaku Yamahata <isaku.yamahata@intel.com>
> > 
> > Add helper functions to print out errors from the TDX module in a uniform
> > manner.
> 
> Likely we need more information here.  See below.
> 
> > 
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> > Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
> > Reviewed-by: Yuan Yao <yuan.yao@intel.com>
> > ---
> > v19:
> > - dropped unnecessary include <asm/tdx.h>
> > 
> > v18:
> > - Added Reviewed-by Binbin.
> 
> The tag doesn't show in the SoB chain.
> 
> > 
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> > ---
> 
> [...]
> 
> > +void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_args *out)
> > +{
> > +	if (!out) {
> > +		pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
> > +				   op, error_code);
> > +		return;
> > +	}
> 
> I think this is the reason you still want the @out in tdx_seamcall()?
> 
> But I am not sure either -- even if you want to have @out *here* -- why
> cannot you pass a NULL explicitly when you *know* the concerned SEAMCALL
> doesn't have a valid output?
> 
> > +
> > +#define MSG	\
> > +	"SEAMCALL (0x%016llx) failed: 0x%016llx RCX 0x%016llx RDX 0x%016llx R8 0x%016llx R9 0x%016llx R10 0x%016llx R11 0x%016llx\n"
> > +	pr_err_ratelimited(MSG, op, error_code, out->rcx, out->rdx, out->r8,
> > +			   out->r9, out->r10, out->r11);
> > +}
> 
> Besides the regs that you are printing, there are more regs (R12-R15, RDI,
> RSI) in the structure.
> 
> It's not clear why you only print some, but not all.
> 
> AFAICT the VP.ENTER SEAMCALL can have all regs as valid output?

Only those are used for SEAMCALLs except TDH.VP.ENTER. TDH.VP.ENTER is an
exception.

As discussed at [1], out can be eliminated. We will have only limited output.
If we go for that route, we'll have the two following functions.
Does it make sense?

void pr_tdx_error(u64 op, u64 error_code)
{
        pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
                           op, error_code);
}

void pr_tdx_sept_error(u64 op, u64 error_code, const union tdx_sept_entry *entry,
		       const union tdx_sept_level_state *level_state)
{
#define MSG \
        "SEAMCALL (0x%016llx) failed: 0x%016llx entry 0x%016llx level_state 0x%016llx\n"
        pr_err_ratelimited(MSG, op, error_code, entry->raw, level_state->raw);
}


[1] https://lore.kernel.org/kvm/20240320213600.GI1994522@ls.amr.corp.intel.com/

> 
> Anyway, that being said, you might need to put more text in
> changelog/comment to make this patch (at least more) reviewable.
>
Huang, Kai March 20, 2024, 11:09 p.m. UTC | #3
> Does it make sense?
> 
> void pr_tdx_error(u64 op, u64 error_code)
> {
>          pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
>                             op, error_code);
> }

Should we also have a _ret version?

void pr_seamcall_err(u64 op, u64 err)
{
	/* A comment to explain why using the _ratelimited() version? */
	pr_err_ratelimited(...);
}

void pr_seamcall_err_ret(u64 op, u64 err, struct tdx_module_args *arg)
{
	pr_err_seamcall(op, err);
	
	pr_err_ratelimited(...);
}

(Hmm... if you look at the tdx.c in TDX host, there's similar code 
there, and again, it was a little bit annoying when I did that..)

Again, if we just use seamcall_ret() for ALL SEAMCALLs except VP.ENTER, 
we can simply have one..

> 
> void pr_tdx_sept_error(u64 op, u64 error_code, const union tdx_sept_entry *entry,
> 		       const union tdx_sept_level_state *level_state)
> {
> #define MSG \
>          "SEAMCALL (0x%016llx) failed: 0x%016llx entry 0x%016llx level_state 0x%016llx\n"
>          pr_err_ratelimited(MSG, op, error_code, entry->raw, level_state->raw);
> }

A higher-level wrapper to print SEPT error is fine to me, but do it in a 
separate patch.
Isaku Yamahata March 21, 2024, 11:52 p.m. UTC | #4
On Thu, Mar 21, 2024 at 12:09:57PM +1300,
"Huang, Kai" <kai.huang@intel.com> wrote:

> > Does it make sense?
> > 
> > void pr_tdx_error(u64 op, u64 error_code)
> > {
> >          pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
> >                             op, error_code);
> > }
> 
> Should we also have a _ret version?
> 
> void pr_seamcall_err(u64 op, u64 err)
> {
> 	/* A comment to explain why using the _ratelimited() version? */

Because KVM can hit successive seamcall erorrs e.g. during desutructing TD,
(it's unintentional sometimes), ratelimited version is preferred as safe guard.
For example, SEAMCALL on all or some LPs (TDH_MNG_KEY_FREEID) can fail at the
same time.  And the number of LPs can be hundreds.


> 	pr_err_ratelimited(...);
> }
> 
> void pr_seamcall_err_ret(u64 op, u64 err, struct tdx_module_args *arg)
> {
> 	pr_err_seamcall(op, err);
> 	
> 	pr_err_ratelimited(...);
> }
> 
> (Hmm... if you look at the tdx.c in TDX host, there's similar code there,
> and again, it was a little bit annoying when I did that..)
> 
> Again, if we just use seamcall_ret() for ALL SEAMCALLs except VP.ENTER, we
> can simply have one..

What about this?

void pr_seamcall_err_ret(u64 op, u64 err, struct tdx_module_args *arg)
{
        pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
                           op, error_code);
        if (arg)	
        	pr_err_ratelimited(...);
}



> > void pr_tdx_sept_error(u64 op, u64 error_code, const union tdx_sept_entry *entry,
> > 		       const union tdx_sept_level_state *level_state)
> > {
> > #define MSG \
> >          "SEAMCALL (0x%016llx) failed: 0x%016llx entry 0x%016llx level_state 0x%016llx\n"
> >          pr_err_ratelimited(MSG, op, error_code, entry->raw, level_state->raw);
> > }
> 
> A higher-level wrapper to print SEPT error is fine to me, but do it in a
> separate patch.

Ok, Let's postpone custom version.
Huang, Kai March 22, 2024, 4:37 a.m. UTC | #5
On Thu, 2024-03-21 at 16:52 -0700, Isaku Yamahata wrote:
> On Thu, Mar 21, 2024 at 12:09:57PM +1300,
> "Huang, Kai" <kai.huang@intel.com> wrote:
> 
> > > Does it make sense?
> > > 
> > > void pr_tdx_error(u64 op, u64 error_code)
> > > {
> > >          pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
> > >                             op, error_code);
> > > }
> > 
> > Should we also have a _ret version?
> > 
> > void pr_seamcall_err(u64 op, u64 err)
> > {
> > 	/* A comment to explain why using the _ratelimited() version? */
> 
> Because KVM can hit successive seamcall erorrs e.g. during desutructing TD,
> (it's unintentional sometimes), ratelimited version is preferred as safe guard.
> For example, SEAMCALL on all or some LPs (TDH_MNG_KEY_FREEID) can fail at the
> same time.  And the number of LPs can be hundreds.

I mean you certainly have a reason to use _ratelimited() version.  My point is
you at least explain it in a comment.

> 
> 
> > 	pr_err_ratelimited(...);
> > }
> > 
> > void pr_seamcall_err_ret(u64 op, u64 err, struct tdx_module_args *arg)
> > {
> > 	pr_err_seamcall(op, err);
> > 	
> > 	pr_err_ratelimited(...);
> > }
> > 
> > (Hmm... if you look at the tdx.c in TDX host, there's similar code there,
> > and again, it was a little bit annoying when I did that..)
> > 
> > Again, if we just use seamcall_ret() for ALL SEAMCALLs except VP.ENTER, we
> > can simply have one..
> 
> What about this?
> 
> void pr_seamcall_err_ret(u64 op, u64 err, struct tdx_module_args *arg)
> {
>         pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
>                            op, error_code);
>         if (arg)	
>         	pr_err_ratelimited(...);
> }
> 

Fine to me.

Or call pr_seamcall_err() instead.  I don't care too much.
Huang, Kai April 24, 2024, 12:11 a.m. UTC | #6
On Mon, 2024-02-26 at 00:25 -0800, isaku.yamahata@intel.com wrote:
> --- a/arch/x86/kvm/vmx/tdx_ops.h
> +++ b/arch/x86/kvm/vmx/tdx_ops.h
> @@ -40,6 +40,10 @@ static inline u64 tdx_seamcall(u64 op, struct tdx_module_args *in,
>  	return ret;
>  }
>  
> +#ifdef CONFIG_INTEL_TDX_HOST
> +void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_args *out);
> +#endif
> +

Why this needs to be inside the CONFIG_INTEL_TDX_HOST while other
tdh_xxx() don't?

I suppose all tdh_xxx() together with this pr_tdx_error() should only be
called tdx.c, which is only built when CONFIG_INTEL_TDX_HOST is true?

In fact, tdx_seamcall() directly calls seamcall() and seamcall_ret(),
which are only present when CONFIG_INTEL_TDX_HOST is on.

So things are really confused here.  I do believe we should just remove
this CONFIG_INTEL_TDX_HOST around pr_tdx_error() so all functions in
"tdx_ops.h" should only be used in tdx.c.
Huang, Kai April 24, 2024, 11:06 a.m. UTC | #7
On Mon, 2024-02-26 at 00:25 -0800, isaku.yamahata@intel.com wrote:
> --- /dev/null
> +++ b/arch/x86/kvm/vmx/tdx_error.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* functions to record TDX SEAMCALL error */
> +
> +#include <linux/kernel.h>
> +#include <linux/bug.h>
> +

I don't see why the above two are needed, especially the giant
<linux/kernel.h>.

<linux/printk.h> should be sufficient for the current patch.
Isaku Yamahata April 26, 2024, 6:38 p.m. UTC | #8
On Wed, Apr 24, 2024 at 12:11:25AM +0000,
"Huang, Kai" <kai.huang@intel.com> wrote:

> On Mon, 2024-02-26 at 00:25 -0800, isaku.yamahata@intel.com wrote:
> > --- a/arch/x86/kvm/vmx/tdx_ops.h
> > +++ b/arch/x86/kvm/vmx/tdx_ops.h
> > @@ -40,6 +40,10 @@ static inline u64 tdx_seamcall(u64 op, struct tdx_module_args *in,
> >  	return ret;
> >  }
> >  
> > +#ifdef CONFIG_INTEL_TDX_HOST
> > +void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_args *out);
> > +#endif
> > +
> 
> Why this needs to be inside the CONFIG_INTEL_TDX_HOST while other
> tdh_xxx() don't?
> 
> I suppose all tdh_xxx() together with this pr_tdx_error() should only be
> called tdx.c, which is only built when CONFIG_INTEL_TDX_HOST is true?
> 
> In fact, tdx_seamcall() directly calls seamcall() and seamcall_ret(),
> which are only present when CONFIG_INTEL_TDX_HOST is on.
> 
> So things are really confused here.  I do believe we should just remove
> this CONFIG_INTEL_TDX_HOST around pr_tdx_error() so all functions in
> "tdx_ops.h" should only be used in tdx.c.

You're right, please go clean them up.
diff mbox series

Patch

diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 5b85ef84b2e9..44b0594da877 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -24,7 +24,7 @@  kvm-intel-y		+= vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
 
 kvm-intel-$(CONFIG_X86_SGX_KVM)	+= vmx/sgx.o
 kvm-intel-$(CONFIG_KVM_HYPERV)	+= vmx/hyperv.o vmx/hyperv_evmcs.o
-kvm-intel-$(CONFIG_INTEL_TDX_HOST)	+= vmx/tdx.o
+kvm-intel-$(CONFIG_INTEL_TDX_HOST)	+= vmx/tdx.o vmx/tdx_error.o
 
 kvm-amd-y		+= svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o \
 			   svm/sev.o
diff --git a/arch/x86/kvm/vmx/tdx_error.c b/arch/x86/kvm/vmx/tdx_error.c
new file mode 100644
index 000000000000..42fcabe1f6c7
--- /dev/null
+++ b/arch/x86/kvm/vmx/tdx_error.c
@@ -0,0 +1,21 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/* functions to record TDX SEAMCALL error */
+
+#include <linux/kernel.h>
+#include <linux/bug.h>
+
+#include "tdx_ops.h"
+
+void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_args *out)
+{
+	if (!out) {
+		pr_err_ratelimited("SEAMCALL (0x%016llx) failed: 0x%016llx\n",
+				   op, error_code);
+		return;
+	}
+
+#define MSG	\
+	"SEAMCALL (0x%016llx) failed: 0x%016llx RCX 0x%016llx RDX 0x%016llx R8 0x%016llx R9 0x%016llx R10 0x%016llx R11 0x%016llx\n"
+	pr_err_ratelimited(MSG, op, error_code, out->rcx, out->rdx, out->r8,
+			   out->r9, out->r10, out->r11);
+}
diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h
index c5bb165b260e..d80212b1daf3 100644
--- a/arch/x86/kvm/vmx/tdx_ops.h
+++ b/arch/x86/kvm/vmx/tdx_ops.h
@@ -40,6 +40,10 @@  static inline u64 tdx_seamcall(u64 op, struct tdx_module_args *in,
 	return ret;
 }
 
+#ifdef CONFIG_INTEL_TDX_HOST
+void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_args *out);
+#endif
+
 static inline u64 tdh_mng_addcx(hpa_t tdr, hpa_t addr)
 {
 	struct tdx_module_args in = {