diff mbox series

[v5,5/8] drm/i915/pxp: Add ARB session creation and cleanup

Message ID 20230214213844.2890382-6-alan.previn.teres.alexis@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/pxp: Add MTL PXP Support | expand

Commit Message

Alan Previn Feb. 14, 2023, 9:38 p.m. UTC
Add MTL's function for ARB session creation using PXP firmware
version 4.3 ABI structure format.

Also add MTL's function for ARB session invalidation but this
reuses PXP firmware version 4.2 ABI structure format.

Before checking the return status, look at the GSC-CS-Mem-Header's
pending-bit which means the GSC firmware is busy and we should
resubmit.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
---
 drivers/gpu/drm/i915/pxp/intel_pxp.c          |  9 +-
 .../drm/i915/pxp/intel_pxp_cmd_interface_43.h | 21 +++++
 drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c    | 92 +++++++++++++++++++
 drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h    |  3 +
 drivers/gpu/drm/i915/pxp/intel_pxp_session.c  | 11 ++-
 5 files changed, 132 insertions(+), 4 deletions(-)

Comments

Alan Previn Feb. 17, 2023, 3:12 a.m. UTC | #1
On Tue, 2023-02-14 at 13:38 -0800, Teres Alexis, Alan Previn wrote:
> Add MTL's function for ARB session creation using PXP firmware
> version 4.3 ABI structure format.
> 
> Also add MTL's function for ARB session invalidation but this
> reuses PXP firmware version 4.2 ABI structure format.
> 
> Before checking the return status, look at the GSC-CS-Mem-Header's
> pending-bit which means the GSC firmware is busy and we should
> resubmit.
> 
> Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
> ---

alan:snip

Not part of this patch today but a new modification is required that would end up going into this patch --->

So from the internal testing we are doing on MTL, i have noticed that the first time the GSC firmware
is requested to init the arb session (right after a cold-boot or  driver-reload-after-flr), it takes much longer.
This has resulted in the observation of the following problematic event flow:

1. app or igt calls gem-context-create to create a protected context (after a fresh boot or driver reload).
2. intel_pxp_start will begin the global teardown and recreation where:
	2-a: the first part (i.e. session teardown) is skipped (since arb session wasnt created before this)
        2-b: the second part (i.e. arb session init commands via the gsc firmware) does happen and takes a long time (on first time)
3. step 2 is queued thru a worker while the main call into intel_pxp_start continues to wait for the arb
    session to start and finally bails out with a timeout (back up through gem-context-create).
4. app retries again and now we get a second call that repeats step 1 while 2-b is still wrapping up.
    so depending on the race of this step 4 (step-1-recall) vs the completion of step 2-b, we could end up
    getting a 2nd teardown right (i.e. step 2-a going in) after the the first arb-session-creation completed
    ... eventhough in both cases app just wants the creation.

The simplest fix (with minimal code changes) would be to add a complementary "is_arb_creation_pending" flag
alongside the is_arb_valid flag - with both remainining protected by the arb-mutex. That said, we I'll respin rev6
with this fix along with other mutex fix on Patch4.
Alan Previn Feb. 23, 2023, 10:39 p.m. UTC | #2
On Tue, 2023-02-14 at 13:38 -0800, Teres Alexis, Alan Previn wrote:
> Add MTL's function for ARB session creation using PXP firmware
> version 4.3 ABI structure format.
> 
alan:snip
> 
> diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> index aecc65b5da70..c25e9ff16b57 100644
> --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
> +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
> @@ -353,8 +353,13 @@ int intel_pxp_start(struct intel_pxp *pxp)
>  	if (!intel_pxp_is_enabled(pxp))
>  		return -ENODEV;
>  
> -	if (wait_for(pxp_component_bound(pxp), 250))
> -		return -ENXIO;
> +	if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) {
> +		if (wait_for(intel_uc_fw_is_running(&pxp->ctrl_gt->uc.gsc.fw), 250))
> +			return -ENXIO;
alan: As per offline conversation with Daniele, at a system level, we should actually need to
check if the huc-firmware has been authenticated via the function intel_huc_is_authenticated
(as opposed to if gsc fw is running as per this patch). We both realize that huc-authentication
is not available yet for MTL on upstream (but that function is already used on ADL/DG2).
Eventually when huc-authentication is posted for MTL, this series (hopefully merged by then)
would be aligned with that upcoming series. Thus will also make this change in upcoming
series.

> +	} else {
> +		if (wait_for(pxp_component_bound(pxp), 250))
> +			return -ENXIO;
Alan Previn Feb. 23, 2023, 11:27 p.m. UTC | #3
On Fri, 2023-02-17 at 03:12 +0000, Teres Alexis, Alan Previn wrote:
> On Tue, 2023-02-14 at 13:38 -0800, Teres Alexis, Alan Previn wrote:
> > Add MTL's function for ARB session creation using PXP firmware
> > version 4.3 ABI structure format.
> 
> alan:snip
> 
> Not part of this patch today but a new modification is required that would end up going into this patch --->
> 
> So from the internal testing we are doing on MTL, i have noticed that the first time the GSC firmware
> is requested to init the arb session (right after a cold-boot or  driver-reload-after-flr), it takes much longer.
> This has resulted in the observation of the following problematic event flow:
> 
> 1. app or igt calls gem-context-create to create a protected context (after a fresh boot or driver reload).
> 2. intel_pxp_start will begin the global teardown and recreation where:
> 	2-a: the first part (i.e. session teardown) is skipped (since arb session wasnt created before this)
>         2-b: the second part (i.e. arb session init commands via the gsc firmware) does happen and takes a long time (on first time)
> 3. step 2 is queued thru a worker while the main call into intel_pxp_start continues to wait for the arb
>     session to start and finally bails out with a timeout (back up through gem-context-create).
> 4. app retries again and now we get a second call that repeats step 1 while 2-b is still wrapping up.
>     so depending on the race of this step 4 (step-1-recall) vs the completion of step 2-b, we could end up
>     getting a 2nd teardown right (i.e. step 2-a going in) after the the first arb-session-creation completed
>     ... eventhough in both cases app just wants the creation.
> 
> The simplest fix (with minimal code changes) would be to add a complementary "is_arb_creation_pending" flag
> alongside the is_arb_valid flag - with both remainining protected by the arb-mutex. That said, we I'll respin rev6
> with this fix along with other mutex fix on Patch4.

After additional offline discussions with Daniele, we've decided against adding more complexity.
Instead we'll get the official timeout spec from the gsc-firmware and bump up the arb-session timeout
creation across the call-stack to ensure its sufficient and if it fails, we return -ENODEV indicating
we do not have PXP support. Although this will block the gem protected-context creation, it will not
block other apps, only the ones creating protected contexts which would end up waiting somewhere for
the gsc fw (from an e2e system level) no matter what design we employ so bumping out timeout with a 
hard -ENODEV seems like the most straight forward.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index aecc65b5da70..c25e9ff16b57 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -353,8 +353,13 @@  int intel_pxp_start(struct intel_pxp *pxp)
 	if (!intel_pxp_is_enabled(pxp))
 		return -ENODEV;
 
-	if (wait_for(pxp_component_bound(pxp), 250))
-		return -ENXIO;
+	if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) {
+		if (wait_for(intel_uc_fw_is_running(&pxp->ctrl_gt->uc.gsc.fw), 250))
+			return -ENXIO;
+	} else {
+		if (wait_for(pxp_component_bound(pxp), 250))
+			return -ENXIO;
+	}
 
 	mutex_lock(&pxp->arb_mutex);
 
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h
index b2523d6918c7..9089e02a8c2d 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_cmd_interface_43.h
@@ -11,6 +11,7 @@ 
 
 /* PXP-Cmd-Op definitions */
 #define PXP43_CMDID_START_HUC_AUTH 0x0000003A
+#define PXP43_CMDID_INIT_SESSION 0x00000036
 
 /* PXP-Packet sizes for MTL's GSCCS-HECI instruction */
 #define PXP43_MAX_HECI_IN_SIZE (SZ_32K)
@@ -27,4 +28,24 @@  struct pxp43_start_huc_auth_out {
 	struct pxp_cmd_header header;
 } __packed;
 
+/* PXP-Input-Packet: Init PXP session */
+struct pxp43_create_arb_in {
+	struct pxp_cmd_header header;
+		/* header.stream_id fields for vesion 4.3 of Init PXP session: */
+		#define PXP43_INIT_SESSION_VALID BIT(0)
+		#define PXP43_INIT_SESSION_APPTYPE BIT(1)
+		#define PXP43_INIT_SESSION_APPID GENMASK(17, 2)
+	u32 protection_mode;
+		#define PXP43_INIT_SESSION_PROTECTION_ARB 0x2
+	u32 sub_session_id;
+	u32 init_flags;
+	u32 rsvd[12];
+} __packed;
+
+/* PXP-Input-Packet: Init PXP session */
+struct pxp43_create_arb_out {
+	struct pxp_cmd_header header;
+	u32 rsvd[8];
+} __packed;
+
 #endif /* __INTEL_PXP_FW_INTERFACE_43_H__ */
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c
index 3cd28db830c1..002686cc7c8a 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c
@@ -9,6 +9,7 @@ 
 #include "gt/uc/intel_gsc_uc_heci_cmd_submit.h"
 
 #include "i915_drv.h"
+#include "intel_pxp_cmd_interface_42.h"
 #include "intel_pxp_cmd_interface_43.h"
 #include "intel_pxp_gsccs.h"
 #include "intel_pxp_types.h"
@@ -121,6 +122,97 @@  static int gsccs_send_message(struct intel_pxp *pxp,
 	return ret;
 }
 
+int intel_pxp_gsccs_create_session(struct intel_pxp *pxp,
+				   int arb_session_id)
+{
+	struct drm_i915_private *i915 = pxp->ctrl_gt->i915;
+	struct gsccs_session_resources *exec =  &pxp->gsccs_res;
+	struct pxp43_create_arb_in msg_in = {0};
+	struct pxp43_create_arb_out msg_out = {0};
+	u64 gsc_session_retry = 0;
+	int ret, tries = 0;
+
+	/* get a unique host-session-handle (used later in HW cmds) at time of session creation */
+	get_random_bytes(&exec->host_session_handle, sizeof(exec->host_session_handle));
+
+	msg_in.header.api_version = PXP_APIVER(4, 3);
+	msg_in.header.command_id = PXP43_CMDID_INIT_SESSION;
+	msg_in.header.stream_id = (FIELD_PREP(PXP43_INIT_SESSION_APPID, arb_session_id) |
+				   FIELD_PREP(PXP43_INIT_SESSION_VALID, 1) |
+				   FIELD_PREP(PXP43_INIT_SESSION_APPTYPE, 0));
+	msg_in.header.buffer_len = sizeof(msg_in) - sizeof(msg_in.header);
+	msg_in.protection_mode = PXP43_INIT_SESSION_PROTECTION_ARB;
+
+	/*
+	 * Keep sending request if GSC firmware was busy.
+	 * Based on specs, we can expects a worst case pending-bit
+	 * delay of 2000 milisecs.
+	 */
+	do {
+		ret = gsccs_send_message(pxp,
+					 &msg_in, sizeof(msg_in),
+					 &msg_out, sizeof(msg_out), NULL,
+					 &gsc_session_retry);
+		/* Only try again if gsc says so */
+		if (ret != -EAGAIN)
+			break;
+
+		msleep(20);
+	} while (++tries < 100);
+
+	if (ret)
+		drm_warn(&i915->drm, "Failed to send gsccs msg for creating-session-%d: ret=[%d]\n",
+			 arb_session_id, ret);
+	else if (msg_out.header.status != 0x0)
+		drm_warn(&i915->drm, "PXP firmware failed on creating-session-%d: status=0x%08x\n",
+			 arb_session_id, msg_out.header.status);
+
+	return ret;
+}
+
+void intel_pxp_gsccs_end_arb_fw_session(struct intel_pxp *pxp, u32 session_id)
+{
+	struct drm_i915_private *i915 = pxp->ctrl_gt->i915;
+	struct pxp42_inv_stream_key_in msg_in = {0};
+	struct pxp42_inv_stream_key_out msg_out = {0};
+	int ret = 0, tries = 0;
+	u64 gsc_session_retry = 0;
+
+	memset(&msg_in, 0, sizeof(msg_in));
+	memset(&msg_out, 0, sizeof(msg_out));
+	msg_in.header.api_version = PXP_APIVER(4, 3);
+	msg_in.header.command_id = PXP42_CMDID_INVALIDATE_STREAM_KEY;
+	msg_in.header.buffer_len = sizeof(msg_in) - sizeof(msg_in.header);
+
+	msg_in.header.stream_id = FIELD_PREP(PXP_CMDHDR_EXTDATA_SESSION_VALID, 1);
+	msg_in.header.stream_id |= FIELD_PREP(PXP_CMDHDR_EXTDATA_APP_TYPE, 0);
+	msg_in.header.stream_id |= FIELD_PREP(PXP_CMDHDR_EXTDATA_SESSION_ID, session_id);
+
+	/*
+	 * Keep sending request if GSC firmware was busy.
+	 * Based on test data, we expects a worst case delay of 250 milisecs.
+	 */
+	do {
+		ret = gsccs_send_message(pxp,
+					 &msg_in, sizeof(msg_in),
+					 &msg_out, sizeof(msg_out), NULL,
+					 &gsc_session_retry);
+
+		/* Only try again if gsc says so */
+		if (ret != -EAGAIN)
+			break;
+
+		msleep(20);
+	} while (++tries < 100);
+
+	if (ret)
+		drm_warn(&i915->drm, "Failed to send gsccs msg for inv-stream-session-%d: ret=[%d]\n",
+			 session_id, ret);
+	else if (msg_out.header.status != 0x0)
+		drm_warn(&i915->drm, "PXP firmware failed on inv-stream-session-%d: status=0x%08x\n",
+			 session_id, msg_out.header.status);
+}
+
 static int
 gsccs_create_buffer(struct intel_gt *gt,
 		    const char *bufname, size_t size,
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h
index 354ea9a8f940..5eb26ccc4170 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.h
@@ -14,6 +14,9 @@  struct intel_pxp;
 void intel_pxp_gsccs_fini(struct intel_pxp *pxp);
 int intel_pxp_gsccs_init(struct intel_pxp *pxp);
 
+int intel_pxp_gsccs_create_session(struct intel_pxp *pxp, int arb_session_id);
+void intel_pxp_gsccs_end_arb_fw_session(struct intel_pxp *pxp, u32 arb_session_id);
+
 #else
 static inline void intel_pxp_gsccs_fini(struct intel_pxp *pxp)
 {
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
index b8fa6df7141b..4ddf2ee60222 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_session.c
@@ -7,6 +7,7 @@ 
 
 #include "intel_pxp.h"
 #include "intel_pxp_cmd.h"
+#include "intel_pxp_gsccs.h"
 #include "intel_pxp_session.h"
 #include "intel_pxp_tee.h"
 #include "intel_pxp_types.h"
@@ -62,7 +63,10 @@  static int pxp_create_arb_session(struct intel_pxp *pxp)
 		return -EEXIST;
 	}
 
-	ret = intel_pxp_tee_cmd_create_arb_session(pxp, ARB_SESSION);
+	if (HAS_ENGINE(pxp->ctrl_gt, GSC0))
+		ret = intel_pxp_gsccs_create_session(pxp, ARB_SESSION);
+	else
+		ret = intel_pxp_tee_cmd_create_arb_session(pxp, ARB_SESSION);
 	if (ret) {
 		drm_err(&gt->i915->drm, "tee cmd for arb session creation failed\n");
 		return ret;
@@ -106,7 +110,10 @@  static int pxp_terminate_arb_session_and_global(struct intel_pxp *pxp)
 
 	intel_uncore_write(gt->uncore, KCR_GLOBAL_TERMINATE(pxp->kcr_base), 1);
 
-	intel_pxp_tee_end_arb_fw_session(pxp, ARB_SESSION);
+	if (HAS_ENGINE(gt, GSC0))
+		intel_pxp_gsccs_end_arb_fw_session(pxp, ARB_SESSION);
+	else
+		intel_pxp_tee_end_arb_fw_session(pxp, ARB_SESSION);
 
 	return ret;
 }