diff mbox series

[v9,6/8] drm/i915/uapi/pxp: Add a GET_PARAM for PXP

Message ID 20230427234843.2886921-7-alan.previn.teres.alexis@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/pxp: Add MTL PXP Support | expand

Commit Message

Teres Alexis, Alan Previn April 27, 2023, 11:48 p.m. UTC
Because of the additional firmware, component-driver and
initialization depedencies required on MTL platform before a
PXP context can be created, UMD calling for PXP creation as a
way to get-caps can take a long time. An actual real world
customer stack has seen this happen in the 4-to-8 second range
after the kernel starts (which sees MESA's init appear in the
middle of this range as the compositor comes up). To avoid
unncessary delays experienced by the UMD for get-caps purposes,
add a GET_PARAM for I915_PARAM_PXP_SUPPORT.

However, some failures can still occur after all the depedencies
are met (such as firmware init flow failure, bios configurations
or SOC fusing not allowing PXP enablement). Those scenarios will
only be known to user space when it attempts creating a PXP context
and is documented in the GEM UAPI headers.

While making this change, create a helper that is common to both
GET_PARAM caller and intel_pxp_start since the latter does
similiar checks.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
---
 drivers/gpu/drm/i915/i915_getparam.c |  7 +++++++
 drivers/gpu/drm/i915/pxp/intel_pxp.c | 29 +++++++++++++++++++++-------
 drivers/gpu/drm/i915/pxp/intel_pxp.h |  1 +
 include/uapi/drm/i915_drm.h          | 19 ++++++++++++++++++
 4 files changed, 49 insertions(+), 7 deletions(-)

Comments

Teres Alexis, Alan Previn May 5, 2023, 5:30 a.m. UTC | #1
On Thu, 2023-04-27 at 16:48 -0700, Teres Alexis, Alan Previn wrote:
> Because of the additional firmware, component-driver and
> initialization depedencies required on MTL platform before a
> PXP context can be created, UMD calling for PXP creation as a
> way to get-caps can take a long time. An actual real world
> customer stack has seen this happen in the 4-to-8 second range
> after the kernel starts (which sees MESA's init appear in the
> middle of this range as the compositor comes up). To avoid
> unncessary delays experienced by the UMD for get-caps purposes,
> add a GET_PARAM for I915_PARAM_PXP_SUPPORT.
> 
alan:snip.
Progress update on the UMD side - I'm working on patch for PR here: 
https://gitlab.freedesktop.org/alan_previn_intel/mesa-alan-previn-features/-/commit/fb9d4fbfbef7dfd3f41df335cd31549fd39ddb57
but taking time to verify certain code paths
Jordan Justen May 5, 2023, 7:39 a.m. UTC | #2
On 2023-05-04 22:30:07, Teres Alexis, Alan Previn wrote:
> On Thu, 2023-04-27 at 16:48 -0700, Teres Alexis, Alan Previn wrote:
> > Because of the additional firmware, component-driver and
> > initialization depedencies required on MTL platform before a
> > PXP context can be created, UMD calling for PXP creation as a
> > way to get-caps can take a long time. An actual real world
> > customer stack has seen this happen in the 4-to-8 second range
> > after the kernel starts (which sees MESA's init appear in the
> > middle of this range as the compositor comes up). To avoid
> > unncessary delays experienced by the UMD for get-caps purposes,
> > add a GET_PARAM for I915_PARAM_PXP_SUPPORT.
> > 
> alan:snip.
> Progress update on the UMD side - I'm working on patch for PR here: 
> https://gitlab.freedesktop.org/alan_previn_intel/mesa-alan-previn-features/-/commit/fb9d4fbfbef7dfd3f41df335cd31549fd39ddb57
> but taking time to verify certain code paths

Just to confirm, if I915_PARAM_PXP_STATUS returns 2 ("will be ready
soon"), then it is basically certain that in a production environment,
then it will eventually return 1 meaning it's ready, right?

If this is correct, then I think that the change in
i915_gem_supports_protected_context() is good, and probably we can
skip the change in iris_create_hw_context().

Basically, we're timing out for multiple seconds either way. And, I'm
hoping that the kernel will eventually get the PXP init done and
create the context.

I think there's 2 cases of interest here.

First, and most likely, the application running doesn't care about
protected content. In this case we can quickly advertise the support,
but there will be no consequence because the application won't use the
feature.

Second, the application does care about protected content. In this
case we can quickly advertise the support, but if the feature is used
too quickly, then the context create might take a long time.

If I915_PARAM_PXP_STATUS returning 2 has a reasonable chance in a
production environment of eventually finding out that pxp can't work,
then perhaps more disussion is needed. I guess the worst case scenario
is no worse than today though. (Except it is still somewhat better,
because the common case would not involve protected content being
used by the application.)

Another option besides from the timeout loop in
iris_create_hw_context() might be to check I915_PARAM_PXP_STATUS after
the context create fails to tweak the debug message.

-Jordan
Jordan Justen May 10, 2023, 9:50 p.m. UTC | #3
...fixing some Cc email addresses I somehow mangled.

On 2023-05-10 12:40:07, Souza, Jose wrote:
> On Mon, 2023-05-08 at 17:49 +0000, Teres Alexis, Alan Previn wrote:
> > On Fri, 2023-05-05 at 00:39 -0700, Justen, Jordan L wrote:
> > > On 2023-05-04 22:30:07, Teres Alexis, Alan Previn wrote:
> > > > On Thu, 2023-04-27 at 16:48 -0700, Teres Alexis, Alan Previn wrote:
> > > > > Because of the additional firmware, component-driver and
> > > > > initialization depedencies required on MTL platform before a
> > > > > PXP context can be created, UMD calling for PXP creation as a
> > > > > way to get-caps can take a long time. An actual real world
> > > > > customer stack has seen this happen in the 4-to-8 second range
> > > > > after the kernel starts (which sees MESA's init appear in the
> > > > > middle of this range as the compositor comes up). To avoid
> > > > > unncessary delays experienced by the UMD for get-caps purposes,
> > > > > add a GET_PARAM for I915_PARAM_PXP_SUPPORT.
> > > > > 
> > > > alan:snip.
> > > > Progress update on the UMD side - I'm working on patch for PR here: 
> > > > https://gitlab.freedesktop.org/alan_previn_intel/mesa-alan-previn-features/-/commit/fb9d4fbfbef7dfd3f41df335cd31549fd39ddb57
> > > > but taking time to verify certain code paths
> > > 
> > > Just to confirm, if I915_PARAM_PXP_STATUS returns 2 ("will be ready
> > > soon"), then it is basically certain that in a production environment,
> > > then it will eventually return 1 meaning it's ready, right?
> > alan: yes - but not 100%. non-platform-state-machine could still
> > cause unexpected failures for example, [1] if hw was fused in a way
> > that doesnt permit PXP or [2] enabling certain BIOS debug knobs
> > doesnt allow PXP. However at the moment, there is no way for us
> > to know for sure without actually creating a protected context.
> > Of course having hw-fusing + bios-config that supports PXP have
> > always 100% succeeded for me.
> 
> In my opinion it is problematic mark that we support protected
> context but then it fails to create protected context.

This is why I asked if it was it was "basically certain that in a
production environment, then it will eventually return 1 meaning it's
ready". Alan's response was a little ambiguous on this point.

But, considering the number of applications that will need this
feature is low, combined with an extremely low chance that the kernel
will end up going from 2 (will be ready soon) to -ENODEV (will never
work), well, it seems worth considering advertising it with no delay
even if it later fails if used.

Arguably a transition from 2 to -ENODEV could be considered a kernel
bug, but it doesn't sound like Alan would agree. :) Maybe Alan would
agree to saying it's either a kernel, or system integration bug.

I think it'd also be ok if we didn't advertise support if an
application starts when the kernel is still in the 2 (will be ready
soon) state.

But, some environments might prefer to wait, so I think the kernel
uapi should stay as Alan has it now so we have the flexibility to
potentially accommodate this. (Perhaps with driconf, or a build flag,
or an env-var.)

-Jordan
Teres Alexis, Alan Previn May 10, 2023, 10 p.m. UTC | #4
alan:snip

> This is why I asked if it was it was "basically certain that in a
> production environment, then it will eventually return 1 meaning it's
> ready". Alan's response was a little ambiguous on this point.
alan: if we get a '2' and never transition to '1' - thats a kernel bug or
firmware / system issue.

> Arguably a transition from 2 to -ENODEV could be considered a kernel
> bug, but it doesn't sound like Alan would agree. :) Maybe Alan would
> agree to saying it's either a kernel, or system integration bug.
alan: agreed - that would be a kernel bug or a system integration bug.

I also agreed on the init-flow vs app-usage thoughts Jordan had.
That said MESA has many ways it can use this UAPI and we can discuss
that on the MESA patch.


hmmm.. so... ack anyone? [insert big hopeful smiley here]
apologies if I am asking too often.
Jordan Justen May 10, 2023, 10:24 p.m. UTC | #5
On 2023-05-10 15:00:03, Teres Alexis, Alan Previn wrote:
> 
> alan:snip
> 
> > This is why I asked if it was it was "basically certain that in a
> > production environment, then it will eventually return 1 meaning it's
> > ready". Alan's response was a little ambiguous on this point.
> alan: if we get a '2' and never transition to '1' - thats a kernel bug or
> firmware / system issue.
> 
> > Arguably a transition from 2 to -ENODEV could be considered a kernel
> > bug, but it doesn't sound like Alan would agree. :) Maybe Alan would
> > agree to saying it's either a kernel, or system integration bug.
> alan: agreed - that would be a kernel bug or a system integration bug.
> 
> I also agreed on the init-flow vs app-usage thoughts Jordan had.
> That said MESA has many ways it can use this UAPI and we can discuss
> that on the MESA patch.
> 
> 
> hmmm.. so... ack anyone? [insert big hopeful smiley here]
> apologies if I am asking too often.

Assuming that:

  2 = PXP feature is supported but should be ready soon (pending
      initialization of non-i915 system dependencies).

really means, "it'll be ready soon or there is a bug somewhere",

Acked-by: Jordan Justen <jordan.l.justen@intel.com>

If Mesa finds that it can't really rely on that, we may have to decide
on a different approach in how to use that return value.

Is it possible to hold on for another ~12 hours to see if Lionel wants
to Ack as well?

-Jordan
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c
index 2238e096c957..6f11d7eaa91a 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -5,6 +5,8 @@ 
 #include "gem/i915_gem_mman.h"
 #include "gt/intel_engine_user.h"
 
+#include "pxp/intel_pxp.h"
+
 #include "i915_cmd_parser.h"
 #include "i915_drv.h"
 #include "i915_getparam.h"
@@ -102,6 +104,11 @@  int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		if (value < 0)
 			return value;
 		break;
+	case I915_PARAM_PXP_STATUS:
+		value = intel_pxp_get_readiness_status(i915->pxp);
+		if (value < 0)
+			return value;
+		break;
 	case I915_PARAM_MMAP_GTT_VERSION:
 		/* Though we've started our numbering from 1, and so class all
 		 * earlier versions as 0, in effect their value is undefined as
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index b600d68de2a4..f143eadbc253 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -358,23 +358,38 @@  void intel_pxp_end(struct intel_pxp *pxp)
 }
 
 /*
- * the arb session is restarted from the irq work when we receive the
- * termination completion interrupt
+ * this helper is used by both intel_pxp_start and by
+ * the GET_PARAM IOCTL that user space calls. Thus, the
+ * return values here should match the UAPI spec.
  */
-int intel_pxp_start(struct intel_pxp *pxp)
+int intel_pxp_get_readiness_status(struct intel_pxp *pxp)
 {
-	int ret = 0;
-
 	if (!intel_pxp_is_enabled(pxp))
 		return -ENODEV;
 
 	if (HAS_ENGINE(pxp->ctrl_gt, GSC0)) {
 		if (wait_for(intel_pxp_gsccs_is_ready_for_sessions(pxp), 250))
-			return -ENXIO;
+			return 2;
 	} else {
 		if (wait_for(pxp_component_bound(pxp), 250))
-			return -ENXIO;
+			return 2;
 	}
+	return 1;
+}
+
+/*
+ * the arb session is restarted from the irq work when we receive the
+ * termination completion interrupt
+ */
+int intel_pxp_start(struct intel_pxp *pxp)
+{
+	int ret = 0;
+
+	ret = intel_pxp_get_readiness_status(pxp);
+	if (ret < 0)
+		return ret;
+	else if (ret > 1)
+		return -EIO; /* per UAPI spec, user may retry later */
 
 	mutex_lock(&pxp->arb_mutex);
 
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.h b/drivers/gpu/drm/i915/pxp/intel_pxp.h
index 3ded0890cd27..17e6dbc86b38 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.h
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.h
@@ -26,6 +26,7 @@  void intel_pxp_fini_hw(struct intel_pxp *pxp);
 void intel_pxp_mark_termination_in_progress(struct intel_pxp *pxp);
 void intel_pxp_tee_end_arb_fw_session(struct intel_pxp *pxp, u32 arb_session_id);
 
+int intel_pxp_get_readiness_status(struct intel_pxp *pxp);
 int intel_pxp_start(struct intel_pxp *pxp);
 void intel_pxp_end(struct intel_pxp *pxp);
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index b15dd9cc2ffb..f4efbb2db0c5 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -771,6 +771,25 @@  typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_OA_TIMESTAMP_FREQUENCY 57
 
+/*
+ * Query the status of PXP support in i915.
+ *
+ * The query can fail in the following scenarios with the listed error codes:
+ *     -ENODEV = PXP support is not available on the GPU device or in the
+ *               kernel due to missing component drivers or kernel configs.
+ *
+ * If the IOCTL is successful, the returned parameter will be set to one of
+ * the following values:
+ *     1 = PXP feature is supported and is ready for use.
+ *     2 = PXP feature is supported but should be ready soon (pending
+ *         initialization of non-i915 system dependencies).
+ *
+ * NOTE: When param is supported (positive return values), user space should
+ *       still refer to the GEM PXP context-creation UAPI header specs to be
+ *       aware of possible failure due to system state machine at the time.
+ */
+#define I915_PARAM_PXP_STATUS		 58
+
 /* Must be kept compact -- no holes and well documented */
 
 /**