Message ID | 20200122194825.101240-1-michal.wajdeczko@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915/huc: Fix error reported by I915_PARAM_HUC_STATUS | expand |
On 1/22/20 11:48 AM, Michal Wajdeczko wrote: > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only > to indicate lack of HuC hardware and we started to use this error > also for all other cases when HuC was not in use or supported. > > Fix that by relying again on HAS_GT_UC macro, since currently > used function intel_huc_is_supported() is based on HuC firmware > support which could be unsupported also due to force disabled > GuC firmware. > > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Tony Ye <tony.ye@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Do we need a fixes tag? Daniele > --- > drivers/gpu/drm/i915/gt/uc/intel_huc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c > index 32a069841c14..ad4d9e16a24e 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c > @@ -209,7 +209,7 @@ int intel_huc_check_status(struct intel_huc *huc) > intel_wakeref_t wakeref; > u32 status = 0; > > - if (!intel_huc_is_supported(huc)) > + if (!HAS_GT_UC(gt->i915)) > return -ENODEV; > > with_intel_runtime_pm(gt->uncore->rpm, wakeref) >
On Thu, 23 Jan 2020 03:43:04 +0100, Patchwork <patchwork@emeril.freedesktop.org> wrote: > == Series Details == > > Series: drm/i915/huc: Fix error reported by I915_PARAM_HUC_STATUS > URL : https://patchwork.freedesktop.org/series/72419/ > State : failure > > == Summary == > > CI Bug Log - changes from CI_DRM_7799 -> Patchwork_16219 > ==================================================== > > Summary > ------- > > **FAILURE** > > Serious unknown changes coming with Patchwork_16219 absolutely need to > be > verified manually. > If you think the reported changes have nothing to do with the changes > introduced in Patchwork_16219, please notify your bug team to allow > them > to document this new failure mode, which will reduce false positives > in CI. > > External URL: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16219/index.html > > Possible new issues > ------------------- > > Here are the unknown changes that may have been introduced in > Patchwork_16219: > > ### IGT changes ### > > #### Possible regressions #### > > * igt@gem_exec_parallel@contexts: > - fi-byt-j1900: NOTRUN -> [FAIL][1] > [1]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_16219/fi-byt-j1900/igt@gem_exec_parallel@contexts.html no HuC here, unrelated
Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) > > > On 1/22/20 11:48 AM, Michal Wajdeczko wrote: > > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 > > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only > > to indicate lack of HuC hardware and we started to use this error > > also for all other cases when HuC was not in use or supported. > > > > Fix that by relying again on HAS_GT_UC macro, since currently > > used function intel_huc_is_supported() is based on HuC firmware > > support which could be unsupported also due to force disabled > > GuC firmware. > > > > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> > > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > > Cc: Tony Ye <tony.ye@intel.com> > > Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Once upon a time did you (Michal) not argue we should indicate the lack of firmware in the error code? Something like if (!HAS_GT_UC(gt->i915)) return -ENODEV; if (!intel_huc_is_supported(huc)) return -ENOEXEC;
On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson <chris@chris-wilson.co.uk> wrote: > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >> >> >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only >> > to indicate lack of HuC hardware and we started to use this error >> > also for all other cases when HuC was not in use or supported. >> > >> > Fix that by relying again on HAS_GT_UC macro, since currently >> > used function intel_huc_is_supported() is based on HuC firmware >> > support which could be unsupported also due to force disabled >> > GuC firmware. >> > >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >> > Cc: Tony Ye <tony.ye@intel.com> >> >> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > > Once upon a time did you (Michal) not argue we should indicate the lack > of firmware in the error code? Something like > > if (!HAS_GT_UC(gt->i915)) > return -ENODEV; > > if (!intel_huc_is_supported(huc)) > return -ENOEXEC; Yes, we discussed this here [1] together with [2] but we didn't conclude our discussion due to different opinions on how represent some states, in particular "manually disabled" state. In this patch I just wanted to restore old notation. But we can start new discussion, here is summary: ------------------+----------+----------+---------- HuC state | today* | option A | option B ------------------+----------+----------+---------- no HuC hardware | -ENODEV | -ENODEV | -ENODEV GuC fw disabled | 0 | 0 | -EOPNOTSUPP HuC fw disabled | 0 | 0 | -EOPNOTSUPP HuC fw missing | 0 | -ENOPKG | -ENOEXEC HuC fw error | 0 | -ENOEXEC | -ENOEXEC HuC fw fail | 0 | -EACCES | 0 HuC authenticated | 1 | 1 | 1 ------------------+----------+----------+---------- Note that all above should be compatible with media driver, which explicitly looks for no error and value 1 Michal [1] https://patchwork.freedesktop.org/patch/306419/?series=61001&rev=1 [2] https://patchwork.freedesktop.org/series/60800/#rev1
Quoting Michal Wajdeczko (2020-01-23 15:38:52) > On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson > <chris@chris-wilson.co.uk> wrote: > > > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) > >> > >> > >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: > >> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 > >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only > >> > to indicate lack of HuC hardware and we started to use this error > >> > also for all other cases when HuC was not in use or supported. > >> > > >> > Fix that by relying again on HAS_GT_UC macro, since currently > >> > used function intel_huc_is_supported() is based on HuC firmware > >> > support which could be unsupported also due to force disabled > >> > GuC firmware. > >> > > >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> > >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > >> > Cc: Tony Ye <tony.ye@intel.com> > >> > >> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > > > > Once upon a time did you (Michal) not argue we should indicate the lack > > of firmware in the error code? Something like > > > > if (!HAS_GT_UC(gt->i915)) > > return -ENODEV; > > > > if (!intel_huc_is_supported(huc)) > > return -ENOEXEC; > > Yes, we discussed this here [1] together with [2] but we didn't > conclude our discussion due to different opinions on how represent > some states, in particular "manually disabled" state. > > In this patch I just wanted to restore old notation. > > But we can start new discussion, here is summary: > > ------------------+----------+----------+---------- > HuC state | today* | option A | option B > ------------------+----------+----------+---------- > no HuC hardware | -ENODEV | -ENODEV | -ENODEV > GuC fw disabled | 0 | 0 | -EOPNOTSUPP > HuC fw disabled | 0 | 0 | -EOPNOTSUPP > HuC fw missing | 0 | -ENOPKG | -ENOEXEC > HuC fw error | 0 | -ENOEXEC | -ENOEXEC > HuC fw fail | 0 | -EACCES | 0 > HuC authenticated | 1 | 1 | 1 > ------------------+----------+----------+---------- By fw fail, you mean we loaded the firmware (to our knowledge) correctly, but HUC_STATUS is not reported as valid? If so, I support option B. I like the idea of saying "no HuC" (machine too old) "no firmware" (user action, or lack thereof) 0 (fw unhappy) 1 (fw reports success) In between states for failures in fw loading? Not so sure. But I can see the nicety in distinguishing between lack of firmware and some random failure in loading the firmware (the former being user action required to rectify, command line parameter whatever and the latter being the firmware file is either invalid or a stray neutrino prevented loading). Imo the error messages should be about why we cannot probe/trust the HUC_STATUS register. If everything is setup correctly then the returned value should be from reading the register. I dislike only returning 1 if supported, and converting a valid read of 0 into another error. So Option B :) > Note that all above should be compatible with media driver, > which explicitly looks for no error and value 1 Cool. -Chris
Acked-by: Tony Ye <tony.ye@intel.com> On 1/22/2020 11:48 AM, Michal Wajdeczko wrote: > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only > to indicate lack of HuC hardware and we started to use this error > also for all other cases when HuC was not in use or supported. > > Fix that by relying again on HAS_GT_UC macro, since currently > used function intel_huc_is_supported() is based on HuC firmware > support which could be unsupported also due to force disabled > GuC firmware. > > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> > Cc: Tony Ye <tony.ye@intel.com> > --- > drivers/gpu/drm/i915/gt/uc/intel_huc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c > index 32a069841c14..ad4d9e16a24e 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c > @@ -209,7 +209,7 @@ int intel_huc_check_status(struct intel_huc *huc) > intel_wakeref_t wakeref; > u32 status = 0; > > - if (!intel_huc_is_supported(huc)) > + if (!HAS_GT_UC(gt->i915)) > return -ENODEV; > > with_intel_runtime_pm(gt->uncore->rpm, wakeref) >
On 1/23/2020 7:38 AM, Michal Wajdeczko wrote: > On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson > <chris@chris-wilson.co.uk> wrote: > >> Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >>> >>> >>> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >>> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 >>> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only >>> > to indicate lack of HuC hardware and we started to use this error >>> > also for all other cases when HuC was not in use or supported. >>> > >>> > Fix that by relying again on HAS_GT_UC macro, since currently >>> > used function intel_huc_is_supported() is based on HuC firmware >>> > support which could be unsupported also due to force disabled >>> > GuC firmware. >>> > >>> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >>> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>> > Cc: Tony Ye <tony.ye@intel.com> >>> >>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >> >> Once upon a time did you (Michal) not argue we should indicate the lack >> of firmware in the error code? Something like >> >> if (!HAS_GT_UC(gt->i915)) >> return -ENODEV; >> >> if (!intel_huc_is_supported(huc)) >> return -ENOEXEC; > > Yes, we discussed this here [1] together with [2] but we didn't > conclude our discussion due to different opinions on how represent > some states, in particular "manually disabled" state. > > In this patch I just wanted to restore old notation. > > But we can start new discussion, here is summary: > > ------------------+----------+----------+---------- > HuC state | today* | option A | option B > ------------------+----------+----------+---------- > no HuC hardware | -ENODEV | -ENODEV | -ENODEV > GuC fw disabled | 0 | 0 | -EOPNOTSUPP > HuC fw disabled | 0 | 0 | -EOPNOTSUPP > HuC fw missing | 0 | -ENOPKG | -ENOEXEC > HuC fw error | 0 | -ENOEXEC | -ENOEXEC > HuC fw fail | 0 | -EACCES | 0 What is the difference of HuC fw error and HuC fw fail here? Regards, Tony > HuC authenticated | 1 | 1 | 1 > ------------------+----------+----------+---------- > > Note that all above should be compatible with media driver, > which explicitly looks for no error and value 1 > > Michal > > [1] https://patchwork.freedesktop.org/patch/306419/?series=61001&rev=1 > [2] https://patchwork.freedesktop.org/series/60800/#rev1
On Thu, 23 Jan 2020 19:26:58 +0100, Ye, Tony <tony.ye@intel.com> wrote: > > > On 1/23/2020 7:38 AM, Michal Wajdeczko wrote: >> On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson >> <chris@chris-wilson.co.uk> wrote: >> >>> Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >>>> >>>> >>>> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >>>> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 >>>> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only >>>> > to indicate lack of HuC hardware and we started to use this error >>>> > also for all other cases when HuC was not in use or supported. >>>> > >>>> > Fix that by relying again on HAS_GT_UC macro, since currently >>>> > used function intel_huc_is_supported() is based on HuC firmware >>>> > support which could be unsupported also due to force disabled >>>> > GuC firmware. >>>> > >>>> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>>> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>> > Cc: Tony Ye <tony.ye@intel.com> >>>> >>>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>> >>> Once upon a time did you (Michal) not argue we should indicate the lack >>> of firmware in the error code? Something like >>> >>> if (!HAS_GT_UC(gt->i915)) >>> return -ENODEV; >>> >>> if (!intel_huc_is_supported(huc)) >>> return -ENOEXEC; >> Yes, we discussed this here [1] together with [2] but we didn't >> conclude our discussion due to different opinions on how represent >> some states, in particular "manually disabled" state. >> In this patch I just wanted to restore old notation. >> But we can start new discussion, here is summary: >> ------------------+----------+----------+---------- >> HuC state | today* | option A | option B >> ------------------+----------+----------+---------- >> no HuC hardware | -ENODEV | -ENODEV | -ENODEV >> GuC fw disabled | 0 | 0 | -EOPNOTSUPP >> HuC fw disabled | 0 | 0 | -EOPNOTSUPP >> HuC fw missing | 0 | -ENOPKG | -ENOEXEC >> HuC fw error | 0 | -ENOEXEC | -ENOEXEC >> HuC fw fail | 0 | -EACCES | 0 > > What is the difference of HuC fw error and HuC fw fail here? see corresponding internal fw status codes: INTEL_UC_FIRMWARE_ERROR, /* invalid format or version */ INTEL_UC_FIRMWARE_FAIL, /* failed to xfer or init/auth the fw */ > > Regards, > Tony > >> HuC authenticated | 1 | 1 | 1 >> ------------------+----------+----------+---------- >> Note that all above should be compatible with media driver, >> which explicitly looks for no error and value 1 >> Michal >> [1] https://patchwork.freedesktop.org/patch/306419/?series=61001&rev=1 >> [2] https://patchwork.freedesktop.org/series/60800/#rev1
On 1/23/2020 7:38 AM, Michal Wajdeczko wrote: > On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson > <chris@chris-wilson.co.uk> wrote: > >> Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >>> >>> >>> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >>> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 >>> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only >>> > to indicate lack of HuC hardware and we started to use this error >>> > also for all other cases when HuC was not in use or supported. >>> > >>> > Fix that by relying again on HAS_GT_UC macro, since currently >>> > used function intel_huc_is_supported() is based on HuC firmware >>> > support which could be unsupported also due to force disabled >>> > GuC firmware. >>> > >>> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >>> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>> > Cc: Tony Ye <tony.ye@intel.com> >>> >>> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >> >> Once upon a time did you (Michal) not argue we should indicate the lack >> of firmware in the error code? Something like >> >> if (!HAS_GT_UC(gt->i915)) >> return -ENODEV; >> >> if (!intel_huc_is_supported(huc)) >> return -ENOEXEC; > > Yes, we discussed this here [1] together with [2] but we didn't > conclude our discussion due to different opinions on how represent > some states, in particular "manually disabled" state. > > In this patch I just wanted to restore old notation. > > But we can start new discussion, here is summary: > > ------------------+----------+----------+---------- > HuC state | today* | option A | option B > ------------------+----------+----------+---------- > no HuC hardware | -ENODEV | -ENODEV | -ENODEV > GuC fw disabled | 0 | 0 | -EOPNOTSUPP > HuC fw disabled | 0 | 0 | -EOPNOTSUPP > HuC fw missing | 0 | -ENOPKG | -ENOEXEC > HuC fw error | 0 | -ENOEXEC | -ENOEXEC > HuC fw fail | 0 | -EACCES | 0 > HuC authenticated | 1 | 1 | 1 > ------------------+----------+----------+---------- > > Note that all above should be compatible with media driver, > which explicitly looks for no error and value 1 From user space perspective, e.g. the new igt huc_copy, option B looks like this: pg.param = I915_PARAM_HUC_STATUS; pg.value = &val; ret = ioctl(fd, DRM_IOCTL_I915_GETPARAM, &pg); ------------------+----------+----------+------------+---------- HuC state | ret | pg.value | errno | huc_copy ------------------+----------+----------+------------+---------- no HuC hardware | -1 | 0 | -ENODEV | SKIP GuC fw disabled | -1 | 0 | -EOPNOTSUPP| SKIP HuC fw disabled | -1 | 0 | -EOPNOTSUPP| SKIP HuC fw missing | -1 | 0 | -ENOEXEC | FAIL HuC fw error | -1 | 0 | -ENOEXEC | FAIL HuC fw fail | 0 | 0 | 0 | FAIL HuC authenticated | 0 | 1 | 0 | continue ------------------+----------+----------+------------+---------- It can distinguish the SKIP and FAIL conditions. But looks not elegant enough. The pg.value is wasted as it is not pushed back to user space when intel_huc_check_status() < 0. case I915_PARAM_HUC_STATUS: value = intel_huc_check_status(&i915->gt.uc.huc); if (value < 0) return value; break; Regards, Tony > > Michal > > [1] https://patchwork.freedesktop.org/patch/306419/?series=61001&rev=1 > [2] https://patchwork.freedesktop.org/series/60800/#rev1
On Thu, 23 Jan 2020 16:51:58 +0100, Chris Wilson <chris@chris-wilson.co.uk> wrote: > Quoting Michal Wajdeczko (2020-01-23 15:38:52) >> On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson >> <chris@chris-wilson.co.uk> wrote: >> >> > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >> >> >> >> >> >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >> >> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 >> >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only >> >> > to indicate lack of HuC hardware and we started to use this error >> >> > also for all other cases when HuC was not in use or supported. >> >> > >> >> > Fix that by relying again on HAS_GT_UC macro, since currently >> >> > used function intel_huc_is_supported() is based on HuC firmware >> >> > support which could be unsupported also due to force disabled >> >> > GuC firmware. >> >> > >> >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >> >> > Cc: Tony Ye <tony.ye@intel.com> >> >> >> >> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >> > >> > Once upon a time did you (Michal) not argue we should indicate the >> lack >> > of firmware in the error code? Something like >> > >> > if (!HAS_GT_UC(gt->i915)) >> > return -ENODEV; >> > >> > if (!intel_huc_is_supported(huc)) >> > return -ENOEXEC; >> >> Yes, we discussed this here [1] together with [2] but we didn't >> conclude our discussion due to different opinions on how represent >> some states, in particular "manually disabled" state. >> >> In this patch I just wanted to restore old notation. >> >> But we can start new discussion, here is summary: >> >> ------------------+----------+----------+---------- >> HuC state | today* | option A | option B >> ------------------+----------+----------+---------- >> no HuC hardware | -ENODEV | -ENODEV | -ENODEV >> GuC fw disabled | 0 | 0 | -EOPNOTSUPP >> HuC fw disabled | 0 | 0 | -EOPNOTSUPP >> HuC fw missing | 0 | -ENOPKG | -ENOEXEC >> HuC fw error | 0 | -ENOEXEC | -ENOEXEC >> HuC fw fail | 0 | -EACCES | 0 >> HuC authenticated | 1 | 1 | 1 >> ------------------+----------+----------+---------- > > By fw fail, you mean we loaded the firmware (to our knowledge) > correctly, but HUC_STATUS is not reported as valid? > > If so, I support option B. I like the idea of saying > "no HuC" (machine too old) > "no firmware" (user action, or lack thereof) > 0 (fw unhappy) > 1 (fw reports success) > > In between states for failures in fw loading? Not so sure. But I can see > the nicety in distinguishing between lack of firmware and some random > failure in loading the firmware (the former being user action required > to rectify, command line parameter whatever and the latter being the > firmware file is either invalid or a stray neutrino prevented loading). > > Imo the error messages should be about why we cannot probe/trust the > HUC_STATUS register. If everything is setup correctly then the returned > value should be from reading the register. I dislike only returning 1 if > supported, and converting a valid read of 0 into another error. > > So Option B :) But I'm not sure that option B is consistent in error reporting, as "fw unhappy" is definitely an serious error but is represented as plain non-error "0" status, while "fw disabled" (user action) is treated as error ------------------+---------- HuC state | option B ------------------+---------- no HuC hardware | -ENODEV GuC fw disabled | -EOPNOTSUPP -> user decision, why error? HuC fw disabled | -EOPNOTSUPP -> user decision, why error? HuC fw missing | -ENOEXEC HuC fw error | -ENOEXEC HuC fw fail | 0 -> unlikely, but still fw/hw error HuC authenticated | 1 ------------------+---------- On other hand, option A treats all error conditions as errors, leaving status codes only for normal operations: disabled(0)/authenticated(1): ------------------+---------- HuC state | option A ------------------+---------- no HuC hardware | -ENODEV -> you shouldn't ask GuC fw disabled | 0 -> user decision, not an error HuC fw disabled | 0 -> user decision, not an error HuC fw missing | -ENOPKG -> fw not installed correctly HuC fw error | -ENOEXEC -> bad/wrong fw HuC fw fail | -EACCES -> fw/hw error HuC authenticated | 1 ------------------+---------- But since I'm not an active HuC user, will leave final decision to others. /Michal > >> Note that all above should be compatible with media driver, >> which explicitly looks for no error and value 1 > > Cool. > -Chris
On 1/27/2020 1:41 AM, Michal Wajdeczko wrote: > On Thu, 23 Jan 2020 16:51:58 +0100, Chris Wilson > <chris@chris-wilson.co.uk> wrote: > >> Quoting Michal Wajdeczko (2020-01-23 15:38:52) >>> On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson >>> <chris@chris-wilson.co.uk> wrote: >>> >>> > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >>> >> >>> >> >>> >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >>> >> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 >>> >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only >>> >> > to indicate lack of HuC hardware and we started to use this error >>> >> > also for all other cases when HuC was not in use or supported. >>> >> > >>> >> > Fix that by relying again on HAS_GT_UC macro, since currently >>> >> > used function intel_huc_is_supported() is based on HuC firmware >>> >> > support which could be unsupported also due to force disabled >>> >> > GuC firmware. >>> >> > >>> >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >>> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>> >> > Cc: Tony Ye <tony.ye@intel.com> >>> >> >>> >> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>> > >>> > Once upon a time did you (Michal) not argue we should indicate the >>> lack >>> > of firmware in the error code? Something like >>> > >>> > if (!HAS_GT_UC(gt->i915)) >>> > return -ENODEV; >>> > >>> > if (!intel_huc_is_supported(huc)) >>> > return -ENOEXEC; >>> >>> Yes, we discussed this here [1] together with [2] but we didn't >>> conclude our discussion due to different opinions on how represent >>> some states, in particular "manually disabled" state. >>> >>> In this patch I just wanted to restore old notation. >>> >>> But we can start new discussion, here is summary: >>> >>> ------------------+----------+----------+---------- >>> HuC state | today* | option A | option B >>> ------------------+----------+----------+---------- >>> no HuC hardware | -ENODEV | -ENODEV | -ENODEV >>> GuC fw disabled | 0 | 0 | -EOPNOTSUPP >>> HuC fw disabled | 0 | 0 | -EOPNOTSUPP >>> HuC fw missing | 0 | -ENOPKG | -ENOEXEC >>> HuC fw error | 0 | -ENOEXEC | -ENOEXEC >>> HuC fw fail | 0 | -EACCES | 0 >>> HuC authenticated | 1 | 1 | 1 >>> ------------------+----------+----------+---------- >> >> By fw fail, you mean we loaded the firmware (to our knowledge) >> correctly, but HUC_STATUS is not reported as valid? >> >> If so, I support option B. I like the idea of saying >> "no HuC" (machine too old) >> "no firmware" (user action, or lack thereof) >> 0 (fw unhappy) >> 1 (fw reports success) >> >> In between states for failures in fw loading? Not so sure. But I can see >> the nicety in distinguishing between lack of firmware and some random >> failure in loading the firmware (the former being user action required >> to rectify, command line parameter whatever and the latter being the >> firmware file is either invalid or a stray neutrino prevented loading). >> >> Imo the error messages should be about why we cannot probe/trust the >> HUC_STATUS register. If everything is setup correctly then the returned >> value should be from reading the register. I dislike only returning 1 if >> supported, and converting a valid read of 0 into another error. >> >> So Option B :) > > But I'm not sure that option B is consistent in error reporting, as > "fw unhappy" is definitely an serious error but is represented as plain > non-error "0" status, while "fw disabled" (user action) is treated as error > > ------------------+---------- > HuC state | option B > ------------------+---------- > no HuC hardware | -ENODEV > GuC fw disabled | -EOPNOTSUPP -> user decision, why error? > HuC fw disabled | -EOPNOTSUPP -> user decision, why error? > HuC fw missing | -ENOEXEC > HuC fw error | -ENOEXEC > HuC fw fail | 0 -> unlikely, but still fw/hw error > HuC authenticated | 1 > ------------------+---------- > > On other hand, option A treats all error conditions as errors, leaving > status codes only for normal operations: disabled(0)/authenticated(1): > > ------------------+---------- > HuC state | option A > ------------------+---------- > no HuC hardware | -ENODEV -> you shouldn't ask > GuC fw disabled | 0 -> user decision, not an error > HuC fw disabled | 0 -> user decision, not an error > HuC fw missing | -ENOPKG -> fw not installed correctly > HuC fw error | -ENOEXEC -> bad/wrong fw > HuC fw fail | -EACCES -> fw/hw error > HuC authenticated | 1 > ------------------+---------- Vote for Option A. Regards, Tony > > But since I'm not an active HuC user, will leave final decision to others. > > /Michal > > >> >>> Note that all above should be compatible with media driver, >>> which explicitly looks for no error and value 1 >> >> Cool. >> -Chris
On 2/4/20 4:43 PM, Ye, Tony wrote: > > > On 1/27/2020 1:41 AM, Michal Wajdeczko wrote: >> On Thu, 23 Jan 2020 16:51:58 +0100, Chris Wilson >> <chris@chris-wilson.co.uk> wrote: >> >>> Quoting Michal Wajdeczko (2020-01-23 15:38:52) >>>> On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson >>>> <chris@chris-wilson.co.uk> wrote: >>>> >>>> > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >>>> >> >>>> >> >>>> >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >>>> >> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over >>>> i915 >>>> >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only >>>> >> > to indicate lack of HuC hardware and we started to use this error >>>> >> > also for all other cases when HuC was not in use or supported. >>>> >> > >>>> >> > Fix that by relying again on HAS_GT_UC macro, since currently >>>> >> > used function intel_huc_is_supported() is based on HuC firmware >>>> >> > support which could be unsupported also due to force disabled >>>> >> > GuC firmware. >>>> >> > >>>> >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>>> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>> >> > Cc: Tony Ye <tony.ye@intel.com> >>>> >> >>>> >> Reviewed-by: Daniele Ceraolo Spurio >>>> <daniele.ceraolospurio@intel.com> >>>> > >>>> > Once upon a time did you (Michal) not argue we should indicate >>>> the lack >>>> > of firmware in the error code? Something like >>>> > >>>> > if (!HAS_GT_UC(gt->i915)) >>>> > return -ENODEV; >>>> > >>>> > if (!intel_huc_is_supported(huc)) >>>> > return -ENOEXEC; >>>> >>>> Yes, we discussed this here [1] together with [2] but we didn't >>>> conclude our discussion due to different opinions on how represent >>>> some states, in particular "manually disabled" state. >>>> >>>> In this patch I just wanted to restore old notation. >>>> >>>> But we can start new discussion, here is summary: >>>> >>>> ------------------+----------+----------+---------- >>>> HuC state | today* | option A | option B >>>> ------------------+----------+----------+---------- >>>> no HuC hardware | -ENODEV | -ENODEV | -ENODEV >>>> GuC fw disabled | 0 | 0 | -EOPNOTSUPP >>>> HuC fw disabled | 0 | 0 | -EOPNOTSUPP >>>> HuC fw missing | 0 | -ENOPKG | -ENOEXEC >>>> HuC fw error | 0 | -ENOEXEC | -ENOEXEC >>>> HuC fw fail | 0 | -EACCES | 0 >>>> HuC authenticated | 1 | 1 | 1 >>>> ------------------+----------+----------+---------- >>> >>> By fw fail, you mean we loaded the firmware (to our knowledge) >>> correctly, but HUC_STATUS is not reported as valid? >>> >>> If so, I support option B. I like the idea of saying >>> "no HuC" (machine too old) >>> "no firmware" (user action, or lack thereof) >>> 0 (fw unhappy) >>> 1 (fw reports success) >>> >>> In between states for failures in fw loading? Not so sure. But I can >>> see >>> the nicety in distinguishing between lack of firmware and some random >>> failure in loading the firmware (the former being user action required >>> to rectify, command line parameter whatever and the latter being the >>> firmware file is either invalid or a stray neutrino prevented loading). >>> >>> Imo the error messages should be about why we cannot probe/trust the >>> HUC_STATUS register. If everything is setup correctly then the returned >>> value should be from reading the register. I dislike only returning >>> 1 if >>> supported, and converting a valid read of 0 into another error. >>> >>> So Option B :) >> >> But I'm not sure that option B is consistent in error reporting, as >> "fw unhappy" is definitely an serious error but is represented as plain >> non-error "0" status, while "fw disabled" (user action) is treated as >> error >> >> ------------------+---------- >> HuC state | option B >> ------------------+---------- >> no HuC hardware | -ENODEV >> GuC fw disabled | -EOPNOTSUPP -> user decision, why error? >> HuC fw disabled | -EOPNOTSUPP -> user decision, why error? >> HuC fw missing | -ENOEXEC >> HuC fw error | -ENOEXEC >> HuC fw fail | 0 -> unlikely, but still fw/hw error >> HuC authenticated | 1 >> ------------------+---------- >> >> On other hand, option A treats all error conditions as errors, leaving >> status codes only for normal operations: disabled(0)/authenticated(1): >> >> ------------------+---------- >> HuC state | option A >> ------------------+---------- >> no HuC hardware | -ENODEV -> you shouldn't ask >> GuC fw disabled | 0 -> user decision, not an error >> HuC fw disabled | 0 -> user decision, not an error >> HuC fw missing | -ENOPKG -> fw not installed correctly >> HuC fw error | -ENOEXEC -> bad/wrong fw >> HuC fw fail | -EACCES -> fw/hw error >> HuC authenticated | 1 >> ------------------+---------- > > Vote for Option A. > > Regards, > Tony > Are we ok to move forward on this? Michal, are you working on updating the patch? -Rob >> >> But since I'm not an active HuC user, will leave final decision to >> others. >> >> /Michal >> >> >>> >>>> Note that all above should be compatible with media driver, >>>> which explicitly looks for no error and value 1 >>> >>> Cool. >>> -Chris > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, 11 Feb 2020 18:53:05 +0100, Fosha, Robert M <robert.m.fosha@intel.com> wrote: > > > On 2/4/20 4:43 PM, Ye, Tony wrote: >> >> >> On 1/27/2020 1:41 AM, Michal Wajdeczko wrote: >>> On Thu, 23 Jan 2020 16:51:58 +0100, Chris Wilson >>> <chris@chris-wilson.co.uk> wrote: >>> >>>> Quoting Michal Wajdeczko (2020-01-23 15:38:52) >>>>> On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson >>>>> <chris@chris-wilson.co.uk> wrote: >>>>> >>>>> > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >>>>> >> >>>>> >> >>>>> >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >>>>> >> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over >>>>> i915 >>>>> >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV >>>>> only >>>>> >> > to indicate lack of HuC hardware and we started to use this >>>>> error >>>>> >> > also for all other cases when HuC was not in use or supported. >>>>> >> > >>>>> >> > Fix that by relying again on HAS_GT_UC macro, since currently >>>>> >> > used function intel_huc_is_supported() is based on HuC firmware >>>>> >> > support which could be unsupported also due to force disabled >>>>> >> > GuC firmware. >>>>> >> > >>>>> >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>>> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>>>> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>>> >> > Cc: Tony Ye <tony.ye@intel.com> >>>>> >> >>>>> >> Reviewed-by: Daniele Ceraolo Spurio >>>>> <daniele.ceraolospurio@intel.com> >>>>> > >>>>> > Once upon a time did you (Michal) not argue we should indicate the >>>>> lack >>>>> > of firmware in the error code? Something like >>>>> > >>>>> > if (!HAS_GT_UC(gt->i915)) >>>>> > return -ENODEV; >>>>> > >>>>> > if (!intel_huc_is_supported(huc)) >>>>> > return -ENOEXEC; >>>>> >>>>> Yes, we discussed this here [1] together with [2] but we didn't >>>>> conclude our discussion due to different opinions on how represent >>>>> some states, in particular "manually disabled" state. >>>>> >>>>> In this patch I just wanted to restore old notation. >>>>> >>>>> But we can start new discussion, here is summary: >>>>> >>>>> ------------------+----------+----------+---------- >>>>> HuC state | today* | option A | option B >>>>> ------------------+----------+----------+---------- >>>>> no HuC hardware | -ENODEV | -ENODEV | -ENODEV >>>>> GuC fw disabled | 0 | 0 | -EOPNOTSUPP >>>>> HuC fw disabled | 0 | 0 | -EOPNOTSUPP >>>>> HuC fw missing | 0 | -ENOPKG | -ENOEXEC >>>>> HuC fw error | 0 | -ENOEXEC | -ENOEXEC >>>>> HuC fw fail | 0 | -EACCES | 0 >>>>> HuC authenticated | 1 | 1 | 1 >>>>> ------------------+----------+----------+---------- >>>> >>>> By fw fail, you mean we loaded the firmware (to our knowledge) >>>> correctly, but HUC_STATUS is not reported as valid? >>>> >>>> If so, I support option B. I like the idea of saying >>>> "no HuC" (machine too old) >>>> "no firmware" (user action, or lack thereof) >>>> 0 (fw unhappy) >>>> 1 (fw reports success) >>>> >>>> In between states for failures in fw loading? Not so sure. But I can >>>> see >>>> the nicety in distinguishing between lack of firmware and some random >>>> failure in loading the firmware (the former being user action required >>>> to rectify, command line parameter whatever and the latter being the >>>> firmware file is either invalid or a stray neutrino prevented >>>> loading). >>>> >>>> Imo the error messages should be about why we cannot probe/trust the >>>> HUC_STATUS register. If everything is setup correctly then the >>>> returned >>>> value should be from reading the register. I dislike only returning 1 >>>> if >>>> supported, and converting a valid read of 0 into another error. >>>> >>>> So Option B :) >>> >>> But I'm not sure that option B is consistent in error reporting, as >>> "fw unhappy" is definitely an serious error but is represented as plain >>> non-error "0" status, while "fw disabled" (user action) is treated as >>> error >>> >>> ------------------+---------- >>> HuC state | option B >>> ------------------+---------- >>> no HuC hardware | -ENODEV >>> GuC fw disabled | -EOPNOTSUPP -> user decision, why error? >>> HuC fw disabled | -EOPNOTSUPP -> user decision, why error? >>> HuC fw missing | -ENOEXEC >>> HuC fw error | -ENOEXEC >>> HuC fw fail | 0 -> unlikely, but still fw/hw error >>> HuC authenticated | 1 >>> ------------------+---------- >>> >>> On other hand, option A treats all error conditions as errors, leaving >>> status codes only for normal operations: disabled(0)/authenticated(1): >>> >>> ------------------+---------- >>> HuC state | option A >>> ------------------+---------- >>> no HuC hardware | -ENODEV -> you shouldn't ask >>> GuC fw disabled | 0 -> user decision, not an error >>> HuC fw disabled | 0 -> user decision, not an error >>> HuC fw missing | -ENOPKG -> fw not installed correctly >>> HuC fw error | -ENOEXEC -> bad/wrong fw >>> HuC fw fail | -EACCES -> fw/hw error >>> HuC authenticated | 1 >>> ------------------+---------- >> >> Vote for Option A. >> >> Regards, >> Tony >> > > Are we ok to move forward on this? Michal, are you working on updating > the patch? I can update the patch, but I don't know which option to implement: Tony votes for Option A, Chris for Option B, what's your choice? Michal > > -Rob > >>> >>> But since I'm not an active HuC user, will leave final decision to >>> others. >>> >>> /Michal >>> >>> >>>> >>>>> Note that all above should be compatible with media driver, >>>>> which explicitly looks for no error and value 1 >>>> >>>> Cool. >>>> -Chris >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 2/11/20 11:57 AM, Michal Wajdeczko wrote: > On Tue, 11 Feb 2020 18:53:05 +0100, Fosha, Robert M > <robert.m.fosha@intel.com> wrote: > >> >> >> On 2/4/20 4:43 PM, Ye, Tony wrote: >>> >>> >>> On 1/27/2020 1:41 AM, Michal Wajdeczko wrote: >>>> On Thu, 23 Jan 2020 16:51:58 +0100, Chris Wilson >>>> <chris@chris-wilson.co.uk> wrote: >>>> >>>>> Quoting Michal Wajdeczko (2020-01-23 15:38:52) >>>>>> On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson >>>>>> <chris@chris-wilson.co.uk> wrote: >>>>>> >>>>>> > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >>>>>> >> >>>>>> >> >>>>>> >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >>>>>> >> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt >>>>>> over i915 >>>>>> >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV >>>>>> only >>>>>> >> > to indicate lack of HuC hardware and we started to use this >>>>>> error >>>>>> >> > also for all other cases when HuC was not in use or supported. >>>>>> >> > >>>>>> >> > Fix that by relying again on HAS_GT_UC macro, since currently >>>>>> >> > used function intel_huc_is_supported() is based on HuC firmware >>>>>> >> > support which could be unsupported also due to force disabled >>>>>> >> > GuC firmware. >>>>>> >> > >>>>>> >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>>>> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>>>>> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>>>> >> > Cc: Tony Ye <tony.ye@intel.com> >>>>>> >> >>>>>> >> Reviewed-by: Daniele Ceraolo Spurio >>>>>> <daniele.ceraolospurio@intel.com> >>>>>> > >>>>>> > Once upon a time did you (Michal) not argue we should indicate >>>>>> the lack >>>>>> > of firmware in the error code? Something like >>>>>> > >>>>>> > if (!HAS_GT_UC(gt->i915)) >>>>>> > return -ENODEV; >>>>>> > >>>>>> > if (!intel_huc_is_supported(huc)) >>>>>> > return -ENOEXEC; >>>>>> >>>>>> Yes, we discussed this here [1] together with [2] but we didn't >>>>>> conclude our discussion due to different opinions on how represent >>>>>> some states, in particular "manually disabled" state. >>>>>> >>>>>> In this patch I just wanted to restore old notation. >>>>>> >>>>>> But we can start new discussion, here is summary: >>>>>> >>>>>> ------------------+----------+----------+---------- >>>>>> HuC state | today* | option A | option B >>>>>> ------------------+----------+----------+---------- >>>>>> no HuC hardware | -ENODEV | -ENODEV | -ENODEV >>>>>> GuC fw disabled | 0 | 0 | -EOPNOTSUPP >>>>>> HuC fw disabled | 0 | 0 | -EOPNOTSUPP >>>>>> HuC fw missing | 0 | -ENOPKG | -ENOEXEC >>>>>> HuC fw error | 0 | -ENOEXEC | -ENOEXEC >>>>>> HuC fw fail | 0 | -EACCES | 0 >>>>>> HuC authenticated | 1 | 1 | 1 >>>>>> ------------------+----------+----------+---------- >>>>> >>>>> By fw fail, you mean we loaded the firmware (to our knowledge) >>>>> correctly, but HUC_STATUS is not reported as valid? >>>>> >>>>> If so, I support option B. I like the idea of saying >>>>> "no HuC" (machine too old) >>>>> "no firmware" (user action, or lack thereof) >>>>> 0 (fw unhappy) >>>>> 1 (fw reports success) >>>>> >>>>> In between states for failures in fw loading? Not so sure. But I >>>>> can see >>>>> the nicety in distinguishing between lack of firmware and some random >>>>> failure in loading the firmware (the former being user action >>>>> required >>>>> to rectify, command line parameter whatever and the latter being the >>>>> firmware file is either invalid or a stray neutrino prevented >>>>> loading). >>>>> >>>>> Imo the error messages should be about why we cannot probe/trust the >>>>> HUC_STATUS register. If everything is setup correctly then the >>>>> returned >>>>> value should be from reading the register. I dislike only >>>>> returning 1 if >>>>> supported, and converting a valid read of 0 into another error. >>>>> >>>>> So Option B :) >>>> >>>> But I'm not sure that option B is consistent in error reporting, as >>>> "fw unhappy" is definitely an serious error but is represented as >>>> plain >>>> non-error "0" status, while "fw disabled" (user action) is treated >>>> as error >>>> >>>> ------------------+---------- >>>> HuC state | option B >>>> ------------------+---------- >>>> no HuC hardware | -ENODEV >>>> GuC fw disabled | -EOPNOTSUPP -> user decision, why error? >>>> HuC fw disabled | -EOPNOTSUPP -> user decision, why error? >>>> HuC fw missing | -ENOEXEC >>>> HuC fw error | -ENOEXEC >>>> HuC fw fail | 0 -> unlikely, but still fw/hw error >>>> HuC authenticated | 1 >>>> ------------------+---------- >>>> >>>> On other hand, option A treats all error conditions as errors, leaving >>>> status codes only for normal operations: disabled(0)/authenticated(1): >>>> >>>> ------------------+---------- >>>> HuC state | option A >>>> ------------------+---------- >>>> no HuC hardware | -ENODEV -> you shouldn't ask >>>> GuC fw disabled | 0 -> user decision, not an error >>>> HuC fw disabled | 0 -> user decision, not an error >>>> HuC fw missing | -ENOPKG -> fw not installed correctly >>>> HuC fw error | -ENOEXEC -> bad/wrong fw >>>> HuC fw fail | -EACCES -> fw/hw error >>>> HuC authenticated | 1 >>>> ------------------+---------- >>> >>> Vote for Option A. >>> >>> Regards, >>> Tony >>> >> >> Are we ok to move forward on this? Michal, are you working on >> updating the patch? > > I can update the patch, but I don't know which option to implement: > Tony votes for Option A, Chris for Option B, what's your choice? > > Michal > I don't have a preference and would trust both Tony and Chris over myself. However, if Tony is using HuC more I would vote for his prefered option. -Rob >> >> -Rob >> >>>> >>>> But since I'm not an active HuC user, will leave final decision to >>>> others. >>>> >>>> /Michal >>>> >>>> >>>>> >>>>>> Note that all above should be compatible with media driver, >>>>>> which explicitly looks for no error and value 1 >>>>> >>>>> Cool. >>>>> -Chris >>> _______________________________________________ >>> Intel-gfx mailing list >>> Intel-gfx@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 2/12/2020 5:54 AM, Fosha, Robert M wrote: > > > On 2/11/20 11:57 AM, Michal Wajdeczko wrote: >> On Tue, 11 Feb 2020 18:53:05 +0100, Fosha, Robert M >> <robert.m.fosha@intel.com> wrote: >> >>> >>> >>> On 2/4/20 4:43 PM, Ye, Tony wrote: >>>> >>>> >>>> On 1/27/2020 1:41 AM, Michal Wajdeczko wrote: >>>>> On Thu, 23 Jan 2020 16:51:58 +0100, Chris Wilson >>>>> <chris@chris-wilson.co.uk> wrote: >>>>> >>>>>> Quoting Michal Wajdeczko (2020-01-23 15:38:52) >>>>>>> On Thu, 23 Jan 2020 16:02:17 +0100, Chris Wilson >>>>>>> <chris@chris-wilson.co.uk> wrote: >>>>>>> >>>>>>> > Quoting Daniele Ceraolo Spurio (2020-01-22 23:52:33) >>>>>>> >> >>>>>>> >> >>>>>>> >> On 1/22/20 11:48 AM, Michal Wajdeczko wrote: >>>>>>> >> > From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt >>>>>>> over i915 >>>>>>> >> > in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV >>>>>>> only >>>>>>> >> > to indicate lack of HuC hardware and we started to use this >>>>>>> error >>>>>>> >> > also for all other cases when HuC was not in use or supported. >>>>>>> >> > >>>>>>> >> > Fix that by relying again on HAS_GT_UC macro, since currently >>>>>>> >> > used function intel_huc_is_supported() is based on HuC firmware >>>>>>> >> > support which could be unsupported also due to force disabled >>>>>>> >> > GuC firmware. >>>>>>> >> > >>>>>>> >> > Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>>>>> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> >>>>>>> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> >>>>>>> >> > Cc: Tony Ye <tony.ye@intel.com> >>>>>>> >> >>>>>>> >> Reviewed-by: Daniele Ceraolo Spurio >>>>>>> <daniele.ceraolospurio@intel.com> >>>>>>> > >>>>>>> > Once upon a time did you (Michal) not argue we should indicate >>>>>>> the lack >>>>>>> > of firmware in the error code? Something like >>>>>>> > >>>>>>> > if (!HAS_GT_UC(gt->i915)) >>>>>>> > return -ENODEV; >>>>>>> > >>>>>>> > if (!intel_huc_is_supported(huc)) >>>>>>> > return -ENOEXEC; >>>>>>> >>>>>>> Yes, we discussed this here [1] together with [2] but we didn't >>>>>>> conclude our discussion due to different opinions on how represent >>>>>>> some states, in particular "manually disabled" state. >>>>>>> >>>>>>> In this patch I just wanted to restore old notation. >>>>>>> >>>>>>> But we can start new discussion, here is summary: >>>>>>> >>>>>>> ------------------+----------+----------+---------- >>>>>>> HuC state | today* | option A | option B >>>>>>> ------------------+----------+----------+---------- >>>>>>> no HuC hardware | -ENODEV | -ENODEV | -ENODEV >>>>>>> GuC fw disabled | 0 | 0 | -EOPNOTSUPP >>>>>>> HuC fw disabled | 0 | 0 | -EOPNOTSUPP >>>>>>> HuC fw missing | 0 | -ENOPKG | -ENOEXEC >>>>>>> HuC fw error | 0 | -ENOEXEC | -ENOEXEC >>>>>>> HuC fw fail | 0 | -EACCES | 0 >>>>>>> HuC authenticated | 1 | 1 | 1 >>>>>>> ------------------+----------+----------+---------- >>>>>> >>>>>> By fw fail, you mean we loaded the firmware (to our knowledge) >>>>>> correctly, but HUC_STATUS is not reported as valid? >>>>>> >>>>>> If so, I support option B. I like the idea of saying >>>>>> "no HuC" (machine too old) >>>>>> "no firmware" (user action, or lack thereof) >>>>>> 0 (fw unhappy) >>>>>> 1 (fw reports success) >>>>>> >>>>>> In between states for failures in fw loading? Not so sure. But I >>>>>> can see >>>>>> the nicety in distinguishing between lack of firmware and some random >>>>>> failure in loading the firmware (the former being user action >>>>>> required >>>>>> to rectify, command line parameter whatever and the latter being the >>>>>> firmware file is either invalid or a stray neutrino prevented >>>>>> loading). >>>>>> >>>>>> Imo the error messages should be about why we cannot probe/trust the >>>>>> HUC_STATUS register. If everything is setup correctly then the >>>>>> returned >>>>>> value should be from reading the register. I dislike only >>>>>> returning 1 if >>>>>> supported, and converting a valid read of 0 into another error. >>>>>> >>>>>> So Option B :) >>>>> >>>>> But I'm not sure that option B is consistent in error reporting, as >>>>> "fw unhappy" is definitely an serious error but is represented as >>>>> plain >>>>> non-error "0" status, while "fw disabled" (user action) is treated >>>>> as error >>>>> >>>>> ------------------+---------- >>>>> HuC state | option B >>>>> ------------------+---------- >>>>> no HuC hardware | -ENODEV >>>>> GuC fw disabled | -EOPNOTSUPP -> user decision, why error? >>>>> HuC fw disabled | -EOPNOTSUPP -> user decision, why error? >>>>> HuC fw missing | -ENOEXEC >>>>> HuC fw error | -ENOEXEC >>>>> HuC fw fail | 0 -> unlikely, but still fw/hw error >>>>> HuC authenticated | 1 >>>>> ------------------+---------- >>>>> >>>>> On other hand, option A treats all error conditions as errors, leaving >>>>> status codes only for normal operations: disabled(0)/authenticated(1): >>>>> >>>>> ------------------+---------- >>>>> HuC state | option A >>>>> ------------------+---------- >>>>> no HuC hardware | -ENODEV -> you shouldn't ask >>>>> GuC fw disabled | 0 -> user decision, not an error >>>>> HuC fw disabled | 0 -> user decision, not an error >>>>> HuC fw missing | -ENOPKG -> fw not installed correctly >>>>> HuC fw error | -ENOEXEC -> bad/wrong fw >>>>> HuC fw fail | -EACCES -> fw/hw error >>>>> HuC authenticated | 1 >>>>> ------------------+---------- >>>> >>>> Vote for Option A. >>>> >>>> Regards, >>>> Tony >>>> >>> >>> Are we ok to move forward on this? Michal, are you working on >>> updating the patch? >> >> I can update the patch, but I don't know which option to implement: >> Tony votes for Option A, Chris for Option B, what's your choice? >> >> Michal >> > > I don't have a preference and would trust both Tony and Chris over > myself. However, if Tony is using HuC more I would vote for his prefered > option. > > -Rob Option B takes enable_guc=0|1 (user doesn't want to load HuC FW) as error, and takes FW fail (failed to xfer or init/auth the fw) as 0. It would confuse the user. Regards, Tony > >>> >>> -Rob >>> >>>>> >>>>> But since I'm not an active HuC user, will leave final decision to >>>>> others. >>>>> >>>>> /Michal >>>>> >>>>> >>>>>> >>>>>>> Note that all above should be compatible with media driver, >>>>>>> which explicitly looks for no error and value 1 >>>>>> >>>>>> Cool. >>>>>> -Chris >>>> _______________________________________________ >>>> Intel-gfx mailing list >>>> Intel-gfx@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >
> >>>>> ------------------+---------- > >>>>> HuC state | option B > >>>>> ------------------+---------- > >>>>> no HuC hardware | -ENODEV > >>>>> GuC fw disabled | -EOPNOTSUPP -> user decision, why error? > >>>>> HuC fw disabled | -EOPNOTSUPP -> user decision, why error? > >>>>> HuC fw missing | -ENOEXEC > >>>>> HuC fw error | -ENOEXEC > >>>>> HuC fw fail | 0 -> unlikely, but still fw/hw error > >>>>> HuC authenticated | 1 > >>>>> ------------------+---------- > >>>>> > >>>>> On other hand, option A treats all error conditions as errors, leaving > >>>>> status codes only for normal operations: disabled(0)/authenticated(1): > >>>>> > >>>>> ------------------+---------- > >>>>> HuC state | option A > >>>>> ------------------+---------- > >>>>> no HuC hardware | -ENODEV -> you shouldn't ask > >>>>> GuC fw disabled | 0 -> user decision, not an error > >>>>> HuC fw disabled | 0 -> user decision, not an error > >>>>> HuC fw missing | -ENOPKG -> fw not installed correctly > >>>>> HuC fw error | -ENOEXEC -> bad/wrong fw > >>>>> HuC fw fail | -EACCES -> fw/hw error > >>>>> HuC authenticated | 1 > >>>>> ------------------+---------- Let's go with Option B here. It correctly reports anything as error if it precedes the actual action of authentication and prevents it from happening. So the result one gets is 0/1 is for the authentication status which is the original intent here. Or alternatively an error if the authentication was not attempted. Regards, Joonas
On 28.02.2020 12:33, Joonas Lahtinen wrote: >>>>>>> ------------------+---------- >>>>>>> HuC state | option B >>>>>>> ------------------+---------- >>>>>>> no HuC hardware | -ENODEV >>>>>>> GuC fw disabled | -EOPNOTSUPP -> user decision, why error? >>>>>>> HuC fw disabled | -EOPNOTSUPP -> user decision, why error? >>>>>>> HuC fw missing | -ENOEXEC >>>>>>> HuC fw error | -ENOEXEC >>>>>>> HuC fw fail | 0 -> unlikely, but still fw/hw error >>>>>>> HuC authenticated | 1 >>>>>>> ------------------+---------- >>>>>>> >>>>>>> On other hand, option A treats all error conditions as errors, leaving >>>>>>> status codes only for normal operations: disabled(0)/authenticated(1): >>>>>>> >>>>>>> ------------------+---------- >>>>>>> HuC state | option A >>>>>>> ------------------+---------- >>>>>>> no HuC hardware | -ENODEV -> you shouldn't ask >>>>>>> GuC fw disabled | 0 -> user decision, not an error >>>>>>> HuC fw disabled | 0 -> user decision, not an error >>>>>>> HuC fw missing | -ENOPKG -> fw not installed correctly >>>>>>> HuC fw error | -ENOEXEC -> bad/wrong fw >>>>>>> HuC fw fail | -EACCES -> fw/hw error >>>>>>> HuC authenticated | 1 >>>>>>> ------------------+---------- > > Let's go with Option B here. Tony was ok with option A... > > It correctly reports anything as error if it precedes > the actual action of authentication and prevents it from > happening. Option A do the same, including correctly reporting corrupted/wrong firmware as error (as this is the most likely reason that prevents successful HuC authentication on given platform). The main difference here is user decision to disable HuC, that is now treated as the only one non-error reason that HuC is not available for use. > > So the result one gets is 0/1 is for the authentication > status which is the original intent here. I'm not sure that from user POV original intent was to explicitly check HuC authentication status but rather if HuC is available for use. Our redundant [1] check of fw authentication status during this ioctl is just internal detail, not necessary something user was interested from this action. Michal [1] as we are already checking this register during HuC fw load
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_huc.c b/drivers/gpu/drm/i915/gt/uc/intel_huc.c index 32a069841c14..ad4d9e16a24e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_huc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_huc.c @@ -209,7 +209,7 @@ int intel_huc_check_status(struct intel_huc *huc) intel_wakeref_t wakeref; u32 status = 0; - if (!intel_huc_is_supported(huc)) + if (!HAS_GT_UC(gt->i915)) return -ENODEV; with_intel_runtime_pm(gt->uncore->rpm, wakeref)
From commit 84b1ca2f0e68 ("drm/i915/uc: prefer intel_gt over i915 in GuC/HuC paths") we stopped using HUC_STATUS error -ENODEV only to indicate lack of HuC hardware and we started to use this error also for all other cases when HuC was not in use or supported. Fix that by relying again on HAS_GT_UC macro, since currently used function intel_huc_is_supported() is based on HuC firmware support which could be unsupported also due to force disabled GuC firmware. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Tony Ye <tony.ye@intel.com> --- drivers/gpu/drm/i915/gt/uc/intel_huc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)