diff mbox

[v3] drm/i915: Remove unsafe i915.enable_rc6

Message ID 20171026103200.20949-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson Oct. 26, 2017, 10:32 a.m. UTC
It has been many years since the last confirmed sighting (and fix) of an
RC6 related bug (usually a system hang). Remove the parameter to stop
users from setting dangerous values, as they often set it during triage
and end up disabling the entire runtime pm instead (the option is not a
fine scalpel!).

Furthermore, it allows users to set known dangerous values which were
intended for testing and not for production use. For testing, we can
always patch in the required setting without having to expose ourselves
to random abuse.

v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
lack of ilk support better.
v3: Clear intel_info->rc6p if we don't support rc6 itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.c     |   2 +-
 drivers/gpu/drm/i915/i915_drv.h     |   1 +
 drivers/gpu/drm/i915/i915_params.c  |   7 --
 drivers/gpu/drm/i915/i915_params.h  |   1 -
 drivers/gpu/drm/i915/i915_pci.c     |   3 +
 drivers/gpu/drm/i915/i915_sysfs.c   |  13 +++-
 drivers/gpu/drm/i915/intel_drv.h    |   5 --
 drivers/gpu/drm/i915/intel_guc.c    |   3 +-
 drivers/gpu/drm/i915/intel_pm.c     | 134 +++++++++++-------------------------
 drivers/gpu/drm/i915/intel_uncore.c |   3 -
 10 files changed, 57 insertions(+), 115 deletions(-)

Comments

Joonas Lahtinen Oct. 26, 2017, 2:33 p.m. UTC | #1
On Thu, 2017-10-26 at 11:32 +0100, Chris Wilson wrote:
> It has been many years since the last confirmed sighting (and fix) of an
> RC6 related bug (usually a system hang). Remove the parameter to stop
> users from setting dangerous values, as they often set it during triage
> and end up disabling the entire runtime pm instead (the option is not a
> fine scalpel!).
> 
> Furthermore, it allows users to set known dangerous values which were
> intended for testing and not for production use. For testing, we can
> always patch in the required setting without having to expose ourselves
> to random abuse.
> 
> v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> lack of ilk support better.
> v3: Clear intel_info->rc6p if we don't support rc6 itself.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

<SNIP>

> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3193,6 +3193,7 @@ intel_info(const struct drm_i915_private *dev_priv)
>  #define HAS_PSR(dev_priv)		 ((dev_priv)->info.has_psr)
>  #define HAS_RC6(dev_priv)		 ((dev_priv)->info.has_rc6)
>  #define HAS_RC6p(dev_priv)		 ((dev_priv)->info.has_rc6p)
> +#define HAS_RC6pp(dev_priv)		 (false)

Slap a comment at the end.

> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -216,6 +216,9 @@ static const struct intel_device_info intel_gm45_info __initconst = {
>  static const struct intel_device_info intel_ironlake_d_info __initconst = {
>  	GEN5_FEATURES,
>  	.platform = INTEL_IRONLAKE,
> +	/* ilk does support rc6, but we do not implement [power] contexts */
> +	.has_rc6 = 0,
> +

Extra newline.

<SNIP>

> -	if (IS_IVYBRIDGE(dev_priv))
> -		return (INTEL_RC6_ENABLE | INTEL_RC6p_ENABLE);
> +	/*
> +	 * We assume that we do not have any deep rc6 levels if we don't have
> +	 * have the previous rc6 level supporteditself, i.e. we use HAS_RC6()

"supported itself"

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
Daniele Ceraolo Spurio Oct. 27, 2017, 8:57 p.m. UTC | #2
On 26/10/17 03:32, Chris Wilson wrote:
> It has been many years since the last confirmed sighting (and fix) of an
> RC6 related bug (usually a system hang). Remove the parameter to stop
> users from setting dangerous values, as they often set it during triage
> and end up disabling the entire runtime pm instead (the option is not a
> fine scalpel!).
> 
> Furthermore, it allows users to set known dangerous values which were
> intended for testing and not for production use. For testing, we can
> always patch in the required setting without having to expose ourselves
> to random abuse.
> 
> v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> lack of ilk support better.
> v3: Clear intel_info->rc6p if we don't support rc6 itself.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---

I think that for execution/debug on early silicon we might still want 
the ability to turn features like RC6 off. Maybe we can add a debug 
kconfig to force info->has_rc6 = 0? Not a blocker to this patch but 
worth considering IMO.

Daniele
David Weinehall Oct. 30, 2017, 1 p.m. UTC | #3
On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 26/10/17 03:32, Chris Wilson wrote:
> > It has been many years since the last confirmed sighting (and fix) of an
> > RC6 related bug (usually a system hang). Remove the parameter to stop
> > users from setting dangerous values, as they often set it during triage
> > and end up disabling the entire runtime pm instead (the option is not a
> > fine scalpel!).
> > 
> > Furthermore, it allows users to set known dangerous values which were
> > intended for testing and not for production use. For testing, we can
> > always patch in the required setting without having to expose ourselves
> > to random abuse.
> > 
> > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> > lack of ilk support better.
> > v3: Clear intel_info->rc6p if we don't support rc6 itself.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Jani Nikula <jani.nikula@intel.com>
> > Cc: Imre Deak <imre.deak@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> 
> I think that for execution/debug on early silicon we might still want the
> ability to turn features like RC6 off. Maybe we can add a debug kconfig to
> force info->has_rc6 = 0? Not a blocker to this patch but worth considering
> IMO.

Most of the BIOSes I've seen on our RVPs have had an option to disable
RC6.


Kind regards, David
Rodrigo Vivi Oct. 30, 2017, 5:48 p.m. UTC | #4
On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
> On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
> >
> >
> > On 26/10/17 03:32, Chris Wilson wrote:
> > > It has been many years since the last confirmed sighting (and fix) of an
> > > RC6 related bug (usually a system hang). Remove the parameter to stop
> > > users from setting dangerous values, as they often set it during triage
> > > and end up disabling the entire runtime pm instead (the option is not a
> > > fine scalpel!).
> > >
> > > Furthermore, it allows users to set known dangerous values which were
> > > intended for testing and not for production use. For testing, we can
> > > always patch in the required setting without having to expose ourselves
> > > to random abuse.
> > >
> > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> > > lack of ilk support better.
> > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > Cc: Imre Deak <imre.deak@intel.com>
> > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > ---
> >
> > I think that for execution/debug on early silicon we might still want the
> > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
> > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
> > IMO.
>
> Most of the BIOSes I've seen on our RVPs have had an option to disable
> RC6.

BIOS option don't block our code to run and set some MMIOs.
Not sure how the GPU will behave on such cases.

I like the idea of removing some and keeping the parameters clean.
But there are few ones like RC6 and disable_power_wells that are very
useful on platform enabling and also when assisting others to debug issues.

For instance right now that we fixed RC6 on CNL someone told that
he believe seeing more hangs, so I immediately asked to boot with
i915.enable_rc6=0 to double check. It is easier and straighforward
to direct them to the unsafe param than to ask them to compile the code
with different options or to use some BIOS options that we are not sure.

Also on bug triage some options like this are helpful.

Also BIOS and compile are saved flags. So if you need to do a quick test
you have to save it, and then unsave later. Parameters are very convinient
for 1 boot only check.

Maybe we could clean and remove the different options and levels and
let it be just a boolean.

Thanks,
Rodrigo.

>
>
> Kind regards, David
Joonas Lahtinen Nov. 1, 2017, 12:07 p.m. UTC | #5
On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
> On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
> > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
> > > 
> > > 
> > > On 26/10/17 03:32, Chris Wilson wrote:
> > > > It has been many years since the last confirmed sighting (and fix) of an
> > > > RC6 related bug (usually a system hang). Remove the parameter to stop
> > > > users from setting dangerous values, as they often set it during triage
> > > > and end up disabling the entire runtime pm instead (the option is not a
> > > > fine scalpel!).
> > > > 
> > > > Furthermore, it allows users to set known dangerous values which were
> > > > intended for testing and not for production use. For testing, we can
> > > > always patch in the required setting without having to expose ourselves
> > > > to random abuse.
> > > > 
> > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> > > > lack of ilk support better.
> > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
> > > > 
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > ---
> > > 
> > > I think that for execution/debug on early silicon we might still want the
> > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
> > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
> > > IMO.
> > 
> > Most of the BIOSes I've seen on our RVPs have had an option to disable
> > RC6.
> 
> BIOS option don't block our code to run and set some MMIOs.
> Not sure how the GPU will behave on such cases.
> 
> I like the idea of removing some and keeping the parameters clean.
> But there are few ones like RC6 and disable_power_wells that are very
> useful on platform enabling and also when assisting others to debug issues.
> 
> For instance right now that we fixed RC6 on CNL someone told that
> he believe seeing more hangs, so I immediately asked to boot with
> i915.enable_rc6=0 to double check. It is easier and straighforward
> to direct them to the unsafe param than to ask them to compile the code
> with different options or to use some BIOS options that we are not sure.
> 
> Also on bug triage some options like this are helpful.
> 
> Also BIOS and compile are saved flags. So if you need to do a quick test
> you have to save it, and then unsave later. Parameters are very convinient
> for 1 boot only check.

It's convenient for sure, but the unsafe module parameters seems to be
finding their way into way too many HOWTOs, and from there to some
"productized" use-cases. Chris states that setting .enable_rc6=0 to
solving an issue on publicly shipping products has been some years ago,
so I don't see a need for carrying this.

We shouldn't allow the convenience of not having to change one line and
recompile kernel during development to affect the end-users who are
Googling how to get the best performance out of their hardware (I could
mention some distro here).

This seems the like the best option as I don't think introducing kernel
parameters that only exists on debug builds would be too convenient
either. It'd maybe just add more confusion.

Regards, Joonas
Ben Widawsky Nov. 1, 2017, 2:43 p.m. UTC | #6
On 17-11-01 14:07:28, Joonas Lahtinen wrote:
>On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
>> On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
>> > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
>> > >
>> > >
>> > > On 26/10/17 03:32, Chris Wilson wrote:
>> > > > It has been many years since the last confirmed sighting (and fix) of an
>> > > > RC6 related bug (usually a system hang). Remove the parameter to stop
>> > > > users from setting dangerous values, as they often set it during triage
>> > > > and end up disabling the entire runtime pm instead (the option is not a
>> > > > fine scalpel!).
>> > > >
>> > > > Furthermore, it allows users to set known dangerous values which were
>> > > > intended for testing and not for production use. For testing, we can
>> > > > always patch in the required setting without having to expose ourselves
>> > > > to random abuse.
>> > > >
>> > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
>> > > > lack of ilk support better.
>> > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
>> > > >
>> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > > > Cc: Jani Nikula <jani.nikula@intel.com>
>> > > > Cc: Imre Deak <imre.deak@intel.com>
>> > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > ---
>> > >
>> > > I think that for execution/debug on early silicon we might still want the
>> > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
>> > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
>> > > IMO.
>> >
>> > Most of the BIOSes I've seen on our RVPs have had an option to disable
>> > RC6.
>>
>> BIOS option don't block our code to run and set some MMIOs.
>> Not sure how the GPU will behave on such cases.
>>
>> I like the idea of removing some and keeping the parameters clean.
>> But there are few ones like RC6 and disable_power_wells that are very
>> useful on platform enabling and also when assisting others to debug issues.
>>
>> For instance right now that we fixed RC6 on CNL someone told that
>> he believe seeing more hangs, so I immediately asked to boot with
>> i915.enable_rc6=0 to double check. It is easier and straighforward
>> to direct them to the unsafe param than to ask them to compile the code
>> with different options or to use some BIOS options that we are not sure.
>>
>> Also on bug triage some options like this are helpful.
>>
>> Also BIOS and compile are saved flags. So if you need to do a quick test
>> you have to save it, and then unsave later. Parameters are very convinient
>> for 1 boot only check.
>
>It's convenient for sure, but the unsafe module parameters seems to be
>finding their way into way too many HOWTOs, and from there to some
>"productized" use-cases. Chris states that setting .enable_rc6=0 to
>solving an issue on publicly shipping products has been some years ago,
>so I don't see a need for carrying this.
>
>We shouldn't allow the convenience of not having to change one line and
>recompile kernel during development to affect the end-users who are
>Googling how to get the best performance out of their hardware (I could
>mention some distro here).
>
>This seems the like the best option as I don't think introducing kernel
>parameters that only exists on debug builds would be too convenient
>either. It'd maybe just add more confusion.
>
>Regards, Joonas

I believe the ability to disable RC6 is valuable not just for debugging
purposes. Folks with very latency sensitive workloads are often willing to
forego power savings. The real problem I see is that we don't test without rc6
in our setup, which indeed makes it unsafe. As such, I see the other option here
going back to the ability to toggle rc6 after load (either module parameter, or
make it sysfs), and actually run some subset of our workloads with RC6. I
suspect people will poop on that suggestion, but I figured I'd mention.
Joonas Lahtinen Nov. 1, 2017, 4:09 p.m. UTC | #7
+ Kimmo and Paul

On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
> On 17-11-01 14:07:28, Joonas Lahtinen wrote:
> > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
> > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
> > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
> > > > > 
> > > > > 
> > > > > On 26/10/17 03:32, Chris Wilson wrote:
> > > > > > It has been many years since the last confirmed sighting (and fix) of an
> > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
> > > > > > users from setting dangerous values, as they often set it during triage
> > > > > > and end up disabling the entire runtime pm instead (the option is not a
> > > > > > fine scalpel!).
> > > > > > 
> > > > > > Furthermore, it allows users to set known dangerous values which were
> > > > > > intended for testing and not for production use. For testing, we can
> > > > > > always patch in the required setting without having to expose ourselves
> > > > > > to random abuse.
> > > > > > 
> > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> > > > > > lack of ilk support better.
> > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
> > > > > > 
> > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > ---
> > > > > 
> > > > > I think that for execution/debug on early silicon we might still want the
> > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
> > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
> > > > > IMO.
> > > > 
> > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
> > > > RC6.
> > > 
> > > BIOS option don't block our code to run and set some MMIOs.
> > > Not sure how the GPU will behave on such cases.
> > > 
> > > I like the idea of removing some and keeping the parameters clean.
> > > But there are few ones like RC6 and disable_power_wells that are very
> > > useful on platform enabling and also when assisting others to debug issues.
> > > 
> > > For instance right now that we fixed RC6 on CNL someone told that
> > > he believe seeing more hangs, so I immediately asked to boot with
> > > i915.enable_rc6=0 to double check. It is easier and straighforward
> > > to direct them to the unsafe param than to ask them to compile the code
> > > with different options or to use some BIOS options that we are not sure.
> > > 
> > > Also on bug triage some options like this are helpful.
> > > 
> > > Also BIOS and compile are saved flags. So if you need to do a quick test
> > > you have to save it, and then unsave later. Parameters are very convinient
> > > for 1 boot only check.
> > 
> > It's convenient for sure, but the unsafe module parameters seems to be
> > finding their way into way too many HOWTOs, and from there to some
> > "productized" use-cases. Chris states that setting .enable_rc6=0 to
> > solving an issue on publicly shipping products has been some years ago,
> > so I don't see a need for carrying this.
> > 
> > We shouldn't allow the convenience of not having to change one line and
> > recompile kernel during development to affect the end-users who are
> > Googling how to get the best performance out of their hardware (I could
> > mention some distro here).
> > 
> > This seems the like the best option as I don't think introducing kernel
> > parameters that only exists on debug builds would be too convenient
> > either. It'd maybe just add more confusion.
> > 
> > Regards, Joonas
> 
> I believe the ability to disable RC6 is valuable not just for debugging
> purposes. Folks with very latency sensitive workloads are often willing to
> forego power savings. The real problem I see is that we don't test without rc6
> in our setup, which indeed makes it unsafe. As such, I see the other option here
> going back to the ability to toggle rc6 after load (either module parameter, or
> make it sysfs), and actually run some subset of our workloads with RC6. I
> suspect people will poop on that suggestion, but I figured I'd mention.

I agree there, but by my understanding there's really no ask to support
the feature in upstream. And the original motive from Chris to drop the
feature is that it's unsafe as it currently is.

So unless we've got the resources to bring it back from the unsafe
zone, I think we should drop it like this patch proposes.

Regards, Joonas
Ben Widawsky Nov. 1, 2017, 4:21 p.m. UTC | #8
On 17-11-01 18:09:47, Joonas Lahtinen wrote:
>+ Kimmo and Paul
>
>On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
>> On 17-11-01 14:07:28, Joonas Lahtinen wrote:
>> > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
>> > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
>> > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
>> > > > >
>> > > > >
>> > > > > On 26/10/17 03:32, Chris Wilson wrote:
>> > > > > > It has been many years since the last confirmed sighting (and fix) of an
>> > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
>> > > > > > users from setting dangerous values, as they often set it during triage
>> > > > > > and end up disabling the entire runtime pm instead (the option is not a
>> > > > > > fine scalpel!).
>> > > > > >
>> > > > > > Furthermore, it allows users to set known dangerous values which were
>> > > > > > intended for testing and not for production use. For testing, we can
>> > > > > > always patch in the required setting without having to expose ourselves
>> > > > > > to random abuse.
>> > > > > >
>> > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
>> > > > > > lack of ilk support better.
>> > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
>> > > > > >
>> > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
>> > > > > > Cc: Imre Deak <imre.deak@intel.com>
>> > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > > > ---
>> > > > >
>> > > > > I think that for execution/debug on early silicon we might still want the
>> > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
>> > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
>> > > > > IMO.
>> > > >
>> > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
>> > > > RC6.
>> > >
>> > > BIOS option don't block our code to run and set some MMIOs.
>> > > Not sure how the GPU will behave on such cases.
>> > >
>> > > I like the idea of removing some and keeping the parameters clean.
>> > > But there are few ones like RC6 and disable_power_wells that are very
>> > > useful on platform enabling and also when assisting others to debug issues.
>> > >
>> > > For instance right now that we fixed RC6 on CNL someone told that
>> > > he believe seeing more hangs, so I immediately asked to boot with
>> > > i915.enable_rc6=0 to double check. It is easier and straighforward
>> > > to direct them to the unsafe param than to ask them to compile the code
>> > > with different options or to use some BIOS options that we are not sure.
>> > >
>> > > Also on bug triage some options like this are helpful.
>> > >
>> > > Also BIOS and compile are saved flags. So if you need to do a quick test
>> > > you have to save it, and then unsave later. Parameters are very convinient
>> > > for 1 boot only check.
>> >
>> > It's convenient for sure, but the unsafe module parameters seems to be
>> > finding their way into way too many HOWTOs, and from there to some
>> > "productized" use-cases. Chris states that setting .enable_rc6=0 to
>> > solving an issue on publicly shipping products has been some years ago,
>> > so I don't see a need for carrying this.
>> >
>> > We shouldn't allow the convenience of not having to change one line and
>> > recompile kernel during development to affect the end-users who are
>> > Googling how to get the best performance out of their hardware (I could
>> > mention some distro here).
>> >
>> > This seems the like the best option as I don't think introducing kernel
>> > parameters that only exists on debug builds would be too convenient
>> > either. It'd maybe just add more confusion.
>> >
>> > Regards, Joonas
>>
>> I believe the ability to disable RC6 is valuable not just for debugging
>> purposes. Folks with very latency sensitive workloads are often willing to
>> forego power savings. The real problem I see is that we don't test without rc6
>> in our setup, which indeed makes it unsafe. As such, I see the other option here
>> going back to the ability to toggle rc6 after load (either module parameter, or
>> make it sysfs), and actually run some subset of our workloads with RC6. I
>> suspect people will poop on that suggestion, but I figured I'd mention.
>
>I agree there, but by my understanding there's really no ask to support
>the feature in upstream. And the original motive from Chris to drop the
>feature is that it's unsafe as it currently is.
>
>So unless we've got the resources to bring it back from the unsafe
>zone, I think we should drop it like this patch proposes.
>
>Regards, Joonas

Yep, I agree. One other option would be to move i915_forcewake_user to sysfs and
let them use that.
Rodrigo Vivi Nov. 1, 2017, 5:12 p.m. UTC | #9
On Wed, Nov 01, 2017 at 04:21:08PM +0000, Ben Widawsky wrote:
> On 17-11-01 18:09:47, Joonas Lahtinen wrote:
> > + Kimmo and Paul
> > 
> > On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
> > > On 17-11-01 14:07:28, Joonas Lahtinen wrote:
> > > > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
> > > > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
> > > > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
> > > > > > >
> > > > > > >
> > > > > > > On 26/10/17 03:32, Chris Wilson wrote:
> > > > > > > > It has been many years since the last confirmed sighting (and fix) of an
> > > > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
> > > > > > > > users from setting dangerous values, as they often set it during triage
> > > > > > > > and end up disabling the entire runtime pm instead (the option is not a
> > > > > > > > fine scalpel!).
> > > > > > > >
> > > > > > > > Furthermore, it allows users to set known dangerous values which were
> > > > > > > > intended for testing and not for production use. For testing, we can
> > > > > > > > always patch in the required setting without having to expose ourselves
> > > > > > > > to random abuse.
> > > > > > > >
> > > > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> > > > > > > > lack of ilk support better.
> > > > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
> > > > > > > >
> > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > > ---
> > > > > > >
> > > > > > > I think that for execution/debug on early silicon we might still want the
> > > > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
> > > > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
> > > > > > > IMO.
> > > > > >
> > > > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
> > > > > > RC6.
> > > > >
> > > > > BIOS option don't block our code to run and set some MMIOs.
> > > > > Not sure how the GPU will behave on such cases.
> > > > >
> > > > > I like the idea of removing some and keeping the parameters clean.
> > > > > But there are few ones like RC6 and disable_power_wells that are very
> > > > > useful on platform enabling and also when assisting others to debug issues.
> > > > >
> > > > > For instance right now that we fixed RC6 on CNL someone told that
> > > > > he believe seeing more hangs, so I immediately asked to boot with
> > > > > i915.enable_rc6=0 to double check. It is easier and straighforward
> > > > > to direct them to the unsafe param than to ask them to compile the code
> > > > > with different options or to use some BIOS options that we are not sure.
> > > > >
> > > > > Also on bug triage some options like this are helpful.
> > > > >
> > > > > Also BIOS and compile are saved flags. So if you need to do a quick test
> > > > > you have to save it, and then unsave later. Parameters are very convinient
> > > > > for 1 boot only check.
> > > >
> > > > It's convenient for sure, but the unsafe module parameters seems to be
> > > > finding their way into way too many HOWTOs, and from there to some
> > > > "productized" use-cases. Chris states that setting .enable_rc6=0 to
> > > > solving an issue on publicly shipping products has been some years ago,
> > > > so I don't see a need for carrying this.
> > > >
> > > > We shouldn't allow the convenience of not having to change one line and
> > > > recompile kernel during development to affect the end-users who are
> > > > Googling how to get the best performance out of their hardware (I could
> > > > mention some distro here).
> > > >
> > > > This seems the like the best option as I don't think introducing kernel
> > > > parameters that only exists on debug builds would be too convenient
> > > > either. It'd maybe just add more confusion.
> > > >
> > > > Regards, Joonas
> > > 
> > > I believe the ability to disable RC6 is valuable not just for debugging
> > > purposes. Folks with very latency sensitive workloads are often willing to
> > > forego power savings. The real problem I see is that we don't test without rc6
> > > in our setup, which indeed makes it unsafe. As such, I see the other option here
> > > going back to the ability to toggle rc6 after load (either module parameter, or
> > > make it sysfs), and actually run some subset of our workloads with RC6. I
> > > suspect people will poop on that suggestion, but I figured I'd mention.
> > 
> > I agree there, but by my understanding there's really no ask to support
> > the feature in upstream. And the original motive from Chris to drop the
> > feature is that it's unsafe as it currently is.
> > 
> > So unless we've got the resources to bring it back from the unsafe
> > zone, I think we should drop it like this patch proposes.
> > 
> > Regards, Joonas
> 
> Yep, I agree. One other option would be to move i915_forcewake_user to sysfs and
> let them use that.

Well, I won't try to block that. I just put my 2 cents that I believe it is a very
useful parameter.

It wasn't that long ago the last time that we needed the flag to allow
end users to have a functional machine: https://plus.google.com/+JonMasters/posts/BqWLEjenLKv.

or to debug a related issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1440988
https://bugzilla.kernel.org/show_bug.cgi?id=116431

Although date on few seems over than 1 year. We need to consider that
that was our latest new gpu... gen9.

If products are recommending the use of enable_rc6=0 I can see they
adding the patch to disable that. Effect is the same and our convenience is gone.

But again, just my view here. Not a nack ;)

Thanks,
Rodrigo.

> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center
Jani Nikula Nov. 2, 2017, 8:06 a.m. UTC | #10
On Wed, 01 Nov 2017, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> On Wed, Nov 01, 2017 at 04:21:08PM +0000, Ben Widawsky wrote:
>> On 17-11-01 18:09:47, Joonas Lahtinen wrote:
>> > + Kimmo and Paul
>> > 
>> > On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
>> > > On 17-11-01 14:07:28, Joonas Lahtinen wrote:
>> > > > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
>> > > > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
>> > > > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
>> > > > > > >
>> > > > > > >
>> > > > > > > On 26/10/17 03:32, Chris Wilson wrote:
>> > > > > > > > It has been many years since the last confirmed sighting (and fix) of an
>> > > > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
>> > > > > > > > users from setting dangerous values, as they often set it during triage
>> > > > > > > > and end up disabling the entire runtime pm instead (the option is not a
>> > > > > > > > fine scalpel!).
>> > > > > > > >
>> > > > > > > > Furthermore, it allows users to set known dangerous values which were
>> > > > > > > > intended for testing and not for production use. For testing, we can
>> > > > > > > > always patch in the required setting without having to expose ourselves
>> > > > > > > > to random abuse.
>> > > > > > > >
>> > > > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
>> > > > > > > > lack of ilk support better.
>> > > > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
>> > > > > > > >
>> > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > > > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > > > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
>> > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
>> > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > > > > > ---
>> > > > > > >
>> > > > > > > I think that for execution/debug on early silicon we might still want the
>> > > > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
>> > > > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
>> > > > > > > IMO.
>> > > > > >
>> > > > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
>> > > > > > RC6.
>> > > > >
>> > > > > BIOS option don't block our code to run and set some MMIOs.
>> > > > > Not sure how the GPU will behave on such cases.
>> > > > >
>> > > > > I like the idea of removing some and keeping the parameters clean.
>> > > > > But there are few ones like RC6 and disable_power_wells that are very
>> > > > > useful on platform enabling and also when assisting others to debug issues.
>> > > > >
>> > > > > For instance right now that we fixed RC6 on CNL someone told that
>> > > > > he believe seeing more hangs, so I immediately asked to boot with
>> > > > > i915.enable_rc6=0 to double check. It is easier and straighforward
>> > > > > to direct them to the unsafe param than to ask them to compile the code
>> > > > > with different options or to use some BIOS options that we are not sure.
>> > > > >
>> > > > > Also on bug triage some options like this are helpful.
>> > > > >
>> > > > > Also BIOS and compile are saved flags. So if you need to do a quick test
>> > > > > you have to save it, and then unsave later. Parameters are very convinient
>> > > > > for 1 boot only check.
>> > > >
>> > > > It's convenient for sure, but the unsafe module parameters seems to be
>> > > > finding their way into way too many HOWTOs, and from there to some
>> > > > "productized" use-cases. Chris states that setting .enable_rc6=0 to
>> > > > solving an issue on publicly shipping products has been some years ago,
>> > > > so I don't see a need for carrying this.
>> > > >
>> > > > We shouldn't allow the convenience of not having to change one line and
>> > > > recompile kernel during development to affect the end-users who are
>> > > > Googling how to get the best performance out of their hardware (I could
>> > > > mention some distro here).
>> > > >
>> > > > This seems the like the best option as I don't think introducing kernel
>> > > > parameters that only exists on debug builds would be too convenient
>> > > > either. It'd maybe just add more confusion.
>> > > >
>> > > > Regards, Joonas
>> > > 
>> > > I believe the ability to disable RC6 is valuable not just for debugging
>> > > purposes. Folks with very latency sensitive workloads are often willing to
>> > > forego power savings. The real problem I see is that we don't test without rc6
>> > > in our setup, which indeed makes it unsafe. As such, I see the other option here
>> > > going back to the ability to toggle rc6 after load (either module parameter, or
>> > > make it sysfs), and actually run some subset of our workloads with RC6. I
>> > > suspect people will poop on that suggestion, but I figured I'd mention.
>> > 
>> > I agree there, but by my understanding there's really no ask to support
>> > the feature in upstream. And the original motive from Chris to drop the
>> > feature is that it's unsafe as it currently is.
>> > 
>> > So unless we've got the resources to bring it back from the unsafe
>> > zone, I think we should drop it like this patch proposes.
>> > 
>> > Regards, Joonas
>> 
>> Yep, I agree. One other option would be to move i915_forcewake_user to sysfs and
>> let them use that.
>
> Well, I won't try to block that. I just put my 2 cents that I believe it is a very
> useful parameter.
>
> It wasn't that long ago the last time that we needed the flag to allow
> end users to have a functional machine: https://plus.google.com/+JonMasters/posts/BqWLEjenLKv.
>
> or to debug a related issue:
> https://bugzilla.redhat.com/show_bug.cgi?id=1440988
> https://bugzilla.kernel.org/show_bug.cgi?id=116431
>
> Although date on few seems over than 1 year. We need to consider that
> that was our latest new gpu... gen9.
>
> If products are recommending the use of enable_rc6=0 I can see they
> adding the patch to disable that. Effect is the same and our convenience is gone.
>
> But again, just my view here. Not a nack ;)

I suppose the compromise would be to make it a boolean module parameter
to only allow disabling rc6 on platforms where it's enabled by default,
but not letting you enable rc6 where it's disabled by default. I.e. only
support i915.enable_rc6=0 to be passed by the user.

BR,
Jani.


>
> Thanks,
> Rodrigo.
>
>> 
>> -- 
>> Ben Widawsky, Intel Open Source Technology Center
Rodrigo Vivi Nov. 2, 2017, 2:47 p.m. UTC | #11
On Thu, Nov 02, 2017 at 08:06:29AM +0000, Jani Nikula wrote:
> On Wed, 01 Nov 2017, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > On Wed, Nov 01, 2017 at 04:21:08PM +0000, Ben Widawsky wrote:
> >> On 17-11-01 18:09:47, Joonas Lahtinen wrote:
> >> > + Kimmo and Paul
> >> > 
> >> > On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
> >> > > On 17-11-01 14:07:28, Joonas Lahtinen wrote:
> >> > > > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
> >> > > > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
> >> > > > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On 26/10/17 03:32, Chris Wilson wrote:
> >> > > > > > > > It has been many years since the last confirmed sighting (and fix) of an
> >> > > > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
> >> > > > > > > > users from setting dangerous values, as they often set it during triage
> >> > > > > > > > and end up disabling the entire runtime pm instead (the option is not a
> >> > > > > > > > fine scalpel!).
> >> > > > > > > >
> >> > > > > > > > Furthermore, it allows users to set known dangerous values which were
> >> > > > > > > > intended for testing and not for production use. For testing, we can
> >> > > > > > > > always patch in the required setting without having to expose ourselves
> >> > > > > > > > to random abuse.
> >> > > > > > > >
> >> > > > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> >> > > > > > > > lack of ilk support better.
> >> > > > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
> >> > > > > > > >
> >> > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > > > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >> > > > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >> > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> >> > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
> >> > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >> > > > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >> > > > > > > > ---
> >> > > > > > >
> >> > > > > > > I think that for execution/debug on early silicon we might still want the
> >> > > > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
> >> > > > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
> >> > > > > > > IMO.
> >> > > > > >
> >> > > > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
> >> > > > > > RC6.
> >> > > > >
> >> > > > > BIOS option don't block our code to run and set some MMIOs.
> >> > > > > Not sure how the GPU will behave on such cases.
> >> > > > >
> >> > > > > I like the idea of removing some and keeping the parameters clean.
> >> > > > > But there are few ones like RC6 and disable_power_wells that are very
> >> > > > > useful on platform enabling and also when assisting others to debug issues.
> >> > > > >
> >> > > > > For instance right now that we fixed RC6 on CNL someone told that
> >> > > > > he believe seeing more hangs, so I immediately asked to boot with
> >> > > > > i915.enable_rc6=0 to double check. It is easier and straighforward
> >> > > > > to direct them to the unsafe param than to ask them to compile the code
> >> > > > > with different options or to use some BIOS options that we are not sure.
> >> > > > >
> >> > > > > Also on bug triage some options like this are helpful.
> >> > > > >
> >> > > > > Also BIOS and compile are saved flags. So if you need to do a quick test
> >> > > > > you have to save it, and then unsave later. Parameters are very convinient
> >> > > > > for 1 boot only check.
> >> > > >
> >> > > > It's convenient for sure, but the unsafe module parameters seems to be
> >> > > > finding their way into way too many HOWTOs, and from there to some
> >> > > > "productized" use-cases. Chris states that setting .enable_rc6=0 to
> >> > > > solving an issue on publicly shipping products has been some years ago,
> >> > > > so I don't see a need for carrying this.
> >> > > >
> >> > > > We shouldn't allow the convenience of not having to change one line and
> >> > > > recompile kernel during development to affect the end-users who are
> >> > > > Googling how to get the best performance out of their hardware (I could
> >> > > > mention some distro here).
> >> > > >
> >> > > > This seems the like the best option as I don't think introducing kernel
> >> > > > parameters that only exists on debug builds would be too convenient
> >> > > > either. It'd maybe just add more confusion.
> >> > > >
> >> > > > Regards, Joonas
> >> > > 
> >> > > I believe the ability to disable RC6 is valuable not just for debugging
> >> > > purposes. Folks with very latency sensitive workloads are often willing to
> >> > > forego power savings. The real problem I see is that we don't test without rc6
> >> > > in our setup, which indeed makes it unsafe. As such, I see the other option here
> >> > > going back to the ability to toggle rc6 after load (either module parameter, or
> >> > > make it sysfs), and actually run some subset of our workloads with RC6. I
> >> > > suspect people will poop on that suggestion, but I figured I'd mention.
> >> > 
> >> > I agree there, but by my understanding there's really no ask to support
> >> > the feature in upstream. And the original motive from Chris to drop the
> >> > feature is that it's unsafe as it currently is.
> >> > 
> >> > So unless we've got the resources to bring it back from the unsafe
> >> > zone, I think we should drop it like this patch proposes.
> >> > 
> >> > Regards, Joonas
> >> 
> >> Yep, I agree. One other option would be to move i915_forcewake_user to sysfs and
> >> let them use that.
> >
> > Well, I won't try to block that. I just put my 2 cents that I believe it is a very
> > useful parameter.
> >
> > It wasn't that long ago the last time that we needed the flag to allow
> > end users to have a functional machine: https://plus.google.com/+JonMasters/posts/BqWLEjenLKv.
> >
> > or to debug a related issue:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1440988
> > https://bugzilla.kernel.org/show_bug.cgi?id=116431
> >
> > Although date on few seems over than 1 year. We need to consider that
> > that was our latest new gpu... gen9.
> >
> > If products are recommending the use of enable_rc6=0 I can see they
> > adding the patch to disable that. Effect is the same and our convenience is gone.
> >
> > But again, just my view here. Not a nack ;)
> 
> I suppose the compromise would be to make it a boolean module parameter
> to only allow disabling rc6 on platforms where it's enabled by default,
> but not letting you enable rc6 where it's disabled by default. I.e. only
> support i915.enable_rc6=0 to be passed by the user.

+1. I like this approach.

> 
> BR,
> Jani.
> 
> 
> >
> > Thanks,
> > Rodrigo.
> >
> >> 
> >> -- 
> >> Ben Widawsky, Intel Open Source Technology Center
> 
> -- 
> Jani Nikula, Intel Open Source Technology Center
Joonas Lahtinen Nov. 2, 2017, 2:59 p.m. UTC | #12
On Thu, 2017-11-02 at 07:47 -0700, Rodrigo Vivi wrote:
> On Thu, Nov 02, 2017 at 08:06:29AM +0000, Jani Nikula wrote:
> > On Wed, 01 Nov 2017, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > > On Wed, Nov 01, 2017 at 04:21:08PM +0000, Ben Widawsky wrote:
> > > > On 17-11-01 18:09:47, Joonas Lahtinen wrote:
> > > > > + Kimmo and Paul
> > > > > 
> > > > > On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
> > > > > > On 17-11-01 14:07:28, Joonas Lahtinen wrote:
> > > > > > > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
> > > > > > > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
> > > > > > > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > On 26/10/17 03:32, Chris Wilson wrote:
> > > > > > > > > > > It has been many years since the last confirmed sighting (and fix) of an
> > > > > > > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
> > > > > > > > > > > users from setting dangerous values, as they often set it during triage
> > > > > > > > > > > and end up disabling the entire runtime pm instead (the option is not a
> > > > > > > > > > > fine scalpel!).
> > > > > > > > > > > 
> > > > > > > > > > > Furthermore, it allows users to set known dangerous values which were
> > > > > > > > > > > intended for testing and not for production use. For testing, we can
> > > > > > > > > > > always patch in the required setting without having to expose ourselves
> > > > > > > > > > > to random abuse.
> > > > > > > > > > > 
> > > > > > > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> > > > > > > > > > > lack of ilk support better.
> > > > > > > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > > > > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > > > > > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > > > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> > > > > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > > > > > ---
> > > > > > > > > > 
> > > > > > > > > > I think that for execution/debug on early silicon we might still want the
> > > > > > > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
> > > > > > > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
> > > > > > > > > > IMO.
> > > > > > > > > 
> > > > > > > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
> > > > > > > > > RC6.
> > > > > > > > 
> > > > > > > > BIOS option don't block our code to run and set some MMIOs.
> > > > > > > > Not sure how the GPU will behave on such cases.
> > > > > > > > 
> > > > > > > > I like the idea of removing some and keeping the parameters clean.
> > > > > > > > But there are few ones like RC6 and disable_power_wells that are very
> > > > > > > > useful on platform enabling and also when assisting others to debug issues.
> > > > > > > > 
> > > > > > > > For instance right now that we fixed RC6 on CNL someone told that
> > > > > > > > he believe seeing more hangs, so I immediately asked to boot with
> > > > > > > > i915.enable_rc6=0 to double check. It is easier and straighforward
> > > > > > > > to direct them to the unsafe param than to ask them to compile the code
> > > > > > > > with different options or to use some BIOS options that we are not sure.
> > > > > > > > 
> > > > > > > > Also on bug triage some options like this are helpful.
> > > > > > > > 
> > > > > > > > Also BIOS and compile are saved flags. So if you need to do a quick test
> > > > > > > > you have to save it, and then unsave later. Parameters are very convinient
> > > > > > > > for 1 boot only check.
> > > > > > > 
> > > > > > > It's convenient for sure, but the unsafe module parameters seems to be
> > > > > > > finding their way into way too many HOWTOs, and from there to some
> > > > > > > "productized" use-cases. Chris states that setting .enable_rc6=0 to
> > > > > > > solving an issue on publicly shipping products has been some years ago,
> > > > > > > so I don't see a need for carrying this.
> > > > > > > 
> > > > > > > We shouldn't allow the convenience of not having to change one line and
> > > > > > > recompile kernel during development to affect the end-users who are
> > > > > > > Googling how to get the best performance out of their hardware (I could
> > > > > > > mention some distro here).
> > > > > > > 
> > > > > > > This seems the like the best option as I don't think introducing kernel
> > > > > > > parameters that only exists on debug builds would be too convenient
> > > > > > > either. It'd maybe just add more confusion.
> > > > > > > 
> > > > > > > Regards, Joonas
> > > > > > 
> > > > > > I believe the ability to disable RC6 is valuable not just for debugging
> > > > > > purposes. Folks with very latency sensitive workloads are often willing to
> > > > > > forego power savings. The real problem I see is that we don't test without rc6
> > > > > > in our setup, which indeed makes it unsafe. As such, I see the other option here
> > > > > > going back to the ability to toggle rc6 after load (either module parameter, or
> > > > > > make it sysfs), and actually run some subset of our workloads with RC6. I
> > > > > > suspect people will poop on that suggestion, but I figured I'd mention.
> > > > > 
> > > > > I agree there, but by my understanding there's really no ask to support
> > > > > the feature in upstream. And the original motive from Chris to drop the
> > > > > feature is that it's unsafe as it currently is.
> > > > > 
> > > > > So unless we've got the resources to bring it back from the unsafe
> > > > > zone, I think we should drop it like this patch proposes.
> > > > > 
> > > > > Regards, Joonas
> > > > 
> > > > Yep, I agree. One other option would be to move i915_forcewake_user to sysfs and
> > > > let them use that.
> > > 
> > > Well, I won't try to block that. I just put my 2 cents that I believe it is a very
> > > useful parameter.
> > > 
> > > It wasn't that long ago the last time that we needed the flag to allow
> > > end users to have a functional machine: https://plus.google.com/+JonMasters/posts/BqWLEjenLKv.
> > > 
> > > or to debug a related issue:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1440988
> > > https://bugzilla.kernel.org/show_bug.cgi?id=116431
> > > 
> > > Although date on few seems over than 1 year. We need to consider that
> > > that was our latest new gpu... gen9.
> > > 
> > > If products are recommending the use of enable_rc6=0 I can see they
> > > adding the patch to disable that. Effect is the same and our convenience is gone.
> > > 
> > > But again, just my view here. Not a nack ;)
> > 
> > I suppose the compromise would be to make it a boolean module parameter
> > to only allow disabling rc6 on platforms where it's enabled by default,
> > but not letting you enable rc6 where it's disabled by default. I.e. only
> > support i915.enable_rc6=0 to be passed by the user.
> 
> +1. I like this approach.

Umm, it still doesn't resolve the issue that it's not being tested.

I try to be super clear; until we have resources to support that
specific code path, I'd much prefer not to have an easy kernel
parameter to set it.

Regards, Joonas
Jani Nikula Nov. 2, 2017, 3:17 p.m. UTC | #13
On Thu, 02 Nov 2017, Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote:
> On Thu, 2017-11-02 at 07:47 -0700, Rodrigo Vivi wrote:
>> On Thu, Nov 02, 2017 at 08:06:29AM +0000, Jani Nikula wrote:
>> > On Wed, 01 Nov 2017, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>> > > On Wed, Nov 01, 2017 at 04:21:08PM +0000, Ben Widawsky wrote:
>> > > > On 17-11-01 18:09:47, Joonas Lahtinen wrote:
>> > > > > + Kimmo and Paul
>> > > > > 
>> > > > > On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
>> > > > > > On 17-11-01 14:07:28, Joonas Lahtinen wrote:
>> > > > > > > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
>> > > > > > > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
>> > > > > > > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > On 26/10/17 03:32, Chris Wilson wrote:
>> > > > > > > > > > > It has been many years since the last confirmed sighting (and fix) of an
>> > > > > > > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
>> > > > > > > > > > > users from setting dangerous values, as they often set it during triage
>> > > > > > > > > > > and end up disabling the entire runtime pm instead (the option is not a
>> > > > > > > > > > > fine scalpel!).
>> > > > > > > > > > > 
>> > > > > > > > > > > Furthermore, it allows users to set known dangerous values which were
>> > > > > > > > > > > intended for testing and not for production use. For testing, we can
>> > > > > > > > > > > always patch in the required setting without having to expose ourselves
>> > > > > > > > > > > to random abuse.
>> > > > > > > > > > > 
>> > > > > > > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
>> > > > > > > > > > > lack of ilk support better.
>> > > > > > > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
>> > > > > > > > > > > 
>> > > > > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > > > > > > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > > > > > > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> > > > > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
>> > > > > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
>> > > > > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > > > > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> > > > > > > > > > > ---
>> > > > > > > > > > 
>> > > > > > > > > > I think that for execution/debug on early silicon we might still want the
>> > > > > > > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
>> > > > > > > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
>> > > > > > > > > > IMO.
>> > > > > > > > > 
>> > > > > > > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
>> > > > > > > > > RC6.
>> > > > > > > > 
>> > > > > > > > BIOS option don't block our code to run and set some MMIOs.
>> > > > > > > > Not sure how the GPU will behave on such cases.
>> > > > > > > > 
>> > > > > > > > I like the idea of removing some and keeping the parameters clean.
>> > > > > > > > But there are few ones like RC6 and disable_power_wells that are very
>> > > > > > > > useful on platform enabling and also when assisting others to debug issues.
>> > > > > > > > 
>> > > > > > > > For instance right now that we fixed RC6 on CNL someone told that
>> > > > > > > > he believe seeing more hangs, so I immediately asked to boot with
>> > > > > > > > i915.enable_rc6=0 to double check. It is easier and straighforward
>> > > > > > > > to direct them to the unsafe param than to ask them to compile the code
>> > > > > > > > with different options or to use some BIOS options that we are not sure.
>> > > > > > > > 
>> > > > > > > > Also on bug triage some options like this are helpful.
>> > > > > > > > 
>> > > > > > > > Also BIOS and compile are saved flags. So if you need to do a quick test
>> > > > > > > > you have to save it, and then unsave later. Parameters are very convinient
>> > > > > > > > for 1 boot only check.
>> > > > > > > 
>> > > > > > > It's convenient for sure, but the unsafe module parameters seems to be
>> > > > > > > finding their way into way too many HOWTOs, and from there to some
>> > > > > > > "productized" use-cases. Chris states that setting .enable_rc6=0 to
>> > > > > > > solving an issue on publicly shipping products has been some years ago,
>> > > > > > > so I don't see a need for carrying this.
>> > > > > > > 
>> > > > > > > We shouldn't allow the convenience of not having to change one line and
>> > > > > > > recompile kernel during development to affect the end-users who are
>> > > > > > > Googling how to get the best performance out of their hardware (I could
>> > > > > > > mention some distro here).
>> > > > > > > 
>> > > > > > > This seems the like the best option as I don't think introducing kernel
>> > > > > > > parameters that only exists on debug builds would be too convenient
>> > > > > > > either. It'd maybe just add more confusion.
>> > > > > > > 
>> > > > > > > Regards, Joonas
>> > > > > > 
>> > > > > > I believe the ability to disable RC6 is valuable not just for debugging
>> > > > > > purposes. Folks with very latency sensitive workloads are often willing to
>> > > > > > forego power savings. The real problem I see is that we don't test without rc6
>> > > > > > in our setup, which indeed makes it unsafe. As such, I see the other option here
>> > > > > > going back to the ability to toggle rc6 after load (either module parameter, or
>> > > > > > make it sysfs), and actually run some subset of our workloads with RC6. I
>> > > > > > suspect people will poop on that suggestion, but I figured I'd mention.
>> > > > > 
>> > > > > I agree there, but by my understanding there's really no ask to support
>> > > > > the feature in upstream. And the original motive from Chris to drop the
>> > > > > feature is that it's unsafe as it currently is.
>> > > > > 
>> > > > > So unless we've got the resources to bring it back from the unsafe
>> > > > > zone, I think we should drop it like this patch proposes.
>> > > > > 
>> > > > > Regards, Joonas
>> > > > 
>> > > > Yep, I agree. One other option would be to move i915_forcewake_user to sysfs and
>> > > > let them use that.
>> > > 
>> > > Well, I won't try to block that. I just put my 2 cents that I believe it is a very
>> > > useful parameter.
>> > > 
>> > > It wasn't that long ago the last time that we needed the flag to allow
>> > > end users to have a functional machine: https://plus.google.com/+JonMasters/posts/BqWLEjenLKv.
>> > > 
>> > > or to debug a related issue:
>> > > https://bugzilla.redhat.com/show_bug.cgi?id=1440988
>> > > https://bugzilla.kernel.org/show_bug.cgi?id=116431
>> > > 
>> > > Although date on few seems over than 1 year. We need to consider that
>> > > that was our latest new gpu... gen9.
>> > > 
>> > > If products are recommending the use of enable_rc6=0 I can see they
>> > > adding the patch to disable that. Effect is the same and our convenience is gone.
>> > > 
>> > > But again, just my view here. Not a nack ;)
>> > 
>> > I suppose the compromise would be to make it a boolean module parameter
>> > to only allow disabling rc6 on platforms where it's enabled by default,
>> > but not letting you enable rc6 where it's disabled by default. I.e. only
>> > support i915.enable_rc6=0 to be passed by the user.
>> 
>> +1. I like this approach.
>
> Umm, it still doesn't resolve the issue that it's not being tested.
>
> I try to be super clear; until we have resources to support that
> specific code path, I'd much prefer not to have an easy kernel
> parameter to set it.

It resolves the worst part of the issue: people enabling rc6 where it's
known not to work.

BR,
Jani.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 3db5851756f0..ca3d5f39338e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2503,7 +2503,7 @@  static int intel_runtime_suspend(struct device *kdev)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	int ret;
 
-	if (WARN_ON_ONCE(!(dev_priv->gt_pm.rc6.enabled && intel_rc6_enabled())))
+	if (WARN_ON_ONCE(!(dev_priv->gt_pm.rc6.enabled && HAS_RC6(dev_priv))))
 		return -ENODEV;
 
 	if (WARN_ON_ONCE(!HAS_RUNTIME_PM(dev_priv)))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 366ba74b0ad2..75100c6e35c8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3193,6 +3193,7 @@  intel_info(const struct drm_i915_private *dev_priv)
 #define HAS_PSR(dev_priv)		 ((dev_priv)->info.has_psr)
 #define HAS_RC6(dev_priv)		 ((dev_priv)->info.has_rc6)
 #define HAS_RC6p(dev_priv)		 ((dev_priv)->info.has_rc6p)
+#define HAS_RC6pp(dev_priv)		 (false)
 
 #define HAS_CSR(dev_priv)	((dev_priv)->info.has_csr)
 
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index b4faeb6aa2bd..6da48a77d70c 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -50,13 +50,6 @@  i915_param_named_unsafe(semaphores, int, 0400,
 	"Use semaphores for inter-ring sync "
 	"(default: -1 (use per-chip defaults))");
 
-i915_param_named_unsafe(enable_rc6, int, 0400,
-	"Enable power-saving render C-state 6. "
-	"Different stages can be selected via bitmask values "
-	"(0 = disable; 1 = enable rc6; 2 = enable deep rc6; 4 = enable deepest rc6). "
-	"For example, 3 would enable rc6 and deep rc6, and 7 would enable everything. "
-	"default: -1 (use per-chip default)");
-
 i915_param_named_unsafe(enable_dc, int, 0400,
 	"Enable power-saving display C-states. "
 	"(-1=auto [default]; 0=disable; 1=up to DC5; 2=up to DC6)");
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index c7292268ed43..374d3a7cb687 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -35,7 +35,6 @@ 
 	param(int, lvds_channel_mode, 0) \
 	param(int, panel_use_ssc, -1) \
 	param(int, vbt_sdvo_panel_type, -1) \
-	param(int, enable_rc6, -1) \
 	param(int, enable_dc, -1) \
 	param(int, enable_fbc, -1) \
 	param(int, enable_ppgtt, -1) \
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 6458c309c039..f1c6756440a9 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -216,6 +216,9 @@  static const struct intel_device_info intel_gm45_info __initconst = {
 static const struct intel_device_info intel_ironlake_d_info __initconst = {
 	GEN5_FEATURES,
 	.platform = INTEL_IRONLAKE,
+	/* ilk does support rc6, but we do not implement [power] contexts */
+	.has_rc6 = 0,
+
 };
 
 static const struct intel_device_info intel_ironlake_m_info __initconst = {
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 791759f632e1..1c95c2167d10 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -49,7 +49,18 @@  static u32 calc_residency(struct drm_i915_private *dev_priv,
 static ssize_t
 show_rc6_mask(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return snprintf(buf, PAGE_SIZE, "%x\n", intel_rc6_enabled());
+	struct drm_i915_private *dev_priv = kdev_minor_to_i915(kdev);
+	unsigned int mask;
+
+	mask = 0;
+	if (HAS_RC6(dev_priv))
+		mask |= BIT(0);
+	if (HAS_RC6p(dev_priv))
+		mask |= BIT(1);
+	if (HAS_RC6pp(dev_priv))
+		mask |= BIT(2);
+
+	return snprintf(buf, PAGE_SIZE, "%x\n", mask);
 }
 
 static ssize_t
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 3b4eafd39f55..5dc759eecd3c 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1908,15 +1908,10 @@  bool skl_ddb_allocation_overlaps(struct drm_i915_private *dev_priv,
 				 const struct skl_ddb_entry *ddb,
 				 int ignore);
 bool ilk_disable_lp_wm(struct drm_device *dev);
-int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
 int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
 				  struct intel_crtc_state *cstate);
 void intel_init_ipc(struct drm_i915_private *dev_priv);
 void intel_enable_ipc(struct drm_i915_private *dev_priv);
-static inline int intel_rc6_enabled(void)
-{
-	return i915_modparams.enable_rc6;
-}
 
 /* intel_sdvo.c */
 bool intel_sdvo_init(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_guc.c b/drivers/gpu/drm/i915/intel_guc.c
index 10037c0fdf95..00a86a44e8fb 100644
--- a/drivers/gpu/drm/i915/intel_guc.c
+++ b/drivers/gpu/drm/i915/intel_guc.c
@@ -230,8 +230,7 @@  int intel_guc_sample_forcewake(struct intel_guc *guc)
 
 	action[0] = INTEL_GUC_ACTION_SAMPLE_FORCEWAKE;
 	/* WaRsDisableCoarsePowerGating:skl,bxt */
-	if (!intel_rc6_enabled() ||
-	    NEEDS_WaRsDisableCoarsePowerGating(dev_priv))
+	if (!HAS_RC6(dev_priv) || NEEDS_WaRsDisableCoarsePowerGating(dev_priv))
 		action[1] = 0;
 	else
 		/* bit 0 and 1 are for Render and Media domain separately */
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 742d5455b201..3e45df64eec4 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6383,26 +6383,6 @@  static void valleyview_disable_rps(struct drm_i915_private *dev_priv)
 	I915_WRITE(GEN6_RP_CONTROL, 0);
 }
 
-static void intel_print_rc6_info(struct drm_i915_private *dev_priv, u32 mode)
-{
-	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
-		if (mode & (GEN7_RC_CTL_TO_MODE | GEN6_RC_CTL_EI_MODE(1)))
-			mode = GEN6_RC_CTL_RC6_ENABLE;
-		else
-			mode = 0;
-	}
-	if (HAS_RC6p(dev_priv))
-		DRM_DEBUG_DRIVER("Enabling RC6 states: "
-				 "RC6 %s RC6p %s RC6pp %s\n",
-				 onoff(mode & GEN6_RC_CTL_RC6_ENABLE),
-				 onoff(mode & GEN6_RC_CTL_RC6p_ENABLE),
-				 onoff(mode & GEN6_RC_CTL_RC6pp_ENABLE));
-
-	else
-		DRM_DEBUG_DRIVER("Enabling RC6 states: RC6 %s\n",
-				 onoff(mode & GEN6_RC_CTL_RC6_ENABLE));
-}
-
 static bool bxt_check_bios_rc6_setup(struct drm_i915_private *dev_priv)
 {
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
@@ -6465,42 +6445,26 @@  static bool bxt_check_bios_rc6_setup(struct drm_i915_private *dev_priv)
 	return enable_rc6;
 }
 
-int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6)
+static bool sanitize_rc6(struct drm_i915_private *i915)
 {
-	/* No RC6 before Ironlake and code is gone for ilk. */
-	if (INTEL_INFO(dev_priv)->gen < 6)
-		return 0;
-
-	if (!enable_rc6)
-		return 0;
+	struct intel_device_info *info = mkwrite_device_info(i915);
 
-	if (IS_GEN9_LP(dev_priv) && !bxt_check_bios_rc6_setup(dev_priv)) {
+	if (IS_GEN9_LP(i915) && !bxt_check_bios_rc6_setup(i915)) {
 		DRM_INFO("RC6 disabled by BIOS\n");
-		return 0;
+		info->has_rc6 = 0;
 	}
 
-	/* Respect the kernel parameter if it is set */
-	if (enable_rc6 >= 0) {
-		int mask;
-
-		if (HAS_RC6p(dev_priv))
-			mask = INTEL_RC6_ENABLE | INTEL_RC6p_ENABLE |
-			       INTEL_RC6pp_ENABLE;
-		else
-			mask = INTEL_RC6_ENABLE;
-
-		if ((enable_rc6 & mask) != enable_rc6)
-			DRM_DEBUG_DRIVER("Adjusting RC6 mask to %d "
-					 "(requested %d, valid %d)\n",
-					 enable_rc6 & mask, enable_rc6, mask);
-
-		return enable_rc6 & mask;
-	}
-
-	if (IS_IVYBRIDGE(dev_priv))
-		return (INTEL_RC6_ENABLE | INTEL_RC6p_ENABLE);
+	/*
+	 * We assume that we do not have any deep rc6 levels if we don't have
+	 * have the previous rc6 level supporteditself, i.e. we use HAS_RC6()
+	 * as the initial coarse check for rc6 in general, moving on to
+	 * progressively finer/deeper levels.
+	 */
+	if (WARN(!info->has_rc6 && info->has_rc6p,
+		 "deep rc6p enabled without rc6; disabling!\n"))
+		info->has_rc6p = 0;
 
-	return INTEL_RC6_ENABLE;
+	return info->has_rc6;
 }
 
 static void gen6_init_rps_frequencies(struct drm_i915_private *dev_priv)
@@ -6592,7 +6556,7 @@  static void gen9_enable_rc6(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	u32 rc6_mode, rc6_mask = 0;
+	u32 rc6_mode;
 
 	/* 1a: Software RC state - RC0 */
 	I915_WRITE(GEN6_RC_STATE, 0);
@@ -6633,9 +6597,6 @@  static void gen9_enable_rc6(struct drm_i915_private *dev_priv)
 	I915_WRITE(GEN9_RENDER_PG_IDLE_HYSTERESIS, 25);
 
 	/* 3a: Enable RC6 */
-	if (intel_rc6_enabled() & INTEL_RC6_ENABLE)
-		rc6_mask = GEN6_RC_CTL_RC6_ENABLE;
-	DRM_INFO("RC6 %s\n", onoff(rc6_mask & GEN6_RC_CTL_RC6_ENABLE));
 	I915_WRITE(GEN6_RC6_THRESHOLD, 37500); /* 37.5/125ms per EI */
 
 	/* WaRsUseTimeoutMode:cnl (pre-prod) */
@@ -6645,7 +6606,9 @@  static void gen9_enable_rc6(struct drm_i915_private *dev_priv)
 		rc6_mode = GEN6_RC_CTL_EI_MODE(1);
 
 	I915_WRITE(GEN6_RC_CONTROL,
-		   GEN6_RC_CTL_HW_ENABLE | rc6_mode | rc6_mask);
+		   GEN6_RC_CTL_HW_ENABLE |
+		   GEN6_RC_CTL_RC6_ENABLE |
+		   rc6_mode);
 
 	/*
 	 * 3b: Enable Coarse Power Gating only when RC6 is enabled.
@@ -6654,8 +6617,8 @@  static void gen9_enable_rc6(struct drm_i915_private *dev_priv)
 	if (NEEDS_WaRsDisableCoarsePowerGating(dev_priv))
 		I915_WRITE(GEN9_PG_ENABLE, 0);
 	else
-		I915_WRITE(GEN9_PG_ENABLE, (rc6_mask & GEN6_RC_CTL_RC6_ENABLE) ?
-				(GEN9_RENDER_PG_ENABLE | GEN9_MEDIA_PG_ENABLE) : 0);
+		I915_WRITE(GEN9_PG_ENABLE,
+			   GEN9_RENDER_PG_ENABLE | GEN9_MEDIA_PG_ENABLE);
 
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 }
@@ -6664,7 +6627,6 @@  static void gen8_enable_rc6(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	uint32_t rc6_mask = 0;
 
 	/* 1a: Software RC state - RC0 */
 	I915_WRITE(GEN6_RC_STATE, 0);
@@ -6686,13 +6648,11 @@  static void gen8_enable_rc6(struct drm_i915_private *dev_priv)
 	I915_WRITE(GEN6_RC6_THRESHOLD, 625); /* 800us/1.28 for TO */
 
 	/* 3: Enable RC6 */
-	if (intel_rc6_enabled() & INTEL_RC6_ENABLE)
-		rc6_mask = GEN6_RC_CTL_RC6_ENABLE;
-	intel_print_rc6_info(dev_priv, rc6_mask);
 
-	I915_WRITE(GEN6_RC_CONTROL, GEN6_RC_CTL_HW_ENABLE |
-			GEN7_RC_CTL_TO_MODE |
-			rc6_mask);
+	I915_WRITE(GEN6_RC_CONTROL,
+		   GEN6_RC_CTL_HW_ENABLE |
+		   GEN7_RC_CTL_TO_MODE |
+		   GEN6_RC_CTL_RC6_ENABLE);
 
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 }
@@ -6741,9 +6701,8 @@  static void gen6_enable_rc6(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	u32 rc6vids, rc6_mask = 0;
+	u32 rc6vids, rc6_mask;
 	u32 gtfifodbg;
-	int rc6_mode;
 	int ret;
 
 	I915_WRITE(GEN6_RC_STATE, 0);
@@ -6778,22 +6737,12 @@  static void gen6_enable_rc6(struct drm_i915_private *dev_priv)
 	I915_WRITE(GEN6_RC6p_THRESHOLD, 150000);
 	I915_WRITE(GEN6_RC6pp_THRESHOLD, 64000); /* unused */
 
-	/* Check if we are enabling RC6 */
-	rc6_mode = intel_rc6_enabled();
-	if (rc6_mode & INTEL_RC6_ENABLE)
-		rc6_mask |= GEN6_RC_CTL_RC6_ENABLE;
-
 	/* We don't use those on Haswell */
-	if (!IS_HASWELL(dev_priv)) {
-		if (rc6_mode & INTEL_RC6p_ENABLE)
-			rc6_mask |= GEN6_RC_CTL_RC6p_ENABLE;
-
-		if (rc6_mode & INTEL_RC6pp_ENABLE)
-			rc6_mask |= GEN6_RC_CTL_RC6pp_ENABLE;
-	}
-
-	intel_print_rc6_info(dev_priv, rc6_mask);
-
+	rc6_mask = GEN6_RC_CTL_RC6_ENABLE;
+	if (HAS_RC6p(dev_priv))
+		rc6_mask |= GEN6_RC_CTL_RC6p_ENABLE;
+	if (HAS_RC6pp(dev_priv))
+		rc6_mask |= GEN6_RC_CTL_RC6pp_ENABLE;
 	I915_WRITE(GEN6_RC_CONTROL,
 		   rc6_mask |
 		   GEN6_RC_CTL_EI_MODE(1) |
@@ -7236,7 +7185,7 @@  static void cherryview_enable_rc6(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	u32 gtfifodbg, rc6_mode = 0, pcbr;
+	u32 gtfifodbg, rc6_mode, pcbr;
 
 	gtfifodbg = I915_READ(GTFIFODBG) & ~(GT_FIFO_SBDEDICATE_FREE_ENTRY_CHV |
 					     GT_FIFO_FREE_ENTRIES_CHV);
@@ -7277,10 +7226,9 @@  static void cherryview_enable_rc6(struct drm_i915_private *dev_priv)
 	pcbr = I915_READ(VLV_PCBR);
 
 	/* 3: Enable RC6 */
-	if ((intel_rc6_enabled() & INTEL_RC6_ENABLE) &&
-	    (pcbr >> VLV_PCBR_ADDR_SHIFT))
+	rc6_mode = 0;
+	if (pcbr >> VLV_PCBR_ADDR_SHIFT)
 		rc6_mode = GEN7_RC_CTL_TO_MODE;
-
 	I915_WRITE(GEN6_RC_CONTROL, rc6_mode);
 
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
@@ -7332,7 +7280,7 @@  static void valleyview_enable_rc6(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
-	u32 gtfifodbg, rc6_mode = 0;
+	u32 gtfifodbg;
 
 	valleyview_check_pctx(dev_priv);
 
@@ -7365,12 +7313,8 @@  static void valleyview_enable_rc6(struct drm_i915_private *dev_priv)
 				      VLV_MEDIA_RC6_COUNT_EN |
 				      VLV_RENDER_RC6_COUNT_EN));
 
-	if (intel_rc6_enabled() & INTEL_RC6_ENABLE)
-		rc6_mode = GEN7_RC_CTL_TO_MODE | VLV_RC_CTL_CTX_RST_PARALLEL;
-
-	intel_print_rc6_info(dev_priv, rc6_mode);
-
-	I915_WRITE(GEN6_RC_CONTROL, rc6_mode);
+	I915_WRITE(GEN6_RC_CONTROL,
+		   GEN7_RC_CTL_TO_MODE | VLV_RC_CTL_CTX_RST_PARALLEL);
 
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 }
@@ -7897,7 +7841,7 @@  void intel_init_gt_powersave(struct drm_i915_private *dev_priv)
 	 * RPM depends on RC6 to save restore the GT HW context, so make RC6 a
 	 * requirement.
 	 */
-	if (!i915_modparams.enable_rc6) {
+	if (!sanitize_rc6(dev_priv)) {
 		DRM_INFO("RC6 disabled, disabling runtime PM support\n");
 		intel_runtime_pm_get(dev_priv);
 	}
@@ -7954,7 +7898,7 @@  void intel_cleanup_gt_powersave(struct drm_i915_private *dev_priv)
 	if (IS_VALLEYVIEW(dev_priv))
 		valleyview_cleanup_gt_powersave(dev_priv);
 
-	if (!i915_modparams.enable_rc6)
+	if (!HAS_RC6(dev_priv))
 		intel_runtime_pm_put(dev_priv);
 }
 
@@ -9505,7 +9449,7 @@  u64 intel_rc6_residency_us(struct drm_i915_private *dev_priv,
 {
 	u64 time_hw, units, div;
 
-	if (!intel_rc6_enabled())
+	if (!HAS_RC6(dev_priv))
 		return 0;
 
 	intel_runtime_pm_get(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 20e3c65c0999..51797d40b4fd 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -436,9 +436,6 @@  void intel_uncore_resume_early(struct drm_i915_private *dev_priv)
 
 void intel_uncore_sanitize(struct drm_i915_private *dev_priv)
 {
-	i915_modparams.enable_rc6 =
-		sanitize_rc6_option(dev_priv, i915_modparams.enable_rc6);
-
 	/* BIOS often leaves RC6 enabled, but disable it for hw init */
 	intel_sanitize_gt_powersave(dev_priv);
 }