diff mbox

[v4,1/8] drm/i915/gen9: Add framework to whitelist specific GPU registers

Message ID 1452785255-4079-1-git-send-email-arun.siluvery@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

arun.siluvery@linux.intel.com Jan. 14, 2016, 3:27 p.m. UTC
Some of the HW registers are privileged and cannot be written to from
non-privileged batch buffers coming from userspace unless they are added to
the HW whitelist. This whitelist is maintained by HW and it is different from
SW whitelist. Userspace need write access to them to implement preemption
related WA.

The reason for using this approach is, the register bits that control
preemption granularity at the HW level are not context save/restored; so even
if we set these bits always in kernel they are going to change once the
context is switched out.  We can consider making them non-privileged by
default but these registers also contain other chicken bits which should not
be allowed to be modified.

In the later revisions controlling bits are save/restored at context level but
in the existing revisions these are exported via other debug registers and
should be on the whitelist. This patch adds changes to provide HW with a list
of registers to be whitelisted. HW checks this list during execution and
provides access accordingly.

HW imposes a limit on the number of registers on whitelist and it is
per-engine.  At this point we are only enabling whitelist for RCS and we don't
foresee any requirement for other engines.

The registers to be whitelisted are added using generic workaround list
mechanism, even these are only enablers for userspace workarounds. But by
sharing this mechanism we get some test assets without additional cost (Mika).

v2: rebase

v3: parameterize RING_FORCE_TO_NONPRIV() as _MMIO() should be limited to
i915_reg.h (Ville), drop inline for wa_ring_whitelist_reg (Mika).

v4: improvements suggested by Chris Wilson.
Clarify that this is HW whitelist and different from the one maintained in
driver. This list is engine specific but it gets initialized along with other
WA which is RCS specific thing, so make it clear that we are not doing any
cross engine setup during initialization.
Make HW whitelist count of each engine available in debugfs.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     | 15 ++++++++++-----
 drivers/gpu/drm/i915/i915_drv.h         |  9 ++++++++-
 drivers/gpu/drm/i915/i915_reg.h         |  3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 17 +++++++++++++++++
 4 files changed, 38 insertions(+), 6 deletions(-)

Comments

Daniel Vetter Jan. 19, 2016, 9 a.m. UTC | #1
On Thu, Jan 14, 2016 at 03:27:35PM +0000, Arun Siluvery wrote:
> Some of the HW registers are privileged and cannot be written to from
> non-privileged batch buffers coming from userspace unless they are added to
> the HW whitelist. This whitelist is maintained by HW and it is different from
> SW whitelist. Userspace need write access to them to implement preemption
> related WA.
> 
> The reason for using this approach is, the register bits that control
> preemption granularity at the HW level are not context save/restored; so even
> if we set these bits always in kernel they are going to change once the
> context is switched out.  We can consider making them non-privileged by
> default but these registers also contain other chicken bits which should not
> be allowed to be modified.
> 
> In the later revisions controlling bits are save/restored at context level but
> in the existing revisions these are exported via other debug registers and
> should be on the whitelist. This patch adds changes to provide HW with a list
> of registers to be whitelisted. HW checks this list during execution and
> provides access accordingly.
> 
> HW imposes a limit on the number of registers on whitelist and it is
> per-engine.  At this point we are only enabling whitelist for RCS and we don't
> foresee any requirement for other engines.
> 
> The registers to be whitelisted are added using generic workaround list
> mechanism, even these are only enablers for userspace workarounds. But by
> sharing this mechanism we get some test assets without additional cost (Mika).
> 
> v2: rebase
> 
> v3: parameterize RING_FORCE_TO_NONPRIV() as _MMIO() should be limited to
> i915_reg.h (Ville), drop inline for wa_ring_whitelist_reg (Mika).
> 
> v4: improvements suggested by Chris Wilson.
> Clarify that this is HW whitelist and different from the one maintained in
> driver. This list is engine specific but it gets initialized along with other
> WA which is RCS specific thing, so make it clear that we are not doing any
> cross engine setup during initialization.
> Make HW whitelist count of each engine available in debugfs.
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>

If you resend just single patches to a series you must --in-reply-to the
individual patch, not the cover letter. Otherwise patchwork won't pick it
up, which means we don't have CI results for this.

Since it's been a while probably best to just resend the entire pile.

Also we seem to be missing r-b tags for the actual w/a changes.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c     | 15 ++++++++++-----
>  drivers/gpu/drm/i915/i915_drv.h         |  9 ++++++++-
>  drivers/gpu/drm/i915/i915_reg.h         |  3 +++
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 17 +++++++++++++++++
>  4 files changed, 38 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index e3377ab..7eb002c 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3229,9 +3229,11 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>  {
>  	int i;
>  	int ret;
> +	struct intel_engine_cs *ring;
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_workarounds *workarounds = &dev_priv->workarounds;
>  
>  	ret = mutex_lock_interruptible(&dev->struct_mutex);
>  	if (ret)
> @@ -3239,15 +3241,18 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>  
>  	intel_runtime_pm_get(dev_priv);
>  
> -	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
> -	for (i = 0; i < dev_priv->workarounds.count; ++i) {
> +	seq_printf(m, "Workarounds applied: %d\n", workarounds->count);
> +	for_each_ring(ring, dev_priv, i)
> +		seq_printf(m, "HW whitelist count for %s: %d\n",
> +			   ring->name, workarounds->hw_whitelist_count[i]);
> +	for (i = 0; i < workarounds->count; ++i) {
>  		i915_reg_t addr;
>  		u32 mask, value, read;
>  		bool ok;
>  
> -		addr = dev_priv->workarounds.reg[i].addr;
> -		mask = dev_priv->workarounds.reg[i].mask;
> -		value = dev_priv->workarounds.reg[i].value;
> +		addr = workarounds->reg[i].addr;
> +		mask = workarounds->reg[i].mask;
> +		value = workarounds->reg[i].value;
>  		read = I915_READ(addr);
>  		ok = (value & mask) == (read & mask);
>  		seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 104bd18..83fccc0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1653,11 +1653,18 @@ struct i915_wa_reg {
>  	u32 mask;
>  };
>  
> -#define I915_MAX_WA_REGS 16
> +/*
> + * RING_MAX_NONPRIV_SLOTS is per-engine but at this point we are only
> + * allowing it for RCS as we don't foresee any requirement of having
> + * a whitelist for other engines. When it is really required for
> + * other engines then the limit need to be increased.
> + */
> +#define I915_MAX_WA_REGS (16 + RING_MAX_NONPRIV_SLOTS)
>  
>  struct i915_workarounds {
>  	struct i915_wa_reg reg[I915_MAX_WA_REGS];
>  	u32 count;
> +	u32 hw_whitelist_count[I915_NUM_RINGS];
>  };
>  
>  struct i915_virtual_gpu {
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 0a98889..7938814 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -1635,6 +1635,9 @@ enum skl_disp_power_wells {
>  #define   RING_WAIT		(1<<11) /* gen3+, PRBx_CTL */
>  #define   RING_WAIT_SEMAPHORE	(1<<10) /* gen6+ */
>  
> +#define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base)+0x4D0) + (i)*4)
> +#define   RING_MAX_NONPRIV_SLOTS  12
> +
>  #define GEN7_TLB_RD_ADDR	_MMIO(0x4700)
>  
>  #if 0
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 4060acf..56af736 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -787,6 +787,22 @@ static int wa_add(struct drm_i915_private *dev_priv,
>  
>  #define WA_WRITE(addr, val) WA_REG(addr, 0xffffffff, val)
>  
> +static int wa_ring_whitelist_reg(struct intel_engine_cs *ring, i915_reg_t reg)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct i915_workarounds *wa = &dev_priv->workarounds;
> +	const uint32_t index = wa->hw_whitelist_count[ring->id];
> +
> +	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
> +		return -EINVAL;
> +
> +	WA_WRITE(RING_FORCE_TO_NONPRIV(ring->mmio_base, index),
> +		 i915_mmio_reg_offset(reg));
> +	wa->hw_whitelist_count[ring->id]++;
> +
> +	return 0;
> +}
> +
>  static int gen8_init_workarounds(struct intel_engine_cs *ring)
>  {
>  	struct drm_device *dev = ring->dev;
> @@ -1115,6 +1131,7 @@ int init_workarounds_ring(struct intel_engine_cs *ring)
>  	WARN_ON(ring->id != RCS);
>  
>  	dev_priv->workarounds.count = 0;
> +	dev_priv->workarounds.hw_whitelist_count[RCS] = 0;
>  
>  	if (IS_BROADWELL(dev))
>  		return bdw_init_workarounds(ring);
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
arun.siluvery@linux.intel.com Jan. 19, 2016, 10:16 a.m. UTC | #2
On 19/01/2016 09:00, Daniel Vetter wrote:
> On Thu, Jan 14, 2016 at 03:27:35PM +0000, Arun Siluvery wrote:
>> Some of the HW registers are privileged and cannot be written to from
>> non-privileged batch buffers coming from userspace unless they are added to
>> the HW whitelist. This whitelist is maintained by HW and it is different from
>> SW whitelist. Userspace need write access to them to implement preemption
>> related WA.
>>
>> The reason for using this approach is, the register bits that control
>> preemption granularity at the HW level are not context save/restored; so even
>> if we set these bits always in kernel they are going to change once the
>> context is switched out.  We can consider making them non-privileged by
>> default but these registers also contain other chicken bits which should not
>> be allowed to be modified.
>>
>> In the later revisions controlling bits are save/restored at context level but
>> in the existing revisions these are exported via other debug registers and
>> should be on the whitelist. This patch adds changes to provide HW with a list
>> of registers to be whitelisted. HW checks this list during execution and
>> provides access accordingly.
>>
>> HW imposes a limit on the number of registers on whitelist and it is
>> per-engine.  At this point we are only enabling whitelist for RCS and we don't
>> foresee any requirement for other engines.
>>
>> The registers to be whitelisted are added using generic workaround list
>> mechanism, even these are only enablers for userspace workarounds. But by
>> sharing this mechanism we get some test assets without additional cost (Mika).
>>
>> v2: rebase
>>
>> v3: parameterize RING_FORCE_TO_NONPRIV() as _MMIO() should be limited to
>> i915_reg.h (Ville), drop inline for wa_ring_whitelist_reg (Mika).
>>
>> v4: improvements suggested by Chris Wilson.
>> Clarify that this is HW whitelist and different from the one maintained in
>> driver. This list is engine specific but it gets initialized along with other
>> WA which is RCS specific thing, so make it clear that we are not doing any
>> cross engine setup during initialization.
>> Make HW whitelist count of each engine available in debugfs.
>>
>> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
>
> If you resend just single patches to a series you must --in-reply-to the
> individual patch, not the cover letter. Otherwise patchwork won't pick it
> up, which means we don't have CI results for this.

Hi Daniel,

Yes I did use --in-reply-to but probably not the correct message-id, 
will keep this in mind.

>
> Since it's been a while probably best to just resend the entire pile.
>
> Also we seem to be missing r-b tags for the actual w/a changes.
yes, actual w/a are yet to be reviewed, I can resend all of them once 
they are reviewed or you want me to send it now?

regards
Arun

> -Daniel
>
>> ---
>>   drivers/gpu/drm/i915/i915_debugfs.c     | 15 ++++++++++-----
>>   drivers/gpu/drm/i915/i915_drv.h         |  9 ++++++++-
>>   drivers/gpu/drm/i915/i915_reg.h         |  3 +++
>>   drivers/gpu/drm/i915/intel_ringbuffer.c | 17 +++++++++++++++++
>>   4 files changed, 38 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>> index e3377ab..7eb002c 100644
>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>> @@ -3229,9 +3229,11 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>>   {
>>   	int i;
>>   	int ret;
>> +	struct intel_engine_cs *ring;
>>   	struct drm_info_node *node = (struct drm_info_node *) m->private;
>>   	struct drm_device *dev = node->minor->dev;
>>   	struct drm_i915_private *dev_priv = dev->dev_private;
>> +	struct i915_workarounds *workarounds = &dev_priv->workarounds;
>>
>>   	ret = mutex_lock_interruptible(&dev->struct_mutex);
>>   	if (ret)
>> @@ -3239,15 +3241,18 @@ static int i915_wa_registers(struct seq_file *m, void *unused)
>>
>>   	intel_runtime_pm_get(dev_priv);
>>
>> -	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
>> -	for (i = 0; i < dev_priv->workarounds.count; ++i) {
>> +	seq_printf(m, "Workarounds applied: %d\n", workarounds->count);
>> +	for_each_ring(ring, dev_priv, i)
>> +		seq_printf(m, "HW whitelist count for %s: %d\n",
>> +			   ring->name, workarounds->hw_whitelist_count[i]);
>> +	for (i = 0; i < workarounds->count; ++i) {
>>   		i915_reg_t addr;
>>   		u32 mask, value, read;
>>   		bool ok;
>>
>> -		addr = dev_priv->workarounds.reg[i].addr;
>> -		mask = dev_priv->workarounds.reg[i].mask;
>> -		value = dev_priv->workarounds.reg[i].value;
>> +		addr = workarounds->reg[i].addr;
>> +		mask = workarounds->reg[i].mask;
>> +		value = workarounds->reg[i].value;
>>   		read = I915_READ(addr);
>>   		ok = (value & mask) == (read & mask);
>>   		seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 104bd18..83fccc0 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1653,11 +1653,18 @@ struct i915_wa_reg {
>>   	u32 mask;
>>   };
>>
>> -#define I915_MAX_WA_REGS 16
>> +/*
>> + * RING_MAX_NONPRIV_SLOTS is per-engine but at this point we are only
>> + * allowing it for RCS as we don't foresee any requirement of having
>> + * a whitelist for other engines. When it is really required for
>> + * other engines then the limit need to be increased.
>> + */
>> +#define I915_MAX_WA_REGS (16 + RING_MAX_NONPRIV_SLOTS)
>>
>>   struct i915_workarounds {
>>   	struct i915_wa_reg reg[I915_MAX_WA_REGS];
>>   	u32 count;
>> +	u32 hw_whitelist_count[I915_NUM_RINGS];
>>   };
>>
>>   struct i915_virtual_gpu {
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index 0a98889..7938814 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -1635,6 +1635,9 @@ enum skl_disp_power_wells {
>>   #define   RING_WAIT		(1<<11) /* gen3+, PRBx_CTL */
>>   #define   RING_WAIT_SEMAPHORE	(1<<10) /* gen6+ */
>>
>> +#define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base)+0x4D0) + (i)*4)
>> +#define   RING_MAX_NONPRIV_SLOTS  12
>> +
>>   #define GEN7_TLB_RD_ADDR	_MMIO(0x4700)
>>
>>   #if 0
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 4060acf..56af736 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -787,6 +787,22 @@ static int wa_add(struct drm_i915_private *dev_priv,
>>
>>   #define WA_WRITE(addr, val) WA_REG(addr, 0xffffffff, val)
>>
>> +static int wa_ring_whitelist_reg(struct intel_engine_cs *ring, i915_reg_t reg)
>> +{
>> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
>> +	struct i915_workarounds *wa = &dev_priv->workarounds;
>> +	const uint32_t index = wa->hw_whitelist_count[ring->id];
>> +
>> +	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
>> +		return -EINVAL;
>> +
>> +	WA_WRITE(RING_FORCE_TO_NONPRIV(ring->mmio_base, index),
>> +		 i915_mmio_reg_offset(reg));
>> +	wa->hw_whitelist_count[ring->id]++;
>> +
>> +	return 0;
>> +}
>> +
>>   static int gen8_init_workarounds(struct intel_engine_cs *ring)
>>   {
>>   	struct drm_device *dev = ring->dev;
>> @@ -1115,6 +1131,7 @@ int init_workarounds_ring(struct intel_engine_cs *ring)
>>   	WARN_ON(ring->id != RCS);
>>
>>   	dev_priv->workarounds.count = 0;
>> +	dev_priv->workarounds.hw_whitelist_count[RCS] = 0;
>>
>>   	if (IS_BROADWELL(dev))
>>   		return bdw_init_workarounds(ring);
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
Daniel Vetter Jan. 19, 2016, 12:03 p.m. UTC | #3
On Tue, Jan 19, 2016 at 10:16:52AM +0000, Arun Siluvery wrote:
> On 19/01/2016 09:00, Daniel Vetter wrote:
> >On Thu, Jan 14, 2016 at 03:27:35PM +0000, Arun Siluvery wrote:
> >>Some of the HW registers are privileged and cannot be written to from
> >>non-privileged batch buffers coming from userspace unless they are added to
> >>the HW whitelist. This whitelist is maintained by HW and it is different from
> >>SW whitelist. Userspace need write access to them to implement preemption
> >>related WA.
> >>
> >>The reason for using this approach is, the register bits that control
> >>preemption granularity at the HW level are not context save/restored; so even
> >>if we set these bits always in kernel they are going to change once the
> >>context is switched out.  We can consider making them non-privileged by
> >>default but these registers also contain other chicken bits which should not
> >>be allowed to be modified.
> >>
> >>In the later revisions controlling bits are save/restored at context level but
> >>in the existing revisions these are exported via other debug registers and
> >>should be on the whitelist. This patch adds changes to provide HW with a list
> >>of registers to be whitelisted. HW checks this list during execution and
> >>provides access accordingly.
> >>
> >>HW imposes a limit on the number of registers on whitelist and it is
> >>per-engine.  At this point we are only enabling whitelist for RCS and we don't
> >>foresee any requirement for other engines.
> >>
> >>The registers to be whitelisted are added using generic workaround list
> >>mechanism, even these are only enablers for userspace workarounds. But by
> >>sharing this mechanism we get some test assets without additional cost (Mika).
> >>
> >>v2: rebase
> >>
> >>v3: parameterize RING_FORCE_TO_NONPRIV() as _MMIO() should be limited to
> >>i915_reg.h (Ville), drop inline for wa_ring_whitelist_reg (Mika).
> >>
> >>v4: improvements suggested by Chris Wilson.
> >>Clarify that this is HW whitelist and different from the one maintained in
> >>driver. This list is engine specific but it gets initialized along with other
> >>WA which is RCS specific thing, so make it clear that we are not doing any
> >>cross engine setup during initialization.
> >>Make HW whitelist count of each engine available in debugfs.
> >>
> >>Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
> >>Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> >>Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >>Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
> >
> >If you resend just single patches to a series you must --in-reply-to the
> >individual patch, not the cover letter. Otherwise patchwork won't pick it
> >up, which means we don't have CI results for this.
> 
> Hi Daniel,
> 
> Yes I did use --in-reply-to but probably not the correct message-id, will
> keep this in mind.
> 
> >
> >Since it's been a while probably best to just resend the entire pile.
> >
> >Also we seem to be missing r-b tags for the actual w/a changes.
> yes, actual w/a are yet to be reviewed, I can resend all of them once they
> are reviewed or you want me to send it now?

Either way is fine I think.
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index e3377ab..7eb002c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3229,9 +3229,11 @@  static int i915_wa_registers(struct seq_file *m, void *unused)
 {
 	int i;
 	int ret;
+	struct intel_engine_cs *ring;
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_workarounds *workarounds = &dev_priv->workarounds;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
 	if (ret)
@@ -3239,15 +3241,18 @@  static int i915_wa_registers(struct seq_file *m, void *unused)
 
 	intel_runtime_pm_get(dev_priv);
 
-	seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
-	for (i = 0; i < dev_priv->workarounds.count; ++i) {
+	seq_printf(m, "Workarounds applied: %d\n", workarounds->count);
+	for_each_ring(ring, dev_priv, i)
+		seq_printf(m, "HW whitelist count for %s: %d\n",
+			   ring->name, workarounds->hw_whitelist_count[i]);
+	for (i = 0; i < workarounds->count; ++i) {
 		i915_reg_t addr;
 		u32 mask, value, read;
 		bool ok;
 
-		addr = dev_priv->workarounds.reg[i].addr;
-		mask = dev_priv->workarounds.reg[i].mask;
-		value = dev_priv->workarounds.reg[i].value;
+		addr = workarounds->reg[i].addr;
+		mask = workarounds->reg[i].mask;
+		value = workarounds->reg[i].value;
 		read = I915_READ(addr);
 		ok = (value & mask) == (read & mask);
 		seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, status: %s\n",
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 104bd18..83fccc0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1653,11 +1653,18 @@  struct i915_wa_reg {
 	u32 mask;
 };
 
-#define I915_MAX_WA_REGS 16
+/*
+ * RING_MAX_NONPRIV_SLOTS is per-engine but at this point we are only
+ * allowing it for RCS as we don't foresee any requirement of having
+ * a whitelist for other engines. When it is really required for
+ * other engines then the limit need to be increased.
+ */
+#define I915_MAX_WA_REGS (16 + RING_MAX_NONPRIV_SLOTS)
 
 struct i915_workarounds {
 	struct i915_wa_reg reg[I915_MAX_WA_REGS];
 	u32 count;
+	u32 hw_whitelist_count[I915_NUM_RINGS];
 };
 
 struct i915_virtual_gpu {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0a98889..7938814 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1635,6 +1635,9 @@  enum skl_disp_power_wells {
 #define   RING_WAIT		(1<<11) /* gen3+, PRBx_CTL */
 #define   RING_WAIT_SEMAPHORE	(1<<10) /* gen6+ */
 
+#define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base)+0x4D0) + (i)*4)
+#define   RING_MAX_NONPRIV_SLOTS  12
+
 #define GEN7_TLB_RD_ADDR	_MMIO(0x4700)
 
 #if 0
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 4060acf..56af736 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -787,6 +787,22 @@  static int wa_add(struct drm_i915_private *dev_priv,
 
 #define WA_WRITE(addr, val) WA_REG(addr, 0xffffffff, val)
 
+static int wa_ring_whitelist_reg(struct intel_engine_cs *ring, i915_reg_t reg)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct i915_workarounds *wa = &dev_priv->workarounds;
+	const uint32_t index = wa->hw_whitelist_count[ring->id];
+
+	if (WARN_ON(index >= RING_MAX_NONPRIV_SLOTS))
+		return -EINVAL;
+
+	WA_WRITE(RING_FORCE_TO_NONPRIV(ring->mmio_base, index),
+		 i915_mmio_reg_offset(reg));
+	wa->hw_whitelist_count[ring->id]++;
+
+	return 0;
+}
+
 static int gen8_init_workarounds(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
@@ -1115,6 +1131,7 @@  int init_workarounds_ring(struct intel_engine_cs *ring)
 	WARN_ON(ring->id != RCS);
 
 	dev_priv->workarounds.count = 0;
+	dev_priv->workarounds.hw_whitelist_count[RCS] = 0;
 
 	if (IS_BROADWELL(dev))
 		return bdw_init_workarounds(ring);