Message ID | 20170905131050.11655-1-david.weinehall@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting David Weinehall (2017-09-05 14:10:50) > Currently we're doing: > > 1. acquire lock > 2. write word to hardware > 3. release lock > 4. repeat from 1 > > to load the DMC firmware. Due to the cost of acquiring/releasing a lock, > and the size of the DMC firmware, this slows down DMC loading a lot. > > This patch simply acquires the lock, writes the entire firmware, > then releases the lock. Testing shows resume speedups > in the order of 10ms on platforms with DMC firmware (GEN9+). > > v2: Per feedback from Chris & Ville there's no need to do the whole > forcewake dance, so lose that bit (Chris, Ville) > > v3: Actually send the new version of the patch... > > v4: Don't acquire the uncore lock. Disable preempt. (Chris) > > Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> > --- > drivers/gpu/drm/i915/intel_csr.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c > index 965988f79a55..cdfb624eb82d 100644 > --- a/drivers/gpu/drm/i915/intel_csr.c > +++ b/drivers/gpu/drm/i915/intel_csr.c > @@ -252,8 +252,14 @@ void intel_csr_load_program(struct drm_i915_private *dev_priv) > } > > fw_size = dev_priv->csr.dmc_fw_size; > + assert_rpm_wakelock_held(dev_priv); > + > + preempt_disable(); > + > for (i = 0; i < fw_size; i++) > - I915_WRITE(CSR_PROGRAM(i), payload[i]); > + I915_WRITE_FW(CSR_PROGRAM(i), payload[i]); > + > + preempt_enable(); > > for (i = 0; i < dev_priv->csr.mmio_count; i++) { > I915_WRITE(dev_priv->csr.mmioaddr[i], Looked into extending the coverage to the second loop? -Chris
On Tue, Sep 05, 2017 at 02:25:36PM +0100, Chris Wilson wrote: > Quoting David Weinehall (2017-09-05 14:10:50) > > Currently we're doing: > > > > 1. acquire lock > > 2. write word to hardware > > 3. release lock > > 4. repeat from 1 > > > > to load the DMC firmware. Due to the cost of acquiring/releasing a lock, > > and the size of the DMC firmware, this slows down DMC loading a lot. > > > > This patch simply acquires the lock, writes the entire firmware, > > then releases the lock. Testing shows resume speedups > > in the order of 10ms on platforms with DMC firmware (GEN9+). > > > > v2: Per feedback from Chris & Ville there's no need to do the whole > > forcewake dance, so lose that bit (Chris, Ville) > > > > v3: Actually send the new version of the patch... > > > > v4: Don't acquire the uncore lock. Disable preempt. (Chris) > > > > Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> > > --- > > drivers/gpu/drm/i915/intel_csr.c | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c > > index 965988f79a55..cdfb624eb82d 100644 > > --- a/drivers/gpu/drm/i915/intel_csr.c > > +++ b/drivers/gpu/drm/i915/intel_csr.c > > @@ -252,8 +252,14 @@ void intel_csr_load_program(struct drm_i915_private *dev_priv) > > } > > > > fw_size = dev_priv->csr.dmc_fw_size; > > + assert_rpm_wakelock_held(dev_priv); > > + > > + preempt_disable(); > > + > > for (i = 0; i < fw_size; i++) > > - I915_WRITE(CSR_PROGRAM(i), payload[i]); > > + I915_WRITE_FW(CSR_PROGRAM(i), payload[i]); > > + > > + preempt_enable(); > > > > for (i = 0; i < dev_priv->csr.mmio_count; i++) { > > I915_WRITE(dev_priv->csr.mmioaddr[i], > > Looked into extending the coverage to the second loop? The second loop didn't really show up in my benchmarks, so I decided to minimise the changes. The only other I915_WRITE() loops that show up when measuring are the LUT loading; I'll fix those in a future patch. Kind regards, David
Quoting David Weinehall (2017-09-05 14:33:25) > On Tue, Sep 05, 2017 at 02:25:36PM +0100, Chris Wilson wrote: > > Quoting David Weinehall (2017-09-05 14:10:50) > > > Currently we're doing: > > > > > > 1. acquire lock > > > 2. write word to hardware > > > 3. release lock > > > 4. repeat from 1 > > > > > > to load the DMC firmware. Due to the cost of acquiring/releasing a lock, > > > and the size of the DMC firmware, this slows down DMC loading a lot. > > > > > > This patch simply acquires the lock, writes the entire firmware, > > > then releases the lock. Testing shows resume speedups > > > in the order of 10ms on platforms with DMC firmware (GEN9+). > > > > > > v2: Per feedback from Chris & Ville there's no need to do the whole > > > forcewake dance, so lose that bit (Chris, Ville) > > > > > > v3: Actually send the new version of the patch... > > > > > > v4: Don't acquire the uncore lock. Disable preempt. (Chris) > > > > > > Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> > > > --- > > > drivers/gpu/drm/i915/intel_csr.c | 8 +++++++- > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c > > > index 965988f79a55..cdfb624eb82d 100644 > > > --- a/drivers/gpu/drm/i915/intel_csr.c > > > +++ b/drivers/gpu/drm/i915/intel_csr.c > > > @@ -252,8 +252,14 @@ void intel_csr_load_program(struct drm_i915_private *dev_priv) > > > } > > > > > > fw_size = dev_priv->csr.dmc_fw_size; > > > + assert_rpm_wakelock_held(dev_priv); > > > + > > > + preempt_disable(); > > > + > > > for (i = 0; i < fw_size; i++) > > > - I915_WRITE(CSR_PROGRAM(i), payload[i]); > > > + I915_WRITE_FW(CSR_PROGRAM(i), payload[i]); > > > + > > > + preempt_enable(); > > > > > > for (i = 0; i < dev_priv->csr.mmio_count; i++) { > > > I915_WRITE(dev_priv->csr.mmioaddr[i], > > > > Looked into extending the coverage to the second loop? > > The second loop didn't really show up in my benchmarks, > so I decided to minimise the changes. Fair enough, looks like it is limited to 8 writes. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c index 965988f79a55..cdfb624eb82d 100644 --- a/drivers/gpu/drm/i915/intel_csr.c +++ b/drivers/gpu/drm/i915/intel_csr.c @@ -252,8 +252,14 @@ void intel_csr_load_program(struct drm_i915_private *dev_priv) } fw_size = dev_priv->csr.dmc_fw_size; + assert_rpm_wakelock_held(dev_priv); + + preempt_disable(); + for (i = 0; i < fw_size; i++) - I915_WRITE(CSR_PROGRAM(i), payload[i]); + I915_WRITE_FW(CSR_PROGRAM(i), payload[i]); + + preempt_enable(); for (i = 0; i < dev_priv->csr.mmio_count; i++) { I915_WRITE(dev_priv->csr.mmioaddr[i],
Currently we're doing: 1. acquire lock 2. write word to hardware 3. release lock 4. repeat from 1 to load the DMC firmware. Due to the cost of acquiring/releasing a lock, and the size of the DMC firmware, this slows down DMC loading a lot. This patch simply acquires the lock, writes the entire firmware, then releases the lock. Testing shows resume speedups in the order of 10ms on platforms with DMC firmware (GEN9+). v2: Per feedback from Chris & Ville there's no need to do the whole forcewake dance, so lose that bit (Chris, Ville) v3: Actually send the new version of the patch... v4: Don't acquire the uncore lock. Disable preempt. (Chris) Signed-off-by: David Weinehall <david.weinehall@linux.intel.com> --- drivers/gpu/drm/i915/intel_csr.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)