Message ID | 1460037940-14094-3-git-send-email-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Apr 07, 2016 at 03:05:40PM +0100, Tvrtko Ursulin wrote: > @@ -2099,6 +2101,12 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine) > > logical_ring_init_platform_invariants(engine); > > + engine->fw_domains_elsp = > + intel_reg_write_fw_domains(dev_priv, RING_ELSP(engine)); > + engine->fw_domains_csb = > + intel_reg_write_fw_domains(dev_priv, > + RING_CONTEXT_STATUS_PTR(engine)); So is write a superset of fw? Tends to be reads that require fw more than writes (gen6/7 fifo, gen8 write shadowing). I think we need a READ | WRITE direction field. > +/** > + * intel_reg_write_fw_domains - which forcewake domains are needed to write a register > + * @dev_priv: pointer to struct drm_i915_private > + * @reg: register in question > + * > + * Returns a set of forcewake domains required to be taken with for example > + * intel_uncore_forcewake_get for the specified register to be writable with the > + * raw mmio accessors. > + */ > +enum forcewake_domains > +intel_reg_write_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg) > +{ > + enum forcewake_domains fw_domains; > + > + if (intel_vgpu_active(dev_priv->dev)) > + return 0; > + > + switch (INTEL_INFO(dev_priv)->gen) { > + case 9: > + fw_domains = __gen9_reg_write_fw_domains(i915_mmio_reg_offset(reg)); > + break; > + case 8: > + if (IS_CHERRYVIEW(dev_priv)) > + fw_domains = __chv_reg_write_fw_domains(i915_mmio_reg_offset(reg)); > + else > + fw_domains = __gen8_reg_write_fw_domains(i915_mmio_reg_offset(reg)); > + break; > + default: > + MISSING_CASE(INTEL_INFO(dev_priv)->gen); > + case 7: > + case 6: This is actually a tricky one. gen6/7 maintain a FIFO to store mmio writes whilst it is powered down. If we fill that fifo we drop writes (and that fifo is shared with functions on the device, i.e. it is not ours to fill exclusively). So should we be saving that if you want to make lots of writes you should take this forcewake domain. Yes. We should report what domains they would require, it is still up to the caller as to whether they risk the FIFO overflowing, but they should have the right information to hand. -Chris
On Thu, Apr 07, 2016 at 03:05:40PM +0100, Tvrtko Ursulin wrote: > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index a1db6a02cf23..cac387f38cf6 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -418,6 +418,7 @@ static void execlists_submit_requests(struct drm_i915_gem_request *rq0, > struct drm_i915_gem_request *rq1) > { > struct drm_i915_private *dev_priv = rq0->i915; > + unsigned int fw_domains = rq0->engine->fw_domains_elsp; So with a slightly different layout of fw that nest the elsp fw within the tasklet handler's fw I would have a preamble like: fw_domains = 0; for_each_reg({ELSP, WRITE}, {CONTEXT_STATUS_BUF, READ}, {CONTEXT_STATUS_PTR, READ | WRITE}) fw_domains |= intel_reg_fw_domains(dev_priv, reg, direction); engine->execlist_fw_domains = fw_domains; Hmm, we have a name clash with i915_reg_t i915_mmio_reg and intel_uncore_forcewake_get() intel_uncore_forcewake_for_mmio() i915_mmio_reg_fw_domains() -Chris
On 07/04/16 15:24, Chris Wilson wrote: > On Thu, Apr 07, 2016 at 03:05:40PM +0100, Tvrtko Ursulin wrote: >> @@ -2099,6 +2101,12 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine) >> >> logical_ring_init_platform_invariants(engine); >> >> + engine->fw_domains_elsp = >> + intel_reg_write_fw_domains(dev_priv, RING_ELSP(engine)); >> + engine->fw_domains_csb = >> + intel_reg_write_fw_domains(dev_priv, >> + RING_CONTEXT_STATUS_PTR(engine)); > > So is write a superset of fw? Tends to be reads that require fw more > than writes (gen6/7 fifo, gen8 write shadowing). > > I think we need a READ | WRITE direction field. Hm, yes embedding too much knowledge in the caller, how about just: engine->fw_domains_csb = intel_reg_read_fw_domains(dev_priv, RING_CONTEXT_STATUS_PTR(engine)); engine->fw_domains_csb |= intel_reg_write_fw_domains(dev_priv, RING_CONTEXT_STATUS_PTR(engine)); ? >> +/** >> + * intel_reg_write_fw_domains - which forcewake domains are needed to write a register >> + * @dev_priv: pointer to struct drm_i915_private >> + * @reg: register in question >> + * >> + * Returns a set of forcewake domains required to be taken with for example >> + * intel_uncore_forcewake_get for the specified register to be writable with the >> + * raw mmio accessors. >> + */ >> +enum forcewake_domains >> +intel_reg_write_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg) >> +{ >> + enum forcewake_domains fw_domains; >> + >> + if (intel_vgpu_active(dev_priv->dev)) >> + return 0; >> + >> + switch (INTEL_INFO(dev_priv)->gen) { >> + case 9: >> + fw_domains = __gen9_reg_write_fw_domains(i915_mmio_reg_offset(reg)); >> + break; >> + case 8: >> + if (IS_CHERRYVIEW(dev_priv)) >> + fw_domains = __chv_reg_write_fw_domains(i915_mmio_reg_offset(reg)); >> + else >> + fw_domains = __gen8_reg_write_fw_domains(i915_mmio_reg_offset(reg)); >> + break; >> + default: >> + MISSING_CASE(INTEL_INFO(dev_priv)->gen); >> + case 7: >> + case 6: > > This is actually a tricky one. gen6/7 maintain a FIFO to store mmio > writes whilst it is powered down. If we fill that fifo we drop writes > (and that fifo is shared with functions on the device, i.e. it is not > ours to fill exclusively). So should we be saving that if you want to > make lots of writes you should take this forcewake domain. Yes. We should > report what domains they would require, it is still up to the caller as > to whether they risk the FIFO overflowing, but they should have the right > information to hand. Missed that. But it isn't part of forcewake domains. So what would you return? Fake out a new domain just complicates things and adds cost for everyone. Maybe better just to limit the whole thing to gen8+ and leave olders platforms untouched? Regards, Tvrtko
On 07/04/16 15:35, Chris Wilson wrote: > On Thu, Apr 07, 2016 at 03:05:40PM +0100, Tvrtko Ursulin wrote: >> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c >> index a1db6a02cf23..cac387f38cf6 100644 >> --- a/drivers/gpu/drm/i915/intel_lrc.c >> +++ b/drivers/gpu/drm/i915/intel_lrc.c >> @@ -418,6 +418,7 @@ static void execlists_submit_requests(struct drm_i915_gem_request *rq0, >> struct drm_i915_gem_request *rq1) >> { >> struct drm_i915_private *dev_priv = rq0->i915; >> + unsigned int fw_domains = rq0->engine->fw_domains_elsp; > > So with a slightly different layout of fw that nest the elsp fw within > the tasklet handler's fw I would have a preamble like: > > fw_domains = 0; > for_each_reg({ELSP, WRITE}, > {CONTEXT_STATUS_BUF, READ}, > {CONTEXT_STATUS_PTR, READ | WRITE}) > fw_domains |= intel_reg_fw_domains(dev_priv, reg, direction); > engine->execlist_fw_domains = fw_domains; I actually considered this (or-ing together for all registers).. might as well do it now. > Hmm, we have a name clash with i915_reg_t i915_mmio_reg and > intel_uncore_forcewake_get() > > intel_uncore_forcewake_for_mmio() > i915_mmio_reg_fw_domains() intel_uncore_forcewake_for_reg ? Remaining open on what to do with gen7 and below. Regards, Tvrtko
On Thu, Apr 07, 2016 at 03:36:04PM +0100, Tvrtko Ursulin wrote: > > On 07/04/16 15:24, Chris Wilson wrote: > >On Thu, Apr 07, 2016 at 03:05:40PM +0100, Tvrtko Ursulin wrote: > >>@@ -2099,6 +2101,12 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine) > >> > >> logical_ring_init_platform_invariants(engine); > >> > >>+ engine->fw_domains_elsp = > >>+ intel_reg_write_fw_domains(dev_priv, RING_ELSP(engine)); > >>+ engine->fw_domains_csb = > >>+ intel_reg_write_fw_domains(dev_priv, > >>+ RING_CONTEXT_STATUS_PTR(engine)); > > > >So is write a superset of fw? Tends to be reads that require fw more > >than writes (gen6/7 fifo, gen8 write shadowing). > > > >I think we need a READ | WRITE direction field. > > Hm, yes embedding too much knowledge in the caller, how about just: > > engine->fw_domains_csb = intel_reg_read_fw_domains(dev_priv, > RING_CONTEXT_STATUS_PTR(engine)); > > engine->fw_domains_csb |= intel_reg_write_fw_domains(dev_priv, > RING_CONTEXT_STATUS_PTR(engine)); See my other mail for a mockup of some code. Something along these lines. > >>+/** > >>+ * intel_reg_write_fw_domains - which forcewake domains are needed to write a register > >>+ * @dev_priv: pointer to struct drm_i915_private > >>+ * @reg: register in question > >>+ * > >>+ * Returns a set of forcewake domains required to be taken with for example > >>+ * intel_uncore_forcewake_get for the specified register to be writable with the > >>+ * raw mmio accessors. > >>+ */ > >>+enum forcewake_domains > >>+intel_reg_write_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg) > >>+{ > >>+ enum forcewake_domains fw_domains; > >>+ > >>+ if (intel_vgpu_active(dev_priv->dev)) > >>+ return 0; > >>+ > >>+ switch (INTEL_INFO(dev_priv)->gen) { > >>+ case 9: > >>+ fw_domains = __gen9_reg_write_fw_domains(i915_mmio_reg_offset(reg)); > >>+ break; > >>+ case 8: > >>+ if (IS_CHERRYVIEW(dev_priv)) > >>+ fw_domains = __chv_reg_write_fw_domains(i915_mmio_reg_offset(reg)); > >>+ else > >>+ fw_domains = __gen8_reg_write_fw_domains(i915_mmio_reg_offset(reg)); > >>+ break; > >>+ default: > >>+ MISSING_CASE(INTEL_INFO(dev_priv)->gen); > >>+ case 7: > >>+ case 6: > > > >This is actually a tricky one. gen6/7 maintain a FIFO to store mmio > >writes whilst it is powered down. If we fill that fifo we drop writes > >(and that fifo is shared with functions on the device, i.e. it is not > >ours to fill exclusively). So should we be saving that if you want to > >make lots of writes you should take this forcewake domain. Yes. We should > >report what domains they would require, it is still up to the caller as > >to whether they risk the FIFO overflowing, but they should have the right > >information to hand. > > Missed that. But it isn't part of forcewake domains. So what would > you return? Fake out a new domain just complicates things and adds > cost for everyone. Maybe better just to limit the whole thing to > gen8+ and leave olders platforms untouched? It is the FORCEWAKE_RENDER to flush the write FIFO, it is just not clear in the implementation (i.e. we have never done that explicitly - I do remember at odd times counting register writes though...). Only in a few places (over ring init) have we explicitly taken the fw domain across writes. I'm leaning towards reporting that they do require the domain, with a note saying about the fifo. It is a specialized interface after all that is going to be using in fairly gen-specific paths (and erring on the side of caution when used outside of that is wise). -Chris
On Thu, Apr 07, 2016 at 03:52:46PM +0100, Tvrtko Ursulin wrote: > > > On 07/04/16 15:35, Chris Wilson wrote: > >On Thu, Apr 07, 2016 at 03:05:40PM +0100, Tvrtko Ursulin wrote: > >>diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > >>index a1db6a02cf23..cac387f38cf6 100644 > >>--- a/drivers/gpu/drm/i915/intel_lrc.c > >>+++ b/drivers/gpu/drm/i915/intel_lrc.c > >>@@ -418,6 +418,7 @@ static void execlists_submit_requests(struct drm_i915_gem_request *rq0, > >> struct drm_i915_gem_request *rq1) > >> { > >> struct drm_i915_private *dev_priv = rq0->i915; > >>+ unsigned int fw_domains = rq0->engine->fw_domains_elsp; > > > >So with a slightly different layout of fw that nest the elsp fw within > >the tasklet handler's fw I would have a preamble like: > > > >fw_domains = 0; > >for_each_reg({ELSP, WRITE}, > > {CONTEXT_STATUS_BUF, READ}, > > {CONTEXT_STATUS_PTR, READ | WRITE}) > > fw_domains |= intel_reg_fw_domains(dev_priv, reg, direction); > >engine->execlist_fw_domains = fw_domains; > > I actually considered this (or-ing together for all registers).. > might as well do it now. > > >Hmm, we have a name clash with i915_reg_t i915_mmio_reg and > >intel_uncore_forcewake_get() > > > >intel_uncore_forcewake_for_mmio() > >i915_mmio_reg_fw_domains() > > intel_uncore_forcewake_for_reg ? for_read / for_write if you wish to stick with two functions, for_reg if go with a combined. We shall leave the i915_mmio_reg_blah() for another day when there is a clear direction for i915_reg_t. -Chris
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4ebd3ff02803..55fec504b19f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -633,6 +633,12 @@ enum forcewake_domains { FORCEWAKE_MEDIA) }; +enum forcewake_domains +intel_reg_read_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg); + +enum forcewake_domains +intel_reg_write_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg); + struct intel_uncore_funcs { void (*force_wake_get)(struct drm_i915_private *dev_priv, enum forcewake_domains domains); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index a1db6a02cf23..cac387f38cf6 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -418,6 +418,7 @@ static void execlists_submit_requests(struct drm_i915_gem_request *rq0, struct drm_i915_gem_request *rq1) { struct drm_i915_private *dev_priv = rq0->i915; + unsigned int fw_domains = rq0->engine->fw_domains_elsp; execlists_update_context(rq0); @@ -425,11 +426,11 @@ static void execlists_submit_requests(struct drm_i915_gem_request *rq0, execlists_update_context(rq1); spin_lock_irq(&dev_priv->uncore.lock); - intel_uncore_forcewake_get__locked(dev_priv, FORCEWAKE_ALL); + intel_uncore_forcewake_get__locked(dev_priv, fw_domains); execlists_elsp_write(rq0, rq1); - intel_uncore_forcewake_put__locked(dev_priv, FORCEWAKE_ALL); + intel_uncore_forcewake_put__locked(dev_priv, fw_domains); spin_unlock_irq(&dev_priv->uncore.lock); } @@ -552,7 +553,7 @@ static void intel_lrc_irq_handler(unsigned long data) unsigned int csb_read = 0, i; unsigned int submit_contexts = 0; - intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); + intel_uncore_forcewake_get(dev_priv, engine->fw_domains_csb); status_pointer = I915_READ_FW(RING_CONTEXT_STATUS_PTR(engine)); @@ -577,7 +578,7 @@ static void intel_lrc_irq_handler(unsigned long data) _MASKED_FIELD(GEN8_CSB_READ_PTR_MASK, engine->next_context_status_buffer << 8)); - intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); + intel_uncore_forcewake_put(dev_priv, engine->fw_domains_csb); spin_lock(&engine->execlist_lock); @@ -2077,7 +2078,8 @@ logical_ring_default_irqs(struct intel_engine_cs *engine, unsigned shift) static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine) { - struct intel_context *dctx = to_i915(dev)->kernel_context; + struct drm_i915_private *dev_priv = to_i915(dev); + struct intel_context *dctx = dev_priv->kernel_context; int ret; /* Intentionally left blank. */ @@ -2099,6 +2101,12 @@ logical_ring_init(struct drm_device *dev, struct intel_engine_cs *engine) logical_ring_init_platform_invariants(engine); + engine->fw_domains_elsp = + intel_reg_write_fw_domains(dev_priv, RING_ELSP(engine)); + engine->fw_domains_csb = + intel_reg_write_fw_domains(dev_priv, + RING_CONTEXT_STATUS_PTR(engine)); + ret = i915_cmd_parser_init_ring(engine); if (ret) goto error; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 18074ab55f61..fd248cde7164 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -270,6 +270,7 @@ struct intel_engine_cs { spinlock_t execlist_lock; /* used inside tasklet, use spin_lock_bh */ struct list_head execlist_queue; struct list_head execlist_retired_req_list; + unsigned int fw_domains_elsp, fw_domains_csb; unsigned int next_context_status_buffer; unsigned int idle_lite_restore_wa; bool disable_lite_restore_wa; diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index b77bdf4a47f6..550e480fedd4 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1766,3 +1766,94 @@ intel_uncore_arm_unclaimed_mmio_detection(struct drm_i915_private *dev_priv) return false; } + +/** + * intel_reg_read_fw_domains - which forcewake domains are needed to read a register + * @dev_priv: pointer to struct drm_i915_private + * @reg: register in question + * + * Returns a set of forcewake domains required to be taken with for example + * intel_uncore_forcewake_get for the specified register to be readable with the + * raw mmio accessors. + */ +enum forcewake_domains +intel_reg_read_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg) +{ + enum forcewake_domains fw_domains; + + if (intel_vgpu_active(dev_priv->dev)) + return 0; + + switch (INTEL_INFO(dev_priv)->gen) { + case 9: + fw_domains = __gen9_reg_read_fw_domains(i915_mmio_reg_offset(reg)); + break; + case 8: + if (IS_CHERRYVIEW(dev_priv)) + fw_domains = __chv_reg_read_fw_domains(i915_mmio_reg_offset(reg)); + else + fw_domains = __gen6_reg_read_fw_domains(i915_mmio_reg_offset(reg)); + break; + case 7: + case 6: + if (IS_VALLEYVIEW(dev_priv)) + fw_domains = __vlv_reg_read_fw_domains(i915_mmio_reg_offset(reg)); + else + fw_domains = __gen6_reg_read_fw_domains(i915_mmio_reg_offset(reg)); + break; + default: + MISSING_CASE(INTEL_INFO(dev_priv)->gen); + case 5: /* forcewake was introduced with gen6 */ + case 4: + case 3: + case 2: + return 0; + } + + WARN_ON(fw_domains & ~dev_priv->uncore.fw_domains); + + return fw_domains; +} + +/** + * intel_reg_write_fw_domains - which forcewake domains are needed to write a register + * @dev_priv: pointer to struct drm_i915_private + * @reg: register in question + * + * Returns a set of forcewake domains required to be taken with for example + * intel_uncore_forcewake_get for the specified register to be writable with the + * raw mmio accessors. + */ +enum forcewake_domains +intel_reg_write_fw_domains(struct drm_i915_private *dev_priv, i915_reg_t reg) +{ + enum forcewake_domains fw_domains; + + if (intel_vgpu_active(dev_priv->dev)) + return 0; + + switch (INTEL_INFO(dev_priv)->gen) { + case 9: + fw_domains = __gen9_reg_write_fw_domains(i915_mmio_reg_offset(reg)); + break; + case 8: + if (IS_CHERRYVIEW(dev_priv)) + fw_domains = __chv_reg_write_fw_domains(i915_mmio_reg_offset(reg)); + else + fw_domains = __gen8_reg_write_fw_domains(i915_mmio_reg_offset(reg)); + break; + default: + MISSING_CASE(INTEL_INFO(dev_priv)->gen); + case 7: + case 6: + case 5: + case 4: + case 3: + case 2: + return 0; + } + + WARN_ON(fw_domains & ~dev_priv->uncore.fw_domains); + + return fw_domains; +}