[v6] drm/i915/icl: Enhanced execution list support
diff mbox

Message ID 1516815007-17902-1-git-send-email-daniele.ceraolospurio@intel.com
State New
Headers show

Commit Message

Daniele Ceraolo Spurio Jan. 24, 2018, 5:30 p.m. UTC
From: Thomas Daniel <thomas.daniel@intel.com>

Enhanced Execlists is an upgraded version of execlists which supports
up to 8 ports. The lrcs to be submitted are written to a submit queue
(the ExecLists Submission Queue - ELSQ), which is then loaded on the
HW. When writing to the ELSP register, the lrcs are written cyclically
in the queue from position 0 to position 7. Alternatively, it is
possible to write directly in the individual positions of the queue
using the ELSQC registers. To be able to re-use all the existing code
we're using the latter method and we're currently limiting ourself to
only using 2 elements.

The preemption flow is sligthly different with enhanced execlists, so
this patch turns preemption off temporarily for platforms with ELSQ
while we wait for the new mechanism to land.

v2: Rebase.
v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
v5: Reword commit, rename regs to be closer to specs, turn off
    preemption (Daniele), reuse engine->execlists.elsp (Chris)
v6: use has_logical_ring_elsq to differentiate the new paths

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |  7 ++++++-
 drivers/gpu/drm/i915/i915_pci.c          |  3 ++-
 drivers/gpu/drm/i915/intel_device_info.h |  1 +
 drivers/gpu/drm/i915/intel_lrc.c         | 35 +++++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/intel_lrc.h         |  3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.h  |  6 ++++--
 6 files changed, 46 insertions(+), 9 deletions(-)

Comments

Chris Wilson Jan. 24, 2018, 5:46 p.m. UTC | #1
Quoting Daniele Ceraolo Spurio (2018-01-24 17:30:07)
> From: Thomas Daniel <thomas.daniel@intel.com>
> 
> Enhanced Execlists is an upgraded version of execlists which supports
> up to 8 ports. The lrcs to be submitted are written to a submit queue
> (the ExecLists Submission Queue - ELSQ), which is then loaded on the
> HW. When writing to the ELSP register, the lrcs are written cyclically
> in the queue from position 0 to position 7. Alternatively, it is
> possible to write directly in the individual positions of the queue
> using the ELSQC registers. To be able to re-use all the existing code
> we're using the latter method and we're currently limiting ourself to
> only using 2 elements.
> 
> The preemption flow is sligthly different with enhanced execlists, so
> this patch turns preemption off temporarily for platforms with ELSQ
> while we wait for the new mechanism to land.
> 
> v2: Rebase.
> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
> v5: Reword commit, rename regs to be closer to specs, turn off
>     preemption (Daniele), reuse engine->execlists.elsp (Chris)
> v6: use has_logical_ring_elsq to differentiate the new paths
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h          |  7 ++++++-
>  drivers/gpu/drm/i915/i915_pci.c          |  3 ++-
>  drivers/gpu/drm/i915/intel_device_info.h |  1 +
>  drivers/gpu/drm/i915/intel_lrc.c         | 35 +++++++++++++++++++++++++++-----
>  drivers/gpu/drm/i915/intel_lrc.h         |  3 +++
>  drivers/gpu/drm/i915/intel_ringbuffer.h  |  6 ++++--
>  6 files changed, 46 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 8333692..346209a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2741,8 +2741,13 @@ static inline unsigned int i915_sg_segment_size(void)
>  
>  #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
>                 ((dev_priv)->info.has_logical_ring_contexts)
> +#define HAS_LOGICAL_RING_ELSQ(dev_priv) \
> +               ((dev_priv)->info.has_logical_ring_elsq)
> +
> +/* XXX: Preemption disabled for ELSQ until support for new flow lands */
>  #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
> -               ((dev_priv)->info.has_logical_ring_preemption)
> +               ((dev_priv)->info.has_logical_ring_preemption && \
> +                !HAS_LOGICAL_RING_ELSQ(dev_priv))

It's in the intel_device_info for a reason. I knew I should not have let
Michal turn this into a macro.

I still do not see any reason why you don't just make the current
preemption work (it will) and then you can refine it if you prove it
worthwhile.
-Chris
Daniele Ceraolo Spurio Jan. 26, 2018, 12:10 a.m. UTC | #2
On 24/01/18 09:46, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2018-01-24 17:30:07)
>> From: Thomas Daniel <thomas.daniel@intel.com>
>>
>> Enhanced Execlists is an upgraded version of execlists which supports
>> up to 8 ports. The lrcs to be submitted are written to a submit queue
>> (the ExecLists Submission Queue - ELSQ), which is then loaded on the
>> HW. When writing to the ELSP register, the lrcs are written cyclically
>> in the queue from position 0 to position 7. Alternatively, it is
>> possible to write directly in the individual positions of the queue
>> using the ELSQC registers. To be able to re-use all the existing code
>> we're using the latter method and we're currently limiting ourself to
>> only using 2 elements.
>>
>> The preemption flow is sligthly different with enhanced execlists, so
>> this patch turns preemption off temporarily for platforms with ELSQ
>> while we wait for the new mechanism to land.
>>
>> v2: Rebase.
>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>> v5: Reword commit, rename regs to be closer to specs, turn off
>>      preemption (Daniele), reuse engine->execlists.elsp (Chris)
>> v6: use has_logical_ring_elsq to differentiate the new paths
>>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h          |  7 ++++++-
>>   drivers/gpu/drm/i915/i915_pci.c          |  3 ++-
>>   drivers/gpu/drm/i915/intel_device_info.h |  1 +
>>   drivers/gpu/drm/i915/intel_lrc.c         | 35 +++++++++++++++++++++++++++-----
>>   drivers/gpu/drm/i915/intel_lrc.h         |  3 +++
>>   drivers/gpu/drm/i915/intel_ringbuffer.h  |  6 ++++--
>>   6 files changed, 46 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 8333692..346209a 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2741,8 +2741,13 @@ static inline unsigned int i915_sg_segment_size(void)
>>   
>>   #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
>>                  ((dev_priv)->info.has_logical_ring_contexts)
>> +#define HAS_LOGICAL_RING_ELSQ(dev_priv) \
>> +               ((dev_priv)->info.has_logical_ring_elsq)
>> +
>> +/* XXX: Preemption disabled for ELSQ until support for new flow lands */
>>   #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
>> -               ((dev_priv)->info.has_logical_ring_preemption)
>> +               ((dev_priv)->info.has_logical_ring_preemption && \
>> +                !HAS_LOGICAL_RING_ELSQ(dev_priv))
> 
> It's in the intel_device_info for a reason. I knew I should not have let
> Michal turn this into a macro.
> 

You mean setting has_logical_ring_preemption to zero directly? I thought 
the policy was to avoid setting things in device_info to values that 
don't reflect real HW capabilities and to do the hacks elsewhere.

> I still do not see any reason why you don't just make the current
> preemption work (it will) and then you can refine it if you prove it
> worthwhile.
> -Chris
> 

Just didn't see the worth of it ;). It's not a lot of code but it's in 
an hot path and we're most likely going to get rid of it soon as the new 
stuff is simpler. I'll put the change together and send it out so we can 
evaluate that and see what works better with code at hand.

Thanks,
Daniele
Chris Wilson Jan. 26, 2018, 8:47 a.m. UTC | #3
Quoting Daniele Ceraolo Spurio (2018-01-26 00:10:09)
> 
> 
> On 24/01/18 09:46, Chris Wilson wrote:
> > Quoting Daniele Ceraolo Spurio (2018-01-24 17:30:07)
> >> From: Thomas Daniel <thomas.daniel@intel.com>
> >>
> >> Enhanced Execlists is an upgraded version of execlists which supports
> >> up to 8 ports. The lrcs to be submitted are written to a submit queue
> >> (the ExecLists Submission Queue - ELSQ), which is then loaded on the
> >> HW. When writing to the ELSP register, the lrcs are written cyclically
> >> in the queue from position 0 to position 7. Alternatively, it is
> >> possible to write directly in the individual positions of the queue
> >> using the ELSQC registers. To be able to re-use all the existing code
> >> we're using the latter method and we're currently limiting ourself to
> >> only using 2 elements.
> >>
> >> The preemption flow is sligthly different with enhanced execlists, so
> >> this patch turns preemption off temporarily for platforms with ELSQ
> >> while we wait for the new mechanism to land.
> >>
> >> v2: Rebase.
> >> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
> >> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
> >> v5: Reword commit, rename regs to be closer to specs, turn off
> >>      preemption (Daniele), reuse engine->execlists.elsp (Chris)
> >> v6: use has_logical_ring_elsq to differentiate the new paths
> >>
> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_drv.h          |  7 ++++++-
> >>   drivers/gpu/drm/i915/i915_pci.c          |  3 ++-
> >>   drivers/gpu/drm/i915/intel_device_info.h |  1 +
> >>   drivers/gpu/drm/i915/intel_lrc.c         | 35 +++++++++++++++++++++++++++-----
> >>   drivers/gpu/drm/i915/intel_lrc.h         |  3 +++
> >>   drivers/gpu/drm/i915/intel_ringbuffer.h  |  6 ++++--
> >>   6 files changed, 46 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >> index 8333692..346209a 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -2741,8 +2741,13 @@ static inline unsigned int i915_sg_segment_size(void)
> >>   
> >>   #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
> >>                  ((dev_priv)->info.has_logical_ring_contexts)
> >> +#define HAS_LOGICAL_RING_ELSQ(dev_priv) \
> >> +               ((dev_priv)->info.has_logical_ring_elsq)
> >> +
> >> +/* XXX: Preemption disabled for ELSQ until support for new flow lands */
> >>   #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
> >> -               ((dev_priv)->info.has_logical_ring_preemption)
> >> +               ((dev_priv)->info.has_logical_ring_preemption && \
> >> +                !HAS_LOGICAL_RING_ELSQ(dev_priv))
> > 
> > It's in the intel_device_info for a reason. I knew I should not have let
> > Michal turn this into a macro.
> > 
> 
> You mean setting has_logical_ring_preemption to zero directly? I thought 
> the policy was to avoid setting things in device_info to values that 
> don't reflect real HW capabilities and to do the hacks elsewhere.

No, data driven code. intel_device_info was introduced to remove having
heavy predicates so that we could see what will be enabled and what not
in one place.
 
> > I still do not see any reason why you don't just make the current
> > preemption work (it will) and then you can refine it if you prove it
> > worthwhile.
> > 
> 
> Just didn't see the worth of it ;). It's not a lot of code but it's in 
> an hot path and we're most likely going to get rid of it soon as the new 
> stuff is simpler. I'll put the change together and send it out so we can 
> evaluate that and see what works better with code at hand.

Is the new stuff going to be any simpler? You still need a preemption
point, so a special submission followed by detecting that in the CS
handler to do the unwind.

And whilst I am here, els is awful. Either stick with elsp and note that
they changed the name (+layout) on icl, or replace it with a generic
name. Spelling it out completely as execlists->execlists_submission is
still better than els, but submit[_reg] (or submit_hw) would be clearer.
-Chris
Daniele Ceraolo Spurio Jan. 26, 2018, 4:43 p.m. UTC | #4
On 26/01/18 00:47, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2018-01-26 00:10:09)
>>
>>
>> On 24/01/18 09:46, Chris Wilson wrote:
>>> Quoting Daniele Ceraolo Spurio (2018-01-24 17:30:07)
>>>> From: Thomas Daniel <thomas.daniel@intel.com>
>>>>
>>>> Enhanced Execlists is an upgraded version of execlists which supports
>>>> up to 8 ports. The lrcs to be submitted are written to a submit queue
>>>> (the ExecLists Submission Queue - ELSQ), which is then loaded on the
>>>> HW. When writing to the ELSP register, the lrcs are written cyclically
>>>> in the queue from position 0 to position 7. Alternatively, it is
>>>> possible to write directly in the individual positions of the queue
>>>> using the ELSQC registers. To be able to re-use all the existing code
>>>> we're using the latter method and we're currently limiting ourself to
>>>> only using 2 elements.
>>>>
>>>> The preemption flow is sligthly different with enhanced execlists, so
>>>> this patch turns preemption off temporarily for platforms with ELSQ
>>>> while we wait for the new mechanism to land.
>>>>
>>>> v2: Rebase.
>>>> v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
>>>> v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
>>>> v5: Reword commit, rename regs to be closer to specs, turn off
>>>>       preemption (Daniele), reuse engine->execlists.elsp (Chris)
>>>> v6: use has_logical_ring_elsq to differentiate the new paths
>>>>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>>> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
>>>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_drv.h          |  7 ++++++-
>>>>    drivers/gpu/drm/i915/i915_pci.c          |  3 ++-
>>>>    drivers/gpu/drm/i915/intel_device_info.h |  1 +
>>>>    drivers/gpu/drm/i915/intel_lrc.c         | 35 +++++++++++++++++++++++++++-----
>>>>    drivers/gpu/drm/i915/intel_lrc.h         |  3 +++
>>>>    drivers/gpu/drm/i915/intel_ringbuffer.h  |  6 ++++--
>>>>    6 files changed, 46 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>> index 8333692..346209a 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -2741,8 +2741,13 @@ static inline unsigned int i915_sg_segment_size(void)
>>>>    
>>>>    #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
>>>>                   ((dev_priv)->info.has_logical_ring_contexts)
>>>> +#define HAS_LOGICAL_RING_ELSQ(dev_priv) \
>>>> +               ((dev_priv)->info.has_logical_ring_elsq)
>>>> +
>>>> +/* XXX: Preemption disabled for ELSQ until support for new flow lands */
>>>>    #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
>>>> -               ((dev_priv)->info.has_logical_ring_preemption)
>>>> +               ((dev_priv)->info.has_logical_ring_preemption && \
>>>> +                !HAS_LOGICAL_RING_ELSQ(dev_priv))
>>>
>>> It's in the intel_device_info for a reason. I knew I should not have let
>>> Michal turn this into a macro.
>>>
>>
>> You mean setting has_logical_ring_preemption to zero directly? I thought
>> the policy was to avoid setting things in device_info to values that
>> don't reflect real HW capabilities and to do the hacks elsewhere.
> 
> No, data driven code. intel_device_info was introduced to remove having
> heavy predicates so that we could see what will be enabled and what not
> in one place.
>   
>>> I still do not see any reason why you don't just make the current
>>> preemption work (it will) and then you can refine it if you prove it
>>> worthwhile.
>>>
>>
>> Just didn't see the worth of it ;). It's not a lot of code but it's in
>> an hot path and we're most likely going to get rid of it soon as the new
>> stuff is simpler. I'll put the change together and send it out so we can
>> evaluate that and see what works better with code at hand.
> 
> Is the new stuff going to be any simpler? You still need a preemption
> point, so a special submission followed by detecting that in the CS
> handler to do the unwind.
> 
> And whilst I am here, els is awful. Either stick with elsp and note that
> they changed the name (+layout) on icl, or replace it with a generic
> name. Spelling it out completely as execlists->execlists_submission is
> still better than els, but submit[_reg] (or submit_hw) would be clearer.
> -Chris
> 

The elsp still exists on gen11, with a slightly different behavior as 
noted in the commit message, that's why I wanted to change the name. 
execlists->submit_reg sounds good.

Daniele

Patch
diff mbox

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8333692..346209a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2741,8 +2741,13 @@  static inline unsigned int i915_sg_segment_size(void)
 
 #define HAS_LOGICAL_RING_CONTEXTS(dev_priv) \
 		((dev_priv)->info.has_logical_ring_contexts)
+#define HAS_LOGICAL_RING_ELSQ(dev_priv) \
+		((dev_priv)->info.has_logical_ring_elsq)
+
+/* XXX: Preemption disabled for ELSQ until support for new flow lands */
 #define HAS_LOGICAL_RING_PREEMPTION(dev_priv) \
-		((dev_priv)->info.has_logical_ring_preemption)
+		((dev_priv)->info.has_logical_ring_preemption && \
+		 !HAS_LOGICAL_RING_ELSQ(dev_priv))
 
 #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
 
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index f28c165..6c86cba 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -583,7 +583,8 @@ 
 	GEN10_FEATURES, \
 	.gen = 11, \
 	.ddb_size = 2048, \
-	.has_csr = 0
+	.has_csr = 0, \
+	.has_logical_ring_elsq = 1
 
 static const struct intel_device_info intel_icelake_11_info __initconst = {
 	GEN11_FEATURES,
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 9542018..dbf0f2d 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -96,6 +96,7 @@  enum intel_platform {
 	func(has_l3_dpf); \
 	func(has_llc); \
 	func(has_logical_ring_contexts); \
+	func(has_logical_ring_elsq); \
 	func(has_logical_ring_preemption); \
 	func(has_overlay); \
 	func(has_pooled_eu); \
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 22d471a..6c7cf93 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -428,11 +428,24 @@  static inline void elsp_write(u64 desc, u32 __iomem *elsp)
 	writel(lower_32_bits(desc), elsp);
 }
 
+static inline void elsqc_write(u64 desc, u32 __iomem *elsqc, u32 port)
+{
+	writel(lower_32_bits(desc), elsqc + port * 2);
+	writel(upper_32_bits(desc), elsqc + port * 2 + 1);
+}
+
 static void execlists_submit_ports(struct intel_engine_cs *engine)
 {
+	struct drm_i915_private *dev_priv = engine->i915;
 	struct execlist_port *port = engine->execlists.port;
 	unsigned int n;
 
+	/*
+	 * ELSQ note: the submit queue is not cleared after being submitted
+	 * to the HW so we need to make sure we always clean it up. This is
+	 * currently ensured by the fact that we always write the same number
+	 * of elsq entries, keep this in mind before changing the loop below.
+	 */
 	for (n = execlists_num_ports(&engine->execlists); n--; ) {
 		struct drm_i915_gem_request *rq;
 		unsigned int count;
@@ -456,8 +469,16 @@  static void execlists_submit_ports(struct intel_engine_cs *engine)
 			desc = 0;
 		}
 
-		elsp_write(desc, engine->execlists.elsp);
+		if (HAS_LOGICAL_RING_ELSQ(dev_priv))
+			elsqc_write(desc, engine->execlists.els, n);
+		else
+			elsp_write(desc, engine->execlists.els);
 	}
+
+	/* we need to manually load the submit queue */
+	if (HAS_LOGICAL_RING_ELSQ(dev_priv))
+		I915_WRITE_FW(RING_EXECLIST_CONTROL(engine), EL_CTRL_LOAD);
+
 	execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
 }
 
@@ -506,9 +527,9 @@  static void inject_preempt_context(struct intel_engine_cs *engine)
 
 	GEM_TRACE("%s\n", engine->name);
 	for (n = execlists_num_ports(&engine->execlists); --n; )
-		elsp_write(0, engine->execlists.elsp);
+		elsp_write(0, engine->execlists.els);
 
-	elsp_write(ce->lrc_desc, engine->execlists.elsp);
+	elsp_write(ce->lrc_desc, engine->execlists.els);
 	execlists_clear_active(&engine->execlists, EXECLISTS_ACTIVE_HWACK);
 }
 
@@ -2022,8 +2043,12 @@  static int logical_ring_init(struct intel_engine_cs *engine)
 	if (ret)
 		goto error;
 
-	engine->execlists.elsp =
-		engine->i915->regs + i915_mmio_reg_offset(RING_ELSP(engine));
+	if (HAS_LOGICAL_RING_ELSQ(engine->i915))
+		engine->execlists.els = engine->i915->regs +
+			i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(engine));
+	else
+		engine->execlists.els = engine->i915->regs +
+			i915_mmio_reg_offset(RING_ELSP(engine));
 
 	return 0;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 6d4f9b9..3ab4266 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -38,6 +38,9 @@ 
 #define	  CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT	(1 << 0)
 #define   CTX_CTRL_RS_CTX_ENABLE                (1 << 1)
 #define RING_CONTEXT_STATUS_BUF_BASE(engine)	_MMIO((engine)->mmio_base + 0x370)
+#define RING_EXECLIST_SQ_CONTENTS(engine)	_MMIO((engine)->mmio_base + 0x510)
+#define RING_EXECLIST_CONTROL(engine)		_MMIO((engine)->mmio_base + 0x550)
+#define	  EL_CTRL_LOAD				(1 << 0)
 #define RING_CONTEXT_STATUS_BUF_LO(engine, i)	_MMIO((engine)->mmio_base + 0x370 + (i) * 8)
 #define RING_CONTEXT_STATUS_BUF_HI(engine, i)	_MMIO((engine)->mmio_base + 0x370 + (i) * 8 + 4)
 #define RING_CONTEXT_STATUS_PTR(engine)		_MMIO((engine)->mmio_base + 0x3a0)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c5ff203..d36bb73 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -200,9 +200,11 @@  struct intel_engine_execlists {
 	bool no_priolist;
 
 	/**
-	 * @elsp: the ExecList Submission Port register
+	 * @els: gen-specific execlist submission register
+	 * set to the ExecList Submission Port (elsp) register pre-Gen11 and to
+	 * the ExecList Submission Queue Contents register array for Gen11+
 	 */
-	u32 __iomem *elsp;
+	u32 __iomem *els;
 
 	/**
 	 * @port: execlist port states