[mesa] i965/gen8+: bo in state base address must be in 32-bit address range
diff mbox

Message ID 1435062089-19877-2-git-send-email-michel.thierry@intel.com
State New
Headers show

Commit Message

Michel Thierry June 23, 2015, 12:21 p.m. UTC
Gen8+ supports 48-bit virtual addresses, but some objects must always be
allocated inside the 32-bit address range.

In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
General State Heap (GSH) or Intruction State Heap (ISH) must be in a
32-bit range, because the General State Offset and Instruction State Offset
are limited to 32-bits.

Set provided bo flag when the 4GB limit is not necessary, to be able to use
the full address space.

Cc: mesa-dev@lists.freedesktop.org
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
---
 src/mesa/drivers/dri/i965/gen8_misc_state.c   | 6 +++---
 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 7 +++++++
 2 files changed, 10 insertions(+), 3 deletions(-)

Comments

Ben Widawsky June 24, 2015, 3:51 a.m. UTC | #1
Hi. Feel free to Cc me on patches of this nature. I am far behind on mesa-dev,
and no longer read intel-gfx. I'm probably one of the sensible people to look at
this...

On Tue, Jun 23, 2015 at 01:21:27PM +0100, Michel Thierry wrote:
> Gen8+ supports 48-bit virtual addresses, but some objects must always be
> allocated inside the 32-bit address range.
> 
> In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
> General State Heap (GSH) or Intruction State Heap (ISH) must be in a
> 32-bit range, because the General State Offset and Instruction State Offset
> are limited to 32-bits.

I don't think GSH, or ISH are well known terms that have every appeared
anywhere. I'd just keep the bit after the final comma (...because ...)

> 
> Set provided bo flag when the 4GB limit is not necessary, to be able to use
> the full address space.

I'm glad you got around to this. We'd been putting it off for a long time.

> 
> Cc: mesa-dev@lists.freedesktop.org
> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
> ---
>  src/mesa/drivers/dri/i965/gen8_misc_state.c   | 6 +++---
>  src/mesa/drivers/dri/i965/intel_batchbuffer.h | 7 +++++++
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen8_misc_state.c b/src/mesa/drivers/dri/i965/gen8_misc_state.c
> index b20038e..26531d0 100644
> --- a/src/mesa/drivers/dri/i965/gen8_misc_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_misc_state.c
> @@ -41,17 +41,17 @@ void gen8_upload_state_base_address(struct brw_context *brw)
>     OUT_BATCH(0);
>     OUT_BATCH(mocs_wb << 16);
>     /* Surface state base address: */
> -   OUT_RELOC64(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
> +   OUT_RELOC64_32BWA(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
>                 mocs_wb << 4 | 1);
>     /* Dynamic state base address: */
> -   OUT_RELOC64(brw->batch.bo,
> +   OUT_RELOC64_32BWA(brw->batch.bo,
>                 I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
>                 mocs_wb << 4 | 1);
>     /* Indirect object base address: MEDIA_OBJECT data */
>     OUT_BATCH(mocs_wb << 4 | 1);
>     OUT_BATCH(0);
>     /* Instruction base address: shader kernels (incl. SIP) */
> -   OUT_RELOC64(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
> +   OUT_RELOC64_32BWA(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
>                 mocs_wb << 4 | 1);
>  
>     /* General state buffer size */
> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> index 7bdd836..5aa741e 100644
> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
> @@ -177,6 +177,13 @@ intel_batchbuffer_advance(struct brw_context *brw)
>  
>  /* Handle 48-bit address relocations for Gen8+ */
>  #define OUT_RELOC64(buf, read_domains, write_domain, delta) do { \
> +   drm_intel_bo_set_supports_48baddress(buf); \
> +   intel_batchbuffer_emit_reloc64(brw, buf, read_domains, write_domain, delta);	\
> +} while (0)
> +
> +/* Handle 48-bit address relocations for Gen8+, ask for 32-bit address */
> +#define OUT_RELOC64_32BWA(buf, read_domains, write_domain, delta) do { \
> +   drm_intel_bo_clear_supports_48baddress(buf); \
>     intel_batchbuffer_emit_reloc64(brw, buf, read_domains, write_domain, delta);	\
>  } while (0)
>  

First and least bikesheddy, you need to bump the required libdrm in the
configure.ac to support this new libdrm function (maybe you did, but I don't see
it on mesa-dev).

More bikesheddy, and forgive me here because I haven't looked at any of the
kernel interfaces or libdrm patches (you can Cc those to mesa-dev if they're
relevant fwiw).

Presumably at the end of the day it's drm_intel_bo_emit_reloc which needs to
know about these limitations. Unfortunately we don't have a flags field there.
The implementation here seems like a somewhat cumbersome workaround for that (it
looks like the context execbuf which is pretty crappy - yes, I know who the
author was). Have you already discussed adding a new emit_reloc? I suppose if
people are opposed to a new emit reloc, the only I'd like to see different is
have the functions which need the workaround just call OUT_RELOC, instead of
OUT_RELOC64 (put a comment in the call sites), and make OUT_RELOC call the
drm_intel_bo_clear_supports_48baddress() (which is obviously a nop on pre-gen8
platforms). The OUT_RELOC64 case should be left alone - we shouldn't need to
tell libdrm that I want a 64bit relocation, and it can actually be 64b...

I suspect not many other mesa devs will have an opinion here, but I'm flexible
if they disagree.
Michel Thierry June 25, 2015, 1:10 p.m. UTC | #2
On 6/24/2015 4:51 AM, Ben Widawsky wrote:
> Hi. Feel free to Cc me on patches of this nature. I am far behind on mesa-dev,
> and no longer read intel-gfx. I'm probably one of the sensible people to look at
> this...
>
> On Tue, Jun 23, 2015 at 01:21:27PM +0100, Michel Thierry wrote:
>> Gen8+ supports 48-bit virtual addresses, but some objects must always be
>> allocated inside the 32-bit address range.
>>
>> In specific, any resource used with flat/heapless (0x00000000-0xfffff000)
>> General State Heap (GSH) or Intruction State Heap (ISH) must be in a
>> 32-bit range, because the General State Offset and Instruction State Offset
>> are limited to 32-bits.
>
> I don't think GSH, or ISH are well known terms that have every appeared
> anywhere. I'd just keep the bit after the final comma (...because ...)
>
>>
>> Set provided bo flag when the 4GB limit is not necessary, to be able to use
>> the full address space.
>
> I'm glad you got around to this. We'd been putting it off for a long time.
>
>>
>> Cc: mesa-dev@lists.freedesktop.org
>> Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>> ---
>>   src/mesa/drivers/dri/i965/gen8_misc_state.c   | 6 +++---
>>   src/mesa/drivers/dri/i965/intel_batchbuffer.h | 7 +++++++
>>   2 files changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen8_misc_state.c b/src/mesa/drivers/dri/i965/gen8_misc_state.c
>> index b20038e..26531d0 100644
>> --- a/src/mesa/drivers/dri/i965/gen8_misc_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen8_misc_state.c
>> @@ -41,17 +41,17 @@ void gen8_upload_state_base_address(struct brw_context *brw)
>>      OUT_BATCH(0);
>>      OUT_BATCH(mocs_wb << 16);
>>      /* Surface state base address: */
>> -   OUT_RELOC64(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
>> +   OUT_RELOC64_32BWA(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
>>                  mocs_wb << 4 | 1);
>>      /* Dynamic state base address: */
>> -   OUT_RELOC64(brw->batch.bo,
>> +   OUT_RELOC64_32BWA(brw->batch.bo,
>>                  I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
>>                  mocs_wb << 4 | 1);
>>      /* Indirect object base address: MEDIA_OBJECT data */
>>      OUT_BATCH(mocs_wb << 4 | 1);
>>      OUT_BATCH(0);
>>      /* Instruction base address: shader kernels (incl. SIP) */
>> -   OUT_RELOC64(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
>> +   OUT_RELOC64_32BWA(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
>>                  mocs_wb << 4 | 1);
>>
>>      /* General state buffer size */
>> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> index 7bdd836..5aa741e 100644
>> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> @@ -177,6 +177,13 @@ intel_batchbuffer_advance(struct brw_context *brw)
>>
>>   /* Handle 48-bit address relocations for Gen8+ */
>>   #define OUT_RELOC64(buf, read_domains, write_domain, delta) do { \
>> +   drm_intel_bo_set_supports_48baddress(buf); \
>> +   intel_batchbuffer_emit_reloc64(brw, buf, read_domains, write_domain, delta);	\
>> +} while (0)
>> +
>> +/* Handle 48-bit address relocations for Gen8+, ask for 32-bit address */
>> +#define OUT_RELOC64_32BWA(buf, read_domains, write_domain, delta) do { \
>> +   drm_intel_bo_clear_supports_48baddress(buf); \
>>      intel_batchbuffer_emit_reloc64(brw, buf, read_domains, write_domain, delta);	\
>>   } while (0)
>>
>
> First and least bikesheddy, you need to bump the required libdrm in the
> configure.ac to support this new libdrm function (maybe you did, but I don't see
> it on mesa-dev).

I didn't bump the libdrm version either. I'll make sure to include this 
in the next version.

>
> More bikesheddy, and forgive me here because I haven't looked at any of the
> kernel interfaces or libdrm patches (you can Cc those to mesa-dev if they're
> relevant fwiw).
>
> Presumably at the end of the day it's drm_intel_bo_emit_reloc which needs to
> know about these limitations. Unfortunately we don't have a flags field there.
> The implementation here seems like a somewhat cumbersome workaround for that (it
> looks like the context execbuf which is pretty crappy - yes, I know who the
> author was). Have you already discussed adding a new emit_reloc? I suppose if
> people are opposed to a new emit reloc, the only I'd like to see different is
> have the functions which need the workaround just call OUT_RELOC, instead of
> OUT_RELOC64 (put a comment in the call sites), and make OUT_RELOC call the
> drm_intel_bo_clear_supports_48baddress() (which is obviously a nop on pre-gen8
> platforms). The OUT_RELOC64 case should be left alone - we shouldn't need to
> tell libdrm that I want a 64bit relocation, and it can actually be 64b...

Right, at the end I was only looking to flag the exec_objects that can 
be safely outside the first 4GBs (it was decided to be an opt-in to not 
break any other user). I can change it like you suggest, OUT_RELOC will 
mean that we need the wa, while OUT_RELOC64 means we don't.

If you want to take a look, these are the libdrm and kernel changes:

libdrm: 
http://news.gmane.org/find-root.php?message_id=1435062089-19877-1-git-send-email-michel.thierry@intel.com

i915: 
http://news.gmane.org/find-root.php?message_id=1435062065-19717-1-git-send-email-michel.thierry@intel.com


>
> I suspect not many other mesa devs will have an opinion here, but I'm flexible
> if they disagree.
>

I'll cc you when I send the updated patches.

Thanks,

-Michel

Patch
diff mbox

diff --git a/src/mesa/drivers/dri/i965/gen8_misc_state.c b/src/mesa/drivers/dri/i965/gen8_misc_state.c
index b20038e..26531d0 100644
--- a/src/mesa/drivers/dri/i965/gen8_misc_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_misc_state.c
@@ -41,17 +41,17 @@  void gen8_upload_state_base_address(struct brw_context *brw)
    OUT_BATCH(0);
    OUT_BATCH(mocs_wb << 16);
    /* Surface state base address: */
-   OUT_RELOC64(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
+   OUT_RELOC64_32BWA(brw->batch.bo, I915_GEM_DOMAIN_SAMPLER, 0,
                mocs_wb << 4 | 1);
    /* Dynamic state base address: */
-   OUT_RELOC64(brw->batch.bo,
+   OUT_RELOC64_32BWA(brw->batch.bo,
                I915_GEM_DOMAIN_RENDER | I915_GEM_DOMAIN_INSTRUCTION, 0,
                mocs_wb << 4 | 1);
    /* Indirect object base address: MEDIA_OBJECT data */
    OUT_BATCH(mocs_wb << 4 | 1);
    OUT_BATCH(0);
    /* Instruction base address: shader kernels (incl. SIP) */
-   OUT_RELOC64(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
+   OUT_RELOC64_32BWA(brw->cache.bo, I915_GEM_DOMAIN_INSTRUCTION, 0,
                mocs_wb << 4 | 1);
 
    /* General state buffer size */
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index 7bdd836..5aa741e 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -177,6 +177,13 @@  intel_batchbuffer_advance(struct brw_context *brw)
 
 /* Handle 48-bit address relocations for Gen8+ */
 #define OUT_RELOC64(buf, read_domains, write_domain, delta) do { \
+   drm_intel_bo_set_supports_48baddress(buf); \
+   intel_batchbuffer_emit_reloc64(brw, buf, read_domains, write_domain, delta);	\
+} while (0)
+
+/* Handle 48-bit address relocations for Gen8+, ask for 32-bit address */
+#define OUT_RELOC64_32BWA(buf, read_domains, write_domain, delta) do { \
+   drm_intel_bo_clear_supports_48baddress(buf); \
    intel_batchbuffer_emit_reloc64(brw, buf, read_domains, write_domain, delta);	\
 } while (0)