Message ID | 1493390184-27761-1-git-send-email-oscar.mateo@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Oscar, I had missed this patch here, but noticed now that I was refreshing and testing more cnl tests before re-submitting them. First of all I believe we need to remove the A0 w/a. I don't believe we will ever see one. So I'm removing all A0 exclusive W/a from the patches as well. I also gave a try here on your null state. However if I use the golden state generated by this version I get a blank screen because driver load failes with some strange faults: any idea? [ 4.115243] Memory manager not clean during takedown. [ 4.120389] ------------[ cut here ]------------ [ 4.125068] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mm.c:892 drm_mm_takedown+0x25/0x30 [ 4.133574] Modules linked in: [ 4.136707] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-eywa-46011-g9a19faf #360 [ 4.144650] Hardware name: Intel Corporation Cannonlake Client platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS CNLSFWR1.R00.X075.D01.1703021113 03/02 [ 4.158500] task: ffff880264ab8000 task.stack: ffffc90000038000 [ 4.164506] RIP: 0010:drm_mm_takedown+0x25/0x30 [ 4.169104] RSP: 0000:ffffc9000003bc28 EFLAGS: 00010292 [ 4.174409] RAX: 0000000000000029 RBX: ffff880260a54170 RCX: ffffffff82468740 [ 4.181654] RDX: 0000000000000001 RSI: 0000000000000082 RDI: 00000000ffffffff [ 4.188839] RBP: ffffc9000003bc28 R08: 00000000fffffffe R09: 000000000000035a [ 4.196028] R10: 0000000000000005 R11: 0000000000000000 R12: ffff880260a50000 [ 4.203215] R13: ffff880260a54348 R14: ffff880260a50070 R15: ffff880262844a00 [ 4.210402] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) knlGS:0000000000000000 [ 4.218541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.224344] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: 00000000007406f0 [ 4.231529] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.238716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.245900] PKRU: 00000000 [ 4.248673] Call Trace: [ 4.251193] i915_gem_cleanup_stolen+0x1f/0x30 [ 4.255703] i915_ggtt_cleanup_hw+0xa4/0x170 [ 4.260035] i915_driver_cleanup_hw+0x36/0x40 [ 4.264455] i915_driver_load+0x6a0/0xe70 [ 4.268535] ? _raw_spin_unlock_irqrestore+0x26/0x50 [ 4.273560] i915_pci_probe+0x2c/0x50 [ 4.277293] local_pci_probe+0x45/0xa0 [ 4.281106] ? pci_match_device+0xe0/0x110 [ 4.285265] pci_device_probe+0x135/0x150 [ 4.289343] driver_probe_device+0x288/0x490 [ 4.293676] __driver_attach+0xc9/0xf0 [ 4.297490] ? driver_probe_device+0x490/0x490 [ 4.301999] bus_for_each_dev+0x5d/0x90 [ 4.305902] driver_attach+0x1e/0x20 [ 4.309543] bus_add_driver+0x1d0/0x290 [ 4.313442] driver_register+0x60/0xe0 [ 4.317257] __pci_register_driver+0x5d/0x60 [ 4.321652] i915_init+0x59/0x5c [ 4.324944] ? mipi_dsi_bus_init+0x17/0x17 [ 4.329103] do_one_initcall+0x42/0x180 [ 4.333007] kernel_init_freeable+0x17c/0x202 [ 4.337426] ? set_debug_rodata+0x17/0x17 [ 4.341500] ? rest_init+0x90/0x90 [ 4.344969] kernel_init+0xe/0x110 [ 4.348438] ret_from_fork+0x25/0x30 [ 4.352079] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48 83 c7 38 48 39 c7 75 01 c3 55 48 c7 c7 70 ac 20 82 31 c0 48 89 e5 e8 6b 62 b7 ff <0f> ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 [ 4.371029] ---[ end trace 7d36c2dd72851315 ]--- [ 4.381680] WARN_ON(dev_priv->mm.object_count) [ 4.381698] ------------[ cut here ]------------ [ 4.390921] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/i915/i915_gem.c:4964 i915_gem_load_cleanup+0x10b/0x120 [ 4.400797] Modules linked in: [ 4.403927] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.12.0-eywa-46011-g9a19faf #360 [ 4.413021] Hardware name: Intel Corporation Cannonlake Client platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS CNLSFWR1.R00.X075.D01.1703021113 03/02 [ 4.426884] task: ffff880264ab8000 task.stack: ffffc90000038000 [ 4.432865] RIP: 0010:i915_gem_load_cleanup+0x10b/0x120 [ 4.438157] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292 [ 4.443450] RAX: 0000000000000022 RBX: ffff880260a50000 RCX: ffffffff82468740 [ 4.450642] RDX: 0000000000000001 RSI: 0000000000000082 RDI: 0000000000000202 [ 4.457839] RBP: ffffc9000003bc68 R08: 0000000000000022 R09: 0000000000000389 [ 4.465029] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880260a54678 [ 4.472227] R13: ffff88026446c000 R14: ffff88026446c000 R15: ffff880262844a00 [ 4.479420] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) knlGS:0000000000000000 [ 4.487564] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.493370] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: 00000000007406f0 [ 4.500569] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.507763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.514959] PKRU: 00000000 [ 4.517737] Call Trace: [ 4.520265] i915_driver_cleanup_early+0x1a/0x50 [ 4.524955] i915_driver_load+0x6b8/0xe70 [ 4.529038] ? _raw_spin_unlock_irqrestore+0x26/0x50 [ 4.534100] clocksource: Switched to clocksource tsc [ 4.534105] i915_pci_probe+0x2c/0x50 [ 4.534113] local_pci_probe+0x45/0xa0 [ 4.534118] ? pci_match_device+0xe0/0x110 [ 4.534124] pci_device_probe+0x135/0x150 [ 4.534131] driver_probe_device+0x288/0x490 [ 4.534137] __driver_attach+0xc9/0xf0 [ 4.534142] ? driver_probe_device+0x490/0x490 [ 4.534146] bus_for_each_dev+0x5d/0x90 [ 4.534152] driver_attach+0x1e/0x20 [ 4.534156] bus_add_driver+0x1d0/0x290 [ 4.534162] driver_register+0x60/0xe0 [ 4.534167] __pci_register_driver+0x5d/0x60 [ 4.534173] i915_init+0x59/0x5c [ 4.534177] ? mipi_dsi_bus_init+0x17/0x17 [ 4.534181] do_one_initcall+0x42/0x180 [ 4.534187] kernel_init_freeable+0x17c/0x202 [ 4.534191] ? set_debug_rodata+0x17/0x17 [ 4.534196] ? rest_init+0x90/0x90 [ 4.534200] kernel_init+0xe/0x110 [ 4.534204] ret_from_fork+0x25/0x30 [ 4.534208] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff 0f ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8 05 4f b1 ff <0f> ff e9 33 ff ff ff 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 [ 4.534272] ---[ end trace 7d36c2dd72851316 ]--- [ 4.534277] WARN_ON(!list_empty(&dev_priv->gt.timelines)) [ 4.534293] ------------[ cut here ]------------ [ 4.534298] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/i915/i915_gem.c:4968 i915_gem_load_cleanup+0xef/0x120 [ 4.534299] Modules linked in: [ 4.534304] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.12.0-eywa-46011-g9a19faf #360 [ 4.534306] Hardware name: Intel Corporation Cannonlake Client platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS CNLSFWR1.R00.X075.D01.1703021113 03/02 [ 4.534308] task: ffff880264ab8000 task.stack: ffffc90000038000 [ 4.534312] RIP: 0010:i915_gem_load_cleanup+0xef/0x120 [ 4.534314] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292 [ 4.534317] RAX: 000000000000002d RBX: ffff880260a50000 RCX: 0000000000000000 [ 4.534319] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000296 [ 4.534321] RBP: ffffc9000003bc68 R08: 000000000000002d R09: 000000000000002d [ 4.534322] R10: 0000000000000000 R11: ffff880260a4e000 R12: ffff880260a50070 [ 4.534324] R13: ffff88026446c000 R14: ffff88026446c000 R15: ffff880262844a00 [ 4.534327] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) knlGS:0000000000000000 [ 4.534329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.534331] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: 00000000007406f0 [ 4.534334] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.534335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.534337] PKRU: 00000000 [ 4.534338] Call Trace: [ 4.534344] i915_driver_cleanup_early+0x1a/0x50 [ 4.534350] i915_driver_load+0x6b8/0xe70 [ 4.534356] ? _raw_spin_unlock_irqrestore+0x26/0x50 [ 4.534361] i915_pci_probe+0x2c/0x50 [ 4.534366] local_pci_probe+0x45/0xa0 [ 4.534371] ? pci_match_device+0xe0/0x110 [ 4.534376] pci_device_probe+0x135/0x150 [ 4.534382] driver_probe_device+0x288/0x490 [ 4.534388] __driver_attach+0xc9/0xf0 [ 4.534393] ? driver_probe_device+0x490/0x490 [ 4.534398] bus_for_each_dev+0x5d/0x90 [ 4.534403] driver_attach+0x1e/0x20 [ 4.534408] bus_add_driver+0x1d0/0x290 [ 4.534414] driver_register+0x60/0xe0 [ 4.534419] __pci_register_driver+0x5d/0x60 [ 4.534424] i915_init+0x59/0x5c [ 4.534428] ? mipi_dsi_bus_init+0x17/0x17 [ 4.534431] do_one_initcall+0x42/0x180 [ 4.534437] kernel_init_freeable+0x17c/0x202 [ 4.534440] ? set_debug_rodata+0x17/0x17 [ 4.534444] ? rest_init+0x90/0x90 [ 4.534448] kernel_init+0xe/0x110 [ 4.534451] ret_from_fork+0x25/0x30 [ 4.534455] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 3d 4f b1 ff 0f ff e9 5d ff ff ff 48 c7 c6 b0 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff <0f> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 [ 4.534519] ---[ end trace 7d36c2dd72851317 ]--- [ 4.534605] ============================================================================= [ 4.534608] BUG drm_i915_gem_object (Tainted: G W ): Objects remaining in drm_i915_gem_object on __kmem_cache_shutdown() [ 4.534609] ----------------------------------------------------------------------------- [ 4.534611] Disabling lock debugging due to kernel taint [ 4.534614] INFO: Slab 0xffffea0009820600 objects=19 used=2 fp=0xffff88026081ba80 flags=0x200000000008100 [ 4.534618] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B W 4.12.0-eywa-46011-g9a19faf #360 [ 4.534620] Hardware name: Intel Corporation Cannonlake Client platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS CNLSFWR1.R00.X075.D01.1703021113 03/02 [ 4.534621] Call Trace: [ 4.534626] dump_stack+0x65/0x89 [ 4.534633] slab_err+0xa1/0xb0 [ 4.534640] ? __kmalloc+0x185/0x270 [ 4.534645] ? kmem_cache_alloc_bulk+0x1f0/0x1f0 [ 4.534650] ? __kmem_cache_shutdown+0x160/0x400 [ 4.534655] __kmem_cache_shutdown+0x180/0x400 [ 4.534663] shutdown_cache+0x18/0x1a0 [ 4.534667] kmem_cache_destroy+0x1c1/0x1f0 [ 4.534672] i915_gem_load_cleanup+0xb4/0x120 [ 4.534677] i915_driver_cleanup_early+0x1a/0x50 [ 4.534682] i915_driver_load+0x6b8/0xe70 [ 4.534689] ? _raw_spin_unlock_irqrestore+0x26/0x50 [ 4.534693] i915_pci_probe+0x2c/0x50 [ 4.534698] local_pci_probe+0x45/0xa0 [ 4.534703] ? pci_match_device+0xe0/0x110 [ 4.534708] pci_device_probe+0x135/0x150 [ 4.534714] driver_probe_device+0x288/0x490 [ 4.534721] __driver_attach+0xc9/0xf0 [ 4.534726] ? driver_probe_device+0x490/0x490 [ 4.534730] bus_for_each_dev+0x5d/0x90 [ 4.534736] driver_attach+0x1e/0x20 [ 4.534741] bus_add_driver+0x1d0/0x290 [ 4.534746] driver_register+0x60/0xe0 [ 4.534751] __pci_register_driver+0x5d/0x60 [ 4.534756] i915_init+0x59/0x5c [ 4.534760] ? mipi_dsi_bus_init+0x17/0x17 4.534760] ? mipi_dsi_bus_init+0x17/0x17 [ 4.534763] do_one_initcall+0x42/0x180 [ 4.534769] kernel_init_freeable+0x17c/0x202 [ 4.534773] ? set_debug_rodata+0x17/0x17 [ 4.534777] ? rest_init+0x90/0x90 [ 4.534781] kernel_init+0xe/0x110 [ 4.534784] ret_from_fork+0x25/0x30 [ 4.534791] INFO: Object 0xffff880260818340 @offset=832 [ 4.534792] INFO: Object 0xffff880260818680 @offset=1664 [ 4.534795] kmem_cache_destroy drm_i915_gem_object: Slab cache still has objects [ 4.534798] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B W 4.12.0-eywa-46011-g9a19faf #360 [ 4.534800] Hardware name: Intel Corporation Cannonlake Client platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS CNLSFWR1.R00.X075.D01.1703021113 03/02 [ 4.534801] Call Trace: [ 4.534805] dump_stack+0x65/0x89 [ 4.534809] kmem_cache_destroy+0x1e1/0x1f0 [ 4.534814] i915_gem_load_cleanup+0xb4/0x120 [ 4.534819] i915_driver_cleanup_early+0x1a/0x50 [ 4.534824] i915_driver_load+0x6b8/0xe70 [ 4.534830] ? _raw_spin_unlock_irqrestore+0x26/0x50 [ 4.534835] i915_pci_probe+0x2c/0x50 [ 4.534840] local_pci_probe+0x45/0xa0 [ 4.534844] ? pci_match_device+0xe0/0x110 [ 4.534850] pci_device_probe+0x135/0x150 [ 4.534856] driver_probe_device+0x288/0x490 [ 4.534862] __driver_attach+0xc9/0xf0 [ 4.534867] ? driver_probe_device+0x490/0x490 [ 4.534871] bus_for_each_dev+0x5d/0x90 [ 4.534877] driver_attach+0x1e/0x20 [ 4.534882] bus_add_driver+0x1d0/0x290 [ 4.534888] driver_register+0x60/0xe0 [ 4.534893] __pci_register_driver+0x5d/0x60 [ 4.534897] i915_init+0x59/0x5c [ 4.534901] ? mipi_dsi_bus_init+0x17/0x17 [ 4.534904] do_one_initcall+0x42/0x180 [ 4.534910] kernel_init_freeable+0x17c/0x202 [ 4.534914] ? set_debug_rodata+0x17/0x17 [ 4.534917] ? rest_init+0x90/0x90 [ 4.534922] kernel_init+0xe/0x110 [ 4.534925] ret_from_fork+0x25/0x30 [ 4.535386] i915 0000:00:02.0: [drm:i915_driver_load] Device initialization failed (-22) [ 4.535390] i915 0000:00:02.0: Please file a bug at https://bugs.freedesktop.org/enter_bug.cgi?product=DRI against DRM/Intel providing the dmesg log by booting with drm.debug=0xf [ 4.535450] i915: probe of 0000:00:02.0 failed with error -22 On Fri, Apr 28, 2017 at 7:36 AM, Oscar Mateo <oscar.mateo@intel.com> wrote: > This batchbuffer is over 4096 bytes, so we need to increase the size of the > array (and the KMD has to be modified to deal with more than one page). > > Notice that there to workarounds embedded here, both applicable to all CNL > steppings. > > v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so update > the comment in the code and in the commit message. > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: Ben Widawsky <ben@bwidawsk.net> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> > --- > lib/gen10_render.h | 63 +++ > tools/null_state_gen/Makefile.am | 3 +- > tools/null_state_gen/intel_batchbuffer.h | 2 +- > tools/null_state_gen/intel_null_state_gen.c | 5 +- > tools/null_state_gen/intel_renderstate.h | 1 + > tools/null_state_gen/intel_renderstate_gen10.c | 538 +++++++++++++++++++++++++ > 6 files changed, 609 insertions(+), 3 deletions(-) > create mode 100644 lib/gen10_render.h > create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c > > diff --git a/lib/gen10_render.h b/lib/gen10_render.h > new file mode 100644 > index 0000000..f4a7dff > --- /dev/null > +++ b/lib/gen10_render.h > @@ -0,0 +1,63 @@ > +#ifndef GEN10_RENDER_H > +#define GEN10_RENDER_H > + > +#include "gen9_render.h" > + > +#define GEN7_MI_RS_CONTROL (0x6 << 23) > +# define GEN7_MI_RS_CONTROL_ENABLE (1 << 0) > + > +#define GEN10_3DSTATE_GATHER_POOL_ALLOC GEN6_3D(3, 1, 0x1a) > +# define GEN10_3DSTATE_GATHER_POOL_ENABLE (1 << 11) > + > +#define GEN10_3DSTATE_GATHER_CONSTANT_VS GEN6_3D(3, 0, 0x34) > +#define GEN10_3DSTATE_GATHER_CONSTANT_HS GEN6_3D(3, 0, 0x36) > +#define GEN10_3DSTATE_GATHER_CONSTANT_DS GEN6_3D(3, 0, 0x37) > +#define GEN10_3DSTATE_GATHER_CONSTANT_GS GEN6_3D(3, 0, 0x35) > +#define GEN10_3DSTATE_GATHER_CONSTANT_PS GEN6_3D(3, 0, 0x38) > + > +#define GEN10_3DSTATE_WM_DEPTH_STENCIL GEN6_3D(3, 0, 0x4e) > +#define GEN10_3DSTATE_WM_CHROMAKEY GEN6_3D(3, 0, 0x4c) > + > +#define GEN8_REG_L3_CACHE_CONFIG 0x7034 > + > +/* > + * Programming for L3 cache allocations can be made per bank. Based on the > + * programmed value HW will apply same allocations on other available banks. > + * Total L3 Cache size per bank = 256 KB. > + * {SLM, URB, DC, RO(I/S, C, T), L3 Client Pool} > + * { 0, 96, 32, 128, 0 } > + */ > +#define GEN10_L3_CACHE_CONFIG_VALUE 0x00420060 > + > +#define URB_ALIGN(val, align) ((val % align) ? (val - (val % align)) : val) > + > +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES 64 > +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES 2752 > + > +#define GEN10_KB_PER_URB_INDEX 8 > +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB 96 > + > +#define GEN10_URB_RESERVED_SIZE_KB 32 > +#define GEN10_URB_RESERVED_END_SIZE_KB 8 > + > +#define GEN10_VS_NUM_BITS_PER_URB_UNIT 512 > +#define GEN10_VS_NUM_OF_URB_UNITS 1 // zero based > +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS (GEN10_VS_NUM_BITS_PER_URB_UNIT * \ > + (GEN10_VS_NUM_OF_URB_UNITS + 1)) > + > +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / GEN10_KB_PER_URB_INDEX) > + > +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count) \ > + URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX) > + > +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) \ > + (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - GEN10_URB_RESERVED_END_SIZE_KB) > + > +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice) \ > + ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) * \ > + 1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS) > + > +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice) \ > + ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / GEN10_KB_PER_URB_INDEX) > + > +#endif > diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am > index 24884a7..2f90990 100644 > --- a/tools/null_state_gen/Makefile.am > +++ b/tools/null_state_gen/Makefile.am > @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES = \ > intel_renderstate_gen7.c \ > intel_renderstate_gen8.c \ > intel_renderstate_gen9.c \ > + intel_renderstate_gen10.c \ > intel_null_state_gen.c > > -gens := 6 7 8 9 > +gens := 6 7 8 9 10 > > h = /tmp/intel_renderstate_gen$$gen.c > states: intel_null_state_gen > diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h > index 771d1c8..e40e01b 100644 > --- a/tools/null_state_gen/intel_batchbuffer.h > +++ b/tools/null_state_gen/intel_batchbuffer.h > @@ -34,7 +34,7 @@ > #include <stdint.h> > > #define MAX_RELOCS 64 > -#define MAX_ITEMS 1024 > +#define MAX_ITEMS 2048 > #define MAX_STRLEN 256 > > #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1)) > diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c > index 06eb954..4f12f5f 100644 > --- a/tools/null_state_gen/intel_null_state_gen.c > +++ b/tools/null_state_gen/intel_null_state_gen.c > @@ -41,7 +41,7 @@ static int debug = 0; > static void print_usage(char *s) > { > fprintf(stderr, "%s: <gen>\n" > - " gen: gen to generate for (6,7,8,9)\n", > + " gen: gen to generate for (6,7,8,9,10)\n", > s); > } > > @@ -173,6 +173,9 @@ static int do_generate(int gen) > case 9: > null_state_gen = gen9_setup_null_render_state; > break; > + case 10: > + null_state_gen = gen10_setup_null_render_state; > + break; > } > > if (null_state_gen == NULL) { > diff --git a/tools/null_state_gen/intel_renderstate.h b/tools/null_state_gen/intel_renderstate.h > index b27b434..b3c8c2b 100644 > --- a/tools/null_state_gen/intel_renderstate.h > +++ b/tools/null_state_gen/intel_renderstate.h > @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct intel_batchbuffer *batch); > void gen7_setup_null_render_state(struct intel_batchbuffer *batch); > void gen8_setup_null_render_state(struct intel_batchbuffer *batch); > void gen9_setup_null_render_state(struct intel_batchbuffer *batch); > +void gen10_setup_null_render_state(struct intel_batchbuffer *batch); > > #endif /* __INTEL_RENDERSTATE_H__ */ > diff --git a/tools/null_state_gen/intel_renderstate_gen10.c b/tools/null_state_gen/intel_renderstate_gen10.c > new file mode 100644 > index 0000000..f5678c3 > --- /dev/null > +++ b/tools/null_state_gen/intel_renderstate_gen10.c > @@ -0,0 +1,538 @@ > +/* > + * Copyright © 2014 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > + * DEALINGS IN THE SOFTWARE. > + * > + * Authors: > + * Oscar Mateo <oscar.mateo@intel.com> > + */ > + > +#include "intel_renderstate.h" > +#include <lib/gen10_render.h> > +#include <lib/intel_reg.h> > + > +static void gen8_emit_wm(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2)); > + OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION); > +} > + > +static void gen8_emit_ps(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2)); > + OUT_BATCH(0); > + OUT_BATCH(0); /* kernel hi */ > + OUT_BATCH(GEN7_PS_SPF_MODE); > + OUT_BATCH(0); /* scratch space stuff */ > + OUT_BATCH(0); /* scratch hi */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); // kernel 1 > + OUT_BATCH(0); /* kernel 1 hi */ > + OUT_BATCH(0); // kernel 2 > + OUT_BATCH(0); /* kernel 2 hi */ > +} > + > +static void gen8_emit_sf(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2)); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT | > + 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT | > + GEN7_SF_POINT_WIDTH_FROM_SOURCE | > + 8); > +} > + > +static void gen8_emit_vs(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2)); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > +} > + > +static void gen8_emit_hs(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2)); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT); > + OUT_BATCH(0); > +} > + > +static void gen8_emit_raster(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2)); > + OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW); > + OUT_BATCH(0.0); > + OUT_BATCH(0.0); > + OUT_BATCH(0.0); > +} > + > +static void gen10_emit_urb(struct intel_batchbuffer *batch) > +{ > + /* Smallest SKU: 3x8*/ > + int l3_bank_count = 3; > + int slice_count = 1; > + int urb_size_per_slice = GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count); > + int other_urb_start_addr = GEN10_VS_END_URB_INDEX(urb_size_per_slice); > + const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX; > + const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS; > + int vs_urb_entries = GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice); > + > + if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES) > + vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES; > + if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES) > + vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES; > + > + OUT_BATCH(GEN7_3DSTATE_URB_VS); > + OUT_BATCH(vs_urb_entries | > + (vs_urb_alloc_size << 16) | > + (vs_urb_start_addr << 25)); > + > + OUT_BATCH(GEN7_3DSTATE_URB_HS); > + OUT_BATCH(other_urb_start_addr << 25); > + > + OUT_BATCH(GEN7_3DSTATE_URB_DS); > + OUT_BATCH(other_urb_start_addr << 25); > + > + OUT_BATCH(GEN7_3DSTATE_URB_GS); > + OUT_BATCH(other_urb_start_addr << 25); > +} > + > +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY); > + OUT_BATCH(_3DPRIM_TRILIST); > +} > + > +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch) > +{ > + const int num_decls = 128; > + int i; > + > + OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST | > + (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */); > + OUT_BATCH(0); > + OUT_BATCH(num_decls); > + > + for (i = 0; i < num_decls; i++) { > + OUT_BATCH(0); > + OUT_BATCH(0); > + } > +} > + > +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index) > +{ > + OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2)); > + OUT_BATCH(index << 29); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > +} > + > +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index) > +{ > + OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2)); > + OUT_BATCH(index << 30); > + OUT_BATCH(0); > + OUT_BATCH(0); > +} > + > +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch) > +{ > + const int buffers = 33; > + int i; > + > + OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS | > + (((4 * buffers) + 1)- 2) /* DWORD count - 2 */); > + > + for (i = 0; i < buffers; i++) { > + OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT | > + GEN7_VB0_BUFFER_ADDR_MOD_EN); > + OUT_BATCH(0); /* Address */ > + OUT_BATCH(0); > + OUT_BATCH(0); > + } > +} > + > +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch) > +{ > + const int elements = 34; > + int i; > + > + OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS | > + (((2 * elements) + 1) - 2) /* DWORD count - 2 */); > + > + /* Element 0 */ > + OUT_BATCH(VE0_VALID); > + OUT_BATCH( > + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT | > + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT | > + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT | > + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT); > + /* Elements 1 -> 33 */ > + for (i = 1; i < elements; i++) { > + OUT_BATCH(0); > + OUT_BATCH(0); > + } > +} > + > +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch) > +{ > + union { > + float fval; > + uint32_t uval; > + } u; > + > + unsigned offset; > + > + u.fval = 1.0f; > + > + offset = intel_batch_state_offset(batch, 64); > + OUT_STATE(0); > + OUT_STATE(0); /* Alpha reference value */ > + OUT_STATE(u.uval); /* Blend constant color RED */ > + OUT_STATE(u.uval); /* Blend constant color BLUE */ > + OUT_STATE(u.uval); /* Blend constant color GREEN */ > + OUT_STATE(u.uval); /* Blend constant color ALPHA */ > + > + OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS); > + OUT_BATCH_STATE_OFFSET(offset | 1); > +} > + > +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch) > +{ > + unsigned offset; > + int i; > + > + offset = intel_batch_state_offset(batch, 64); > + > + for (i = 0; i < 17; i++) > + OUT_STATE(0); > + > + OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2)); > + OUT_BATCH_STATE_OFFSET(offset | 1); > +} > + > +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2)); > + OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID | > + GEN8_PSX_ATTRIBUTE_ENABLE); > + > +} > + > +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2)); > + OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT); > +} > + > +static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch) > +{ > + unsigned offset; > + > + offset = intel_batch_state_offset(batch, 32); > + > + OUT_STATE((uint32_t)0.0f); /* Minimum depth */ > + OUT_STATE((uint32_t)0.0f); /* Maximum depth */ > + > + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2)); > + OUT_BATCH_STATE_OFFSET(offset); > +} > + > +static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch) > +{ > + unsigned offset; > + int i; > + > + offset = intel_batch_state_offset(batch, 64); > + > + for (i = 0; i < 16; i++) > + OUT_STATE(0); > + > + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2)); > + OUT_BATCH_STATE_OFFSET(offset); > +} > + > +static void gen8_emit_primitive(struct intel_batchbuffer *batch) > +{ > + OUT_BATCH(GEN6_3DPRIMITIVE | (10-2)); > + OUT_BATCH(4); /* gen8+ ignore the topology type field */ > + OUT_BATCH(1); /* vertex count */ > + OUT_BATCH(0); > + OUT_BATCH(1); /* single instance */ > + OUT_BATCH(0); /* start instance location */ > + OUT_BATCH(0); /* index buffer offset, ignored */ > + OUT_BATCH(0); /* extended parameter 0 */ > + OUT_BATCH(0); /* extended parameter 1 */ > + OUT_BATCH(0); /* extended parameter 2 */ > +} > + > +static void gen9_emit_state_base_address(struct intel_batchbuffer *batch) { > + const unsigned offset = 0; > + OUT_BATCH(GEN6_STATE_BASE_ADDRESS | > + (22 - 2) /* DWORD count - 2 */); > + > + /* general state base address - requires BB address > + * added to state offset to be stored in this location > + */ > + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + > + /* stateless data port */ > + OUT_BATCH(0); > + > + /* surface state base address - requires BB address > + * added to state offset to be stored in this location > + */ > + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + > + /* dynamic state base address - requires BB address > + * added to state offset to be stored in this location > + */ > + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + > + /* indirect state base address */ > + OUT_BATCH(BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + > + /* instruction state base address - requires BB address > + * added to state offset to be stored in this location > + */ > + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + > + /* general state buffer size */ > + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); > + /* dynamic state buffer size */ > + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); > + /* indirect object buffer size */ > + OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY); > + /* intruction buffer size */ > + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); > + > + /* bindless surface state base address */ > + OUT_BATCH(BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + /* bindless surface state size */ > + OUT_BATCH(0); > + > + /* bindless sampler state base address */ > + OUT_BATCH(BASE_ADDRESS_MODIFY); > + OUT_BATCH(0); > + /* bindless sampler state size */ > + OUT_BATCH(0); > +} > + > +/* > + * Generate the batch buffer commands needed to initialize the 3D engine > + * to its "golden state". > + */ > +void gen10_setup_null_render_state(struct intel_batchbuffer *batch) > +{ > + int i; > + > + /* WaRsGatherPoolEnable: cnl */ > + OUT_BATCH(GEN7_MI_RS_CONTROL); > + > +#define GEN8_PIPE_CONTROL_GLOBAL_GTT (1 << 24) > + /* PIPE_CONTROL */ > + OUT_BATCH(GEN6_PIPE_CONTROL | > + (6 - 2)); /* DWORD count - 2 */ > + OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + /* PIPELINE_SELECT */ > + OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D); > + > + OUT_BATCH(MI_LOAD_REGISTER_IMM); > + OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG); > + OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE); > + > + gen8_emit_wm(batch); > + gen8_emit_ps(batch); > + gen8_emit_sf(batch); > + > + OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */ > + OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11); > + > + gen8_emit_vs(batch); > + gen8_emit_hs(batch); > + > + OUT_CMD(GEN7_3DSTATE_GS, 10); > + OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5); > + OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */ > + OUT_CMD(GEN6_3DSTATE_CLIP, 4); > + OUT_CMD(GEN7_3DSTATE_TE, 4); > + OUT_CMD(GEN8_3DSTATE_VF, 2); > + OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5); > + > + /* URB States */ > + gen10_emit_urb(batch); > + > + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130); > + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130); > + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130); > + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130); > + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130); > + > + OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4); > + OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4); > + OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4); > + > + /* Push Constants */ > + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2); > + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2); > + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2); > + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2); > + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2); > + > + /* Constants */ > + OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11); > + OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11); > + OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11); > + OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11); > + OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11); > + > + OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3); > + OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2); > + gen8_emit_vf_topology(batch); > + > + /* Streamer out declaration list */ > + gen8_emit_so_decl_list(batch); > + > + /* Streamer out buffers */ > + for (i = 0; i < 4; i++) { > + gen8_emit_so_buffer(batch, i); > + } > + > + /* State base addresses */ > + gen9_emit_state_base_address(batch); > + > + OUT_CMD(GEN6_STATE_SIP, 3); > + OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4); > + OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8); > + > + /* Chroma key */ > + for (i = 0; i < 4; i++) { > + gen8_emit_chroma_key(batch, i); > + } > + > + OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3); > + OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3); > + OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5); > + OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5); > + OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3); > + OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2); > + > + /* WaPSRandomCSNotDone:cnl */ > +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) > + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); > + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2); > + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2); > + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32); > + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16); > + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16); > + OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5); > + > + /* Vertex buffers */ > + gen8_emit_vertex_buffers(batch); > + gen8_emit_vertex_elements(batch); > + > + OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */); > + > + /* 3D state binding table pointers */ > + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2); > + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2); > + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2); > + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2); > + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2); > + > + gen8_emit_cc_state_pointers(batch); > + gen8_emit_blend_state_pointers(batch); > + gen8_emit_ps_extra(batch); > + gen8_emit_ps_blend(batch); > + > + /* 3D state sampler state pointers */ > + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2); > + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2); > + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2); > + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2); > + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2); > + > + OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2); > + > + gen8_emit_viewport_state_pointers_cc(batch); > + gen8_emit_viewport_state_pointers_sf_clip(batch); > + > + /* WaPSRandomCSNotDone:cnl */ > +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) > + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); > + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + OUT_BATCH(0); > + > + gen8_emit_raster(batch); > + > + OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4); > + OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2); > + > + /* Launch 3D operation */ > + gen8_emit_primitive(batch); > + > + /* WaRsGatherPoolEnable: cnl */ > + OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE); > + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2)); > + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE); > + OUT_BATCH(0); > + OUT_BATCH(0xfffff << 12); > + OUT_BATCH(GEN7_MI_RS_CONTROL); > + OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4); > + > + OUT_BATCH(MI_BATCH_BUFFER_END); > +} > -- > 1.9.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 07/05/2017 05:50 PM, Rodrigo Vivi wrote: > Hi Oscar, Hey! > I had missed this patch here, but noticed now that I was refreshing > and testing more cnl tests before re-submitting them. > > First of all I believe we need to remove the A0 w/a. I don't believe > we will ever see one. So I'm removing all A0 exclusive W/a from the > patches as well. Be careful: I think both WAs in the patch are for all steppings (one was incorrectly marked as A0 only in v1 of this patch). > I also gave a try here on your null state. However if I use the golden > state generated by this version I get a blank screen because driver > load failes with some strange faults: Good. I don't have a CNL so it was only compile-tested. > any idea? Did you also include the i915 patch to allow golden BBs over one page in size? I sent it separately as "drm/i915: Allow null render state batchbuffers bigger than one page". BTW: this patch was given a cold shoulder in the mailing list, since I could not re-justify why null state was needed in the first place (since UMD needs to configure the 3D pipeline first thing anyway). I am still trying to get a better explanation from HW people. -- Oscar > [ 4.115243] Memory manager not clean during takedown. > [ 4.120389] ------------[ cut here ]------------ > [ 4.125068] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mm.c:892 > drm_mm_takedown+0x25/0x30 > [ 4.133574] Modules linked in: > [ 4.136707] CPU: 0 PID: 1 Comm: swapper/0 Not tainted > 4.12.0-eywa-46011-g9a19faf #360 > [ 4.144650] Hardware name: Intel Corporation Cannonlake Client > platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS > CNLSFWR1.R00.X075.D01.1703021113 03/02 > [ 4.158500] task: ffff880264ab8000 task.stack: ffffc90000038000 > [ 4.164506] RIP: 0010:drm_mm_takedown+0x25/0x30 > [ 4.169104] RSP: 0000:ffffc9000003bc28 EFLAGS: 00010292 > [ 4.174409] RAX: 0000000000000029 RBX: ffff880260a54170 RCX: > ffffffff82468740 > [ 4.181654] RDX: 0000000000000001 RSI: 0000000000000082 RDI: > 00000000ffffffff > [ 4.188839] RBP: ffffc9000003bc28 R08: 00000000fffffffe R09: > 000000000000035a > [ 4.196028] R10: 0000000000000005 R11: 0000000000000000 R12: > ffff880260a50000 > [ 4.203215] R13: ffff880260a54348 R14: ffff880260a50070 R15: > ffff880262844a00 > [ 4.210402] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) > knlGS:0000000000000000 > [ 4.218541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 4.224344] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: > 00000000007406f0 > [ 4.231529] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 4.238716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [ 4.245900] PKRU: 00000000 > [ 4.248673] Call Trace: > [ 4.251193] i915_gem_cleanup_stolen+0x1f/0x30 > [ 4.255703] i915_ggtt_cleanup_hw+0xa4/0x170 > [ 4.260035] i915_driver_cleanup_hw+0x36/0x40 > [ 4.264455] i915_driver_load+0x6a0/0xe70 > [ 4.268535] ? _raw_spin_unlock_irqrestore+0x26/0x50 > [ 4.273560] i915_pci_probe+0x2c/0x50 > [ 4.277293] local_pci_probe+0x45/0xa0 > [ 4.281106] ? pci_match_device+0xe0/0x110 > [ 4.285265] pci_device_probe+0x135/0x150 > [ 4.289343] driver_probe_device+0x288/0x490 > [ 4.293676] __driver_attach+0xc9/0xf0 > [ 4.297490] ? driver_probe_device+0x490/0x490 > [ 4.301999] bus_for_each_dev+0x5d/0x90 > [ 4.305902] driver_attach+0x1e/0x20 > [ 4.309543] bus_add_driver+0x1d0/0x290 > [ 4.313442] driver_register+0x60/0xe0 > [ 4.317257] __pci_register_driver+0x5d/0x60 > [ 4.321652] i915_init+0x59/0x5c > [ 4.324944] ? mipi_dsi_bus_init+0x17/0x17 > [ 4.329103] do_one_initcall+0x42/0x180 > [ 4.333007] kernel_init_freeable+0x17c/0x202 > [ 4.337426] ? set_debug_rodata+0x17/0x17 > [ 4.341500] ? rest_init+0x90/0x90 > [ 4.344969] kernel_init+0xe/0x110 > [ 4.348438] ret_from_fork+0x25/0x30 > [ 4.352079] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48 > 83 c7 38 48 39 c7 75 01 c3 55 48 c7 c7 70 ac 20 82 31 c0 48 89 e5 e8 > 6b 62 b7 ff <0f> ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 > e5 41 > [ 4.371029] ---[ end trace 7d36c2dd72851315 ]--- > [ 4.381680] WARN_ON(dev_priv->mm.object_count) > [ 4.381698] ------------[ cut here ]------------ > [ 4.390921] WARNING: CPU: 0 PID: 1 at > drivers/gpu/drm/i915/i915_gem.c:4964 i915_gem_load_cleanup+0x10b/0x120 > [ 4.400797] Modules linked in: > [ 4.403927] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W > 4.12.0-eywa-46011-g9a19faf #360 > [ 4.413021] Hardware name: Intel Corporation Cannonlake Client > platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS > CNLSFWR1.R00.X075.D01.1703021113 03/02 > [ 4.426884] task: ffff880264ab8000 task.stack: ffffc90000038000 > [ 4.432865] RIP: 0010:i915_gem_load_cleanup+0x10b/0x120 > [ 4.438157] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292 > [ 4.443450] RAX: 0000000000000022 RBX: ffff880260a50000 RCX: > ffffffff82468740 > [ 4.450642] RDX: 0000000000000001 RSI: 0000000000000082 RDI: > 0000000000000202 > [ 4.457839] RBP: ffffc9000003bc68 R08: 0000000000000022 R09: > 0000000000000389 > [ 4.465029] R10: 0000000000000000 R11: 0000000000000001 R12: > ffff880260a54678 > [ 4.472227] R13: ffff88026446c000 R14: ffff88026446c000 R15: > ffff880262844a00 > [ 4.479420] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) > knlGS:0000000000000000 > [ 4.487564] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 4.493370] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: > 00000000007406f0 > [ 4.500569] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 4.507763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [ 4.514959] PKRU: 00000000 > [ 4.517737] Call Trace: > [ 4.520265] i915_driver_cleanup_early+0x1a/0x50 > [ 4.524955] i915_driver_load+0x6b8/0xe70 > [ 4.529038] ? _raw_spin_unlock_irqrestore+0x26/0x50 > [ 4.534100] clocksource: Switched to clocksource tsc > [ 4.534105] i915_pci_probe+0x2c/0x50 > [ 4.534113] local_pci_probe+0x45/0xa0 > [ 4.534118] ? pci_match_device+0xe0/0x110 > [ 4.534124] pci_device_probe+0x135/0x150 > [ 4.534131] driver_probe_device+0x288/0x490 > [ 4.534137] __driver_attach+0xc9/0xf0 > [ 4.534142] ? driver_probe_device+0x490/0x490 > [ 4.534146] bus_for_each_dev+0x5d/0x90 > [ 4.534152] driver_attach+0x1e/0x20 > [ 4.534156] bus_add_driver+0x1d0/0x290 > [ 4.534162] driver_register+0x60/0xe0 > [ 4.534167] __pci_register_driver+0x5d/0x60 > [ 4.534173] i915_init+0x59/0x5c > [ 4.534177] ? mipi_dsi_bus_init+0x17/0x17 > [ 4.534181] do_one_initcall+0x42/0x180 > [ 4.534187] kernel_init_freeable+0x17c/0x202 > [ 4.534191] ? set_debug_rodata+0x17/0x17 > [ 4.534196] ? rest_init+0x90/0x90 > [ 4.534200] kernel_init+0xe/0x110 > [ 4.534204] ret_from_fork+0x25/0x30 > [ 4.534208] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff 0f > ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8 > 05 4f b1 ff <0f> ff e9 33 ff ff ff 66 66 66 66 66 2e 0f 1f 84 00 00 00 > 00 00 > [ 4.534272] ---[ end trace 7d36c2dd72851316 ]--- > [ 4.534277] WARN_ON(!list_empty(&dev_priv->gt.timelines)) > [ 4.534293] ------------[ cut here ]------------ > [ 4.534298] WARNING: CPU: 0 PID: 1 at > drivers/gpu/drm/i915/i915_gem.c:4968 i915_gem_load_cleanup+0xef/0x120 > [ 4.534299] Modules linked in: > [ 4.534304] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W > 4.12.0-eywa-46011-g9a19faf #360 > [ 4.534306] Hardware name: Intel Corporation Cannonlake Client > platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS > CNLSFWR1.R00.X075.D01.1703021113 03/02 > [ 4.534308] task: ffff880264ab8000 task.stack: ffffc90000038000 > [ 4.534312] RIP: 0010:i915_gem_load_cleanup+0xef/0x120 > [ 4.534314] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292 > [ 4.534317] RAX: 000000000000002d RBX: ffff880260a50000 RCX: > 0000000000000000 > [ 4.534319] RDX: 0000000000000001 RSI: 0000000000000002 RDI: > 0000000000000296 > [ 4.534321] RBP: ffffc9000003bc68 R08: 000000000000002d R09: > 000000000000002d > [ 4.534322] R10: 0000000000000000 R11: ffff880260a4e000 R12: > ffff880260a50070 > [ 4.534324] R13: ffff88026446c000 R14: ffff88026446c000 R15: > ffff880262844a00 > [ 4.534327] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) > knlGS:0000000000000000 > [ 4.534329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 4.534331] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: > 00000000007406f0 > [ 4.534334] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [ 4.534335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [ 4.534337] PKRU: 00000000 > [ 4.534338] Call Trace: > [ 4.534344] i915_driver_cleanup_early+0x1a/0x50 > [ 4.534350] i915_driver_load+0x6b8/0xe70 > [ 4.534356] ? _raw_spin_unlock_irqrestore+0x26/0x50 > [ 4.534361] i915_pci_probe+0x2c/0x50 > [ 4.534366] local_pci_probe+0x45/0xa0 > [ 4.534371] ? pci_match_device+0xe0/0x110 > [ 4.534376] pci_device_probe+0x135/0x150 > [ 4.534382] driver_probe_device+0x288/0x490 > [ 4.534388] __driver_attach+0xc9/0xf0 > [ 4.534393] ? driver_probe_device+0x490/0x490 > [ 4.534398] bus_for_each_dev+0x5d/0x90 > [ 4.534403] driver_attach+0x1e/0x20 > [ 4.534408] bus_add_driver+0x1d0/0x290 > [ 4.534414] driver_register+0x60/0xe0 > [ 4.534419] __pci_register_driver+0x5d/0x60 > [ 4.534424] i915_init+0x59/0x5c > [ 4.534428] ? mipi_dsi_bus_init+0x17/0x17 > [ 4.534431] do_one_initcall+0x42/0x180 > [ 4.534437] kernel_init_freeable+0x17c/0x202 > [ 4.534440] ? set_debug_rodata+0x17/0x17 > [ 4.534444] ? rest_init+0x90/0x90 > [ 4.534448] kernel_init+0xe/0x110 > [ 4.534451] ret_from_fork+0x25/0x30 > [ 4.534455] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 3d 4f b1 ff 0f > ff e9 5d ff ff ff 48 c7 c6 b0 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8 > 21 4f b1 ff <0f> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 > 1a 82 > [ 4.534519] ---[ end trace 7d36c2dd72851317 ]--- > [ 4.534605] ============================================================================= > [ 4.534608] BUG drm_i915_gem_object (Tainted: G W ): > Objects remaining in drm_i915_gem_object on __kmem_cache_shutdown() > [ 4.534609] ----------------------------------------------------------------------------- > > [ 4.534611] Disabling lock debugging due to kernel taint > [ 4.534614] INFO: Slab 0xffffea0009820600 objects=19 used=2 > fp=0xffff88026081ba80 flags=0x200000000008100 > [ 4.534618] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B W > 4.12.0-eywa-46011-g9a19faf #360 > [ 4.534620] Hardware name: Intel Corporation Cannonlake Client > platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS > CNLSFWR1.R00.X075.D01.1703021113 03/02 > [ 4.534621] Call Trace: > [ 4.534626] dump_stack+0x65/0x89 > [ 4.534633] slab_err+0xa1/0xb0 > [ 4.534640] ? __kmalloc+0x185/0x270 > [ 4.534645] ? kmem_cache_alloc_bulk+0x1f0/0x1f0 > [ 4.534650] ? __kmem_cache_shutdown+0x160/0x400 > [ 4.534655] __kmem_cache_shutdown+0x180/0x400 > [ 4.534663] shutdown_cache+0x18/0x1a0 > [ 4.534667] kmem_cache_destroy+0x1c1/0x1f0 > [ 4.534672] i915_gem_load_cleanup+0xb4/0x120 > [ 4.534677] i915_driver_cleanup_early+0x1a/0x50 > [ 4.534682] i915_driver_load+0x6b8/0xe70 > [ 4.534689] ? _raw_spin_unlock_irqrestore+0x26/0x50 > [ 4.534693] i915_pci_probe+0x2c/0x50 > [ 4.534698] local_pci_probe+0x45/0xa0 > [ 4.534703] ? pci_match_device+0xe0/0x110 > [ 4.534708] pci_device_probe+0x135/0x150 > [ 4.534714] driver_probe_device+0x288/0x490 > [ 4.534721] __driver_attach+0xc9/0xf0 > [ 4.534726] ? driver_probe_device+0x490/0x490 > [ 4.534730] bus_for_each_dev+0x5d/0x90 > [ 4.534736] driver_attach+0x1e/0x20 > [ 4.534741] bus_add_driver+0x1d0/0x290 > [ 4.534746] driver_register+0x60/0xe0 > [ 4.534751] __pci_register_driver+0x5d/0x60 > [ 4.534756] i915_init+0x59/0x5c > [ 4.534760] ? mipi_dsi_bus_init+0x17/0x17 > 4.534760] ? mipi_dsi_bus_init+0x17/0x17 > [ 4.534763] do_one_initcall+0x42/0x180 > [ 4.534769] kernel_init_freeable+0x17c/0x202 > [ 4.534773] ? set_debug_rodata+0x17/0x17 > [ 4.534777] ? rest_init+0x90/0x90 > [ 4.534781] kernel_init+0xe/0x110 > [ 4.534784] ret_from_fork+0x25/0x30 > [ 4.534791] INFO: Object 0xffff880260818340 @offset=832 > [ 4.534792] INFO: Object 0xffff880260818680 @offset=1664 > [ 4.534795] kmem_cache_destroy drm_i915_gem_object: Slab cache > still has objects > [ 4.534798] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B W > 4.12.0-eywa-46011-g9a19faf #360 > [ 4.534800] Hardware name: Intel Corporation Cannonlake Client > platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS > CNLSFWR1.R00.X075.D01.1703021113 03/02 > [ 4.534801] Call Trace: > [ 4.534805] dump_stack+0x65/0x89 > [ 4.534809] kmem_cache_destroy+0x1e1/0x1f0 > [ 4.534814] i915_gem_load_cleanup+0xb4/0x120 > [ 4.534819] i915_driver_cleanup_early+0x1a/0x50 > [ 4.534824] i915_driver_load+0x6b8/0xe70 > [ 4.534830] ? _raw_spin_unlock_irqrestore+0x26/0x50 > [ 4.534835] i915_pci_probe+0x2c/0x50 > [ 4.534840] local_pci_probe+0x45/0xa0 > [ 4.534844] ? pci_match_device+0xe0/0x110 > [ 4.534850] pci_device_probe+0x135/0x150 > [ 4.534856] driver_probe_device+0x288/0x490 > [ 4.534862] __driver_attach+0xc9/0xf0 > [ 4.534867] ? driver_probe_device+0x490/0x490 > [ 4.534871] bus_for_each_dev+0x5d/0x90 > [ 4.534877] driver_attach+0x1e/0x20 > [ 4.534882] bus_add_driver+0x1d0/0x290 > [ 4.534888] driver_register+0x60/0xe0 > [ 4.534893] __pci_register_driver+0x5d/0x60 > [ 4.534897] i915_init+0x59/0x5c > [ 4.534901] ? mipi_dsi_bus_init+0x17/0x17 > [ 4.534904] do_one_initcall+0x42/0x180 > [ 4.534910] kernel_init_freeable+0x17c/0x202 > [ 4.534914] ? set_debug_rodata+0x17/0x17 > [ 4.534917] ? rest_init+0x90/0x90 > [ 4.534922] kernel_init+0xe/0x110 > [ 4.534925] ret_from_fork+0x25/0x30 > [ 4.535386] i915 0000:00:02.0: [drm:i915_driver_load] Device > initialization failed (-22) > [ 4.535390] i915 0000:00:02.0: Please file a bug at > https://bugs.freedesktop.org/enter_bug.cgi?product=DRI against > DRM/Intel providing the dmesg log by booting with drm.debug=0xf > [ 4.535450] i915: probe of 0000:00:02.0 failed with error -22 > > > On Fri, Apr 28, 2017 at 7:36 AM, Oscar Mateo <oscar.mateo@intel.com> wrote: >> This batchbuffer is over 4096 bytes, so we need to increase the size of the >> array (and the KMD has to be modified to deal with more than one page). >> >> Notice that there to workarounds embedded here, both applicable to all CNL >> steppings. >> >> v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so update >> the comment in the code and in the commit message. >> >> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >> Cc: Ben Widawsky <ben@bwidawsk.net> >> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> >> --- >> lib/gen10_render.h | 63 +++ >> tools/null_state_gen/Makefile.am | 3 +- >> tools/null_state_gen/intel_batchbuffer.h | 2 +- >> tools/null_state_gen/intel_null_state_gen.c | 5 +- >> tools/null_state_gen/intel_renderstate.h | 1 + >> tools/null_state_gen/intel_renderstate_gen10.c | 538 +++++++++++++++++++++++++ >> 6 files changed, 609 insertions(+), 3 deletions(-) >> create mode 100644 lib/gen10_render.h >> create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c >> >> diff --git a/lib/gen10_render.h b/lib/gen10_render.h >> new file mode 100644 >> index 0000000..f4a7dff >> --- /dev/null >> +++ b/lib/gen10_render.h >> @@ -0,0 +1,63 @@ >> +#ifndef GEN10_RENDER_H >> +#define GEN10_RENDER_H >> + >> +#include "gen9_render.h" >> + >> +#define GEN7_MI_RS_CONTROL (0x6 << 23) >> +# define GEN7_MI_RS_CONTROL_ENABLE (1 << 0) >> + >> +#define GEN10_3DSTATE_GATHER_POOL_ALLOC GEN6_3D(3, 1, 0x1a) >> +# define GEN10_3DSTATE_GATHER_POOL_ENABLE (1 << 11) >> + >> +#define GEN10_3DSTATE_GATHER_CONSTANT_VS GEN6_3D(3, 0, 0x34) >> +#define GEN10_3DSTATE_GATHER_CONSTANT_HS GEN6_3D(3, 0, 0x36) >> +#define GEN10_3DSTATE_GATHER_CONSTANT_DS GEN6_3D(3, 0, 0x37) >> +#define GEN10_3DSTATE_GATHER_CONSTANT_GS GEN6_3D(3, 0, 0x35) >> +#define GEN10_3DSTATE_GATHER_CONSTANT_PS GEN6_3D(3, 0, 0x38) >> + >> +#define GEN10_3DSTATE_WM_DEPTH_STENCIL GEN6_3D(3, 0, 0x4e) >> +#define GEN10_3DSTATE_WM_CHROMAKEY GEN6_3D(3, 0, 0x4c) >> + >> +#define GEN8_REG_L3_CACHE_CONFIG 0x7034 >> + >> +/* >> + * Programming for L3 cache allocations can be made per bank. Based on the >> + * programmed value HW will apply same allocations on other available banks. >> + * Total L3 Cache size per bank = 256 KB. >> + * {SLM, URB, DC, RO(I/S, C, T), L3 Client Pool} >> + * { 0, 96, 32, 128, 0 } >> + */ >> +#define GEN10_L3_CACHE_CONFIG_VALUE 0x00420060 >> + >> +#define URB_ALIGN(val, align) ((val % align) ? (val - (val % align)) : val) >> + >> +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES 64 >> +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES 2752 >> + >> +#define GEN10_KB_PER_URB_INDEX 8 >> +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB 96 >> + >> +#define GEN10_URB_RESERVED_SIZE_KB 32 >> +#define GEN10_URB_RESERVED_END_SIZE_KB 8 >> + >> +#define GEN10_VS_NUM_BITS_PER_URB_UNIT 512 >> +#define GEN10_VS_NUM_OF_URB_UNITS 1 // zero based >> +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS (GEN10_VS_NUM_BITS_PER_URB_UNIT * \ >> + (GEN10_VS_NUM_OF_URB_UNITS + 1)) >> + >> +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / GEN10_KB_PER_URB_INDEX) >> + >> +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count) \ >> + URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX) >> + >> +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) \ >> + (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - GEN10_URB_RESERVED_END_SIZE_KB) >> + >> +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice) \ >> + ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) * \ >> + 1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS) >> + >> +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice) \ >> + ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / GEN10_KB_PER_URB_INDEX) >> + >> +#endif >> diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am >> index 24884a7..2f90990 100644 >> --- a/tools/null_state_gen/Makefile.am >> +++ b/tools/null_state_gen/Makefile.am >> @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES = \ >> intel_renderstate_gen7.c \ >> intel_renderstate_gen8.c \ >> intel_renderstate_gen9.c \ >> + intel_renderstate_gen10.c \ >> intel_null_state_gen.c >> >> -gens := 6 7 8 9 >> +gens := 6 7 8 9 10 >> >> h = /tmp/intel_renderstate_gen$$gen.c >> states: intel_null_state_gen >> diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h >> index 771d1c8..e40e01b 100644 >> --- a/tools/null_state_gen/intel_batchbuffer.h >> +++ b/tools/null_state_gen/intel_batchbuffer.h >> @@ -34,7 +34,7 @@ >> #include <stdint.h> >> >> #define MAX_RELOCS 64 >> -#define MAX_ITEMS 1024 >> +#define MAX_ITEMS 2048 >> #define MAX_STRLEN 256 >> >> #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1)) >> diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c >> index 06eb954..4f12f5f 100644 >> --- a/tools/null_state_gen/intel_null_state_gen.c >> +++ b/tools/null_state_gen/intel_null_state_gen.c >> @@ -41,7 +41,7 @@ static int debug = 0; >> static void print_usage(char *s) >> { >> fprintf(stderr, "%s: <gen>\n" >> - " gen: gen to generate for (6,7,8,9)\n", >> + " gen: gen to generate for (6,7,8,9,10)\n", >> s); >> } >> >> @@ -173,6 +173,9 @@ static int do_generate(int gen) >> case 9: >> null_state_gen = gen9_setup_null_render_state; >> break; >> + case 10: >> + null_state_gen = gen10_setup_null_render_state; >> + break; >> } >> >> if (null_state_gen == NULL) { >> diff --git a/tools/null_state_gen/intel_renderstate.h b/tools/null_state_gen/intel_renderstate.h >> index b27b434..b3c8c2b 100644 >> --- a/tools/null_state_gen/intel_renderstate.h >> +++ b/tools/null_state_gen/intel_renderstate.h >> @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct intel_batchbuffer *batch); >> void gen7_setup_null_render_state(struct intel_batchbuffer *batch); >> void gen8_setup_null_render_state(struct intel_batchbuffer *batch); >> void gen9_setup_null_render_state(struct intel_batchbuffer *batch); >> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch); >> >> #endif /* __INTEL_RENDERSTATE_H__ */ >> diff --git a/tools/null_state_gen/intel_renderstate_gen10.c b/tools/null_state_gen/intel_renderstate_gen10.c >> new file mode 100644 >> index 0000000..f5678c3 >> --- /dev/null >> +++ b/tools/null_state_gen/intel_renderstate_gen10.c >> @@ -0,0 +1,538 @@ >> +/* >> + * Copyright © 2014 Intel Corporation >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice (including the next >> + * paragraph) shall be included in all copies or substantial portions of the >> + * Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER >> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING >> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >> + * DEALINGS IN THE SOFTWARE. >> + * >> + * Authors: >> + * Oscar Mateo <oscar.mateo@intel.com> >> + */ >> + >> +#include "intel_renderstate.h" >> +#include <lib/gen10_render.h> >> +#include <lib/intel_reg.h> >> + >> +static void gen8_emit_wm(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2)); >> + OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION); >> +} >> + >> +static void gen8_emit_ps(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2)); >> + OUT_BATCH(0); >> + OUT_BATCH(0); /* kernel hi */ >> + OUT_BATCH(GEN7_PS_SPF_MODE); >> + OUT_BATCH(0); /* scratch space stuff */ >> + OUT_BATCH(0); /* scratch hi */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); // kernel 1 >> + OUT_BATCH(0); /* kernel 1 hi */ >> + OUT_BATCH(0); // kernel 2 >> + OUT_BATCH(0); /* kernel 2 hi */ >> +} >> + >> +static void gen8_emit_sf(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2)); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT | >> + 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT | >> + GEN7_SF_POINT_WIDTH_FROM_SOURCE | >> + 8); >> +} >> + >> +static void gen8_emit_vs(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2)); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> +} >> + >> +static void gen8_emit_hs(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2)); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT); >> + OUT_BATCH(0); >> +} >> + >> +static void gen8_emit_raster(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2)); >> + OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW); >> + OUT_BATCH(0.0); >> + OUT_BATCH(0.0); >> + OUT_BATCH(0.0); >> +} >> + >> +static void gen10_emit_urb(struct intel_batchbuffer *batch) >> +{ >> + /* Smallest SKU: 3x8*/ >> + int l3_bank_count = 3; >> + int slice_count = 1; >> + int urb_size_per_slice = GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count); >> + int other_urb_start_addr = GEN10_VS_END_URB_INDEX(urb_size_per_slice); >> + const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX; >> + const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS; >> + int vs_urb_entries = GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice); >> + >> + if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES) >> + vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES; >> + if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES) >> + vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES; >> + >> + OUT_BATCH(GEN7_3DSTATE_URB_VS); >> + OUT_BATCH(vs_urb_entries | >> + (vs_urb_alloc_size << 16) | >> + (vs_urb_start_addr << 25)); >> + >> + OUT_BATCH(GEN7_3DSTATE_URB_HS); >> + OUT_BATCH(other_urb_start_addr << 25); >> + >> + OUT_BATCH(GEN7_3DSTATE_URB_DS); >> + OUT_BATCH(other_urb_start_addr << 25); >> + >> + OUT_BATCH(GEN7_3DSTATE_URB_GS); >> + OUT_BATCH(other_urb_start_addr << 25); >> +} >> + >> +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY); >> + OUT_BATCH(_3DPRIM_TRILIST); >> +} >> + >> +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch) >> +{ >> + const int num_decls = 128; >> + int i; >> + >> + OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST | >> + (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */); >> + OUT_BATCH(0); >> + OUT_BATCH(num_decls); >> + >> + for (i = 0; i < num_decls; i++) { >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + } >> +} >> + >> +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index) >> +{ >> + OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2)); >> + OUT_BATCH(index << 29); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> +} >> + >> +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index) >> +{ >> + OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2)); >> + OUT_BATCH(index << 30); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> +} >> + >> +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch) >> +{ >> + const int buffers = 33; >> + int i; >> + >> + OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS | >> + (((4 * buffers) + 1)- 2) /* DWORD count - 2 */); >> + >> + for (i = 0; i < buffers; i++) { >> + OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT | >> + GEN7_VB0_BUFFER_ADDR_MOD_EN); >> + OUT_BATCH(0); /* Address */ >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + } >> +} >> + >> +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch) >> +{ >> + const int elements = 34; >> + int i; >> + >> + OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS | >> + (((2 * elements) + 1) - 2) /* DWORD count - 2 */); >> + >> + /* Element 0 */ >> + OUT_BATCH(VE0_VALID); >> + OUT_BATCH( >> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT | >> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT | >> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT | >> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT); >> + /* Elements 1 -> 33 */ >> + for (i = 1; i < elements; i++) { >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + } >> +} >> + >> +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch) >> +{ >> + union { >> + float fval; >> + uint32_t uval; >> + } u; >> + >> + unsigned offset; >> + >> + u.fval = 1.0f; >> + >> + offset = intel_batch_state_offset(batch, 64); >> + OUT_STATE(0); >> + OUT_STATE(0); /* Alpha reference value */ >> + OUT_STATE(u.uval); /* Blend constant color RED */ >> + OUT_STATE(u.uval); /* Blend constant color BLUE */ >> + OUT_STATE(u.uval); /* Blend constant color GREEN */ >> + OUT_STATE(u.uval); /* Blend constant color ALPHA */ >> + >> + OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS); >> + OUT_BATCH_STATE_OFFSET(offset | 1); >> +} >> + >> +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch) >> +{ >> + unsigned offset; >> + int i; >> + >> + offset = intel_batch_state_offset(batch, 64); >> + >> + for (i = 0; i < 17; i++) >> + OUT_STATE(0); >> + >> + OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2)); >> + OUT_BATCH_STATE_OFFSET(offset | 1); >> +} >> + >> +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2)); >> + OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID | >> + GEN8_PSX_ATTRIBUTE_ENABLE); >> + >> +} >> + >> +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2)); >> + OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT); >> +} >> + >> +static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch) >> +{ >> + unsigned offset; >> + >> + offset = intel_batch_state_offset(batch, 32); >> + >> + OUT_STATE((uint32_t)0.0f); /* Minimum depth */ >> + OUT_STATE((uint32_t)0.0f); /* Maximum depth */ >> + >> + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2)); >> + OUT_BATCH_STATE_OFFSET(offset); >> +} >> + >> +static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch) >> +{ >> + unsigned offset; >> + int i; >> + >> + offset = intel_batch_state_offset(batch, 64); >> + >> + for (i = 0; i < 16; i++) >> + OUT_STATE(0); >> + >> + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2)); >> + OUT_BATCH_STATE_OFFSET(offset); >> +} >> + >> +static void gen8_emit_primitive(struct intel_batchbuffer *batch) >> +{ >> + OUT_BATCH(GEN6_3DPRIMITIVE | (10-2)); >> + OUT_BATCH(4); /* gen8+ ignore the topology type field */ >> + OUT_BATCH(1); /* vertex count */ >> + OUT_BATCH(0); >> + OUT_BATCH(1); /* single instance */ >> + OUT_BATCH(0); /* start instance location */ >> + OUT_BATCH(0); /* index buffer offset, ignored */ >> + OUT_BATCH(0); /* extended parameter 0 */ >> + OUT_BATCH(0); /* extended parameter 1 */ >> + OUT_BATCH(0); /* extended parameter 2 */ >> +} >> + >> +static void gen9_emit_state_base_address(struct intel_batchbuffer *batch) { >> + const unsigned offset = 0; >> + OUT_BATCH(GEN6_STATE_BASE_ADDRESS | >> + (22 - 2) /* DWORD count - 2 */); >> + >> + /* general state base address - requires BB address >> + * added to state offset to be stored in this location >> + */ >> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + >> + /* stateless data port */ >> + OUT_BATCH(0); >> + >> + /* surface state base address - requires BB address >> + * added to state offset to be stored in this location >> + */ >> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + >> + /* dynamic state base address - requires BB address >> + * added to state offset to be stored in this location >> + */ >> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + >> + /* indirect state base address */ >> + OUT_BATCH(BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + >> + /* instruction state base address - requires BB address >> + * added to state offset to be stored in this location >> + */ >> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + >> + /* general state buffer size */ >> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >> + /* dynamic state buffer size */ >> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >> + /* indirect object buffer size */ >> + OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY); >> + /* intruction buffer size */ >> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >> + >> + /* bindless surface state base address */ >> + OUT_BATCH(BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + /* bindless surface state size */ >> + OUT_BATCH(0); >> + >> + /* bindless sampler state base address */ >> + OUT_BATCH(BASE_ADDRESS_MODIFY); >> + OUT_BATCH(0); >> + /* bindless sampler state size */ >> + OUT_BATCH(0); >> +} >> + >> +/* >> + * Generate the batch buffer commands needed to initialize the 3D engine >> + * to its "golden state". >> + */ >> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch) >> +{ >> + int i; >> + >> + /* WaRsGatherPoolEnable: cnl */ >> + OUT_BATCH(GEN7_MI_RS_CONTROL); >> + >> +#define GEN8_PIPE_CONTROL_GLOBAL_GTT (1 << 24) >> + /* PIPE_CONTROL */ >> + OUT_BATCH(GEN6_PIPE_CONTROL | >> + (6 - 2)); /* DWORD count - 2 */ >> + OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + /* PIPELINE_SELECT */ >> + OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D); >> + >> + OUT_BATCH(MI_LOAD_REGISTER_IMM); >> + OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG); >> + OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE); >> + >> + gen8_emit_wm(batch); >> + gen8_emit_ps(batch); >> + gen8_emit_sf(batch); >> + >> + OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */ >> + OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11); >> + >> + gen8_emit_vs(batch); >> + gen8_emit_hs(batch); >> + >> + OUT_CMD(GEN7_3DSTATE_GS, 10); >> + OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5); >> + OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */ >> + OUT_CMD(GEN6_3DSTATE_CLIP, 4); >> + OUT_CMD(GEN7_3DSTATE_TE, 4); >> + OUT_CMD(GEN8_3DSTATE_VF, 2); >> + OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5); >> + >> + /* URB States */ >> + gen10_emit_urb(batch); >> + >> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130); >> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130); >> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130); >> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130); >> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130); >> + >> + OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4); >> + OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4); >> + OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4); >> + >> + /* Push Constants */ >> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2); >> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2); >> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2); >> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2); >> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2); >> + >> + /* Constants */ >> + OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11); >> + OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11); >> + OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11); >> + OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11); >> + OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11); >> + >> + OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3); >> + OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2); >> + gen8_emit_vf_topology(batch); >> + >> + /* Streamer out declaration list */ >> + gen8_emit_so_decl_list(batch); >> + >> + /* Streamer out buffers */ >> + for (i = 0; i < 4; i++) { >> + gen8_emit_so_buffer(batch, i); >> + } >> + >> + /* State base addresses */ >> + gen9_emit_state_base_address(batch); >> + >> + OUT_CMD(GEN6_STATE_SIP, 3); >> + OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4); >> + OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8); >> + >> + /* Chroma key */ >> + for (i = 0; i < 4; i++) { >> + gen8_emit_chroma_key(batch, i); >> + } >> + >> + OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3); >> + OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3); >> + OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5); >> + OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5); >> + OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3); >> + OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2); >> + >> + /* WaPSRandomCSNotDone:cnl */ >> +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) >> + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); >> + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2); >> + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2); >> + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32); >> + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16); >> + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16); >> + OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5); >> + >> + /* Vertex buffers */ >> + gen8_emit_vertex_buffers(batch); >> + gen8_emit_vertex_elements(batch); >> + >> + OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */); >> + >> + /* 3D state binding table pointers */ >> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2); >> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2); >> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2); >> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2); >> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2); >> + >> + gen8_emit_cc_state_pointers(batch); >> + gen8_emit_blend_state_pointers(batch); >> + gen8_emit_ps_extra(batch); >> + gen8_emit_ps_blend(batch); >> + >> + /* 3D state sampler state pointers */ >> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2); >> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2); >> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2); >> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2); >> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2); >> + >> + OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2); >> + >> + gen8_emit_viewport_state_pointers_cc(batch); >> + gen8_emit_viewport_state_pointers_sf_clip(batch); >> + >> + /* WaPSRandomCSNotDone:cnl */ >> +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) >> + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); >> + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + OUT_BATCH(0); >> + >> + gen8_emit_raster(batch); >> + >> + OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4); >> + OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2); >> + >> + /* Launch 3D operation */ >> + gen8_emit_primitive(batch); >> + >> + /* WaRsGatherPoolEnable: cnl */ >> + OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE); >> + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2)); >> + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE); >> + OUT_BATCH(0); >> + OUT_BATCH(0xfffff << 12); >> + OUT_BATCH(GEN7_MI_RS_CONTROL); >> + OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4); >> + >> + OUT_BATCH(MI_BATCH_BUFFER_END); >> +} >> -- >> 1.9.1 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx > >
On Wed, Jul 12, 2017 at 1:42 PM, Oscar Mateo <oscar.mateo@intel.com> wrote: > > > On 07/05/2017 05:50 PM, Rodrigo Vivi wrote: >> >> Hi Oscar, > > > Hey! > >> I had missed this patch here, but noticed now that I was refreshing >> and testing more cnl tests before re-submitting them. >> >> First of all I believe we need to remove the A0 w/a. I don't believe >> we will ever see one. So I'm removing all A0 exclusive W/a from the >> patches as well. > > > Be careful: I think both WAs in the patch are for all steppings (one was > incorrectly marked as A0 only in v1 of this patch). ah cool, so v2 is right... > >> I also gave a try here on your null state. However if I use the golden >> state generated by this version I get a blank screen because driver >> load failes with some strange faults: > > > Good. I don't have a CNL so it was only compile-tested. > >> any idea? > > > Did you also include the i915 patch to allow golden BBs over one page in > size? I sent it separately as "drm/i915: Allow null render state > batchbuffers bigger than one page". BTW: this patch was given a cold > shoulder in the mailing list, since I could not re-justify why null state > was needed in the first place (since UMD needs to configure the 3D pipeline > first thing anyway). I am still trying to get a better explanation from HW > people. hmmmm no... I missed that patch... sorry... I'm currently without access to CNL, but as soon as I have I will test it and if that works I will just merge igt one, review your kernel one, etc... > > -- Oscar > >> [ 4.115243] Memory manager not clean during takedown. >> >> [ 4.120389] ------------[ cut here ]------------ >> [ 4.125068] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mm.c:892 >> drm_mm_takedown+0x25/0x30 >> [ 4.133574] Modules linked in: >> [ 4.136707] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >> 4.12.0-eywa-46011-g9a19faf #360 >> [ 4.144650] Hardware name: Intel Corporation Cannonlake Client >> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >> CNLSFWR1.R00.X075.D01.1703021113 03/02 >> [ 4.158500] task: ffff880264ab8000 task.stack: ffffc90000038000 >> [ 4.164506] RIP: 0010:drm_mm_takedown+0x25/0x30 >> [ 4.169104] RSP: 0000:ffffc9000003bc28 EFLAGS: 00010292 >> [ 4.174409] RAX: 0000000000000029 RBX: ffff880260a54170 RCX: >> ffffffff82468740 >> [ 4.181654] RDX: 0000000000000001 RSI: 0000000000000082 RDI: >> 00000000ffffffff >> [ 4.188839] RBP: ffffc9000003bc28 R08: 00000000fffffffe R09: >> 000000000000035a >> [ 4.196028] R10: 0000000000000005 R11: 0000000000000000 R12: >> ffff880260a50000 >> [ 4.203215] R13: ffff880260a54348 R14: ffff880260a50070 R15: >> ffff880262844a00 >> [ 4.210402] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) >> knlGS:0000000000000000 >> [ 4.218541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 4.224344] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: >> 00000000007406f0 >> [ 4.231529] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [ 4.238716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >> 0000000000000400 >> [ 4.245900] PKRU: 00000000 >> [ 4.248673] Call Trace: >> [ 4.251193] i915_gem_cleanup_stolen+0x1f/0x30 >> [ 4.255703] i915_ggtt_cleanup_hw+0xa4/0x170 >> [ 4.260035] i915_driver_cleanup_hw+0x36/0x40 >> [ 4.264455] i915_driver_load+0x6a0/0xe70 >> [ 4.268535] ? _raw_spin_unlock_irqrestore+0x26/0x50 >> [ 4.273560] i915_pci_probe+0x2c/0x50 >> [ 4.277293] local_pci_probe+0x45/0xa0 >> [ 4.281106] ? pci_match_device+0xe0/0x110 >> [ 4.285265] pci_device_probe+0x135/0x150 >> [ 4.289343] driver_probe_device+0x288/0x490 >> [ 4.293676] __driver_attach+0xc9/0xf0 >> [ 4.297490] ? driver_probe_device+0x490/0x490 >> [ 4.301999] bus_for_each_dev+0x5d/0x90 >> [ 4.305902] driver_attach+0x1e/0x20 >> [ 4.309543] bus_add_driver+0x1d0/0x290 >> [ 4.313442] driver_register+0x60/0xe0 >> [ 4.317257] __pci_register_driver+0x5d/0x60 >> [ 4.321652] i915_init+0x59/0x5c >> [ 4.324944] ? mipi_dsi_bus_init+0x17/0x17 >> [ 4.329103] do_one_initcall+0x42/0x180 >> [ 4.333007] kernel_init_freeable+0x17c/0x202 >> [ 4.337426] ? set_debug_rodata+0x17/0x17 >> [ 4.341500] ? rest_init+0x90/0x90 >> [ 4.344969] kernel_init+0xe/0x110 >> [ 4.348438] ret_from_fork+0x25/0x30 >> [ 4.352079] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48 >> 83 c7 38 48 39 c7 75 01 c3 55 48 c7 c7 70 ac 20 82 31 c0 48 89 e5 e8 >> 6b 62 b7 ff <0f> ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 >> e5 41 >> [ 4.371029] ---[ end trace 7d36c2dd72851315 ]--- >> [ 4.381680] WARN_ON(dev_priv->mm.object_count) >> [ 4.381698] ------------[ cut here ]------------ >> [ 4.390921] WARNING: CPU: 0 PID: 1 at >> drivers/gpu/drm/i915/i915_gem.c:4964 i915_gem_load_cleanup+0x10b/0x120 >> [ 4.400797] Modules linked in: >> [ 4.403927] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W >> 4.12.0-eywa-46011-g9a19faf #360 >> [ 4.413021] Hardware name: Intel Corporation Cannonlake Client >> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >> CNLSFWR1.R00.X075.D01.1703021113 03/02 >> [ 4.426884] task: ffff880264ab8000 task.stack: ffffc90000038000 >> [ 4.432865] RIP: 0010:i915_gem_load_cleanup+0x10b/0x120 >> [ 4.438157] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292 >> [ 4.443450] RAX: 0000000000000022 RBX: ffff880260a50000 RCX: >> ffffffff82468740 >> [ 4.450642] RDX: 0000000000000001 RSI: 0000000000000082 RDI: >> 0000000000000202 >> [ 4.457839] RBP: ffffc9000003bc68 R08: 0000000000000022 R09: >> 0000000000000389 >> [ 4.465029] R10: 0000000000000000 R11: 0000000000000001 R12: >> ffff880260a54678 >> [ 4.472227] R13: ffff88026446c000 R14: ffff88026446c000 R15: >> ffff880262844a00 >> [ 4.479420] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) >> knlGS:0000000000000000 >> [ 4.487564] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 4.493370] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: >> 00000000007406f0 >> [ 4.500569] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [ 4.507763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >> 0000000000000400 >> [ 4.514959] PKRU: 00000000 >> [ 4.517737] Call Trace: >> [ 4.520265] i915_driver_cleanup_early+0x1a/0x50 >> [ 4.524955] i915_driver_load+0x6b8/0xe70 >> [ 4.529038] ? _raw_spin_unlock_irqrestore+0x26/0x50 >> [ 4.534100] clocksource: Switched to clocksource tsc >> [ 4.534105] i915_pci_probe+0x2c/0x50 >> [ 4.534113] local_pci_probe+0x45/0xa0 >> [ 4.534118] ? pci_match_device+0xe0/0x110 >> [ 4.534124] pci_device_probe+0x135/0x150 >> [ 4.534131] driver_probe_device+0x288/0x490 >> [ 4.534137] __driver_attach+0xc9/0xf0 >> [ 4.534142] ? driver_probe_device+0x490/0x490 >> [ 4.534146] bus_for_each_dev+0x5d/0x90 >> [ 4.534152] driver_attach+0x1e/0x20 >> [ 4.534156] bus_add_driver+0x1d0/0x290 >> [ 4.534162] driver_register+0x60/0xe0 >> [ 4.534167] __pci_register_driver+0x5d/0x60 >> [ 4.534173] i915_init+0x59/0x5c >> [ 4.534177] ? mipi_dsi_bus_init+0x17/0x17 >> [ 4.534181] do_one_initcall+0x42/0x180 >> [ 4.534187] kernel_init_freeable+0x17c/0x202 >> [ 4.534191] ? set_debug_rodata+0x17/0x17 >> [ 4.534196] ? rest_init+0x90/0x90 >> [ 4.534200] kernel_init+0xe/0x110 >> [ 4.534204] ret_from_fork+0x25/0x30 >> [ 4.534208] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff 0f >> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8 >> 05 4f b1 ff <0f> ff e9 33 ff ff ff 66 66 66 66 66 2e 0f 1f 84 00 00 00 >> 00 00 >> [ 4.534272] ---[ end trace 7d36c2dd72851316 ]--- >> [ 4.534277] WARN_ON(!list_empty(&dev_priv->gt.timelines)) >> [ 4.534293] ------------[ cut here ]------------ >> [ 4.534298] WARNING: CPU: 0 PID: 1 at >> drivers/gpu/drm/i915/i915_gem.c:4968 i915_gem_load_cleanup+0xef/0x120 >> [ 4.534299] Modules linked in: >> [ 4.534304] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W >> 4.12.0-eywa-46011-g9a19faf #360 >> [ 4.534306] Hardware name: Intel Corporation Cannonlake Client >> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >> CNLSFWR1.R00.X075.D01.1703021113 03/02 >> [ 4.534308] task: ffff880264ab8000 task.stack: ffffc90000038000 >> [ 4.534312] RIP: 0010:i915_gem_load_cleanup+0xef/0x120 >> [ 4.534314] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292 >> [ 4.534317] RAX: 000000000000002d RBX: ffff880260a50000 RCX: >> 0000000000000000 >> [ 4.534319] RDX: 0000000000000001 RSI: 0000000000000002 RDI: >> 0000000000000296 >> [ 4.534321] RBP: ffffc9000003bc68 R08: 000000000000002d R09: >> 000000000000002d >> [ 4.534322] R10: 0000000000000000 R11: ffff880260a4e000 R12: >> ffff880260a50070 >> [ 4.534324] R13: ffff88026446c000 R14: ffff88026446c000 R15: >> ffff880262844a00 >> [ 4.534327] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) >> knlGS:0000000000000000 >> [ 4.534329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 4.534331] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: >> 00000000007406f0 >> [ 4.534334] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [ 4.534335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >> 0000000000000400 >> [ 4.534337] PKRU: 00000000 >> [ 4.534338] Call Trace: >> [ 4.534344] i915_driver_cleanup_early+0x1a/0x50 >> [ 4.534350] i915_driver_load+0x6b8/0xe70 >> [ 4.534356] ? _raw_spin_unlock_irqrestore+0x26/0x50 >> [ 4.534361] i915_pci_probe+0x2c/0x50 >> [ 4.534366] local_pci_probe+0x45/0xa0 >> [ 4.534371] ? pci_match_device+0xe0/0x110 >> [ 4.534376] pci_device_probe+0x135/0x150 >> [ 4.534382] driver_probe_device+0x288/0x490 >> [ 4.534388] __driver_attach+0xc9/0xf0 >> [ 4.534393] ? driver_probe_device+0x490/0x490 >> [ 4.534398] bus_for_each_dev+0x5d/0x90 >> [ 4.534403] driver_attach+0x1e/0x20 >> [ 4.534408] bus_add_driver+0x1d0/0x290 >> [ 4.534414] driver_register+0x60/0xe0 >> [ 4.534419] __pci_register_driver+0x5d/0x60 >> [ 4.534424] i915_init+0x59/0x5c >> [ 4.534428] ? mipi_dsi_bus_init+0x17/0x17 >> [ 4.534431] do_one_initcall+0x42/0x180 >> [ 4.534437] kernel_init_freeable+0x17c/0x202 >> [ 4.534440] ? set_debug_rodata+0x17/0x17 >> [ 4.534444] ? rest_init+0x90/0x90 >> [ 4.534448] kernel_init+0xe/0x110 >> [ 4.534451] ret_from_fork+0x25/0x30 >> [ 4.534455] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 3d 4f b1 ff 0f >> ff e9 5d ff ff ff 48 c7 c6 b0 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8 >> 21 4f b1 ff <0f> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 >> 1a 82 >> [ 4.534519] ---[ end trace 7d36c2dd72851317 ]--- >> [ 4.534605] >> ============================================================================= >> [ 4.534608] BUG drm_i915_gem_object (Tainted: G W ): >> Objects remaining in drm_i915_gem_object on __kmem_cache_shutdown() >> [ 4.534609] >> ----------------------------------------------------------------------------- >> >> [ 4.534611] Disabling lock debugging due to kernel taint >> [ 4.534614] INFO: Slab 0xffffea0009820600 objects=19 used=2 >> fp=0xffff88026081ba80 flags=0x200000000008100 >> [ 4.534618] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B W >> 4.12.0-eywa-46011-g9a19faf #360 >> [ 4.534620] Hardware name: Intel Corporation Cannonlake Client >> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >> CNLSFWR1.R00.X075.D01.1703021113 03/02 >> [ 4.534621] Call Trace: >> [ 4.534626] dump_stack+0x65/0x89 >> [ 4.534633] slab_err+0xa1/0xb0 >> [ 4.534640] ? __kmalloc+0x185/0x270 >> [ 4.534645] ? kmem_cache_alloc_bulk+0x1f0/0x1f0 >> [ 4.534650] ? __kmem_cache_shutdown+0x160/0x400 >> [ 4.534655] __kmem_cache_shutdown+0x180/0x400 >> [ 4.534663] shutdown_cache+0x18/0x1a0 >> [ 4.534667] kmem_cache_destroy+0x1c1/0x1f0 >> [ 4.534672] i915_gem_load_cleanup+0xb4/0x120 >> [ 4.534677] i915_driver_cleanup_early+0x1a/0x50 >> [ 4.534682] i915_driver_load+0x6b8/0xe70 >> [ 4.534689] ? _raw_spin_unlock_irqrestore+0x26/0x50 >> [ 4.534693] i915_pci_probe+0x2c/0x50 >> [ 4.534698] local_pci_probe+0x45/0xa0 >> [ 4.534703] ? pci_match_device+0xe0/0x110 >> [ 4.534708] pci_device_probe+0x135/0x150 >> [ 4.534714] driver_probe_device+0x288/0x490 >> [ 4.534721] __driver_attach+0xc9/0xf0 >> [ 4.534726] ? driver_probe_device+0x490/0x490 >> [ 4.534730] bus_for_each_dev+0x5d/0x90 >> [ 4.534736] driver_attach+0x1e/0x20 >> [ 4.534741] bus_add_driver+0x1d0/0x290 >> [ 4.534746] driver_register+0x60/0xe0 >> [ 4.534751] __pci_register_driver+0x5d/0x60 >> [ 4.534756] i915_init+0x59/0x5c >> [ 4.534760] ? mipi_dsi_bus_init+0x17/0x17 >> 4.534760] ? mipi_dsi_bus_init+0x17/0x17 >> [ 4.534763] do_one_initcall+0x42/0x180 >> [ 4.534769] kernel_init_freeable+0x17c/0x202 >> [ 4.534773] ? set_debug_rodata+0x17/0x17 >> [ 4.534777] ? rest_init+0x90/0x90 >> [ 4.534781] kernel_init+0xe/0x110 >> [ 4.534784] ret_from_fork+0x25/0x30 >> [ 4.534791] INFO: Object 0xffff880260818340 @offset=832 >> [ 4.534792] INFO: Object 0xffff880260818680 @offset=1664 >> [ 4.534795] kmem_cache_destroy drm_i915_gem_object: Slab cache >> still has objects >> [ 4.534798] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B W >> 4.12.0-eywa-46011-g9a19faf #360 >> [ 4.534800] Hardware name: Intel Corporation Cannonlake Client >> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >> CNLSFWR1.R00.X075.D01.1703021113 03/02 >> [ 4.534801] Call Trace: >> [ 4.534805] dump_stack+0x65/0x89 >> [ 4.534809] kmem_cache_destroy+0x1e1/0x1f0 >> [ 4.534814] i915_gem_load_cleanup+0xb4/0x120 >> [ 4.534819] i915_driver_cleanup_early+0x1a/0x50 >> [ 4.534824] i915_driver_load+0x6b8/0xe70 >> [ 4.534830] ? _raw_spin_unlock_irqrestore+0x26/0x50 >> [ 4.534835] i915_pci_probe+0x2c/0x50 >> [ 4.534840] local_pci_probe+0x45/0xa0 >> [ 4.534844] ? pci_match_device+0xe0/0x110 >> [ 4.534850] pci_device_probe+0x135/0x150 >> [ 4.534856] driver_probe_device+0x288/0x490 >> [ 4.534862] __driver_attach+0xc9/0xf0 >> [ 4.534867] ? driver_probe_device+0x490/0x490 >> [ 4.534871] bus_for_each_dev+0x5d/0x90 >> [ 4.534877] driver_attach+0x1e/0x20 >> [ 4.534882] bus_add_driver+0x1d0/0x290 >> [ 4.534888] driver_register+0x60/0xe0 >> [ 4.534893] __pci_register_driver+0x5d/0x60 >> [ 4.534897] i915_init+0x59/0x5c >> [ 4.534901] ? mipi_dsi_bus_init+0x17/0x17 >> [ 4.534904] do_one_initcall+0x42/0x180 >> [ 4.534910] kernel_init_freeable+0x17c/0x202 >> [ 4.534914] ? set_debug_rodata+0x17/0x17 >> [ 4.534917] ? rest_init+0x90/0x90 >> [ 4.534922] kernel_init+0xe/0x110 >> [ 4.534925] ret_from_fork+0x25/0x30 >> [ 4.535386] i915 0000:00:02.0: [drm:i915_driver_load] Device >> initialization failed (-22) >> [ 4.535390] i915 0000:00:02.0: Please file a bug at >> https://bugs.freedesktop.org/enter_bug.cgi?product=DRI against >> DRM/Intel providing the dmesg log by booting with drm.debug=0xf >> [ 4.535450] i915: probe of 0000:00:02.0 failed with error -22 >> >> >> On Fri, Apr 28, 2017 at 7:36 AM, Oscar Mateo <oscar.mateo@intel.com> >> wrote: >>> >>> This batchbuffer is over 4096 bytes, so we need to increase the size of >>> the >>> array (and the KMD has to be modified to deal with more than one page). >>> >>> Notice that there to workarounds embedded here, both applicable to all >>> CNL >>> steppings. >>> >>> v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so >>> update >>> the comment in the code and in the commit message. >>> >>> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >>> Cc: Ben Widawsky <ben@bwidawsk.net> >>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> >>> --- >>> lib/gen10_render.h | 63 +++ >>> tools/null_state_gen/Makefile.am | 3 +- >>> tools/null_state_gen/intel_batchbuffer.h | 2 +- >>> tools/null_state_gen/intel_null_state_gen.c | 5 +- >>> tools/null_state_gen/intel_renderstate.h | 1 + >>> tools/null_state_gen/intel_renderstate_gen10.c | 538 >>> +++++++++++++++++++++++++ >>> 6 files changed, 609 insertions(+), 3 deletions(-) >>> create mode 100644 lib/gen10_render.h >>> create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c >>> >>> diff --git a/lib/gen10_render.h b/lib/gen10_render.h >>> new file mode 100644 >>> index 0000000..f4a7dff >>> --- /dev/null >>> +++ b/lib/gen10_render.h >>> @@ -0,0 +1,63 @@ >>> +#ifndef GEN10_RENDER_H >>> +#define GEN10_RENDER_H >>> + >>> +#include "gen9_render.h" >>> + >>> +#define GEN7_MI_RS_CONTROL (0x6 << 23) >>> +# define GEN7_MI_RS_CONTROL_ENABLE (1 << 0) >>> + >>> +#define GEN10_3DSTATE_GATHER_POOL_ALLOC GEN6_3D(3, 1, >>> 0x1a) >>> +# define GEN10_3DSTATE_GATHER_POOL_ENABLE (1 << 11) >>> + >>> +#define GEN10_3DSTATE_GATHER_CONSTANT_VS GEN6_3D(3, 0, 0x34) >>> +#define GEN10_3DSTATE_GATHER_CONSTANT_HS GEN6_3D(3, 0, 0x36) >>> +#define GEN10_3DSTATE_GATHER_CONSTANT_DS GEN6_3D(3, 0, 0x37) >>> +#define GEN10_3DSTATE_GATHER_CONSTANT_GS GEN6_3D(3, 0, 0x35) >>> +#define GEN10_3DSTATE_GATHER_CONSTANT_PS GEN6_3D(3, 0, 0x38) >>> + >>> +#define GEN10_3DSTATE_WM_DEPTH_STENCIL GEN6_3D(3, 0, 0x4e) >>> +#define GEN10_3DSTATE_WM_CHROMAKEY GEN6_3D(3, 0, 0x4c) >>> + >>> +#define GEN8_REG_L3_CACHE_CONFIG 0x7034 >>> + >>> +/* >>> + * Programming for L3 cache allocations can be made per bank. Based on >>> the >>> + * programmed value HW will apply same allocations on other available >>> banks. >>> + * Total L3 Cache size per bank = 256 KB. >>> + * {SLM, URB, DC, RO(I/S, C, T), L3 Client Pool} >>> + * { 0, 96, 32, 128, 0 } >>> + */ >>> +#define GEN10_L3_CACHE_CONFIG_VALUE 0x00420060 >>> + >>> +#define URB_ALIGN(val, align) ((val % align) ? (val - (val % align)) : >>> val) >>> + >>> +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES 64 >>> +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES 2752 >>> + >>> +#define GEN10_KB_PER_URB_INDEX 8 >>> +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB 96 >>> + >>> +#define GEN10_URB_RESERVED_SIZE_KB 32 >>> +#define GEN10_URB_RESERVED_END_SIZE_KB 8 >>> + >>> +#define GEN10_VS_NUM_BITS_PER_URB_UNIT 512 >>> +#define GEN10_VS_NUM_OF_URB_UNITS 1 // zero based >>> +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS >>> (GEN10_VS_NUM_BITS_PER_URB_UNIT * \ >>> + >>> (GEN10_VS_NUM_OF_URB_UNITS + 1)) >>> + >>> +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / >>> GEN10_KB_PER_URB_INDEX) >>> + >>> +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count) >>> \ >>> + URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * >>> l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX) >>> + >>> +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) \ >>> + (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - >>> GEN10_URB_RESERVED_END_SIZE_KB) >>> + >>> +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice) \ >>> + ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) * \ >>> + 1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS) >>> + >>> +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice) \ >>> + ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / >>> GEN10_KB_PER_URB_INDEX) >>> + >>> +#endif >>> diff --git a/tools/null_state_gen/Makefile.am >>> b/tools/null_state_gen/Makefile.am >>> index 24884a7..2f90990 100644 >>> --- a/tools/null_state_gen/Makefile.am >>> +++ b/tools/null_state_gen/Makefile.am >>> @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES = \ >>> intel_renderstate_gen7.c \ >>> intel_renderstate_gen8.c \ >>> intel_renderstate_gen9.c \ >>> + intel_renderstate_gen10.c \ >>> intel_null_state_gen.c >>> >>> -gens := 6 7 8 9 >>> +gens := 6 7 8 9 10 >>> >>> h = /tmp/intel_renderstate_gen$$gen.c >>> states: intel_null_state_gen >>> diff --git a/tools/null_state_gen/intel_batchbuffer.h >>> b/tools/null_state_gen/intel_batchbuffer.h >>> index 771d1c8..e40e01b 100644 >>> --- a/tools/null_state_gen/intel_batchbuffer.h >>> +++ b/tools/null_state_gen/intel_batchbuffer.h >>> @@ -34,7 +34,7 @@ >>> #include <stdint.h> >>> >>> #define MAX_RELOCS 64 >>> -#define MAX_ITEMS 1024 >>> +#define MAX_ITEMS 2048 >>> #define MAX_STRLEN 256 >>> >>> #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1)) >>> diff --git a/tools/null_state_gen/intel_null_state_gen.c >>> b/tools/null_state_gen/intel_null_state_gen.c >>> index 06eb954..4f12f5f 100644 >>> --- a/tools/null_state_gen/intel_null_state_gen.c >>> +++ b/tools/null_state_gen/intel_null_state_gen.c >>> @@ -41,7 +41,7 @@ static int debug = 0; >>> static void print_usage(char *s) >>> { >>> fprintf(stderr, "%s: <gen>\n" >>> - " gen: gen to generate for (6,7,8,9)\n", >>> + " gen: gen to generate for (6,7,8,9,10)\n", >>> s); >>> } >>> >>> @@ -173,6 +173,9 @@ static int do_generate(int gen) >>> case 9: >>> null_state_gen = gen9_setup_null_render_state; >>> break; >>> + case 10: >>> + null_state_gen = gen10_setup_null_render_state; >>> + break; >>> } >>> >>> if (null_state_gen == NULL) { >>> diff --git a/tools/null_state_gen/intel_renderstate.h >>> b/tools/null_state_gen/intel_renderstate.h >>> index b27b434..b3c8c2b 100644 >>> --- a/tools/null_state_gen/intel_renderstate.h >>> +++ b/tools/null_state_gen/intel_renderstate.h >>> @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct >>> intel_batchbuffer *batch); >>> void gen7_setup_null_render_state(struct intel_batchbuffer *batch); >>> void gen8_setup_null_render_state(struct intel_batchbuffer *batch); >>> void gen9_setup_null_render_state(struct intel_batchbuffer *batch); >>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch); >>> >>> #endif /* __INTEL_RENDERSTATE_H__ */ >>> diff --git a/tools/null_state_gen/intel_renderstate_gen10.c >>> b/tools/null_state_gen/intel_renderstate_gen10.c >>> new file mode 100644 >>> index 0000000..f5678c3 >>> --- /dev/null >>> +++ b/tools/null_state_gen/intel_renderstate_gen10.c >>> @@ -0,0 +1,538 @@ >>> +/* >>> + * Copyright © 2014 Intel Corporation >>> + * >>> + * Permission is hereby granted, free of charge, to any person obtaining >>> a >>> + * copy of this software and associated documentation files (the >>> "Software"), >>> + * to deal in the Software without restriction, including without >>> limitation >>> + * the rights to use, copy, modify, merge, publish, distribute, >>> sublicense, >>> + * and/or sell copies of the Software, and to permit persons to whom the >>> + * Software is furnished to do so, subject to the following conditions: >>> + * >>> + * The above copyright notice and this permission notice (including the >>> next >>> + * paragraph) shall be included in all copies or substantial portions of >>> the >>> + * Software. >>> + * >>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>> EXPRESS OR >>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>> MERCHANTABILITY, >>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT >>> SHALL >>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR >>> OTHER >>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >>> ARISING >>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >>> + * DEALINGS IN THE SOFTWARE. >>> + * >>> + * Authors: >>> + * Oscar Mateo <oscar.mateo@intel.com> >>> + */ >>> + >>> +#include "intel_renderstate.h" >>> +#include <lib/gen10_render.h> >>> +#include <lib/intel_reg.h> >>> + >>> +static void gen8_emit_wm(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2)); >>> + OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION); >>> +} >>> + >>> +static void gen8_emit_ps(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2)); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); /* kernel hi */ >>> + OUT_BATCH(GEN7_PS_SPF_MODE); >>> + OUT_BATCH(0); /* scratch space stuff */ >>> + OUT_BATCH(0); /* scratch hi */ >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); // kernel 1 >>> + OUT_BATCH(0); /* kernel 1 hi */ >>> + OUT_BATCH(0); // kernel 2 >>> + OUT_BATCH(0); /* kernel 2 hi */ >>> +} >>> + >>> +static void gen8_emit_sf(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2)); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT | >>> + 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT | >>> + GEN7_SF_POINT_WIDTH_FROM_SOURCE | >>> + 8); >>> +} >>> + >>> +static void gen8_emit_vs(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2)); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> +} >>> + >>> +static void gen8_emit_hs(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2)); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT); >>> + OUT_BATCH(0); >>> +} >>> + >>> +static void gen8_emit_raster(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2)); >>> + OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW); >>> + OUT_BATCH(0.0); >>> + OUT_BATCH(0.0); >>> + OUT_BATCH(0.0); >>> +} >>> + >>> +static void gen10_emit_urb(struct intel_batchbuffer *batch) >>> +{ >>> + /* Smallest SKU: 3x8*/ >>> + int l3_bank_count = 3; >>> + int slice_count = 1; >>> + int urb_size_per_slice = >>> GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count); >>> + int other_urb_start_addr = >>> GEN10_VS_END_URB_INDEX(urb_size_per_slice); >>> + const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX; >>> + const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS; >>> + int vs_urb_entries = >>> GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice); >>> + >>> + if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES) >>> + vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES; >>> + if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES) >>> + vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES; >>> + >>> + OUT_BATCH(GEN7_3DSTATE_URB_VS); >>> + OUT_BATCH(vs_urb_entries | >>> + (vs_urb_alloc_size << 16) | >>> + (vs_urb_start_addr << 25)); >>> + >>> + OUT_BATCH(GEN7_3DSTATE_URB_HS); >>> + OUT_BATCH(other_urb_start_addr << 25); >>> + >>> + OUT_BATCH(GEN7_3DSTATE_URB_DS); >>> + OUT_BATCH(other_urb_start_addr << 25); >>> + >>> + OUT_BATCH(GEN7_3DSTATE_URB_GS); >>> + OUT_BATCH(other_urb_start_addr << 25); >>> +} >>> + >>> +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY); >>> + OUT_BATCH(_3DPRIM_TRILIST); >>> +} >>> + >>> +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch) >>> +{ >>> + const int num_decls = 128; >>> + int i; >>> + >>> + OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST | >>> + (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */); >>> + OUT_BATCH(0); >>> + OUT_BATCH(num_decls); >>> + >>> + for (i = 0; i < num_decls; i++) { >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + } >>> +} >>> + >>> +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const >>> int index) >>> +{ >>> + OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2)); >>> + OUT_BATCH(index << 29); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> +} >>> + >>> +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const >>> int index) >>> +{ >>> + OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2)); >>> + OUT_BATCH(index << 30); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> +} >>> + >>> +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch) >>> +{ >>> + const int buffers = 33; >>> + int i; >>> + >>> + OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS | >>> + (((4 * buffers) + 1)- 2) /* DWORD count - 2 */); >>> + >>> + for (i = 0; i < buffers; i++) { >>> + OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT | >>> + GEN7_VB0_BUFFER_ADDR_MOD_EN); >>> + OUT_BATCH(0); /* Address */ >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + } >>> +} >>> + >>> +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch) >>> +{ >>> + const int elements = 34; >>> + int i; >>> + >>> + OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS | >>> + (((2 * elements) + 1) - 2) /* DWORD count - 2 */); >>> + >>> + /* Element 0 */ >>> + OUT_BATCH(VE0_VALID); >>> + OUT_BATCH( >>> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT | >>> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT | >>> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT | >>> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT); >>> + /* Elements 1 -> 33 */ >>> + for (i = 1; i < elements; i++) { >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + } >>> +} >>> + >>> +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch) >>> +{ >>> + union { >>> + float fval; >>> + uint32_t uval; >>> + } u; >>> + >>> + unsigned offset; >>> + >>> + u.fval = 1.0f; >>> + >>> + offset = intel_batch_state_offset(batch, 64); >>> + OUT_STATE(0); >>> + OUT_STATE(0); /* Alpha reference value */ >>> + OUT_STATE(u.uval); /* Blend constant color RED */ >>> + OUT_STATE(u.uval); /* Blend constant color BLUE */ >>> + OUT_STATE(u.uval); /* Blend constant color GREEN */ >>> + OUT_STATE(u.uval); /* Blend constant color ALPHA */ >>> + >>> + OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS); >>> + OUT_BATCH_STATE_OFFSET(offset | 1); >>> +} >>> + >>> +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer >>> *batch) >>> +{ >>> + unsigned offset; >>> + int i; >>> + >>> + offset = intel_batch_state_offset(batch, 64); >>> + >>> + for (i = 0; i < 17; i++) >>> + OUT_STATE(0); >>> + >>> + OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2)); >>> + OUT_BATCH_STATE_OFFSET(offset | 1); >>> +} >>> + >>> +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2)); >>> + OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID | >>> + GEN8_PSX_ATTRIBUTE_ENABLE); >>> + >>> +} >>> + >>> +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2)); >>> + OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT); >>> +} >>> + >>> +static void gen8_emit_viewport_state_pointers_cc(struct >>> intel_batchbuffer *batch) >>> +{ >>> + unsigned offset; >>> + >>> + offset = intel_batch_state_offset(batch, 32); >>> + >>> + OUT_STATE((uint32_t)0.0f); /* Minimum depth */ >>> + OUT_STATE((uint32_t)0.0f); /* Maximum depth */ >>> + >>> + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2)); >>> + OUT_BATCH_STATE_OFFSET(offset); >>> +} >>> + >>> +static void gen8_emit_viewport_state_pointers_sf_clip(struct >>> intel_batchbuffer *batch) >>> +{ >>> + unsigned offset; >>> + int i; >>> + >>> + offset = intel_batch_state_offset(batch, 64); >>> + >>> + for (i = 0; i < 16; i++) >>> + OUT_STATE(0); >>> + >>> + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - >>> 2)); >>> + OUT_BATCH_STATE_OFFSET(offset); >>> +} >>> + >>> +static void gen8_emit_primitive(struct intel_batchbuffer *batch) >>> +{ >>> + OUT_BATCH(GEN6_3DPRIMITIVE | (10-2)); >>> + OUT_BATCH(4); /* gen8+ ignore the topology type field */ >>> + OUT_BATCH(1); /* vertex count */ >>> + OUT_BATCH(0); >>> + OUT_BATCH(1); /* single instance */ >>> + OUT_BATCH(0); /* start instance location */ >>> + OUT_BATCH(0); /* index buffer offset, ignored */ >>> + OUT_BATCH(0); /* extended parameter 0 */ >>> + OUT_BATCH(0); /* extended parameter 1 */ >>> + OUT_BATCH(0); /* extended parameter 2 */ >>> +} >>> + >>> +static void gen9_emit_state_base_address(struct intel_batchbuffer >>> *batch) { >>> + const unsigned offset = 0; >>> + OUT_BATCH(GEN6_STATE_BASE_ADDRESS | >>> + (22 - 2) /* DWORD count - 2 */); >>> + >>> + /* general state base address - requires BB address >>> + * added to state offset to be stored in this location >>> + */ >>> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >>> + OUT_BATCH(0); >>> + >>> + /* stateless data port */ >>> + OUT_BATCH(0); >>> + >>> + /* surface state base address - requires BB address >>> + * added to state offset to be stored in this location >>> + */ >>> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >>> + OUT_BATCH(0); >>> + >>> + /* dynamic state base address - requires BB address >>> + * added to state offset to be stored in this location >>> + */ >>> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >>> + OUT_BATCH(0); >>> + >>> + /* indirect state base address */ >>> + OUT_BATCH(BASE_ADDRESS_MODIFY); >>> + OUT_BATCH(0); >>> + >>> + /* instruction state base address - requires BB address >>> + * added to state offset to be stored in this location >>> + */ >>> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >>> + OUT_BATCH(0); >>> + >>> + /* general state buffer size */ >>> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >>> + /* dynamic state buffer size */ >>> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >>> + /* indirect object buffer size */ >>> + OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY); >>> + /* intruction buffer size */ >>> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >>> + >>> + /* bindless surface state base address */ >>> + OUT_BATCH(BASE_ADDRESS_MODIFY); >>> + OUT_BATCH(0); >>> + /* bindless surface state size */ >>> + OUT_BATCH(0); >>> + >>> + /* bindless sampler state base address */ >>> + OUT_BATCH(BASE_ADDRESS_MODIFY); >>> + OUT_BATCH(0); >>> + /* bindless sampler state size */ >>> + OUT_BATCH(0); >>> +} >>> + >>> +/* >>> + * Generate the batch buffer commands needed to initialize the 3D engine >>> + * to its "golden state". >>> + */ >>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch) >>> +{ >>> + int i; >>> + >>> + /* WaRsGatherPoolEnable: cnl */ >>> + OUT_BATCH(GEN7_MI_RS_CONTROL); >>> + >>> +#define GEN8_PIPE_CONTROL_GLOBAL_GTT (1 << 24) >>> + /* PIPE_CONTROL */ >>> + OUT_BATCH(GEN6_PIPE_CONTROL | >>> + (6 - 2)); /* DWORD count - 2 */ >>> + OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + >>> + /* PIPELINE_SELECT */ >>> + OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D); >>> + >>> + OUT_BATCH(MI_LOAD_REGISTER_IMM); >>> + OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG); >>> + OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE); >>> + >>> + gen8_emit_wm(batch); >>> + gen8_emit_ps(batch); >>> + gen8_emit_sf(batch); >>> + >>> + OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */ >>> + OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11); >>> + >>> + gen8_emit_vs(batch); >>> + gen8_emit_hs(batch); >>> + >>> + OUT_CMD(GEN7_3DSTATE_GS, 10); >>> + OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5); >>> + OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */ >>> + OUT_CMD(GEN6_3DSTATE_CLIP, 4); >>> + OUT_CMD(GEN7_3DSTATE_TE, 4); >>> + OUT_CMD(GEN8_3DSTATE_VF, 2); >>> + OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5); >>> + >>> + /* URB States */ >>> + gen10_emit_urb(batch); >>> + >>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130); >>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130); >>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130); >>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130); >>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130); >>> + >>> + OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4); >>> + OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4); >>> + OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4); >>> + >>> + /* Push Constants */ >>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2); >>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2); >>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2); >>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2); >>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2); >>> + >>> + /* Constants */ >>> + OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11); >>> + OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11); >>> + OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11); >>> + OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11); >>> + OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11); >>> + >>> + OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3); >>> + OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2); >>> + gen8_emit_vf_topology(batch); >>> + >>> + /* Streamer out declaration list */ >>> + gen8_emit_so_decl_list(batch); >>> + >>> + /* Streamer out buffers */ >>> + for (i = 0; i < 4; i++) { >>> + gen8_emit_so_buffer(batch, i); >>> + } >>> + >>> + /* State base addresses */ >>> + gen9_emit_state_base_address(batch); >>> + >>> + OUT_CMD(GEN6_STATE_SIP, 3); >>> + OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4); >>> + OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8); >>> + >>> + /* Chroma key */ >>> + for (i = 0; i < 4; i++) { >>> + gen8_emit_chroma_key(batch, i); >>> + } >>> + >>> + OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3); >>> + OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3); >>> + OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5); >>> + OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5); >>> + OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3); >>> + OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2); >>> + >>> + /* WaPSRandomCSNotDone:cnl */ >>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) >>> + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); >>> + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + >>> + OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2); >>> + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2); >>> + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32); >>> + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16); >>> + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16); >>> + OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5); >>> + >>> + /* Vertex buffers */ >>> + gen8_emit_vertex_buffers(batch); >>> + gen8_emit_vertex_elements(batch); >>> + >>> + OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */); >>> + >>> + /* 3D state binding table pointers */ >>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2); >>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2); >>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2); >>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2); >>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2); >>> + >>> + gen8_emit_cc_state_pointers(batch); >>> + gen8_emit_blend_state_pointers(batch); >>> + gen8_emit_ps_extra(batch); >>> + gen8_emit_ps_blend(batch); >>> + >>> + /* 3D state sampler state pointers */ >>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2); >>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2); >>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2); >>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2); >>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2); >>> + >>> + OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2); >>> + >>> + gen8_emit_viewport_state_pointers_cc(batch); >>> + gen8_emit_viewport_state_pointers_sf_clip(batch); >>> + >>> + /* WaPSRandomCSNotDone:cnl */ >>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) >>> + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); >>> + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0); >>> + >>> + gen8_emit_raster(batch); >>> + >>> + OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4); >>> + OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2); >>> + >>> + /* Launch 3D operation */ >>> + gen8_emit_primitive(batch); >>> + >>> + /* WaRsGatherPoolEnable: cnl */ >>> + OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE); >>> + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2)); >>> + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE); >>> + OUT_BATCH(0); >>> + OUT_BATCH(0xfffff << 12); >>> + OUT_BATCH(GEN7_MI_RS_CONTROL); >>> + OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4); >>> + >>> + OUT_BATCH(MI_BATCH_BUFFER_END); >>> +} >>> -- >>> 1.9.1 >>> >>> _______________________________________________ >>> Intel-gfx mailing list >>> Intel-gfx@lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >> >> >> >
So, you were right, with your patches (v2 of that igt) and the "drm/i915: Allow null render state > batchbuffers bigger than one page" in place everything works... However it seems that patch is kind of Nacked for now... We need to first get a solution there before continue with this patches here... On Wed, Jul 12, 2017 at 2:03 PM, Rodrigo Vivi <rodrigo.vivi@gmail.com> wrote: > On Wed, Jul 12, 2017 at 1:42 PM, Oscar Mateo <oscar.mateo@intel.com> wrote: >> >> >> On 07/05/2017 05:50 PM, Rodrigo Vivi wrote: >>> >>> Hi Oscar, >> >> >> Hey! >> >>> I had missed this patch here, but noticed now that I was refreshing >>> and testing more cnl tests before re-submitting them. >>> >>> First of all I believe we need to remove the A0 w/a. I don't believe >>> we will ever see one. So I'm removing all A0 exclusive W/a from the >>> patches as well. >> >> >> Be careful: I think both WAs in the patch are for all steppings (one was >> incorrectly marked as A0 only in v1 of this patch). > > ah cool, so v2 is right... > >> >>> I also gave a try here on your null state. However if I use the golden >>> state generated by this version I get a blank screen because driver >>> load failes with some strange faults: >> >> >> Good. I don't have a CNL so it was only compile-tested. >> >>> any idea? >> >> >> Did you also include the i915 patch to allow golden BBs over one page in >> size? I sent it separately as "drm/i915: Allow null render state >> batchbuffers bigger than one page". BTW: this patch was given a cold >> shoulder in the mailing list, since I could not re-justify why null state >> was needed in the first place (since UMD needs to configure the 3D pipeline >> first thing anyway). I am still trying to get a better explanation from HW >> people. > > hmmmm no... I missed that patch... sorry... > > I'm currently without access to CNL, but as soon as I have I will test > it and if that works I will just merge igt one, review your kernel > one, etc... > >> >> -- Oscar >> >>> [ 4.115243] Memory manager not clean during takedown. >>> >>> [ 4.120389] ------------[ cut here ]------------ >>> [ 4.125068] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_mm.c:892 >>> drm_mm_takedown+0x25/0x30 >>> [ 4.133574] Modules linked in: >>> [ 4.136707] CPU: 0 PID: 1 Comm: swapper/0 Not tainted >>> 4.12.0-eywa-46011-g9a19faf #360 >>> [ 4.144650] Hardware name: Intel Corporation Cannonlake Client >>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >>> CNLSFWR1.R00.X075.D01.1703021113 03/02 >>> [ 4.158500] task: ffff880264ab8000 task.stack: ffffc90000038000 >>> [ 4.164506] RIP: 0010:drm_mm_takedown+0x25/0x30 >>> [ 4.169104] RSP: 0000:ffffc9000003bc28 EFLAGS: 00010292 >>> [ 4.174409] RAX: 0000000000000029 RBX: ffff880260a54170 RCX: >>> ffffffff82468740 >>> [ 4.181654] RDX: 0000000000000001 RSI: 0000000000000082 RDI: >>> 00000000ffffffff >>> [ 4.188839] RBP: ffffc9000003bc28 R08: 00000000fffffffe R09: >>> 000000000000035a >>> [ 4.196028] R10: 0000000000000005 R11: 0000000000000000 R12: >>> ffff880260a50000 >>> [ 4.203215] R13: ffff880260a54348 R14: ffff880260a50070 R15: >>> ffff880262844a00 >>> [ 4.210402] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) >>> knlGS:0000000000000000 >>> [ 4.218541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 4.224344] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: >>> 00000000007406f0 >>> [ 4.231529] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>> 0000000000000000 >>> [ 4.238716] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >>> 0000000000000400 >>> [ 4.245900] PKRU: 00000000 >>> [ 4.248673] Call Trace: >>> [ 4.251193] i915_gem_cleanup_stolen+0x1f/0x30 >>> [ 4.255703] i915_ggtt_cleanup_hw+0xa4/0x170 >>> [ 4.260035] i915_driver_cleanup_hw+0x36/0x40 >>> [ 4.264455] i915_driver_load+0x6a0/0xe70 >>> [ 4.268535] ? _raw_spin_unlock_irqrestore+0x26/0x50 >>> [ 4.273560] i915_pci_probe+0x2c/0x50 >>> [ 4.277293] local_pci_probe+0x45/0xa0 >>> [ 4.281106] ? pci_match_device+0xe0/0x110 >>> [ 4.285265] pci_device_probe+0x135/0x150 >>> [ 4.289343] driver_probe_device+0x288/0x490 >>> [ 4.293676] __driver_attach+0xc9/0xf0 >>> [ 4.297490] ? driver_probe_device+0x490/0x490 >>> [ 4.301999] bus_for_each_dev+0x5d/0x90 >>> [ 4.305902] driver_attach+0x1e/0x20 >>> [ 4.309543] bus_add_driver+0x1d0/0x290 >>> [ 4.313442] driver_register+0x60/0xe0 >>> [ 4.317257] __pci_register_driver+0x5d/0x60 >>> [ 4.321652] i915_init+0x59/0x5c >>> [ 4.324944] ? mipi_dsi_bus_init+0x17/0x17 >>> [ 4.329103] do_one_initcall+0x42/0x180 >>> [ 4.333007] kernel_init_freeable+0x17c/0x202 >>> [ 4.337426] ? set_debug_rodata+0x17/0x17 >>> [ 4.341500] ? rest_init+0x90/0x90 >>> [ 4.344969] kernel_init+0xe/0x110 >>> [ 4.348438] ret_from_fork+0x25/0x30 >>> [ 4.352079] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48 >>> 83 c7 38 48 39 c7 75 01 c3 55 48 c7 c7 70 ac 20 82 31 c0 48 89 e5 e8 >>> 6b 62 b7 ff <0f> ff 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 >>> e5 41 >>> [ 4.371029] ---[ end trace 7d36c2dd72851315 ]--- >>> [ 4.381680] WARN_ON(dev_priv->mm.object_count) >>> [ 4.381698] ------------[ cut here ]------------ >>> [ 4.390921] WARNING: CPU: 0 PID: 1 at >>> drivers/gpu/drm/i915/i915_gem.c:4964 i915_gem_load_cleanup+0x10b/0x120 >>> [ 4.400797] Modules linked in: >>> [ 4.403927] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W >>> 4.12.0-eywa-46011-g9a19faf #360 >>> [ 4.413021] Hardware name: Intel Corporation Cannonlake Client >>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >>> CNLSFWR1.R00.X075.D01.1703021113 03/02 >>> [ 4.426884] task: ffff880264ab8000 task.stack: ffffc90000038000 >>> [ 4.432865] RIP: 0010:i915_gem_load_cleanup+0x10b/0x120 >>> [ 4.438157] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292 >>> [ 4.443450] RAX: 0000000000000022 RBX: ffff880260a50000 RCX: >>> ffffffff82468740 >>> [ 4.450642] RDX: 0000000000000001 RSI: 0000000000000082 RDI: >>> 0000000000000202 >>> [ 4.457839] RBP: ffffc9000003bc68 R08: 0000000000000022 R09: >>> 0000000000000389 >>> [ 4.465029] R10: 0000000000000000 R11: 0000000000000001 R12: >>> ffff880260a54678 >>> [ 4.472227] R13: ffff88026446c000 R14: ffff88026446c000 R15: >>> ffff880262844a00 >>> [ 4.479420] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) >>> knlGS:0000000000000000 >>> [ 4.487564] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 4.493370] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: >>> 00000000007406f0 >>> [ 4.500569] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>> 0000000000000000 >>> [ 4.507763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >>> 0000000000000400 >>> [ 4.514959] PKRU: 00000000 >>> [ 4.517737] Call Trace: >>> [ 4.520265] i915_driver_cleanup_early+0x1a/0x50 >>> [ 4.524955] i915_driver_load+0x6b8/0xe70 >>> [ 4.529038] ? _raw_spin_unlock_irqrestore+0x26/0x50 >>> [ 4.534100] clocksource: Switched to clocksource tsc >>> [ 4.534105] i915_pci_probe+0x2c/0x50 >>> [ 4.534113] local_pci_probe+0x45/0xa0 >>> [ 4.534118] ? pci_match_device+0xe0/0x110 >>> [ 4.534124] pci_device_probe+0x135/0x150 >>> [ 4.534131] driver_probe_device+0x288/0x490 >>> [ 4.534137] __driver_attach+0xc9/0xf0 >>> [ 4.534142] ? driver_probe_device+0x490/0x490 >>> [ 4.534146] bus_for_each_dev+0x5d/0x90 >>> [ 4.534152] driver_attach+0x1e/0x20 >>> [ 4.534156] bus_add_driver+0x1d0/0x290 >>> [ 4.534162] driver_register+0x60/0xe0 >>> [ 4.534167] __pci_register_driver+0x5d/0x60 >>> [ 4.534173] i915_init+0x59/0x5c >>> [ 4.534177] ? mipi_dsi_bus_init+0x17/0x17 >>> [ 4.534181] do_one_initcall+0x42/0x180 >>> [ 4.534187] kernel_init_freeable+0x17c/0x202 >>> [ 4.534191] ? set_debug_rodata+0x17/0x17 >>> [ 4.534196] ? rest_init+0x90/0x90 >>> [ 4.534200] kernel_init+0xe/0x110 >>> [ 4.534204] ret_from_fork+0x25/0x30 >>> [ 4.534208] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 21 4f b1 ff 0f >>> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8 >>> 05 4f b1 ff <0f> ff e9 33 ff ff ff 66 66 66 66 66 2e 0f 1f 84 00 00 00 >>> 00 00 >>> [ 4.534272] ---[ end trace 7d36c2dd72851316 ]--- >>> [ 4.534277] WARN_ON(!list_empty(&dev_priv->gt.timelines)) >>> [ 4.534293] ------------[ cut here ]------------ >>> [ 4.534298] WARNING: CPU: 0 PID: 1 at >>> drivers/gpu/drm/i915/i915_gem.c:4968 i915_gem_load_cleanup+0xef/0x120 >>> [ 4.534299] Modules linked in: >>> [ 4.534304] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W >>> 4.12.0-eywa-46011-g9a19faf #360 >>> [ 4.534306] Hardware name: Intel Corporation Cannonlake Client >>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >>> CNLSFWR1.R00.X075.D01.1703021113 03/02 >>> [ 4.534308] task: ffff880264ab8000 task.stack: ffffc90000038000 >>> [ 4.534312] RIP: 0010:i915_gem_load_cleanup+0xef/0x120 >>> [ 4.534314] RSP: 0000:ffffc9000003bc58 EFLAGS: 00010292 >>> [ 4.534317] RAX: 000000000000002d RBX: ffff880260a50000 RCX: >>> 0000000000000000 >>> [ 4.534319] RDX: 0000000000000001 RSI: 0000000000000002 RDI: >>> 0000000000000296 >>> [ 4.534321] RBP: ffffc9000003bc68 R08: 000000000000002d R09: >>> 000000000000002d >>> [ 4.534322] R10: 0000000000000000 R11: ffff880260a4e000 R12: >>> ffff880260a50070 >>> [ 4.534324] R13: ffff88026446c000 R14: ffff88026446c000 R15: >>> ffff880262844a00 >>> [ 4.534327] FS: 0000000000000000(0000) GS:ffff88026dc00000(0000) >>> knlGS:0000000000000000 >>> [ 4.534329] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 4.534331] CR2: ffffc90000d58000 CR3: 000000000240b000 CR4: >>> 00000000007406f0 >>> [ 4.534334] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>> 0000000000000000 >>> [ 4.534335] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >>> 0000000000000400 >>> [ 4.534337] PKRU: 00000000 >>> [ 4.534338] Call Trace: >>> [ 4.534344] i915_driver_cleanup_early+0x1a/0x50 >>> [ 4.534350] i915_driver_load+0x6b8/0xe70 >>> [ 4.534356] ? _raw_spin_unlock_irqrestore+0x26/0x50 >>> [ 4.534361] i915_pci_probe+0x2c/0x50 >>> [ 4.534366] local_pci_probe+0x45/0xa0 >>> [ 4.534371] ? pci_match_device+0xe0/0x110 >>> [ 4.534376] pci_device_probe+0x135/0x150 >>> [ 4.534382] driver_probe_device+0x288/0x490 >>> [ 4.534388] __driver_attach+0xc9/0xf0 >>> [ 4.534393] ? driver_probe_device+0x490/0x490 >>> [ 4.534398] bus_for_each_dev+0x5d/0x90 >>> [ 4.534403] driver_attach+0x1e/0x20 >>> [ 4.534408] bus_add_driver+0x1d0/0x290 >>> [ 4.534414] driver_register+0x60/0xe0 >>> [ 4.534419] __pci_register_driver+0x5d/0x60 >>> [ 4.534424] i915_init+0x59/0x5c >>> [ 4.534428] ? mipi_dsi_bus_init+0x17/0x17 >>> [ 4.534431] do_one_initcall+0x42/0x180 >>> [ 4.534437] kernel_init_freeable+0x17c/0x202 >>> [ 4.534440] ? set_debug_rodata+0x17/0x17 >>> [ 4.534444] ? rest_init+0x90/0x90 >>> [ 4.534448] kernel_init+0xe/0x110 >>> [ 4.534451] ret_from_fork+0x25/0x30 >>> [ 4.534455] Code: 82 48 c7 c7 7a 98 1a 82 31 c0 e8 3d 4f b1 ff 0f >>> ff e9 5d ff ff ff 48 c7 c6 b0 33 21 82 48 c7 c7 7a 98 1a 82 31 c0 e8 >>> 21 4f b1 ff <0f> ff e9 7a ff ff ff 48 c7 c6 88 33 21 82 48 c7 c7 7a 98 >>> 1a 82 >>> [ 4.534519] ---[ end trace 7d36c2dd72851317 ]--- >>> [ 4.534605] >>> ============================================================================= >>> [ 4.534608] BUG drm_i915_gem_object (Tainted: G W ): >>> Objects remaining in drm_i915_gem_object on __kmem_cache_shutdown() >>> [ 4.534609] >>> ----------------------------------------------------------------------------- >>> >>> [ 4.534611] Disabling lock debugging due to kernel taint >>> [ 4.534614] INFO: Slab 0xffffea0009820600 objects=19 used=2 >>> fp=0xffff88026081ba80 flags=0x200000000008100 >>> [ 4.534618] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B W >>> 4.12.0-eywa-46011-g9a19faf #360 >>> [ 4.534620] Hardware name: Intel Corporation Cannonlake Client >>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >>> CNLSFWR1.R00.X075.D01.1703021113 03/02 >>> [ 4.534621] Call Trace: >>> [ 4.534626] dump_stack+0x65/0x89 >>> [ 4.534633] slab_err+0xa1/0xb0 >>> [ 4.534640] ? __kmalloc+0x185/0x270 >>> [ 4.534645] ? kmem_cache_alloc_bulk+0x1f0/0x1f0 >>> [ 4.534650] ? __kmem_cache_shutdown+0x160/0x400 >>> [ 4.534655] __kmem_cache_shutdown+0x180/0x400 >>> [ 4.534663] shutdown_cache+0x18/0x1a0 >>> [ 4.534667] kmem_cache_destroy+0x1c1/0x1f0 >>> [ 4.534672] i915_gem_load_cleanup+0xb4/0x120 >>> [ 4.534677] i915_driver_cleanup_early+0x1a/0x50 >>> [ 4.534682] i915_driver_load+0x6b8/0xe70 >>> [ 4.534689] ? _raw_spin_unlock_irqrestore+0x26/0x50 >>> [ 4.534693] i915_pci_probe+0x2c/0x50 >>> [ 4.534698] local_pci_probe+0x45/0xa0 >>> [ 4.534703] ? pci_match_device+0xe0/0x110 >>> [ 4.534708] pci_device_probe+0x135/0x150 >>> [ 4.534714] driver_probe_device+0x288/0x490 >>> [ 4.534721] __driver_attach+0xc9/0xf0 >>> [ 4.534726] ? driver_probe_device+0x490/0x490 >>> [ 4.534730] bus_for_each_dev+0x5d/0x90 >>> [ 4.534736] driver_attach+0x1e/0x20 >>> [ 4.534741] bus_add_driver+0x1d0/0x290 >>> [ 4.534746] driver_register+0x60/0xe0 >>> [ 4.534751] __pci_register_driver+0x5d/0x60 >>> [ 4.534756] i915_init+0x59/0x5c >>> [ 4.534760] ? mipi_dsi_bus_init+0x17/0x17 >>> 4.534760] ? mipi_dsi_bus_init+0x17/0x17 >>> [ 4.534763] do_one_initcall+0x42/0x180 >>> [ 4.534769] kernel_init_freeable+0x17c/0x202 >>> [ 4.534773] ? set_debug_rodata+0x17/0x17 >>> [ 4.534777] ? rest_init+0x90/0x90 >>> [ 4.534781] kernel_init+0xe/0x110 >>> [ 4.534784] ret_from_fork+0x25/0x30 >>> [ 4.534791] INFO: Object 0xffff880260818340 @offset=832 >>> [ 4.534792] INFO: Object 0xffff880260818680 @offset=1664 >>> [ 4.534795] kmem_cache_destroy drm_i915_gem_object: Slab cache >>> still has objects >>> [ 4.534798] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B W >>> 4.12.0-eywa-46011-g9a19faf #360 >>> [ 4.534800] Hardware name: Intel Corporation Cannonlake Client >>> platform CNL - U0/CannonLake U DDR4 SODIMM RVP, BIOS >>> CNLSFWR1.R00.X075.D01.1703021113 03/02 >>> [ 4.534801] Call Trace: >>> [ 4.534805] dump_stack+0x65/0x89 >>> [ 4.534809] kmem_cache_destroy+0x1e1/0x1f0 >>> [ 4.534814] i915_gem_load_cleanup+0xb4/0x120 >>> [ 4.534819] i915_driver_cleanup_early+0x1a/0x50 >>> [ 4.534824] i915_driver_load+0x6b8/0xe70 >>> [ 4.534830] ? _raw_spin_unlock_irqrestore+0x26/0x50 >>> [ 4.534835] i915_pci_probe+0x2c/0x50 >>> [ 4.534840] local_pci_probe+0x45/0xa0 >>> [ 4.534844] ? pci_match_device+0xe0/0x110 >>> [ 4.534850] pci_device_probe+0x135/0x150 >>> [ 4.534856] driver_probe_device+0x288/0x490 >>> [ 4.534862] __driver_attach+0xc9/0xf0 >>> [ 4.534867] ? driver_probe_device+0x490/0x490 >>> [ 4.534871] bus_for_each_dev+0x5d/0x90 >>> [ 4.534877] driver_attach+0x1e/0x20 >>> [ 4.534882] bus_add_driver+0x1d0/0x290 >>> [ 4.534888] driver_register+0x60/0xe0 >>> [ 4.534893] __pci_register_driver+0x5d/0x60 >>> [ 4.534897] i915_init+0x59/0x5c >>> [ 4.534901] ? mipi_dsi_bus_init+0x17/0x17 >>> [ 4.534904] do_one_initcall+0x42/0x180 >>> [ 4.534910] kernel_init_freeable+0x17c/0x202 >>> [ 4.534914] ? set_debug_rodata+0x17/0x17 >>> [ 4.534917] ? rest_init+0x90/0x90 >>> [ 4.534922] kernel_init+0xe/0x110 >>> [ 4.534925] ret_from_fork+0x25/0x30 >>> [ 4.535386] i915 0000:00:02.0: [drm:i915_driver_load] Device >>> initialization failed (-22) >>> [ 4.535390] i915 0000:00:02.0: Please file a bug at >>> https://bugs.freedesktop.org/enter_bug.cgi?product=DRI against >>> DRM/Intel providing the dmesg log by booting with drm.debug=0xf >>> [ 4.535450] i915: probe of 0000:00:02.0 failed with error -22 >>> >>> >>> On Fri, Apr 28, 2017 at 7:36 AM, Oscar Mateo <oscar.mateo@intel.com> >>> wrote: >>>> >>>> This batchbuffer is over 4096 bytes, so we need to increase the size of >>>> the >>>> array (and the KMD has to be modified to deal with more than one page). >>>> >>>> Notice that there to workarounds embedded here, both applicable to all >>>> CNL >>>> steppings. >>>> >>>> v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so >>>> update >>>> the comment in the code and in the commit message. >>>> >>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >>>> Cc: Ben Widawsky <ben@bwidawsk.net> >>>> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> >>>> --- >>>> lib/gen10_render.h | 63 +++ >>>> tools/null_state_gen/Makefile.am | 3 +- >>>> tools/null_state_gen/intel_batchbuffer.h | 2 +- >>>> tools/null_state_gen/intel_null_state_gen.c | 5 +- >>>> tools/null_state_gen/intel_renderstate.h | 1 + >>>> tools/null_state_gen/intel_renderstate_gen10.c | 538 >>>> +++++++++++++++++++++++++ >>>> 6 files changed, 609 insertions(+), 3 deletions(-) >>>> create mode 100644 lib/gen10_render.h >>>> create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c >>>> >>>> diff --git a/lib/gen10_render.h b/lib/gen10_render.h >>>> new file mode 100644 >>>> index 0000000..f4a7dff >>>> --- /dev/null >>>> +++ b/lib/gen10_render.h >>>> @@ -0,0 +1,63 @@ >>>> +#ifndef GEN10_RENDER_H >>>> +#define GEN10_RENDER_H >>>> + >>>> +#include "gen9_render.h" >>>> + >>>> +#define GEN7_MI_RS_CONTROL (0x6 << 23) >>>> +# define GEN7_MI_RS_CONTROL_ENABLE (1 << 0) >>>> + >>>> +#define GEN10_3DSTATE_GATHER_POOL_ALLOC GEN6_3D(3, 1, >>>> 0x1a) >>>> +# define GEN10_3DSTATE_GATHER_POOL_ENABLE (1 << 11) >>>> + >>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_VS GEN6_3D(3, 0, 0x34) >>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_HS GEN6_3D(3, 0, 0x36) >>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_DS GEN6_3D(3, 0, 0x37) >>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_GS GEN6_3D(3, 0, 0x35) >>>> +#define GEN10_3DSTATE_GATHER_CONSTANT_PS GEN6_3D(3, 0, 0x38) >>>> + >>>> +#define GEN10_3DSTATE_WM_DEPTH_STENCIL GEN6_3D(3, 0, 0x4e) >>>> +#define GEN10_3DSTATE_WM_CHROMAKEY GEN6_3D(3, 0, 0x4c) >>>> + >>>> +#define GEN8_REG_L3_CACHE_CONFIG 0x7034 >>>> + >>>> +/* >>>> + * Programming for L3 cache allocations can be made per bank. Based on >>>> the >>>> + * programmed value HW will apply same allocations on other available >>>> banks. >>>> + * Total L3 Cache size per bank = 256 KB. >>>> + * {SLM, URB, DC, RO(I/S, C, T), L3 Client Pool} >>>> + * { 0, 96, 32, 128, 0 } >>>> + */ >>>> +#define GEN10_L3_CACHE_CONFIG_VALUE 0x00420060 >>>> + >>>> +#define URB_ALIGN(val, align) ((val % align) ? (val - (val % align)) : >>>> val) >>>> + >>>> +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES 64 >>>> +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES 2752 >>>> + >>>> +#define GEN10_KB_PER_URB_INDEX 8 >>>> +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB 96 >>>> + >>>> +#define GEN10_URB_RESERVED_SIZE_KB 32 >>>> +#define GEN10_URB_RESERVED_END_SIZE_KB 8 >>>> + >>>> +#define GEN10_VS_NUM_BITS_PER_URB_UNIT 512 >>>> +#define GEN10_VS_NUM_OF_URB_UNITS 1 // zero based >>>> +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS >>>> (GEN10_VS_NUM_BITS_PER_URB_UNIT * \ >>>> + >>>> (GEN10_VS_NUM_OF_URB_UNITS + 1)) >>>> + >>>> +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / >>>> GEN10_KB_PER_URB_INDEX) >>>> + >>>> +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count) >>>> \ >>>> + URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * >>>> l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX) >>>> + >>>> +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) \ >>>> + (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - >>>> GEN10_URB_RESERVED_END_SIZE_KB) >>>> + >>>> +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice) \ >>>> + ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) * \ >>>> + 1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS) >>>> + >>>> +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice) \ >>>> + ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / >>>> GEN10_KB_PER_URB_INDEX) >>>> + >>>> +#endif >>>> diff --git a/tools/null_state_gen/Makefile.am >>>> b/tools/null_state_gen/Makefile.am >>>> index 24884a7..2f90990 100644 >>>> --- a/tools/null_state_gen/Makefile.am >>>> +++ b/tools/null_state_gen/Makefile.am >>>> @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES = \ >>>> intel_renderstate_gen7.c \ >>>> intel_renderstate_gen8.c \ >>>> intel_renderstate_gen9.c \ >>>> + intel_renderstate_gen10.c \ >>>> intel_null_state_gen.c >>>> >>>> -gens := 6 7 8 9 >>>> +gens := 6 7 8 9 10 >>>> >>>> h = /tmp/intel_renderstate_gen$$gen.c >>>> states: intel_null_state_gen >>>> diff --git a/tools/null_state_gen/intel_batchbuffer.h >>>> b/tools/null_state_gen/intel_batchbuffer.h >>>> index 771d1c8..e40e01b 100644 >>>> --- a/tools/null_state_gen/intel_batchbuffer.h >>>> +++ b/tools/null_state_gen/intel_batchbuffer.h >>>> @@ -34,7 +34,7 @@ >>>> #include <stdint.h> >>>> >>>> #define MAX_RELOCS 64 >>>> -#define MAX_ITEMS 1024 >>>> +#define MAX_ITEMS 2048 >>>> #define MAX_STRLEN 256 >>>> >>>> #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1)) >>>> diff --git a/tools/null_state_gen/intel_null_state_gen.c >>>> b/tools/null_state_gen/intel_null_state_gen.c >>>> index 06eb954..4f12f5f 100644 >>>> --- a/tools/null_state_gen/intel_null_state_gen.c >>>> +++ b/tools/null_state_gen/intel_null_state_gen.c >>>> @@ -41,7 +41,7 @@ static int debug = 0; >>>> static void print_usage(char *s) >>>> { >>>> fprintf(stderr, "%s: <gen>\n" >>>> - " gen: gen to generate for (6,7,8,9)\n", >>>> + " gen: gen to generate for (6,7,8,9,10)\n", >>>> s); >>>> } >>>> >>>> @@ -173,6 +173,9 @@ static int do_generate(int gen) >>>> case 9: >>>> null_state_gen = gen9_setup_null_render_state; >>>> break; >>>> + case 10: >>>> + null_state_gen = gen10_setup_null_render_state; >>>> + break; >>>> } >>>> >>>> if (null_state_gen == NULL) { >>>> diff --git a/tools/null_state_gen/intel_renderstate.h >>>> b/tools/null_state_gen/intel_renderstate.h >>>> index b27b434..b3c8c2b 100644 >>>> --- a/tools/null_state_gen/intel_renderstate.h >>>> +++ b/tools/null_state_gen/intel_renderstate.h >>>> @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct >>>> intel_batchbuffer *batch); >>>> void gen7_setup_null_render_state(struct intel_batchbuffer *batch); >>>> void gen8_setup_null_render_state(struct intel_batchbuffer *batch); >>>> void gen9_setup_null_render_state(struct intel_batchbuffer *batch); >>>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch); >>>> >>>> #endif /* __INTEL_RENDERSTATE_H__ */ >>>> diff --git a/tools/null_state_gen/intel_renderstate_gen10.c >>>> b/tools/null_state_gen/intel_renderstate_gen10.c >>>> new file mode 100644 >>>> index 0000000..f5678c3 >>>> --- /dev/null >>>> +++ b/tools/null_state_gen/intel_renderstate_gen10.c >>>> @@ -0,0 +1,538 @@ >>>> +/* >>>> + * Copyright © 2014 Intel Corporation >>>> + * >>>> + * Permission is hereby granted, free of charge, to any person obtaining >>>> a >>>> + * copy of this software and associated documentation files (the >>>> "Software"), >>>> + * to deal in the Software without restriction, including without >>>> limitation >>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>> sublicense, >>>> + * and/or sell copies of the Software, and to permit persons to whom the >>>> + * Software is furnished to do so, subject to the following conditions: >>>> + * >>>> + * The above copyright notice and this permission notice (including the >>>> next >>>> + * paragraph) shall be included in all copies or substantial portions of >>>> the >>>> + * Software. >>>> + * >>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>>> EXPRESS OR >>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>> MERCHANTABILITY, >>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT >>>> SHALL >>>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR >>>> OTHER >>>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >>>> ARISING >>>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER >>>> + * DEALINGS IN THE SOFTWARE. >>>> + * >>>> + * Authors: >>>> + * Oscar Mateo <oscar.mateo@intel.com> >>>> + */ >>>> + >>>> +#include "intel_renderstate.h" >>>> +#include <lib/gen10_render.h> >>>> +#include <lib/intel_reg.h> >>>> + >>>> +static void gen8_emit_wm(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2)); >>>> + OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION); >>>> +} >>>> + >>>> +static void gen8_emit_ps(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2)); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); /* kernel hi */ >>>> + OUT_BATCH(GEN7_PS_SPF_MODE); >>>> + OUT_BATCH(0); /* scratch space stuff */ >>>> + OUT_BATCH(0); /* scratch hi */ >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); // kernel 1 >>>> + OUT_BATCH(0); /* kernel 1 hi */ >>>> + OUT_BATCH(0); // kernel 2 >>>> + OUT_BATCH(0); /* kernel 2 hi */ >>>> +} >>>> + >>>> +static void gen8_emit_sf(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2)); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT | >>>> + 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT | >>>> + GEN7_SF_POINT_WIDTH_FROM_SOURCE | >>>> + 8); >>>> +} >>>> + >>>> +static void gen8_emit_vs(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2)); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> +} >>>> + >>>> +static void gen8_emit_hs(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2)); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT); >>>> + OUT_BATCH(0); >>>> +} >>>> + >>>> +static void gen8_emit_raster(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2)); >>>> + OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW); >>>> + OUT_BATCH(0.0); >>>> + OUT_BATCH(0.0); >>>> + OUT_BATCH(0.0); >>>> +} >>>> + >>>> +static void gen10_emit_urb(struct intel_batchbuffer *batch) >>>> +{ >>>> + /* Smallest SKU: 3x8*/ >>>> + int l3_bank_count = 3; >>>> + int slice_count = 1; >>>> + int urb_size_per_slice = >>>> GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count); >>>> + int other_urb_start_addr = >>>> GEN10_VS_END_URB_INDEX(urb_size_per_slice); >>>> + const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX; >>>> + const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS; >>>> + int vs_urb_entries = >>>> GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice); >>>> + >>>> + if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES) >>>> + vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES; >>>> + if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES) >>>> + vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES; >>>> + >>>> + OUT_BATCH(GEN7_3DSTATE_URB_VS); >>>> + OUT_BATCH(vs_urb_entries | >>>> + (vs_urb_alloc_size << 16) | >>>> + (vs_urb_start_addr << 25)); >>>> + >>>> + OUT_BATCH(GEN7_3DSTATE_URB_HS); >>>> + OUT_BATCH(other_urb_start_addr << 25); >>>> + >>>> + OUT_BATCH(GEN7_3DSTATE_URB_DS); >>>> + OUT_BATCH(other_urb_start_addr << 25); >>>> + >>>> + OUT_BATCH(GEN7_3DSTATE_URB_GS); >>>> + OUT_BATCH(other_urb_start_addr << 25); >>>> +} >>>> + >>>> +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY); >>>> + OUT_BATCH(_3DPRIM_TRILIST); >>>> +} >>>> + >>>> +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch) >>>> +{ >>>> + const int num_decls = 128; >>>> + int i; >>>> + >>>> + OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST | >>>> + (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(num_decls); >>>> + >>>> + for (i = 0; i < num_decls; i++) { >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + } >>>> +} >>>> + >>>> +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const >>>> int index) >>>> +{ >>>> + OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2)); >>>> + OUT_BATCH(index << 29); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> +} >>>> + >>>> +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const >>>> int index) >>>> +{ >>>> + OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2)); >>>> + OUT_BATCH(index << 30); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> +} >>>> + >>>> +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch) >>>> +{ >>>> + const int buffers = 33; >>>> + int i; >>>> + >>>> + OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS | >>>> + (((4 * buffers) + 1)- 2) /* DWORD count - 2 */); >>>> + >>>> + for (i = 0; i < buffers; i++) { >>>> + OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT | >>>> + GEN7_VB0_BUFFER_ADDR_MOD_EN); >>>> + OUT_BATCH(0); /* Address */ >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + } >>>> +} >>>> + >>>> +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch) >>>> +{ >>>> + const int elements = 34; >>>> + int i; >>>> + >>>> + OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS | >>>> + (((2 * elements) + 1) - 2) /* DWORD count - 2 */); >>>> + >>>> + /* Element 0 */ >>>> + OUT_BATCH(VE0_VALID); >>>> + OUT_BATCH( >>>> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT | >>>> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT | >>>> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT | >>>> + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT); >>>> + /* Elements 1 -> 33 */ >>>> + for (i = 1; i < elements; i++) { >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + } >>>> +} >>>> + >>>> +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch) >>>> +{ >>>> + union { >>>> + float fval; >>>> + uint32_t uval; >>>> + } u; >>>> + >>>> + unsigned offset; >>>> + >>>> + u.fval = 1.0f; >>>> + >>>> + offset = intel_batch_state_offset(batch, 64); >>>> + OUT_STATE(0); >>>> + OUT_STATE(0); /* Alpha reference value */ >>>> + OUT_STATE(u.uval); /* Blend constant color RED */ >>>> + OUT_STATE(u.uval); /* Blend constant color BLUE */ >>>> + OUT_STATE(u.uval); /* Blend constant color GREEN */ >>>> + OUT_STATE(u.uval); /* Blend constant color ALPHA */ >>>> + >>>> + OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS); >>>> + OUT_BATCH_STATE_OFFSET(offset | 1); >>>> +} >>>> + >>>> +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer >>>> *batch) >>>> +{ >>>> + unsigned offset; >>>> + int i; >>>> + >>>> + offset = intel_batch_state_offset(batch, 64); >>>> + >>>> + for (i = 0; i < 17; i++) >>>> + OUT_STATE(0); >>>> + >>>> + OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2)); >>>> + OUT_BATCH_STATE_OFFSET(offset | 1); >>>> +} >>>> + >>>> +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2)); >>>> + OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID | >>>> + GEN8_PSX_ATTRIBUTE_ENABLE); >>>> + >>>> +} >>>> + >>>> +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2)); >>>> + OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT); >>>> +} >>>> + >>>> +static void gen8_emit_viewport_state_pointers_cc(struct >>>> intel_batchbuffer *batch) >>>> +{ >>>> + unsigned offset; >>>> + >>>> + offset = intel_batch_state_offset(batch, 32); >>>> + >>>> + OUT_STATE((uint32_t)0.0f); /* Minimum depth */ >>>> + OUT_STATE((uint32_t)0.0f); /* Maximum depth */ >>>> + >>>> + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2)); >>>> + OUT_BATCH_STATE_OFFSET(offset); >>>> +} >>>> + >>>> +static void gen8_emit_viewport_state_pointers_sf_clip(struct >>>> intel_batchbuffer *batch) >>>> +{ >>>> + unsigned offset; >>>> + int i; >>>> + >>>> + offset = intel_batch_state_offset(batch, 64); >>>> + >>>> + for (i = 0; i < 16; i++) >>>> + OUT_STATE(0); >>>> + >>>> + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - >>>> 2)); >>>> + OUT_BATCH_STATE_OFFSET(offset); >>>> +} >>>> + >>>> +static void gen8_emit_primitive(struct intel_batchbuffer *batch) >>>> +{ >>>> + OUT_BATCH(GEN6_3DPRIMITIVE | (10-2)); >>>> + OUT_BATCH(4); /* gen8+ ignore the topology type field */ >>>> + OUT_BATCH(1); /* vertex count */ >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(1); /* single instance */ >>>> + OUT_BATCH(0); /* start instance location */ >>>> + OUT_BATCH(0); /* index buffer offset, ignored */ >>>> + OUT_BATCH(0); /* extended parameter 0 */ >>>> + OUT_BATCH(0); /* extended parameter 1 */ >>>> + OUT_BATCH(0); /* extended parameter 2 */ >>>> +} >>>> + >>>> +static void gen9_emit_state_base_address(struct intel_batchbuffer >>>> *batch) { >>>> + const unsigned offset = 0; >>>> + OUT_BATCH(GEN6_STATE_BASE_ADDRESS | >>>> + (22 - 2) /* DWORD count - 2 */); >>>> + >>>> + /* general state base address - requires BB address >>>> + * added to state offset to be stored in this location >>>> + */ >>>> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >>>> + OUT_BATCH(0); >>>> + >>>> + /* stateless data port */ >>>> + OUT_BATCH(0); >>>> + >>>> + /* surface state base address - requires BB address >>>> + * added to state offset to be stored in this location >>>> + */ >>>> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >>>> + OUT_BATCH(0); >>>> + >>>> + /* dynamic state base address - requires BB address >>>> + * added to state offset to be stored in this location >>>> + */ >>>> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >>>> + OUT_BATCH(0); >>>> + >>>> + /* indirect state base address */ >>>> + OUT_BATCH(BASE_ADDRESS_MODIFY); >>>> + OUT_BATCH(0); >>>> + >>>> + /* instruction state base address - requires BB address >>>> + * added to state offset to be stored in this location >>>> + */ >>>> + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); >>>> + OUT_BATCH(0); >>>> + >>>> + /* general state buffer size */ >>>> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >>>> + /* dynamic state buffer size */ >>>> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >>>> + /* indirect object buffer size */ >>>> + OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY); >>>> + /* intruction buffer size */ >>>> + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); >>>> + >>>> + /* bindless surface state base address */ >>>> + OUT_BATCH(BASE_ADDRESS_MODIFY); >>>> + OUT_BATCH(0); >>>> + /* bindless surface state size */ >>>> + OUT_BATCH(0); >>>> + >>>> + /* bindless sampler state base address */ >>>> + OUT_BATCH(BASE_ADDRESS_MODIFY); >>>> + OUT_BATCH(0); >>>> + /* bindless sampler state size */ >>>> + OUT_BATCH(0); >>>> +} >>>> + >>>> +/* >>>> + * Generate the batch buffer commands needed to initialize the 3D engine >>>> + * to its "golden state". >>>> + */ >>>> +void gen10_setup_null_render_state(struct intel_batchbuffer *batch) >>>> +{ >>>> + int i; >>>> + >>>> + /* WaRsGatherPoolEnable: cnl */ >>>> + OUT_BATCH(GEN7_MI_RS_CONTROL); >>>> + >>>> +#define GEN8_PIPE_CONTROL_GLOBAL_GTT (1 << 24) >>>> + /* PIPE_CONTROL */ >>>> + OUT_BATCH(GEN6_PIPE_CONTROL | >>>> + (6 - 2)); /* DWORD count - 2 */ >>>> + OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + >>>> + /* PIPELINE_SELECT */ >>>> + OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D); >>>> + >>>> + OUT_BATCH(MI_LOAD_REGISTER_IMM); >>>> + OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG); >>>> + OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE); >>>> + >>>> + gen8_emit_wm(batch); >>>> + gen8_emit_ps(batch); >>>> + gen8_emit_sf(batch); >>>> + >>>> + OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */ >>>> + OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11); >>>> + >>>> + gen8_emit_vs(batch); >>>> + gen8_emit_hs(batch); >>>> + >>>> + OUT_CMD(GEN7_3DSTATE_GS, 10); >>>> + OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5); >>>> + OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */ >>>> + OUT_CMD(GEN6_3DSTATE_CLIP, 4); >>>> + OUT_CMD(GEN7_3DSTATE_TE, 4); >>>> + OUT_CMD(GEN8_3DSTATE_VF, 2); >>>> + OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5); >>>> + >>>> + /* URB States */ >>>> + gen10_emit_urb(batch); >>>> + >>>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130); >>>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130); >>>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130); >>>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130); >>>> + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130); >>>> + >>>> + OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4); >>>> + OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4); >>>> + OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4); >>>> + >>>> + /* Push Constants */ >>>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2); >>>> + >>>> + /* Constants */ >>>> + OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11); >>>> + OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11); >>>> + OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11); >>>> + OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11); >>>> + OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11); >>>> + >>>> + OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3); >>>> + OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2); >>>> + gen8_emit_vf_topology(batch); >>>> + >>>> + /* Streamer out declaration list */ >>>> + gen8_emit_so_decl_list(batch); >>>> + >>>> + /* Streamer out buffers */ >>>> + for (i = 0; i < 4; i++) { >>>> + gen8_emit_so_buffer(batch, i); >>>> + } >>>> + >>>> + /* State base addresses */ >>>> + gen9_emit_state_base_address(batch); >>>> + >>>> + OUT_CMD(GEN6_STATE_SIP, 3); >>>> + OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4); >>>> + OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8); >>>> + >>>> + /* Chroma key */ >>>> + for (i = 0; i < 4; i++) { >>>> + gen8_emit_chroma_key(batch, i); >>>> + } >>>> + >>>> + OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3); >>>> + OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3); >>>> + OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5); >>>> + OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5); >>>> + OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3); >>>> + OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2); >>>> + >>>> + /* WaPSRandomCSNotDone:cnl */ >>>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) >>>> + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); >>>> + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + >>>> + OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2); >>>> + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2); >>>> + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32); >>>> + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16); >>>> + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16); >>>> + OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5); >>>> + >>>> + /* Vertex buffers */ >>>> + gen8_emit_vertex_buffers(batch); >>>> + gen8_emit_vertex_elements(batch); >>>> + >>>> + OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */); >>>> + >>>> + /* 3D state binding table pointers */ >>>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2); >>>> + >>>> + gen8_emit_cc_state_pointers(batch); >>>> + gen8_emit_blend_state_pointers(batch); >>>> + gen8_emit_ps_extra(batch); >>>> + gen8_emit_ps_blend(batch); >>>> + >>>> + /* 3D state sampler state pointers */ >>>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2); >>>> + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2); >>>> + >>>> + OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2); >>>> + >>>> + gen8_emit_viewport_state_pointers_cc(batch); >>>> + gen8_emit_viewport_state_pointers_sf_clip(batch); >>>> + >>>> + /* WaPSRandomCSNotDone:cnl */ >>>> +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) >>>> + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); >>>> + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0); >>>> + >>>> + gen8_emit_raster(batch); >>>> + >>>> + OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4); >>>> + OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2); >>>> + >>>> + /* Launch 3D operation */ >>>> + gen8_emit_primitive(batch); >>>> + >>>> + /* WaRsGatherPoolEnable: cnl */ >>>> + OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE); >>>> + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2)); >>>> + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE); >>>> + OUT_BATCH(0); >>>> + OUT_BATCH(0xfffff << 12); >>>> + OUT_BATCH(GEN7_MI_RS_CONTROL); >>>> + OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4); >>>> + >>>> + OUT_BATCH(MI_BATCH_BUFFER_END); >>>> +} >>>> -- >>>> 1.9.1 >>>> >>>> _______________________________________________ >>>> Intel-gfx mailing list >>>> Intel-gfx@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >>> >>> >>> >> > > > > -- > Rodrigo Vivi > Blog: http://blog.vivi.eng.br
diff --git a/lib/gen10_render.h b/lib/gen10_render.h new file mode 100644 index 0000000..f4a7dff --- /dev/null +++ b/lib/gen10_render.h @@ -0,0 +1,63 @@ +#ifndef GEN10_RENDER_H +#define GEN10_RENDER_H + +#include "gen9_render.h" + +#define GEN7_MI_RS_CONTROL (0x6 << 23) +# define GEN7_MI_RS_CONTROL_ENABLE (1 << 0) + +#define GEN10_3DSTATE_GATHER_POOL_ALLOC GEN6_3D(3, 1, 0x1a) +# define GEN10_3DSTATE_GATHER_POOL_ENABLE (1 << 11) + +#define GEN10_3DSTATE_GATHER_CONSTANT_VS GEN6_3D(3, 0, 0x34) +#define GEN10_3DSTATE_GATHER_CONSTANT_HS GEN6_3D(3, 0, 0x36) +#define GEN10_3DSTATE_GATHER_CONSTANT_DS GEN6_3D(3, 0, 0x37) +#define GEN10_3DSTATE_GATHER_CONSTANT_GS GEN6_3D(3, 0, 0x35) +#define GEN10_3DSTATE_GATHER_CONSTANT_PS GEN6_3D(3, 0, 0x38) + +#define GEN10_3DSTATE_WM_DEPTH_STENCIL GEN6_3D(3, 0, 0x4e) +#define GEN10_3DSTATE_WM_CHROMAKEY GEN6_3D(3, 0, 0x4c) + +#define GEN8_REG_L3_CACHE_CONFIG 0x7034 + +/* + * Programming for L3 cache allocations can be made per bank. Based on the + * programmed value HW will apply same allocations on other available banks. + * Total L3 Cache size per bank = 256 KB. + * {SLM, URB, DC, RO(I/S, C, T), L3 Client Pool} + * { 0, 96, 32, 128, 0 } + */ +#define GEN10_L3_CACHE_CONFIG_VALUE 0x00420060 + +#define URB_ALIGN(val, align) ((val % align) ? (val - (val % align)) : val) + +#define GEN10_VS_MIN_NUM_OF_URB_ENTRIES 64 +#define GEN10_VS_MAX_NUM_OF_URB_ENTRIES 2752 + +#define GEN10_KB_PER_URB_INDEX 8 +#define GEN10_L3_URB_SIZE_PER_BANK_IN_KB 96 + +#define GEN10_URB_RESERVED_SIZE_KB 32 +#define GEN10_URB_RESERVED_END_SIZE_KB 8 + +#define GEN10_VS_NUM_BITS_PER_URB_UNIT 512 +#define GEN10_VS_NUM_OF_URB_UNITS 1 // zero based +#define GEN10_VS_URB_ENTRY_SIZE_IN_BITS (GEN10_VS_NUM_BITS_PER_URB_UNIT * \ + (GEN10_VS_NUM_OF_URB_UNITS + 1)) + +#define GEN10_VS_URB_START_INDEX (GEN10_URB_RESERVED_SIZE_KB / GEN10_KB_PER_URB_INDEX) + +#define GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count) \ + URB_ALIGN((uint32_t)(GEN10_L3_URB_SIZE_PER_BANK_IN_KB * l3_bank_count / slice_count), GEN10_KB_PER_URB_INDEX) + +#define GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) \ + (total_urb_size_per_slice - GEN10_URB_RESERVED_SIZE_KB - GEN10_URB_RESERVED_END_SIZE_KB) + +#define GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(total_urb_size_per_slice) \ + ((GEN10_VS_URB_SIZE_PER_SLICE_KB(total_urb_size_per_slice) * \ + 1024 * 8) / GEN10_VS_URB_ENTRY_SIZE_IN_BITS) + +#define GEN10_VS_END_URB_INDEX(urb_size_per_slice) \ + ((urb_size_per_slice - GEN10_URB_RESERVED_END_SIZE_KB) / GEN10_KB_PER_URB_INDEX) + +#endif diff --git a/tools/null_state_gen/Makefile.am b/tools/null_state_gen/Makefile.am index 24884a7..2f90990 100644 --- a/tools/null_state_gen/Makefile.am +++ b/tools/null_state_gen/Makefile.am @@ -12,9 +12,10 @@ intel_null_state_gen_SOURCES = \ intel_renderstate_gen7.c \ intel_renderstate_gen8.c \ intel_renderstate_gen9.c \ + intel_renderstate_gen10.c \ intel_null_state_gen.c -gens := 6 7 8 9 +gens := 6 7 8 9 10 h = /tmp/intel_renderstate_gen$$gen.c states: intel_null_state_gen diff --git a/tools/null_state_gen/intel_batchbuffer.h b/tools/null_state_gen/intel_batchbuffer.h index 771d1c8..e40e01b 100644 --- a/tools/null_state_gen/intel_batchbuffer.h +++ b/tools/null_state_gen/intel_batchbuffer.h @@ -34,7 +34,7 @@ #include <stdint.h> #define MAX_RELOCS 64 -#define MAX_ITEMS 1024 +#define MAX_ITEMS 2048 #define MAX_STRLEN 256 #define ALIGN(x, y) (((x) + (y)-1) & ~((y)-1)) diff --git a/tools/null_state_gen/intel_null_state_gen.c b/tools/null_state_gen/intel_null_state_gen.c index 06eb954..4f12f5f 100644 --- a/tools/null_state_gen/intel_null_state_gen.c +++ b/tools/null_state_gen/intel_null_state_gen.c @@ -41,7 +41,7 @@ static int debug = 0; static void print_usage(char *s) { fprintf(stderr, "%s: <gen>\n" - " gen: gen to generate for (6,7,8,9)\n", + " gen: gen to generate for (6,7,8,9,10)\n", s); } @@ -173,6 +173,9 @@ static int do_generate(int gen) case 9: null_state_gen = gen9_setup_null_render_state; break; + case 10: + null_state_gen = gen10_setup_null_render_state; + break; } if (null_state_gen == NULL) { diff --git a/tools/null_state_gen/intel_renderstate.h b/tools/null_state_gen/intel_renderstate.h index b27b434..b3c8c2b 100644 --- a/tools/null_state_gen/intel_renderstate.h +++ b/tools/null_state_gen/intel_renderstate.h @@ -30,5 +30,6 @@ void gen6_setup_null_render_state(struct intel_batchbuffer *batch); void gen7_setup_null_render_state(struct intel_batchbuffer *batch); void gen8_setup_null_render_state(struct intel_batchbuffer *batch); void gen9_setup_null_render_state(struct intel_batchbuffer *batch); +void gen10_setup_null_render_state(struct intel_batchbuffer *batch); #endif /* __INTEL_RENDERSTATE_H__ */ diff --git a/tools/null_state_gen/intel_renderstate_gen10.c b/tools/null_state_gen/intel_renderstate_gen10.c new file mode 100644 index 0000000..f5678c3 --- /dev/null +++ b/tools/null_state_gen/intel_renderstate_gen10.c @@ -0,0 +1,538 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + * + * Authors: + * Oscar Mateo <oscar.mateo@intel.com> + */ + +#include "intel_renderstate.h" +#include <lib/gen10_render.h> +#include <lib/intel_reg.h> + +static void gen8_emit_wm(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN6_3DSTATE_WM | (2 - 2)); + OUT_BATCH(GEN7_WM_LEGACY_DIAMOND_LINE_RASTERIZATION); +} + +static void gen8_emit_ps(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN7_3DSTATE_PS | (12 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); /* kernel hi */ + OUT_BATCH(GEN7_PS_SPF_MODE); + OUT_BATCH(0); /* scratch space stuff */ + OUT_BATCH(0); /* scratch hi */ + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); // kernel 1 + OUT_BATCH(0); /* kernel 1 hi */ + OUT_BATCH(0); // kernel 2 + OUT_BATCH(0); /* kernel 2 hi */ +} + +static void gen8_emit_sf(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN6_3DSTATE_SF | (4 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(1 << GEN6_3DSTATE_SF_TRIFAN_PROVOKE_SHIFT | + 1 << GEN6_3DSTATE_SF_VERTEX_SUB_PIXEL_PRECISION_SHIFT | + GEN7_SF_POINT_WIDTH_FROM_SOURCE | + 8); +} + +static void gen8_emit_vs(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN6_3DSTATE_VS | (9 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(GEN7_VS_FLOATING_POINT_MODE_ALTERNATE); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); +} + +static void gen8_emit_hs(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN7_3DSTATE_HS | (9 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(1 << GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT); + OUT_BATCH(0); +} + +static void gen8_emit_raster(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_3DSTATE_RASTER | (5 - 2)); + OUT_BATCH(GEN8_RASTER_CULL_NONE | GEN8_RASTER_FRONT_WINDING_CCW); + OUT_BATCH(0.0); + OUT_BATCH(0.0); + OUT_BATCH(0.0); +} + +static void gen10_emit_urb(struct intel_batchbuffer *batch) +{ + /* Smallest SKU: 3x8*/ + int l3_bank_count = 3; + int slice_count = 1; + int urb_size_per_slice = GEN10_URB_SIZE_PER_SLICE_KB(l3_bank_count, slice_count); + int other_urb_start_addr = GEN10_VS_END_URB_INDEX(urb_size_per_slice); + const int vs_urb_start_addr = GEN10_VS_URB_START_INDEX; + const int vs_urb_alloc_size = GEN10_VS_NUM_OF_URB_UNITS; + int vs_urb_entries = GEN10_VS_NUM_URB_ENTRIES_PER_SLICE(urb_size_per_slice); + + if (vs_urb_entries < GEN10_VS_MIN_NUM_OF_URB_ENTRIES) + vs_urb_entries = GEN10_VS_MIN_NUM_OF_URB_ENTRIES; + if (vs_urb_entries > GEN10_VS_MAX_NUM_OF_URB_ENTRIES) + vs_urb_entries = GEN10_VS_MAX_NUM_OF_URB_ENTRIES; + + OUT_BATCH(GEN7_3DSTATE_URB_VS); + OUT_BATCH(vs_urb_entries | + (vs_urb_alloc_size << 16) | + (vs_urb_start_addr << 25)); + + OUT_BATCH(GEN7_3DSTATE_URB_HS); + OUT_BATCH(other_urb_start_addr << 25); + + OUT_BATCH(GEN7_3DSTATE_URB_DS); + OUT_BATCH(other_urb_start_addr << 25); + + OUT_BATCH(GEN7_3DSTATE_URB_GS); + OUT_BATCH(other_urb_start_addr << 25); +} + +static void gen8_emit_vf_topology(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_3DSTATE_VF_TOPOLOGY); + OUT_BATCH(_3DPRIM_TRILIST); +} + +static void gen8_emit_so_decl_list(struct intel_batchbuffer *batch) +{ + const int num_decls = 128; + int i; + + OUT_BATCH(GEN8_3DSTATE_SO_DECL_LIST | + (((2 * num_decls) + 3) - 2) /* DWORD count - 2 */); + OUT_BATCH(0); + OUT_BATCH(num_decls); + + for (i = 0; i < num_decls; i++) { + OUT_BATCH(0); + OUT_BATCH(0); + } +} + +static void gen8_emit_so_buffer(struct intel_batchbuffer *batch, const int index) +{ + OUT_BATCH(GEN8_3DSTATE_SO_BUFFER | (8 - 2)); + OUT_BATCH(index << 29); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); +} + +static void gen8_emit_chroma_key(struct intel_batchbuffer *batch, const int index) +{ + OUT_BATCH(GEN6_3DSTATE_CHROMA_KEY | (4 - 2)); + OUT_BATCH(index << 30); + OUT_BATCH(0); + OUT_BATCH(0); +} + +static void gen8_emit_vertex_buffers(struct intel_batchbuffer *batch) +{ + const int buffers = 33; + int i; + + OUT_BATCH(GEN6_3DSTATE_VERTEX_BUFFERS | + (((4 * buffers) + 1)- 2) /* DWORD count - 2 */); + + for (i = 0; i < buffers; i++) { + OUT_BATCH(i << VB0_BUFFER_INDEX_SHIFT | + GEN7_VB0_BUFFER_ADDR_MOD_EN); + OUT_BATCH(0); /* Address */ + OUT_BATCH(0); + OUT_BATCH(0); + } +} + +static void gen8_emit_vertex_elements(struct intel_batchbuffer *batch) +{ + const int elements = 34; + int i; + + OUT_BATCH(GEN6_3DSTATE_VERTEX_ELEMENTS | + (((2 * elements) + 1) - 2) /* DWORD count - 2 */); + + /* Element 0 */ + OUT_BATCH(VE0_VALID); + OUT_BATCH( + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_0_SHIFT | + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_1_SHIFT | + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_2_SHIFT | + GEN6_VFCOMPONENT_STORE_0 << VE1_VFCOMPONENT_3_SHIFT); + /* Elements 1 -> 33 */ + for (i = 1; i < elements; i++) { + OUT_BATCH(0); + OUT_BATCH(0); + } +} + +static void gen8_emit_cc_state_pointers(struct intel_batchbuffer *batch) +{ + union { + float fval; + uint32_t uval; + } u; + + unsigned offset; + + u.fval = 1.0f; + + offset = intel_batch_state_offset(batch, 64); + OUT_STATE(0); + OUT_STATE(0); /* Alpha reference value */ + OUT_STATE(u.uval); /* Blend constant color RED */ + OUT_STATE(u.uval); /* Blend constant color BLUE */ + OUT_STATE(u.uval); /* Blend constant color GREEN */ + OUT_STATE(u.uval); /* Blend constant color ALPHA */ + + OUT_BATCH(GEN6_3DSTATE_CC_STATE_POINTERS); + OUT_BATCH_STATE_OFFSET(offset | 1); +} + +static void gen8_emit_blend_state_pointers(struct intel_batchbuffer *batch) +{ + unsigned offset; + int i; + + offset = intel_batch_state_offset(batch, 64); + + for (i = 0; i < 17; i++) + OUT_STATE(0); + + OUT_BATCH(GEN7_3DSTATE_BLEND_STATE_POINTERS | (2 - 2)); + OUT_BATCH_STATE_OFFSET(offset | 1); +} + +static void gen8_emit_ps_extra(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_3DSTATE_PS_EXTRA | (2 - 2)); + OUT_BATCH(GEN8_PSX_PIXEL_SHADER_VALID | + GEN8_PSX_ATTRIBUTE_ENABLE); + +} + +static void gen8_emit_ps_blend(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN8_3DSTATE_PS_BLEND | (2 - 2)); + OUT_BATCH(GEN8_PS_BLEND_HAS_WRITEABLE_RT); +} + +static void gen8_emit_viewport_state_pointers_cc(struct intel_batchbuffer *batch) +{ + unsigned offset; + + offset = intel_batch_state_offset(batch, 32); + + OUT_STATE((uint32_t)0.0f); /* Minimum depth */ + OUT_STATE((uint32_t)0.0f); /* Maximum depth */ + + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_CC | (2 - 2)); + OUT_BATCH_STATE_OFFSET(offset); +} + +static void gen8_emit_viewport_state_pointers_sf_clip(struct intel_batchbuffer *batch) +{ + unsigned offset; + int i; + + offset = intel_batch_state_offset(batch, 64); + + for (i = 0; i < 16; i++) + OUT_STATE(0); + + OUT_BATCH(GEN7_3DSTATE_VIEWPORT_STATE_POINTERS_SF_CLIP | (2 - 2)); + OUT_BATCH_STATE_OFFSET(offset); +} + +static void gen8_emit_primitive(struct intel_batchbuffer *batch) +{ + OUT_BATCH(GEN6_3DPRIMITIVE | (10-2)); + OUT_BATCH(4); /* gen8+ ignore the topology type field */ + OUT_BATCH(1); /* vertex count */ + OUT_BATCH(0); + OUT_BATCH(1); /* single instance */ + OUT_BATCH(0); /* start instance location */ + OUT_BATCH(0); /* index buffer offset, ignored */ + OUT_BATCH(0); /* extended parameter 0 */ + OUT_BATCH(0); /* extended parameter 1 */ + OUT_BATCH(0); /* extended parameter 2 */ +} + +static void gen9_emit_state_base_address(struct intel_batchbuffer *batch) { + const unsigned offset = 0; + OUT_BATCH(GEN6_STATE_BASE_ADDRESS | + (22 - 2) /* DWORD count - 2 */); + + /* general state base address - requires BB address + * added to state offset to be stored in this location + */ + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* stateless data port */ + OUT_BATCH(0); + + /* surface state base address - requires BB address + * added to state offset to be stored in this location + */ + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* dynamic state base address - requires BB address + * added to state offset to be stored in this location + */ + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* indirect state base address */ + OUT_BATCH(BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* instruction state base address - requires BB address + * added to state offset to be stored in this location + */ + OUT_RELOC(batch, 0, 0, offset | BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + + /* general state buffer size */ + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); + /* dynamic state buffer size */ + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); + /* indirect object buffer size */ + OUT_BATCH(0x0 | BUFFER_SIZE_MODIFY); + /* intruction buffer size */ + OUT_BATCH(GEN8_STATE_SIZE_PAGES(1) | BUFFER_SIZE_MODIFY); + + /* bindless surface state base address */ + OUT_BATCH(BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + /* bindless surface state size */ + OUT_BATCH(0); + + /* bindless sampler state base address */ + OUT_BATCH(BASE_ADDRESS_MODIFY); + OUT_BATCH(0); + /* bindless sampler state size */ + OUT_BATCH(0); +} + +/* + * Generate the batch buffer commands needed to initialize the 3D engine + * to its "golden state". + */ +void gen10_setup_null_render_state(struct intel_batchbuffer *batch) +{ + int i; + + /* WaRsGatherPoolEnable: cnl */ + OUT_BATCH(GEN7_MI_RS_CONTROL); + +#define GEN8_PIPE_CONTROL_GLOBAL_GTT (1 << 24) + /* PIPE_CONTROL */ + OUT_BATCH(GEN6_PIPE_CONTROL | + (6 - 2)); /* DWORD count - 2 */ + OUT_BATCH(GEN8_PIPE_CONTROL_GLOBAL_GTT); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + + /* PIPELINE_SELECT */ + OUT_BATCH(GEN9_PIPELINE_SELECT | PIPELINE_SELECT_3D); + + OUT_BATCH(MI_LOAD_REGISTER_IMM); + OUT_BATCH(GEN8_REG_L3_CACHE_CONFIG); + OUT_BATCH(GEN10_L3_CACHE_CONFIG_VALUE); + + gen8_emit_wm(batch); + gen8_emit_ps(batch); + gen8_emit_sf(batch); + + OUT_CMD(GEN7_3DSTATE_SBE, 6); /* Check w/ Gen8 code */ + OUT_CMD(GEN8_3DSTATE_SBE_SWIZ, 11); + + gen8_emit_vs(batch); + gen8_emit_hs(batch); + + OUT_CMD(GEN7_3DSTATE_GS, 10); + OUT_CMD(GEN7_3DSTATE_STREAMOUT, 5); + OUT_CMD(GEN7_3DSTATE_DS, 11); /* Check w/ Gen8 code */ + OUT_CMD(GEN6_3DSTATE_CLIP, 4); + OUT_CMD(GEN7_3DSTATE_TE, 4); + OUT_CMD(GEN8_3DSTATE_VF, 2); + OUT_CMD(GEN8_3DSTATE_WM_HZ_OP, 5); + + /* URB States */ + gen10_emit_urb(batch); + + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_VS, 130); + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_HS, 130); + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_DS, 130); + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_GS, 130); + OUT_CMD(GEN10_3DSTATE_GATHER_CONSTANT_PS, 130); + + OUT_CMD(GEN8_3DSTATE_BIND_TABLE_POOL_ALLOC, 4); + OUT_CMD(GEN8_3DSTATE_GATHER_POOL_ALLOC, 4); + OUT_CMD(GEN8_3DSTATE_DX9_CONSTANT_BUFFER_POOL_ALLOC, 4); + + /* Push Constants */ + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_VS, 2); + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_HS, 2); + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_DS, 2); + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_GS, 2); + OUT_CMD(GEN7_3DSTATE_PUSH_CONSTANT_ALLOC_PS, 2); + + /* Constants */ + OUT_CMD(GEN6_3DSTATE_CONSTANT_VS, 11); + OUT_CMD(GEN7_3DSTATE_CONSTANT_HS, 11); + OUT_CMD(GEN7_3DSTATE_CONSTANT_DS, 11); + OUT_CMD(GEN7_3DSTATE_CONSTANT_GS, 11); + OUT_CMD(GEN7_3DSTATE_CONSTANT_PS, 11); + + OUT_CMD(GEN8_3DSTATE_VF_INSTANCING, 3); + OUT_CMD(GEN8_3DSTATE_VF_SGVS, 2); + gen8_emit_vf_topology(batch); + + /* Streamer out declaration list */ + gen8_emit_so_decl_list(batch); + + /* Streamer out buffers */ + for (i = 0; i < 4; i++) { + gen8_emit_so_buffer(batch, i); + } + + /* State base addresses */ + gen9_emit_state_base_address(batch); + + OUT_CMD(GEN6_STATE_SIP, 3); + OUT_CMD(GEN6_3DSTATE_DRAWING_RECTANGLE, 4); + OUT_CMD(GEN7_3DSTATE_DEPTH_BUFFER, 8); + + /* Chroma key */ + for (i = 0; i < 4; i++) { + gen8_emit_chroma_key(batch, i); + } + + OUT_CMD(GEN6_3DSTATE_LINE_STIPPLE, 3); + OUT_CMD(GEN6_3DSTATE_AA_LINE_PARAMS, 3); + OUT_CMD(GEN7_3DSTATE_STENCIL_BUFFER, 5); + OUT_CMD(GEN7_3DSTATE_HIER_DEPTH_BUFFER, 5); + OUT_CMD(GEN7_3DSTATE_CLEAR_PARAMS, 3); + OUT_CMD(GEN6_3DSTATE_MONOFILTER_SIZE, 2); + + /* WaPSRandomCSNotDone:cnl */ +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + + OUT_CMD(GEN8_3DSTATE_MULTISAMPLE, 2); + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_OFFSET, 2); + OUT_CMD(GEN8_3DSTATE_POLY_STIPPLE_PATTERN, 1 + 32); + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD0, 1 + 16); + OUT_CMD(GEN8_3DSTATE_SAMPLER_PALETTE_LOAD1, 1 + 16); + OUT_CMD(GEN6_3DSTATE_INDEX_BUFFER, 5); + + /* Vertex buffers */ + gen8_emit_vertex_buffers(batch); + gen8_emit_vertex_elements(batch); + + OUT_BATCH(GEN6_3DSTATE_VF_STATISTICS | 1 /* Enable */); + + /* 3D state binding table pointers */ + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_VS, 2); + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_HS, 2); + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_DS, 2); + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_GS, 2); + OUT_CMD(GEN7_3DSTATE_BINDING_TABLE_POINTERS_PS, 2); + + gen8_emit_cc_state_pointers(batch); + gen8_emit_blend_state_pointers(batch); + gen8_emit_ps_extra(batch); + gen8_emit_ps_blend(batch); + + /* 3D state sampler state pointers */ + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_VS, 2); + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_HS, 2); + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_DS, 2); + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_GS, 2); + OUT_CMD(GEN7_3DSTATE_SAMPLER_STATE_POINTERS_PS, 2); + + OUT_CMD(GEN6_3DSTATE_SCISSOR_STATE_POINTERS, 2); + + gen8_emit_viewport_state_pointers_cc(batch); + gen8_emit_viewport_state_pointers_sf_clip(batch); + + /* WaPSRandomCSNotDone:cnl */ +#define GEN8_PIPE_CONTROL_STALL_ENABLE (1 << 20) + OUT_BATCH(GEN6_PIPE_CONTROL | (6 - 2)); + OUT_BATCH(GEN8_PIPE_CONTROL_STALL_ENABLE); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + + gen8_emit_raster(batch); + + OUT_CMD(GEN10_3DSTATE_WM_DEPTH_STENCIL, 4); + OUT_CMD(GEN10_3DSTATE_WM_CHROMAKEY, 2); + + /* Launch 3D operation */ + gen8_emit_primitive(batch); + + /* WaRsGatherPoolEnable: cnl */ + OUT_BATCH(GEN7_MI_RS_CONTROL | GEN7_MI_RS_CONTROL_ENABLE); + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ALLOC | (4 - 2)); + OUT_BATCH(GEN10_3DSTATE_GATHER_POOL_ENABLE); + OUT_BATCH(0); + OUT_BATCH(0xfffff << 12); + OUT_BATCH(GEN7_MI_RS_CONTROL); + OUT_CMD(GEN10_3DSTATE_GATHER_POOL_ALLOC, 4); + + OUT_BATCH(MI_BATCH_BUFFER_END); +}
This batchbuffer is over 4096 bytes, so we need to increase the size of the array (and the KMD has to be modified to deal with more than one page). Notice that there to workarounds embedded here, both applicable to all CNL steppings. v2: WaPSRandomCSNotDone is not A0 only (as per the latest BSpec), so update the comment in the code and in the commit message. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> --- lib/gen10_render.h | 63 +++ tools/null_state_gen/Makefile.am | 3 +- tools/null_state_gen/intel_batchbuffer.h | 2 +- tools/null_state_gen/intel_null_state_gen.c | 5 +- tools/null_state_gen/intel_renderstate.h | 1 + tools/null_state_gen/intel_renderstate_gen10.c | 538 +++++++++++++++++++++++++ 6 files changed, 609 insertions(+), 3 deletions(-) create mode 100644 lib/gen10_render.h create mode 100644 tools/null_state_gen/intel_renderstate_gen10.c