Message ID | 1425473717-20059-1-git-send-email-mika.kuoppala@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 5884
-------------------------------------Summary-------------------------------------
Platform Delta drm-intel-nightly Series Applied
PNV -6 278/278 272/278
ILK -1 308/308 307/308
SNB -1 284/284 283/284
IVB 380/380 380/380
BYT 294/294 294/294
HSW 387/387 387/387
BDW -1 316/316 315/316
-------------------------------------Detailed-------------------------------------
Platform Test drm-intel-nightly Series Applied
*PNV igt_gem_fence_thrash_bo-write-verify-none NRUN(1)PASS(7) FAIL(1)PASS(1)
*PNV igt_gem_fence_thrash_bo-write-verify-x PASS(8) FAIL(1)PASS(1)
*PNV igt_gem_fence_thrash_bo-write-verify-y NO_RESULT(1)PASS(7) FAIL(1)PASS(1)
*PNV igt_gem_userptr_blits_minor-normal-sync PASS(5) DMESG_WARN(1)PASS(1)
PNV igt_gen3_render_mixed_blits FAIL(10)PASS(9) FAIL(2)
PNV igt_gem_fence_thrash_bo-write-verify-threaded-none FAIL(2)CRASH(5)PASS(7) CRASH(2)
ILK igt_gem_unfence_active_buffers DMESG_WARN(1)PASS(3) DMESG_WARN(1)PASS(1)
*SNB igt_gem_exec_params_sol-reset-not-gen7 PASS(2) DMESG_WARN(1)PASS(1)
*BDW igt_gem_gtt_hog PASS(20) DMESG_WARN(1)PASS(1)
Note: You need to pay more attention to line start with '*'
On Wed, Mar 04, 2015 at 02:55:17PM +0200, Mika Kuoppala wrote: > If the requested size is less than what the full range > of pdps can address, we end up setting pdps for only the > requested area. > > The logical context however needs all pdp entries to be valid. > Prior to commit 06fda602dbca ("drm/i915: Create page table allocators") > we have been writing pdp entries with dma address of zero instead > of valid pdps. This is supposedly bad even if those pdps are not > addressed. > > As commit 06fda602dbca ("drm/i915: Create page table allocators") > introduced more dynamic structure for pdps, we ended up oopsing > when we populated the lrc context. Analyzing this oops revealed > the fact that we have not been writing valid pdps with bsw, as > it is doing the ppgtt init with 2GB limit in some cases. > > We should do the right thing and setup the non addressable part > pdps/pde/pte to scratch page through the minimal structure by > having just pdp with pde entries pointing to same page with > pte entries pointing to scratch page. > > But instead of going through that trouble, setup all the pdps > through individual pd pages and pt entries, even for non > addressable parts. And let the clear range point them to scratch > page. This way we populate the lrc with valid pdps and wait > for dynamic page allocation work to land, and do the heavy lifting > for truncating page table tree according to usage. > > The regression of oopsing in init was introduced by > commit 06fda602dbca ("drm/i915: Create page table allocators") > > v2: Clear the range for the unused part also (Ville) > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89350 > Cc: Michel Thierry <michel.thierry@intel.com> > Cc: Ben Widawsky <benjamin.widawsky@intel.com> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > Tested-by: Valtteri Rantala <valtteri.rantala@intel.com> > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Seems sane enough assuming we're willing to pay the cost. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> And then we could also throw another patch on top to actually increase the PPGTT size to 4GiB since there's no extra cost anymore :) Well, assuming everything else is 4GiB safe. But I guess it ought to be if BDW machines have already been running with 4GiB GGTT/PPGTT. > --- > drivers/gpu/drm/i915/i915_gem_gtt.c | 21 +++++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > index bd95776..1aa2a2c 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -716,15 +716,19 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) > if (size % (1<<30)) > DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size); > > - /* 1. Do all our allocations for page directories and page tables. */ > - ret = gen8_ppgtt_alloc(ppgtt, max_pdp); > + /* 1. Do all our allocations for page directories and page tables. > + * We allocate more than was asked so that we can point the unused parts > + * to valid entries that point to scratch page. Dynamic page tables > + * will fix this eventually. > + */ > + ret = gen8_ppgtt_alloc(ppgtt, GEN8_LEGACY_PDPES); > if (ret) > return ret; > > /* > * 2. Create DMA mappings for the page directories and page tables. > */ > - for (i = 0; i < max_pdp; i++) { > + for (i = 0; i < GEN8_LEGACY_PDPES; i++) { > ret = gen8_ppgtt_setup_page_directories(ppgtt, i); > if (ret) > goto bail; > @@ -744,7 +748,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) > * plugged in correctly. So we do that now/here. For aliasing PPGTT, we > * will never need to touch the PDEs again. > */ > - for (i = 0; i < max_pdp; i++) { > + for (i = 0; i < GEN8_LEGACY_PDPES; i++) { > struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i]; > gen8_ppgtt_pde_t *pd_vaddr; > pd_vaddr = kmap_atomic(ppgtt->pdp.page_directory[i]->page); > @@ -764,9 +768,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) > ppgtt->base.insert_entries = gen8_ppgtt_insert_entries; > ppgtt->base.cleanup = gen8_ppgtt_cleanup; > ppgtt->base.start = 0; > - ppgtt->base.total = ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE * PAGE_SIZE; > > - ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true); > + /* This is the area that we advertise as usable for the caller */ > + ppgtt->base.total = max_pdp * GEN8_PDES_PER_PAGE * GEN8_PTES_PER_PAGE * PAGE_SIZE; > + > + /* Set all ptes to a valid scratch page. Also above requested space */ > + ppgtt->base.clear_range(&ppgtt->base, 0, > + ppgtt->num_pd_pages * GEN8_PTES_PER_PAGE * PAGE_SIZE, > + true); > > DRM_DEBUG_DRIVER("Allocated %d pages for page directories (%d wasted)\n", > ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp); > -- > 1.9.1
On Wed, Mar 04, 2015 at 09:14:39PM +0200, Ville Syrjälä wrote: > On Wed, Mar 04, 2015 at 02:55:17PM +0200, Mika Kuoppala wrote: > > If the requested size is less than what the full range > > of pdps can address, we end up setting pdps for only the > > requested area. > > > > The logical context however needs all pdp entries to be valid. > > Prior to commit 06fda602dbca ("drm/i915: Create page table allocators") > > we have been writing pdp entries with dma address of zero instead > > of valid pdps. This is supposedly bad even if those pdps are not > > addressed. > > > > As commit 06fda602dbca ("drm/i915: Create page table allocators") > > introduced more dynamic structure for pdps, we ended up oopsing > > when we populated the lrc context. Analyzing this oops revealed > > the fact that we have not been writing valid pdps with bsw, as > > it is doing the ppgtt init with 2GB limit in some cases. > > > > We should do the right thing and setup the non addressable part > > pdps/pde/pte to scratch page through the minimal structure by > > having just pdp with pde entries pointing to same page with > > pte entries pointing to scratch page. > > > > But instead of going through that trouble, setup all the pdps > > through individual pd pages and pt entries, even for non > > addressable parts. And let the clear range point them to scratch > > page. This way we populate the lrc with valid pdps and wait > > for dynamic page allocation work to land, and do the heavy lifting > > for truncating page table tree according to usage. > > > > The regression of oopsing in init was introduced by > > commit 06fda602dbca ("drm/i915: Create page table allocators") > > > > v2: Clear the range for the unused part also (Ville) > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89350 > > Cc: Michel Thierry <michel.thierry@intel.com> > > Cc: Ben Widawsky <benjamin.widawsky@intel.com> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> > > Tested-by: Valtteri Rantala <valtteri.rantala@intel.com> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> > > Seems sane enough assuming we're willing to pay the cost. > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > > And then we could also throw another patch on top to actually increase the > PPGTT size to 4GiB since there's no extra cost anymore :) Well, assuming > everything else is 4GiB safe. But I guess it ought to be if BDW machines > have already been running with 4GiB GGTT/PPGTT. Queued for -next, thanks for the patch. -Daniel
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index bd95776..1aa2a2c 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -716,15 +716,19 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) if (size % (1<<30)) DRM_INFO("Pages will be wasted unless GTT size (%llu) is divisible by 1GB\n", size); - /* 1. Do all our allocations for page directories and page tables. */ - ret = gen8_ppgtt_alloc(ppgtt, max_pdp); + /* 1. Do all our allocations for page directories and page tables. + * We allocate more than was asked so that we can point the unused parts + * to valid entries that point to scratch page. Dynamic page tables + * will fix this eventually. + */ + ret = gen8_ppgtt_alloc(ppgtt, GEN8_LEGACY_PDPES); if (ret) return ret; /* * 2. Create DMA mappings for the page directories and page tables. */ - for (i = 0; i < max_pdp; i++) { + for (i = 0; i < GEN8_LEGACY_PDPES; i++) { ret = gen8_ppgtt_setup_page_directories(ppgtt, i); if (ret) goto bail; @@ -744,7 +748,7 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) * plugged in correctly. So we do that now/here. For aliasing PPGTT, we * will never need to touch the PDEs again. */ - for (i = 0; i < max_pdp; i++) { + for (i = 0; i < GEN8_LEGACY_PDPES; i++) { struct i915_page_directory_entry *pd = ppgtt->pdp.page_directory[i]; gen8_ppgtt_pde_t *pd_vaddr; pd_vaddr = kmap_atomic(ppgtt->pdp.page_directory[i]->page); @@ -764,9 +768,14 @@ static int gen8_ppgtt_init(struct i915_hw_ppgtt *ppgtt, uint64_t size) ppgtt->base.insert_entries = gen8_ppgtt_insert_entries; ppgtt->base.cleanup = gen8_ppgtt_cleanup; ppgtt->base.start = 0; - ppgtt->base.total = ppgtt->num_pd_entries * GEN8_PTES_PER_PAGE * PAGE_SIZE; - ppgtt->base.clear_range(&ppgtt->base, 0, ppgtt->base.total, true); + /* This is the area that we advertise as usable for the caller */ + ppgtt->base.total = max_pdp * GEN8_PDES_PER_PAGE * GEN8_PTES_PER_PAGE * PAGE_SIZE; + + /* Set all ptes to a valid scratch page. Also above requested space */ + ppgtt->base.clear_range(&ppgtt->base, 0, + ppgtt->num_pd_pages * GEN8_PTES_PER_PAGE * PAGE_SIZE, + true); DRM_DEBUG_DRIVER("Allocated %d pages for page directories (%d wasted)\n", ppgtt->num_pd_pages, ppgtt->num_pd_pages - max_pdp);