Message ID | 20230217191813.1104342-1-jonathan.cavitt@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | gen8_ppgtt: Use correct huge page manager for MTL | expand |
On 17/02/2023 19:18, Jonathan Cavitt wrote: > MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because > MTL reports as not supporting 64K pages, or more accurately, the system that reports > whether a platform has 64K pages reports false for MTL. This is only half correct, > as the 64K page support reporting system only cares about 64K page support for LMEM, > which MTL doesn't have. > > MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to > using that manager doesn't resolve the issue because MTL is expecting the virtual > address space for the page table to be flushed after initialization, so we must also > add a flush statement there. > > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Although it looks like the hugepage mock tests are failing with this. I assume the mock device just uses some "max" gen version or so, which now triggers this path. Any ideas for that? > --- > drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > index 4daaa6f55668..9c571185395f 100644 > --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, > } > } while (rem >= page_size && index < max); > > + drm_clflush_virt_range(vaddr, PAGE_SIZE); > vma_res->page_sizes_gtt |= page_size; > } while (iter->sg && sg_dma_len(iter->sg)); > } > @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, > struct sgt_dma iter = sgt_dma(vma_res); > > if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { > - if (HAS_64K_PAGES(vm->i915)) > + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) > xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); > else > gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
On 21/02/2023 16:06, Matthew Auld wrote: > On 17/02/2023 19:18, Jonathan Cavitt wrote: >> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. >> This is because >> MTL reports as not supporting 64K pages, or more accurately, the >> system that reports >> whether a platform has 64K pages reports false for MTL. This is only >> half correct, >> as the 64K page support reporting system only cares about 64K page >> support for LMEM, >> which MTL doesn't have. >> >> MTL should be using xehpsdv_ppgtt_insert_huge. However, simply >> changing over to >> using that manager doesn't resolve the issue because MTL is expecting >> the virtual >> address space for the page table to be flushed after initialization, >> so we must also >> add a flush statement there. >> >> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> > Reviewed-by: Matthew Auld <matthew.auld@intel.com> Also make sure we use "drm/i915:" prefix for commit title. > > Although it looks like the hugepage mock tests are failing with this. I > assume the mock device just uses some "max" gen version or so, which now > triggers this path. Any ideas for that? > >> --- >> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >> b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >> index 4daaa6f55668..9c571185395f 100644 >> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >> @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct >> i915_address_space *vm, >> } >> } while (rem >= page_size && index < max); >> + drm_clflush_virt_range(vaddr, PAGE_SIZE); >> vma_res->page_sizes_gtt |= page_size; >> } while (iter->sg && sg_dma_len(iter->sg)); >> } >> @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct >> i915_address_space *vm, >> struct sgt_dma iter = sgt_dma(vma_res); >> if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { >> - if (HAS_64K_PAGES(vm->i915)) >> + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) >> xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, >> cache_level, flags); >> else >> gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, >> flags);
-----Original Message----- From: Auld, Matthew <matthew.auld@intel.com> Sent: Tuesday, February 21, 2023 8:06 AM To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-gfx@lists.freedesktop.org Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com> Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > > On 17/02/2023 19:18, Jonathan Cavitt wrote: > > MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because > > MTL reports as not supporting 64K pages, or more accurately, the system that reports > > whether a platform has 64K pages reports false for MTL. This is only half correct, > > as the 64K page support reporting system only cares about 64K page support for LMEM, > > which MTL doesn't have. > > > > MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to > > using that manager doesn't resolve the issue because MTL is expecting the virtual > > address space for the page table to be flushed after initialization, so we must also > > add a flush statement there. > > > > Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> > Reviewed-by: Matthew Auld <matthew.auld@intel.com> > > Although it looks like the hugepage mock tests are failing with this. I > assume the mock device just uses some "max" gen version or so, which now > triggers this path. Any ideas for that? With this patch applied, multiple calls to the hugepages live selftest result in a kernel panic. If the mock tests are run immediately after the live ones, that would explain this behavior. I was informed when this was initially debugged that the error was a known IOMMU issue rather than some novel regression, though it's hard to tell if that was just hopeful optimism or not at this point. -Jonathan Cavitt > > > --- > > drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > > index 4daaa6f55668..9c571185395f 100644 > > --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > > +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > > @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, > > } > > } while (rem >= page_size && index < max); > > > > + drm_clflush_virt_range(vaddr, PAGE_SIZE); > > vma_res->page_sizes_gtt |= page_size; > > } while (iter->sg && sg_dma_len(iter->sg)); > > } > > @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, > > struct sgt_dma iter = sgt_dma(vma_res); > > > > if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { > > - if (HAS_64K_PAGES(vm->i915)) > > + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) > > xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); > > else > > gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); >
On 21/02/2023 16:28, Cavitt, Jonathan wrote: > -----Original Message----- > From: Auld, Matthew <matthew.auld@intel.com> > Sent: Tuesday, February 21, 2023 8:06 AM > To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-gfx@lists.freedesktop.org > Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com> > Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL >> >> On 17/02/2023 19:18, Jonathan Cavitt wrote: >>> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because >>> MTL reports as not supporting 64K pages, or more accurately, the system that reports >>> whether a platform has 64K pages reports false for MTL. This is only half correct, >>> as the 64K page support reporting system only cares about 64K page support for LMEM, >>> which MTL doesn't have. >>> >>> MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to >>> using that manager doesn't resolve the issue because MTL is expecting the virtual >>> address space for the page table to be flushed after initialization, so we must also >>> add a flush statement there. >>> >>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> >> Reviewed-by: Matthew Auld <matthew.auld@intel.com> >> >> Although it looks like the hugepage mock tests are failing with this. I >> assume the mock device just uses some "max" gen version or so, which now >> triggers this path. Any ideas for that? > > With this patch applied, multiple calls to the hugepages live selftest result in a kernel panic. > If the mock tests are run immediately after the live ones, that would explain this behavior. > I was informed when this was initially debugged that the error was a known IOMMU issue > rather than some novel regression, though it's hard to tell if that was just hopeful optimism > or not at this point. In the test results we now get: 6> [183.420316] i915: Running i915_gem_huge_page_mock_selftests/igt_mock_exhaust_device_supported_pages <6> [183.436978] i915: Running i915_gem_huge_page_mock_selftests/igt_mock_memory_region_huge_pages <6> [183.445777] i915: Running i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_misaligned_dma <6> [183.904531] i915: Running i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_huge_fill <3> [183.912658] gtt=69632, expected=4096, size=69632, single=yes <3> [183.912784] i915/i915_gem_huge_page_mock_selftests: igt_mock_ppgtt_huge_fill failed with error -22 I didn't look any deeper than that though. Note that this a just a mock/fake device. I don't think its IOMMU related. > -Jonathan Cavitt > >> >>> --- >>> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>> index 4daaa6f55668..9c571185395f 100644 >>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>> @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, >>> } >>> } while (rem >= page_size && index < max); >>> >>> + drm_clflush_virt_range(vaddr, PAGE_SIZE); >>> vma_res->page_sizes_gtt |= page_size; >>> } while (iter->sg && sg_dma_len(iter->sg)); >>> } >>> @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, >>> struct sgt_dma iter = sgt_dma(vma_res); >>> >>> if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { >>> - if (HAS_64K_PAGES(vm->i915)) >>> + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) >>> xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); >>> else >>> gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); >>
-----Original Message----- From: Auld, Matthew <matthew.auld@intel.com> Sent: Tuesday, February 21, 2023 8:33 AM To: Cavitt, Jonathan <jonathan.cavitt@intel.com> Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; intel-gfx@lists.freedesktop.org Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > > On 21/02/2023 16:28, Cavitt, Jonathan wrote: > > -----Original Message----- > > From: Auld, Matthew <matthew.auld@intel.com> > > Sent: Tuesday, February 21, 2023 8:06 AM > > To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-gfx@lists.freedesktop.org > > Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com> > > Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > >> > >> On 17/02/2023 19:18, Jonathan Cavitt wrote: > >>> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because > >>> MTL reports as not supporting 64K pages, or more accurately, the system that reports > >>> whether a platform has 64K pages reports false for MTL. This is only half correct, > >>> as the 64K page support reporting system only cares about 64K page support for LMEM, > >>> which MTL doesn't have. > >>> > >>> MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to > >>> using that manager doesn't resolve the issue because MTL is expecting the virtual > >>> address space for the page table to be flushed after initialization, so we must also > >>> add a flush statement there. > >>> > >>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> > >> Reviewed-by: Matthew Auld <matthew.auld@intel.com> > >> > >> Although it looks like the hugepage mock tests are failing with this. I > >> assume the mock device just uses some "max" gen version or so, which now > >> triggers this path. Any ideas for that? > > > > With this patch applied, multiple calls to the hugepages live selftest result in a kernel panic. > > If the mock tests are run immediately after the live ones, that would explain this behavior. > > I was informed when this was initially debugged that the error was a known IOMMU issue > > rather than some novel regression, though it's hard to tell if that was just hopeful optimism > > or not at this point. > > In the test results we now get: > > 6> [183.420316] i915: Running > i915_gem_huge_page_mock_selftests/igt_mock_exhaust_device_supported_pages > <6> [183.436978] i915: Running > i915_gem_huge_page_mock_selftests/igt_mock_memory_region_huge_pages > <6> [183.445777] i915: Running > i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_misaligned_dma > <6> [183.904531] i915: Running > i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_huge_fill > <3> [183.912658] gtt=69632, expected=4096, size=69632, single=yes > <3> [183.912784] i915/i915_gem_huge_page_mock_selftests: > igt_mock_ppgtt_huge_fill failed with error -22 if (expected_gtt & I915_GTT_PAGE_SIZE_4K) expected_gtt &= ~I915_GTT_PAGE_SIZE_64K; I don't know why we're doing that to expected_gtt, but that seems to be the cause of the problem in this case. -Jonathan Cavitt > > I didn't look any deeper than that though. Note that this a just a > mock/fake device. I don't think its IOMMU related. > > > -Jonathan Cavitt > > > >> > >>> --- > >>> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- > >>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>> index 4daaa6f55668..9c571185395f 100644 > >>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>> @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, > >>> } > >>> } while (rem >= page_size && index < max); > >>> > >>> + drm_clflush_virt_range(vaddr, PAGE_SIZE); > >>> vma_res->page_sizes_gtt |= page_size; > >>> } while (iter->sg && sg_dma_len(iter->sg)); > >>> } > >>> @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, > >>> struct sgt_dma iter = sgt_dma(vma_res); > >>> > >>> if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { > >>> - if (HAS_64K_PAGES(vm->i915)) > >>> + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) > >>> xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); > >>> else > >>> gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); > >> >
On 21/02/2023 17:14, Cavitt, Jonathan wrote: > -----Original Message----- > From: Auld, Matthew <matthew.auld@intel.com> > Sent: Tuesday, February 21, 2023 8:33 AM > To: Cavitt, Jonathan <jonathan.cavitt@intel.com> > Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; intel-gfx@lists.freedesktop.org > Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL >> >> On 21/02/2023 16:28, Cavitt, Jonathan wrote: >>> -----Original Message----- >>> From: Auld, Matthew <matthew.auld@intel.com> >>> Sent: Tuesday, February 21, 2023 8:06 AM >>> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-gfx@lists.freedesktop.org >>> Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com> >>> Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL >>>> >>>> On 17/02/2023 19:18, Jonathan Cavitt wrote: >>>>> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because >>>>> MTL reports as not supporting 64K pages, or more accurately, the system that reports >>>>> whether a platform has 64K pages reports false for MTL. This is only half correct, >>>>> as the 64K page support reporting system only cares about 64K page support for LMEM, >>>>> which MTL doesn't have. >>>>> >>>>> MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to >>>>> using that manager doesn't resolve the issue because MTL is expecting the virtual >>>>> address space for the page table to be flushed after initialization, so we must also >>>>> add a flush statement there. >>>>> >>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> >>>> Reviewed-by: Matthew Auld <matthew.auld@intel.com> >>>> >>>> Although it looks like the hugepage mock tests are failing with this. I >>>> assume the mock device just uses some "max" gen version or so, which now >>>> triggers this path. Any ideas for that? >>> >>> With this patch applied, multiple calls to the hugepages live selftest result in a kernel panic. >>> If the mock tests are run immediately after the live ones, that would explain this behavior. >>> I was informed when this was initially debugged that the error was a known IOMMU issue >>> rather than some novel regression, though it's hard to tell if that was just hopeful optimism >>> or not at this point. >> >> In the test results we now get: >> >> 6> [183.420316] i915: Running >> i915_gem_huge_page_mock_selftests/igt_mock_exhaust_device_supported_pages >> <6> [183.436978] i915: Running >> i915_gem_huge_page_mock_selftests/igt_mock_memory_region_huge_pages >> <6> [183.445777] i915: Running >> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_misaligned_dma >> <6> [183.904531] i915: Running >> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_huge_fill >> <3> [183.912658] gtt=69632, expected=4096, size=69632, single=yes >> <3> [183.912784] i915/i915_gem_huge_page_mock_selftests: >> igt_mock_ppgtt_huge_fill failed with error -22 > > if (expected_gtt & I915_GTT_PAGE_SIZE_4K) > expected_gtt &= ~I915_GTT_PAGE_SIZE_64K; > > I don't know why we're doing that to expected_gtt, but that seems to be the cause of the > problem in this case. I think it's due to the older huge page model, where 64K requires the entire page-table to all use 64K pages underneath (pde level hint), so if we see 4K in there somewhere then we don't expect to get back 64K GTT. But on newer HW we now have have pte level hint, so I think the above can just be removed with this patch, since that's what the mock device now uses. > -Jonathan Cavitt > >> >> I didn't look any deeper than that though. Note that this a just a >> mock/fake device. I don't think its IOMMU related. >> >>> -Jonathan Cavitt >>> >>>> >>>>> --- >>>>> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>>>> index 4daaa6f55668..9c571185395f 100644 >>>>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>>>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>>>> @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, >>>>> } >>>>> } while (rem >= page_size && index < max); >>>>> >>>>> + drm_clflush_virt_range(vaddr, PAGE_SIZE); >>>>> vma_res->page_sizes_gtt |= page_size; >>>>> } while (iter->sg && sg_dma_len(iter->sg)); >>>>> } >>>>> @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, >>>>> struct sgt_dma iter = sgt_dma(vma_res); >>>>> >>>>> if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { >>>>> - if (HAS_64K_PAGES(vm->i915)) >>>>> + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) >>>>> xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); >>>>> else >>>>> gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); >>>> >>
-----Original Message----- From: Auld, Matthew <matthew.auld@intel.com> Sent: Tuesday, February 21, 2023 9:46 AM To: Cavitt, Jonathan <jonathan.cavitt@intel.com> Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; intel-gfx@lists.freedesktop.org Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > > On 21/02/2023 17:14, Cavitt, Jonathan wrote: > > -----Original Message----- > > From: Auld, Matthew <matthew.auld@intel.com> > > Sent: Tuesday, February 21, 2023 8:33 AM > > To: Cavitt, Jonathan <jonathan.cavitt@intel.com> > > Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; intel-gfx@lists.freedesktop.org > > Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > >> > >> On 21/02/2023 16:28, Cavitt, Jonathan wrote: > >>> -----Original Message----- > >>> From: Auld, Matthew <matthew.auld@intel.com> > >>> Sent: Tuesday, February 21, 2023 8:06 AM > >>> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-gfx@lists.freedesktop.org > >>> Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com> > >>> Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL > >>>> > >>>> On 17/02/2023 19:18, Jonathan Cavitt wrote: > >>>>> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because > >>>>> MTL reports as not supporting 64K pages, or more accurately, the system that reports > >>>>> whether a platform has 64K pages reports false for MTL. This is only half correct, > >>>>> as the 64K page support reporting system only cares about 64K page support for LMEM, > >>>>> which MTL doesn't have. > >>>>> > >>>>> MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to > >>>>> using that manager doesn't resolve the issue because MTL is expecting the virtual > >>>>> address space for the page table to be flushed after initialization, so we must also > >>>>> add a flush statement there. > >>>>> > >>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> > >>>> Reviewed-by: Matthew Auld <matthew.auld@intel.com> > >>>> > >>>> Although it looks like the hugepage mock tests are failing with this. I > >>>> assume the mock device just uses some "max" gen version or so, which now > >>>> triggers this path. Any ideas for that? > >>> > >>> With this patch applied, multiple calls to the hugepages live selftest result in a kernel panic. > >>> If the mock tests are run immediately after the live ones, that would explain this behavior. > >>> I was informed when this was initially debugged that the error was a known IOMMU issue > >>> rather than some novel regression, though it's hard to tell if that was just hopeful optimism > >>> or not at this point. > >> > >> In the test results we now get: > >> > >> 6> [183.420316] i915: Running > >> i915_gem_huge_page_mock_selftests/igt_mock_exhaust_device_supported_pages > >> <6> [183.436978] i915: Running > >> i915_gem_huge_page_mock_selftests/igt_mock_memory_region_huge_pages > >> <6> [183.445777] i915: Running > >> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_misaligned_dma > >> <6> [183.904531] i915: Running > >> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_huge_fill > >> <3> [183.912658] gtt=69632, expected=4096, size=69632, single=yes > >> <3> [183.912784] i915/i915_gem_huge_page_mock_selftests: > >> igt_mock_ppgtt_huge_fill failed with error -22 > > > > if (expected_gtt & I915_GTT_PAGE_SIZE_4K) > > expected_gtt &= ~I915_GTT_PAGE_SIZE_64K; > > > > I don't know why we're doing that to expected_gtt, but that seems to be the cause of the > > problem in this case. > > I think it's due to the older huge page model, where 64K requires the > entire page-table to all use 64K pages underneath (pde level hint), so > if we see 4K in there somewhere then we don't expect to get back 64K > GTT. But on newer HW we now have have pte level hint, so I think the > above can just be removed with this patch, since that's what the mock > device now uses. Seems right. I guess that would be... what? Is it: A. Platform specific? I.E. we need s generation check in the selftest to proceed, such as the following: if (expected_gtt & I915_GTT_PAGE_SIZE_4K && GRAPHICS_VER(i915) >= 12) B. Systems specific? I.E. we have a special check for this functionality such as: if (expected_gtt & I915_GTT_PAGE_SIZE_4K && has_pte_level_hint(i915)) C. The new norm. I.E. we can just remove this line from the test and everything will work out fine. -Jonathan Cavitt > > > -Jonathan Cavitt > > > >> > >> I didn't look any deeper than that though. Note that this a just a > >> mock/fake device. I don't think its IOMMU related. > >> > >>> -Jonathan Cavitt > >>> > >>>> > >>>>> --- > >>>>> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- > >>>>> 1 file changed, 2 insertions(+), 1 deletion(-) > >>>>> > >>>>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>>>> index 4daaa6f55668..9c571185395f 100644 > >>>>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>>>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c > >>>>> @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, > >>>>> } > >>>>> } while (rem >= page_size && index < max); > >>>>> > >>>>> + drm_clflush_virt_range(vaddr, PAGE_SIZE); > >>>>> vma_res->page_sizes_gtt |= page_size; > >>>>> } while (iter->sg && sg_dma_len(iter->sg)); > >>>>> } > >>>>> @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, > >>>>> struct sgt_dma iter = sgt_dma(vma_res); > >>>>> > >>>>> if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { > >>>>> - if (HAS_64K_PAGES(vm->i915)) > >>>>> + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) > >>>>> xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); > >>>>> else > >>>>> gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); > >>>> > >> >
On 21/02/2023 18:34, Cavitt, Jonathan wrote: > -----Original Message----- > From: Auld, Matthew <matthew.auld@intel.com> > Sent: Tuesday, February 21, 2023 9:46 AM > To: Cavitt, Jonathan <jonathan.cavitt@intel.com> > Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; intel-gfx@lists.freedesktop.org > Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL >> >> On 21/02/2023 17:14, Cavitt, Jonathan wrote: >>> -----Original Message----- >>> From: Auld, Matthew <matthew.auld@intel.com> >>> Sent: Tuesday, February 21, 2023 8:33 AM >>> To: Cavitt, Jonathan <jonathan.cavitt@intel.com> >>> Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com>; intel-gfx@lists.freedesktop.org >>> Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL >>>> >>>> On 21/02/2023 16:28, Cavitt, Jonathan wrote: >>>>> -----Original Message----- >>>>> From: Auld, Matthew <matthew.auld@intel.com> >>>>> Sent: Tuesday, February 21, 2023 8:06 AM >>>>> To: Cavitt, Jonathan <jonathan.cavitt@intel.com>; intel-gfx@lists.freedesktop.org >>>>> Cc: Dutt, Sudeep <sudeep.dutt@intel.com>; Siddiqui, Ayaz A <ayaz.siddiqui@intel.com> >>>>> Subject: Re: [PATCH] gen8_ppgtt: Use correct huge page manager for MTL >>>>>> >>>>>> On 17/02/2023 19:18, Jonathan Cavitt wrote: >>>>>>> MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because >>>>>>> MTL reports as not supporting 64K pages, or more accurately, the system that reports >>>>>>> whether a platform has 64K pages reports false for MTL. This is only half correct, >>>>>>> as the 64K page support reporting system only cares about 64K page support for LMEM, >>>>>>> which MTL doesn't have. >>>>>>> >>>>>>> MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to >>>>>>> using that manager doesn't resolve the issue because MTL is expecting the virtual >>>>>>> address space for the page table to be flushed after initialization, so we must also >>>>>>> add a flush statement there. >>>>>>> >>>>>>> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> >>>>>> Reviewed-by: Matthew Auld <matthew.auld@intel.com> >>>>>> >>>>>> Although it looks like the hugepage mock tests are failing with this. I >>>>>> assume the mock device just uses some "max" gen version or so, which now >>>>>> triggers this path. Any ideas for that? >>>>> >>>>> With this patch applied, multiple calls to the hugepages live selftest result in a kernel panic. >>>>> If the mock tests are run immediately after the live ones, that would explain this behavior. >>>>> I was informed when this was initially debugged that the error was a known IOMMU issue >>>>> rather than some novel regression, though it's hard to tell if that was just hopeful optimism >>>>> or not at this point. >>>> >>>> In the test results we now get: >>>> >>>> 6> [183.420316] i915: Running >>>> i915_gem_huge_page_mock_selftests/igt_mock_exhaust_device_supported_pages >>>> <6> [183.436978] i915: Running >>>> i915_gem_huge_page_mock_selftests/igt_mock_memory_region_huge_pages >>>> <6> [183.445777] i915: Running >>>> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_misaligned_dma >>>> <6> [183.904531] i915: Running >>>> i915_gem_huge_page_mock_selftests/igt_mock_ppgtt_huge_fill >>>> <3> [183.912658] gtt=69632, expected=4096, size=69632, single=yes >>>> <3> [183.912784] i915/i915_gem_huge_page_mock_selftests: >>>> igt_mock_ppgtt_huge_fill failed with error -22 >>> >>> if (expected_gtt & I915_GTT_PAGE_SIZE_4K) >>> expected_gtt &= ~I915_GTT_PAGE_SIZE_64K; >>> >>> I don't know why we're doing that to expected_gtt, but that seems to be the cause of the >>> problem in this case. >> >> I think it's due to the older huge page model, where 64K requires the >> entire page-table to all use 64K pages underneath (pde level hint), so >> if we see 4K in there somewhere then we don't expect to get back 64K >> GTT. But on newer HW we now have have pte level hint, so I think the >> above can just be removed with this patch, since that's what the mock >> device now uses. > > Seems right. I guess that would be... what? Is it: > A. Platform specific? I.E. we need s generation check in the selftest to proceed, such as the following: > > if (expected_gtt & I915_GTT_PAGE_SIZE_4K && GRAPHICS_VER(i915) >= 12) > > B. Systems specific? I.E. we have a special check for this functionality such as: > > if (expected_gtt & I915_GTT_PAGE_SIZE_4K && has_pte_level_hint(i915)) > > C. The new norm. I.E. we can just remove this line from the test and everything will work out fine. The mock device will always use the max graphics version: RUNTIME_INFO(i915)->graphics.ip.ver = -1; So I think option C. > > -Jonathan Cavitt > >> >>> -Jonathan Cavitt >>> >>>> >>>> I didn't look any deeper than that though. Note that this a just a >>>> mock/fake device. I don't think its IOMMU related. >>>> >>>>> -Jonathan Cavitt >>>>> >>>>>> >>>>>>> --- >>>>>>> drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- >>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>>>>>> index 4daaa6f55668..9c571185395f 100644 >>>>>>> --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>>>>>> +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c >>>>>>> @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, >>>>>>> } >>>>>>> } while (rem >= page_size && index < max); >>>>>>> >>>>>>> + drm_clflush_virt_range(vaddr, PAGE_SIZE); >>>>>>> vma_res->page_sizes_gtt |= page_size; >>>>>>> } while (iter->sg && sg_dma_len(iter->sg)); >>>>>>> } >>>>>>> @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, >>>>>>> struct sgt_dma iter = sgt_dma(vma_res); >>>>>>> >>>>>>> if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { >>>>>>> - if (HAS_64K_PAGES(vm->i915)) >>>>>>> + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) >>>>>>> xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); >>>>>>> else >>>>>>> gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); >>>>>> >>>> >>
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 4daaa6f55668..9c571185395f 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -570,6 +570,7 @@ xehpsdv_ppgtt_insert_huge(struct i915_address_space *vm, } } while (rem >= page_size && index < max); + drm_clflush_virt_range(vaddr, PAGE_SIZE); vma_res->page_sizes_gtt |= page_size; } while (iter->sg && sg_dma_len(iter->sg)); } @@ -707,7 +708,7 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, struct sgt_dma iter = sgt_dma(vma_res); if (vma_res->bi.page_sizes.sg > I915_GTT_PAGE_SIZE) { - if (HAS_64K_PAGES(vm->i915)) + if (GRAPHICS_VER_FULL(vm->i915) >= IP_VER(12, 50)) xehpsdv_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags); else gen8_ppgtt_insert_huge(vm, vma_res, &iter, cache_level, flags);
MTL currently uses gen8_ppgtt_insert_huge when managing huge pages. This is because MTL reports as not supporting 64K pages, or more accurately, the system that reports whether a platform has 64K pages reports false for MTL. This is only half correct, as the 64K page support reporting system only cares about 64K page support for LMEM, which MTL doesn't have. MTL should be using xehpsdv_ppgtt_insert_huge. However, simply changing over to using that manager doesn't resolve the issue because MTL is expecting the virtual address space for the page table to be flushed after initialization, so we must also add a flush statement there. Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> --- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)