Message ID | 20200511160803.15407-1-mika.kuoppala@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Force pte cacheline to main memory | expand |
Quoting Mika Kuoppala (2020-05-11 17:08:03) > We have problems of tgl not seeing a valid pte entry > when iommu is enabled. Add heavy handed flushing > of entry modification by flushing the cpu, cacheline > and then wcb. This forces the pte out to main memory > past this point regarless of promises of coherency. > > This is an evolution of an experimental patch from > Chris Wilson of adding wmb for coherent partners, > by adding a clflush to force the cache->memory step. > > Testcase: igt/gem_exec_fence/parallel > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Not only does it help tgl, but it is also helping with a coherency problem on Braswell. We see similar problems on gen9 and icl, and I have a trybot run to see if it helps with those. As it is helping with multiple platforms and diverse symptoms, even if we can't explain why it helps, it is. That makes it prudent to apply to improve the baseline and work from there. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
Quoting Chris Wilson (2020-05-11 17:13:52) > Quoting Mika Kuoppala (2020-05-11 17:08:03) > > We have problems of tgl not seeing a valid pte entry > > when iommu is enabled. Add heavy handed flushing > > of entry modification by flushing the cpu, cacheline > > and then wcb. This forces the pte out to main memory > > past this point regarless of promises of coherency. > > > > This is an evolution of an experimental patch from > > Chris Wilson of adding wmb for coherent partners, > > by adding a clflush to force the cache->memory step. > > > > Testcase: igt/gem_exec_fence/parallel > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > > Not only does it help tgl, but it is also helping with a coherency > problem on Braswell. We see similar problems on gen9 and icl, and I have > a trybot run to see if it helps with those. It should be noted that Braswell is using WC kmaps of the PTE, so this should not even be necessary... But if we drop the WC and keep the clflush, it fails. Just to add to the confusion. -Chris
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 94e746af8926..6b13408b0e38 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -389,6 +389,15 @@ static int gen8_ppgtt_alloc(struct i915_address_space *vm, return err; } +static __always_inline inline void +write_pte(gen8_pte_t * const pte, const gen8_pte_t val) +{ + *pte = val; + wmb(); /* cpu to cache */ + clflush((void *)pte); /* cache to memory */ + wmb(); /* visible to all */ +} + static __always_inline u64 gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt, struct i915_page_directory *pdp, @@ -405,7 +414,8 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt, vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1))); do { GEM_BUG_ON(iter->sg->length < I915_GTT_PAGE_SIZE); - vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma; + write_pte(&vaddr[gen8_pd_index(idx, 0)], + pte_encode | iter->dma); iter->dma += I915_GTT_PAGE_SIZE; if (iter->dma >= iter->max) { @@ -487,7 +497,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma, do { GEM_BUG_ON(iter->sg->length < page_size); - vaddr[index++] = encode | iter->dma; + write_pte(&vaddr[index++], encode | iter->dma); start += page_size; iter->dma += page_size;
We have problems of tgl not seeing a valid pte entry when iommu is enabled. Add heavy handed flushing of entry modification by flushing the cpu, cacheline and then wcb. This forces the pte out to main memory past this point regarless of promises of coherency. This is an evolution of an experimental patch from Chris Wilson of adding wmb for coherent partners, by adding a clflush to force the cache->memory step. Testcase: igt/gem_exec_fence/parallel Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> --- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)