Message ID | 20220222145206.76118-1-balasubramani.vivekanandan@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | drm/i915: Use the memcpy_from_wc function from drm | expand |
On 22/02/2022 15:51, Balasubramani Vivekanandan wrote: > drm_memcpy_from_wc() performs fast copy from WC memory type using > non-temporal instructions. Now there are two similar implementations of > this function. One exists in drm_cache.c as drm_memcpy_from_wc() and > another implementation in i915/i915_memcpy.c as i915_memcpy_from_wc(). > drm_memcpy_from_wc() was the recent addition through the series > https://patchwork.freedesktop.org/patch/436276/?series=90681&rev=6 > > The goal of this patch series is to change all users of > i915_memcpy_from_wc() to drm_memcpy_from_wc() and a have common > implementation in drm and eventually remove the copy from i915. > > Another benefit of using memcpy functions from drm is that > drm_memcpy_from_wc() is available for non-x86 architectures. > i915_memcpy_from_wc() is implemented only for x86 and prevents building > i915 for ARM64. > drm_memcpy_from_wc() does fast copy using non-temporal instructions for > x86 and for other architectures makes use of memcpy() family of > functions as fallback. > > Another major difference is unlike i915_memcpy_from_wc(), > drm_memcpy_from_wc() will not fail if the passed address argument is not > alignment to be used with non-temporal load instructions or if the > platform lacks support for those instructions (non-temporal load > instructions are provided through SSE4.1 instruction set extension). > Instead drm_memcpy_from_wc() continues with fallback functions to > complete the copy. > This relieves the caller from checking the return value of > i915_memcpy_from_wc() and explicitly using a fallback. > > Follow up series will be created to remove the memcpy_from_wc functions > from i915 once the dependency is completely removed. Overall the series looks good to me but I think you can add another patch to remove i915_memcpy_from_wc() as I don't see any other usages left after this series, may be I am missing something? Regards, Nirmoy > > Cc: Jani Nikula <jani.nikula@intel.com> > Cc: Lucas De Marchi <lucas.demarchi@intel.com> > Cc: David Airlie <airlied@linux.ie> > Cc: Daniel Vetter <daniel@ffwll.ch> > Cc: Chris Wilson <chris.p.wilson@intel.com> > Cc: Thomas Hellstr_m <thomas.hellstrom@linux.intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > > Balasubramani Vivekanandan (7): > drm: Relax alignment constraint for destination address > drm: Add drm_memcpy_from_wc() variant which accepts destination > address > drm/i915: use the memcpy_from_wc call from the drm > drm/i915/guc: use the memcpy_from_wc call from the drm > drm/i915/selftests: use the memcpy_from_wc call from the drm > drm/i915/gt: Avoid direct dereferencing of io memory > drm/i915: Avoid dereferencing io mapped memory > > drivers/gpu/drm/drm_cache.c | 98 +++++++++++++++++-- > drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 +- > drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++-- > drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 11 ++- > drivers/gpu/drm/i915/i915_gpu_error.c | 45 +++++---- > .../drm/i915/selftests/intel_memory_region.c | 8 +- > include/drm/drm_cache.h | 3 + > 7 files changed, 148 insertions(+), 46 deletions(-) >
On 23.02.2022 10:02, Das, Nirmoy wrote: > > On 22/02/2022 15:51, Balasubramani Vivekanandan wrote: > > drm_memcpy_from_wc() performs fast copy from WC memory type using > > non-temporal instructions. Now there are two similar implementations of > > this function. One exists in drm_cache.c as drm_memcpy_from_wc() and > > another implementation in i915/i915_memcpy.c as i915_memcpy_from_wc(). > > drm_memcpy_from_wc() was the recent addition through the series > > https://patchwork.freedesktop.org/patch/436276/?series=90681&rev=6 > > > > The goal of this patch series is to change all users of > > i915_memcpy_from_wc() to drm_memcpy_from_wc() and a have common > > implementation in drm and eventually remove the copy from i915. > > > > Another benefit of using memcpy functions from drm is that > > drm_memcpy_from_wc() is available for non-x86 architectures. > > i915_memcpy_from_wc() is implemented only for x86 and prevents building > > i915 for ARM64. > > drm_memcpy_from_wc() does fast copy using non-temporal instructions for > > x86 and for other architectures makes use of memcpy() family of > > functions as fallback. > > > > Another major difference is unlike i915_memcpy_from_wc(), > > drm_memcpy_from_wc() will not fail if the passed address argument is not > > alignment to be used with non-temporal load instructions or if the > > platform lacks support for those instructions (non-temporal load > > instructions are provided through SSE4.1 instruction set extension). > > Instead drm_memcpy_from_wc() continues with fallback functions to > > complete the copy. > > This relieves the caller from checking the return value of > > i915_memcpy_from_wc() and explicitly using a fallback. > > > > Follow up series will be created to remove the memcpy_from_wc functions > > from i915 once the dependency is completely removed. > > Overall the series looks good to me but I think you can add another patch to > remove > > i915_memcpy_from_wc() as I don't see any other usages left after this series, may be I > am missing something? I have changed all users of i915_memcpy_from_wc() to drm function. But this is another function i915_unaligned_memcpy_from_wc() in i915_memcpy.c which is blocking completely eliminating the i915_memcpy.c file from i915. This function accepts unaligned source address and does fast copy only for the aligned region of memory and remaining part is copied using memcpy function. Either I can move i915_unaligned_memcpy_from_wc() also to drm but I am concerned since it is more a platform specific handling, does it make sense to keep it in drm. Else I have retain to i915_unaligned_memcpy_from_wc() inside i915 and refactor the function to use drm_memcpy_from_wc() instead of the __memcpy_ntdqu(). But before I could do more changes, I wanted feedback on the current change. So I decided to go ahead with creating series for review. Regards, Bala > > Regards, > Nirmoy > > > > > Cc: Jani Nikula <jani.nikula@intel.com> > > Cc: Lucas De Marchi <lucas.demarchi@intel.com> > > Cc: David Airlie <airlied@linux.ie> > > Cc: Daniel Vetter <daniel@ffwll.ch> > > Cc: Chris Wilson <chris.p.wilson@intel.com> > > Cc: Thomas Hellstr_m <thomas.hellstrom@linux.intel.com> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > > > > Balasubramani Vivekanandan (7): > > drm: Relax alignment constraint for destination address > > drm: Add drm_memcpy_from_wc() variant which accepts destination > > address > > drm/i915: use the memcpy_from_wc call from the drm > > drm/i915/guc: use the memcpy_from_wc call from the drm > > drm/i915/selftests: use the memcpy_from_wc call from the drm > > drm/i915/gt: Avoid direct dereferencing of io memory > > drm/i915: Avoid dereferencing io mapped memory > > > > drivers/gpu/drm/drm_cache.c | 98 +++++++++++++++++-- > > drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 +- > > drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++-- > > drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 11 ++- > > drivers/gpu/drm/i915/i915_gpu_error.c | 45 +++++---- > > .../drm/i915/selftests/intel_memory_region.c | 8 +- > > include/drm/drm_cache.h | 3 + > > 7 files changed, 148 insertions(+), 46 deletions(-) > >
On 23/02/2022 12:08, Balasubramani Vivekanandan wrote: > On 23.02.2022 10:02, Das, Nirmoy wrote: >> On 22/02/2022 15:51, Balasubramani Vivekanandan wrote: >>> drm_memcpy_from_wc() performs fast copy from WC memory type using >>> non-temporal instructions. Now there are two similar implementations of >>> this function. One exists in drm_cache.c as drm_memcpy_from_wc() and >>> another implementation in i915/i915_memcpy.c as i915_memcpy_from_wc(). >>> drm_memcpy_from_wc() was the recent addition through the series >>> https://patchwork.freedesktop.org/patch/436276/?series=90681&rev=6 >>> >>> The goal of this patch series is to change all users of >>> i915_memcpy_from_wc() to drm_memcpy_from_wc() and a have common >>> implementation in drm and eventually remove the copy from i915. >>> >>> Another benefit of using memcpy functions from drm is that >>> drm_memcpy_from_wc() is available for non-x86 architectures. >>> i915_memcpy_from_wc() is implemented only for x86 and prevents building >>> i915 for ARM64. >>> drm_memcpy_from_wc() does fast copy using non-temporal instructions for >>> x86 and for other architectures makes use of memcpy() family of >>> functions as fallback. >>> >>> Another major difference is unlike i915_memcpy_from_wc(), >>> drm_memcpy_from_wc() will not fail if the passed address argument is not >>> alignment to be used with non-temporal load instructions or if the >>> platform lacks support for those instructions (non-temporal load >>> instructions are provided through SSE4.1 instruction set extension). >>> Instead drm_memcpy_from_wc() continues with fallback functions to >>> complete the copy. >>> This relieves the caller from checking the return value of >>> i915_memcpy_from_wc() and explicitly using a fallback. >>> >>> Follow up series will be created to remove the memcpy_from_wc functions >>> from i915 once the dependency is completely removed. >> Overall the series looks good to me but I think you can add another patch to >> remove >> >> i915_memcpy_from_wc() as I don't see any other usages left after this series, may be I >> am missing something? > I have changed all users of i915_memcpy_from_wc() to drm function. But > this is another function i915_unaligned_memcpy_from_wc() in > i915_memcpy.c which is blocking completely eliminating the i915_memcpy.c > file from i915. > This function accepts unaligned source address and does fast copy only > for the aligned region of memory and remaining part is copied using > memcpy function. > Either I can move i915_unaligned_memcpy_from_wc() also to drm but I am > concerned since it is more a platform specific handling, does it make > sense to keep it in drm. > Else I have retain to i915_unaligned_memcpy_from_wc() inside i915 and > refactor the function to use drm_memcpy_from_wc() instead of the > __memcpy_ntdqu(). I think for completeness it makes sense to remove i915_memcpy_from_wc() and its helper functions in this series. I don't think we can have i915_unaligned_memcpy_from_wc() if want i915 on ARM[0] so I think you can remove usages of i915_unaligned_memcpy_from_wc() as well. [0]IIUC CI_BUG_ON() check in i915_unaligned_memcpy_from_wc() will raise a build error on ARM Regards, Nirmoy > But before I could do more changes, I wanted feedback on the current > change. So I decided to go ahead with creating series for review. > > Regards, > Bala > >> Regards, >> Nirmoy >> >>> Cc: Jani Nikula <jani.nikula@intel.com> >>> Cc: Lucas De Marchi <lucas.demarchi@intel.com> >>> Cc: David Airlie <airlied@linux.ie> >>> Cc: Daniel Vetter <daniel@ffwll.ch> >>> Cc: Chris Wilson <chris.p.wilson@intel.com> >>> Cc: Thomas Hellstr_m <thomas.hellstrom@linux.intel.com> >>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> >>> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> >>> >>> Balasubramani Vivekanandan (7): >>> drm: Relax alignment constraint for destination address >>> drm: Add drm_memcpy_from_wc() variant which accepts destination >>> address >>> drm/i915: use the memcpy_from_wc call from the drm >>> drm/i915/guc: use the memcpy_from_wc call from the drm >>> drm/i915/selftests: use the memcpy_from_wc call from the drm >>> drm/i915/gt: Avoid direct dereferencing of io memory >>> drm/i915: Avoid dereferencing io mapped memory >>> >>> drivers/gpu/drm/drm_cache.c | 98 +++++++++++++++++-- >>> drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 +- >>> drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++-- >>> drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 11 ++- >>> drivers/gpu/drm/i915/i915_gpu_error.c | 45 +++++---- >>> .../drm/i915/selftests/intel_memory_region.c | 8 +- >>> include/drm/drm_cache.h | 3 + >>> 7 files changed, 148 insertions(+), 46 deletions(-) >>>