Message ID | 20220222145206.76118-3-balasubramani.vivekanandan@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Use the memcpy_from_wc function from drm | expand |
On Tue, Feb 22, 2022 at 08:22:01PM +0530, Balasubramani Vivekanandan wrote: >Fast copy using non-temporal instructions for x86 currently exists at two >locations. One is implemented in i915 driver at i915/i915_memcpy.c and >another copy at drm_cache.c. The plan is to remove the duplicate >implementation in i915 driver and use the functions from drm_cache.c. > >A variant of drm_memcpy_from_wc() is added in drm_cache.c which accepts >address as argument instead of iosys_map for destination. It is a very >common scenario in i915 to copy from a WC memory type, which may be an >io memory or a system memory to a destination address pointing to system >memory. To avoid the overhead of creating iosys_map type for the >destination, new variant is created to accept the address directly. > >Also a new function is exported in drm_cache.c to find if the fast copy >is supported by the platform or not. It is required for i915. > >Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> >Cc: Maxime Ripard <mripard@kernel.org> >Cc: Thomas Zimmermann <tzimmermann@suse.de> >Cc: David Airlie <airlied@linux.ie> >Cc: Daniel Vetter <daniel@ffwll.ch> >Cc: Thomas Hellstr_m <thomas.hellstrom@linux.intel.com> > >Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> >--- > drivers/gpu/drm/drm_cache.c | 54 +++++++++++++++++++++++++++++++++++++ > include/drm/drm_cache.h | 3 +++ > 2 files changed, 57 insertions(+) > >diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c >index a21c1350eb09..eb0bcd33665e 100644 >--- a/drivers/gpu/drm/drm_cache.c >+++ b/drivers/gpu/drm/drm_cache.c >@@ -358,6 +358,54 @@ void drm_memcpy_from_wc(struct iosys_map *dst, > } > EXPORT_SYMBOL(drm_memcpy_from_wc); > >+/** >+ * drm_memcpy_from_wc_vaddr - Perform the fastest available memcpy from a source >+ * that may be WC. .... to a destination in system memory. >+ * @dst: The destination pointer >+ * @src: The source pointer >+ * @len: The size of the area to transfer in bytes >+ * >+ * Same as drm_memcpy_from_wc except destination is accepted as system memory >+ * address. Useful in situations where passing destination address as iosys_map >+ * is simply an overhead and can be avoided. although one could do drm_memcpy_from_wc(IOSYS_MAP_INIT_VADDR(addr), ... (if IOSYS_MAP_INIT_VADDR provided a cast to the struct). >+ */ >+void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, name here is confusing as we are copying *to* system memory. Maybe drm_memcpy_vaddr_from_wc()? Not sure it's better. Maybe someone in Cc has a better suggestion ( To be honest, this whole _from_wc() suffix sound weird when are checking I/O vs system memory.... it may have been the motivation, but maybe it shouldn't be the name of the memcpy() variant ) The implementation looks ok and follows drm_memcpy_from_wc() Lucas De Marchi >+ unsigned long len) >+{ >+ if (WARN_ON(in_interrupt())) { >+ iosys_map_memcpy_from(dst, src, 0, len); >+ return; >+ } >+ >+ if (static_branch_likely(&has_movntdqa)) { >+ __drm_memcpy_from_wc(dst, >+ src->is_iomem ? >+ (void const __force *)src->vaddr_iomem : >+ src->vaddr, >+ len); >+ return; >+ } >+ >+ iosys_map_memcpy_from(dst, src, 0, len); >+} >+EXPORT_SYMBOL(drm_memcpy_from_wc_vaddr); >+ >+/* >+ * drm_memcpy_fastcopy_supported - Returns if fast copy using non-temporal >+ * instructions is supported >+ * >+ * Returns true if platform has support for fast copying from wc memory type >+ * using non-temporal instructions. Else false. >+ */ >+bool drm_memcpy_fastcopy_supported(void) >+{ >+ if (static_branch_likely(&has_movntdqa)) >+ return true; >+ >+ return false; >+} >+EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); >+ > /* > * drm_memcpy_init_early - One time initialization of the WC memcpy code > */ >@@ -382,6 +430,12 @@ void drm_memcpy_from_wc(struct iosys_map *dst, > } > EXPORT_SYMBOL(drm_memcpy_from_wc); > >+bool drm_memcpy_fastcopy_supported(void) >+{ >+ return false; >+} >+EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); >+ > void drm_memcpy_init_early(void) > { > } >diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h >index 22deb216b59c..8f48e4dcd7dc 100644 >--- a/include/drm/drm_cache.h >+++ b/include/drm/drm_cache.h >@@ -77,4 +77,7 @@ void drm_memcpy_init_early(void); > void drm_memcpy_from_wc(struct iosys_map *dst, > const struct iosys_map *src, > unsigned long len); >+bool drm_memcpy_fastcopy_supported(void); >+void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, >+ unsigned long len); > #endif >-- >2.25.1 >
On Mon, Feb 28, 2022 at 11:48:58PM -0800, Lucas De Marchi wrote: >On Tue, Feb 22, 2022 at 08:22:01PM +0530, Balasubramani Vivekanandan wrote: >>Fast copy using non-temporal instructions for x86 currently exists at two >>locations. One is implemented in i915 driver at i915/i915_memcpy.c and >>another copy at drm_cache.c. The plan is to remove the duplicate >>implementation in i915 driver and use the functions from drm_cache.c. >> >>A variant of drm_memcpy_from_wc() is added in drm_cache.c which accepts >>address as argument instead of iosys_map for destination. It is a very >>common scenario in i915 to copy from a WC memory type, which may be an >>io memory or a system memory to a destination address pointing to system >>memory. To avoid the overhead of creating iosys_map type for the >>destination, new variant is created to accept the address directly. >> >>Also a new function is exported in drm_cache.c to find if the fast copy >>is supported by the platform or not. It is required for i915. >> >>Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> >>Cc: Maxime Ripard <mripard@kernel.org> >>Cc: Thomas Zimmermann <tzimmermann@suse.de> >>Cc: David Airlie <airlied@linux.ie> >>Cc: Daniel Vetter <daniel@ffwll.ch> >>Cc: Thomas Hellstr_m <thomas.hellstrom@linux.intel.com> >> >>Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> >>--- >>drivers/gpu/drm/drm_cache.c | 54 +++++++++++++++++++++++++++++++++++++ >>include/drm/drm_cache.h | 3 +++ >>2 files changed, 57 insertions(+) >> >>diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c >>index a21c1350eb09..eb0bcd33665e 100644 >>--- a/drivers/gpu/drm/drm_cache.c >>+++ b/drivers/gpu/drm/drm_cache.c >>@@ -358,6 +358,54 @@ void drm_memcpy_from_wc(struct iosys_map *dst, >>} >>EXPORT_SYMBOL(drm_memcpy_from_wc); >> >>+/** >>+ * drm_memcpy_from_wc_vaddr - Perform the fastest available memcpy from a source >>+ * that may be WC. > > .... to a destination in system memory. > >>+ * @dst: The destination pointer >>+ * @src: The source pointer >>+ * @len: The size of the area to transfer in bytes >>+ * >>+ * Same as drm_memcpy_from_wc except destination is accepted as system memory >>+ * address. Useful in situations where passing destination address as iosys_map >>+ * is simply an overhead and can be avoided. > >although one could do drm_memcpy_from_wc(IOSYS_MAP_INIT_VADDR(addr), ... ... Just making you don't take that as a suggestion, I was just thinking out loud. And as is, it doesn't work as the function expects a iosys_map * Lucas De Marhci
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index a21c1350eb09..eb0bcd33665e 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -358,6 +358,54 @@ void drm_memcpy_from_wc(struct iosys_map *dst, } EXPORT_SYMBOL(drm_memcpy_from_wc); +/** + * drm_memcpy_from_wc_vaddr - Perform the fastest available memcpy from a source + * that may be WC. + * @dst: The destination pointer + * @src: The source pointer + * @len: The size of the area to transfer in bytes + * + * Same as drm_memcpy_from_wc except destination is accepted as system memory + * address. Useful in situations where passing destination address as iosys_map + * is simply an overhead and can be avoided. + */ +void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, + unsigned long len) +{ + if (WARN_ON(in_interrupt())) { + iosys_map_memcpy_from(dst, src, 0, len); + return; + } + + if (static_branch_likely(&has_movntdqa)) { + __drm_memcpy_from_wc(dst, + src->is_iomem ? + (void const __force *)src->vaddr_iomem : + src->vaddr, + len); + return; + } + + iosys_map_memcpy_from(dst, src, 0, len); +} +EXPORT_SYMBOL(drm_memcpy_from_wc_vaddr); + +/* + * drm_memcpy_fastcopy_supported - Returns if fast copy using non-temporal + * instructions is supported + * + * Returns true if platform has support for fast copying from wc memory type + * using non-temporal instructions. Else false. + */ +bool drm_memcpy_fastcopy_supported(void) +{ + if (static_branch_likely(&has_movntdqa)) + return true; + + return false; +} +EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); + /* * drm_memcpy_init_early - One time initialization of the WC memcpy code */ @@ -382,6 +430,12 @@ void drm_memcpy_from_wc(struct iosys_map *dst, } EXPORT_SYMBOL(drm_memcpy_from_wc); +bool drm_memcpy_fastcopy_supported(void) +{ + return false; +} +EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); + void drm_memcpy_init_early(void) { } diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h index 22deb216b59c..8f48e4dcd7dc 100644 --- a/include/drm/drm_cache.h +++ b/include/drm/drm_cache.h @@ -77,4 +77,7 @@ void drm_memcpy_init_early(void); void drm_memcpy_from_wc(struct iosys_map *dst, const struct iosys_map *src, unsigned long len); +bool drm_memcpy_fastcopy_supported(void); +void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, + unsigned long len); #endif
Fast copy using non-temporal instructions for x86 currently exists at two locations. One is implemented in i915 driver at i915/i915_memcpy.c and another copy at drm_cache.c. The plan is to remove the duplicate implementation in i915 driver and use the functions from drm_cache.c. A variant of drm_memcpy_from_wc() is added in drm_cache.c which accepts address as argument instead of iosys_map for destination. It is a very common scenario in i915 to copy from a WC memory type, which may be an io memory or a system memory to a destination address pointing to system memory. To avoid the overhead of creating iosys_map type for the destination, new variant is created to accept the address directly. Also a new function is exported in drm_cache.c to find if the fast copy is supported by the platform or not. It is required for i915. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Thomas Hellstr_m <thomas.hellstrom@linux.intel.com> Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> --- drivers/gpu/drm/drm_cache.c | 54 +++++++++++++++++++++++++++++++++++++ include/drm/drm_cache.h | 3 +++ 2 files changed, 57 insertions(+)