Message ID | 20241021133154.516847-1-jvetter@kalrayinc.com (mailing list archive) |
---|---|
Headers | show |
Series | Replace fallback for IO memcpy and IO memset | expand |
From: Julian Vetter > Sent: 21 October 2024 14:32 > > Thank you again for your remarks Arnd and Christoph! I have updated the > patchset, and placed the functions directly in asm-generic/io.h. I have > dropped the libs/iomem_copy.c and have updated/clarified the commit > message in the first patch. Apart from build 'issues' what is the justification for inlining these functions? They are quite large for inlining and some drivers could easily call them many times. The I/O cycles themselves are likely to be slow enough that the cost of a function call is pretty much likely to be noise. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Mon, Oct 21, 2024, at 14:16, David Laight wrote: > From: Julian Vetter >> Sent: 21 October 2024 14:32 >> >> Thank you again for your remarks Arnd and Christoph! I have updated the >> patchset, and placed the functions directly in asm-generic/io.h. I have >> dropped the libs/iomem_copy.c and have updated/clarified the commit >> message in the first patch. > > Apart from build 'issues' what is the justification for inlining > these functions? I think I wasn't clear enough with my previous comment, and Julian just misunderstood what I was asking him to do. Sorry about causing extra work here. > They are quite large for inlining and some drivers could easily > call them many times. > > The I/O cycles themselves are likely to be slow enough that > the cost of a function call is pretty much likely to be noise. I'm not overly worried about the this, as the functions are not that big and there are not that many callers. If a file contains multiple calls to this function, we can expect the compiler to be smart enough to keep it out of line, though it still gets duplicated in each driver calling it. The bit that I am worried about however is the extra #include for linux/unaligned.h that pulls in fairly large headers and may lead to circular header dependencies. To be clear: what I had expected here was to not have any changes to the v9 version of lib/iomem_copy.c and to simplify the asm-generic/io.h change to the version below. Arnd --- --- a/include/asm-generic/io.h +++ b/include/asm-generic/io.h @@ -1211,7 +1211,6 @@ static inline void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr) #endif #ifndef memset_io -#define memset_io memset_io /** * memset_io Set a range of I/O memory to a constant value * @addr: The beginning of the I/O-memory range to set @@ -1220,15 +1219,10 @@ static inline void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr) * * Set a range of I/O memory to a given value. */ -static inline void memset_io(volatile void __iomem *addr, int value, - size_t size) -{ - memset(__io_virt(addr), value, size); -} +void memset_io(volatile void __iomem *addr, int value, size_t size); #endif #ifndef memcpy_fromio -#define memcpy_fromio memcpy_fromio /** * memcpy_fromio Copy a block of data from I/O memory * @dst: The (RAM) destination for the copy @@ -1237,16 +1231,11 @@ static inline void memset_io(volatile void __iomem *addr, int value, * * Copy a block of data from I/O memory. */ -static inline void memcpy_fromio(void *buffer, - const volatile void __iomem *addr, - size_t size) -{ - memcpy(buffer, __io_virt(addr), size); -} +void memcpy_fromio(void *buffer, const volatile void __iomem *addr, + size_t size); #endif #ifndef memcpy_toio -#define memcpy_toio memcpy_toio /** * memcpy_toio Copy a block of data into I/O memory * @dst: The (I/O memory) destination for the copy @@ -1255,11 +1244,8 @@ static inline void memcpy_fromio(void *buffer, * * Copy a block of data to I/O memory. */ -static inline void memcpy_toio(volatile void __iomem *addr, const void *buffer, - size_t size) -{ - memcpy(__io_virt(addr), buffer, size); -} +void memcpy_toio(volatile void __iomem *addr, const void *buffer, + size_t size); #endif extern int devmem_is_allowed(unsigned long pfn);
Thank you again for your remarks Arnd and Christoph! I have updated the patchset, and placed the functions directly in asm-generic/io.h. I have dropped the libs/iomem_copy.c and have updated/clarified the commit message in the first patch. Signed-off-by: Julian Vetter <jvetter@kalrayinc.com> --- Changes for v10: - Removed libs/iomem_copy.c again - Replaced the three functions in asm-generic/io.h directly - Updated description for the first patch to clarify the matter - Slightly updated description in patches 2 to 4 Changes for v9: - Moved functions into a new file iomem_copy.c which is built unconditionally - Guard prototypes with '#ifndef memcpy_fromio', etc. - Dropped patches 5 to 14 for now. I will send some of the changes in separate patches or patchsets to the appropriate mailinglists - Added proper reviewed-by and acked-by to arm64 and csky patches Changes for v8: - Dropped the arch/um patch that adds dummy implementations for IO memcpy functions - Added 3 new patches that fix the dependency problem for UM (added dependencies on HAS_IOMEM || INDIRECT_IOMEM) - Added new patch for s390 to internally call the zpci_memcpy functions and not the generic ones from libs/iomap_copy.c - Addressed reviewer comments and replaced 2 or 3 shifts by 'qc *= ~0UL / 0xff;' - Addressed reviewer comments on pasrisc (masking the int value) - Addressed reviewer comments on alpha (masking the int value) Changes for v7: - Added dummy implementations for memcpy_{to,from}io and memset_io on um architecture so drivers that use these functions build for um - Replaced all accesses and checks by long type - Added function prototypes as extern to asm-generic/io.h - Removed '__' from the 3 new function names - Some archs implement their own version of these IO functions with slightly different prototypes. So, I added 3 new patches to align prototypes with new ones in iomap_copy.c + io.h Changes for v6: - Added include of linux/align.h to fix build on arm arch - Replaced compile-time check by ifdef for the CONFIG_64BIT otherwise we get a warning for the 'qc << 32' for archs with 32bit int types - Suffixed arch commits by arch name Changes for v5: - Added functions to iomap_copy.c as proposed by Arndt - Removed again the new io_copy.c and related objects - Removed GENERIC_IO_COPY symbol and instead rely on the existing HAS_IOMEM symbol - Added prototypes of __memcpy_{to,from}io and __memset_io functions to asm-generic/io.h Changes for v4: - Replaced memcpy/memset in asm-generic/io.h by the new __memcpy_{to,from}io and __memset_io, so individual architectures can use it instead of using their own implementation. Changes for v3: - Replaced again 'if(IS_ENABLED(CONFIG_64BIT))' by '#ifdef CONFIG_64BIT' because on 32bit architectures (e.g., csky), __raw_{read,write}q are not defined. So, it leads to compilation errors Changes for v2: - Renamed io.c -> io_copy.c - Updated flag to 'GENERIC_IO_COPY' - Replaced pointer dereferences by 'put_unaligned()'/'get_unaligned()' - Replaced '#ifdef CONFIG_64BIT' by 'if(IS_ENABLED(CONFIG_64BIT))' - Removed '__raw_{read,write}_native' and replaced by 'if(IS_ENABLED(CONFIG_64BIT))' -> '__raw_write{l,q}' --- Julian Vetter (4): Replace fallback for IO memcpy and IO memset arm64: Use new fallback IO memcpy/memset csky: Use new fallback IO memcpy/memset loongarch: Use new fallback IO memcpy/memset arch/arm64/include/asm/io.h | 11 --- arch/arm64/kernel/io.c | 87 ------------------------ arch/csky/include/asm/io.h | 11 --- arch/csky/kernel/Makefile | 2 +- arch/csky/kernel/io.c | 91 ------------------------- arch/loongarch/include/asm/io.h | 10 --- arch/loongarch/kernel/Makefile | 2 +- arch/loongarch/kernel/io.c | 94 -------------------------- include/asm-generic/io.h | 116 ++++++++++++++++++++++++++------ 9 files changed, 98 insertions(+), 326 deletions(-) delete mode 100644 arch/csky/kernel/io.c delete mode 100644 arch/loongarch/kernel/io.c