Message ID | E1THMAC-0004nL-8B@rmk-PC.arm.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 27 Sep 2012, Russell King wrote: > If we are building for a LE platform, and we haven't overriden the > MMIO ops, then we can optimize the mem*io operations using the > standard string functions. > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> We presume that the IO space is able to cope with a mixture of access width other than byte access which should be perfectly reasonable by default. If so then... Acked-by: Nicolas Pitre <nico@.linaro.org> > --- > arch/arm/include/asm/io.h | 25 +++++++++++++++++++++++++ > 1 files changed, 25 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h > index 815c669..ed0577e 100644 > --- a/arch/arm/include/asm/io.h > +++ b/arch/arm/include/asm/io.h > @@ -23,6 +23,7 @@ > > #ifdef __KERNEL__ > > +#include <linux/string.h> > #include <linux/types.h> > #include <asm/byteorder.h> > #include <asm/memory.h> > @@ -249,9 +250,33 @@ extern void _memset_io(volatile void __iomem *, int, size_t); > #define writesw(p,d,l) __raw_writesw(p,d,l) > #define writesl(p,d,l) __raw_writesl(p,d,l) > > +#ifndef __ARMBE__ > +static inline void memset_io(volatile void __iomem *dst, unsigned c, > + size_t count) > +{ > + memset((void __force *)dst, c, count); > +} > +#define memset_io(dst,c,count) memset_io(dst,c,count) > + > +static inline void memcpy_fromio(void *to, const volatile void __iomem *from, > + size_t count) > +{ > + memcpy(to, (const void __force *)from, count); > +} > +#define memcpy_fromio(to,from,count) memcpy_fromio(to,from,count) > + > +static inline void memcpy_toio(volatile void __iomem *to, const void *from, > + size_t count) > +{ > + memcpy((void __force *)to, from, count); > +} > +#define memcpy_toio(to,from,count) memcpy_toio(to,from,count) > + > +#else > #define memset_io(c,v,l) _memset_io(c,(v),(l)) > #define memcpy_fromio(a,c,l) _memcpy_fromio((a),c,(l)) > #define memcpy_toio(c,a,l) _memcpy_toio(c,(a),(l)) > +#endif > > #endif /* readl */ > > -- > 1.7.4.4 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On Fri, Sep 28, 2012 at 05:17:53AM +0100, Nicolas Pitre wrote: > On Thu, 27 Sep 2012, Russell King wrote: > > > If we are building for a LE platform, and we haven't overriden the > > MMIO ops, then we can optimize the mem*io operations using the > > standard string functions. > > > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > > We presume that the IO space is able to cope with a mixture of access > width other than byte access which should be perfectly reasonable by > default. If so then... > > Acked-by: Nicolas Pitre <nico@.linaro.org> This looks pretty scary to me, but maybe I'm worrying too much. The first thing to ensure is that the accesses are always aligned, which I believe is true for the string operations. However, a quick glance at memset shows that we do things like store multiple with writeback addressing modes. This is bad for a few reasons: 1. If an access other the first one generated by the instruction causes an abort, the CPU will ultimately re-execute the earlier accesses, which could be problematic to a device. 2. Writeback addressing modes when accessing MMIO peripherals causes serious performance problems with virtualisation, as I have described before. 3. We have to guarantee that no single instruction causes accesses that span a page boundary, as this leads to UNPREDICTABLE behaviour. So, unless we can guarantee that our accesses are all aligned, will never fault, do not cross a page boundary and we are not running as a guest then I'd be inclined to stick with byte-by-byte implementations for these functions. Will
On Fri, Sep 28, 2012 at 10:58:08AM +0100, Will Deacon wrote: > On Fri, Sep 28, 2012 at 05:17:53AM +0100, Nicolas Pitre wrote: > > On Thu, 27 Sep 2012, Russell King wrote: > > > > > If we are building for a LE platform, and we haven't overriden the > > > MMIO ops, then we can optimize the mem*io operations using the > > > standard string functions. > > > > > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > > > > We presume that the IO space is able to cope with a mixture of access > > width other than byte access which should be perfectly reasonable by > > default. If so then... > > > > Acked-by: Nicolas Pitre <nico@.linaro.org> > > This looks pretty scary to me, but maybe I'm worrying too much. The first > thing to ensure is that the accesses are always aligned, which I believe is > true for the string operations. However, a quick glance at memset shows that > we do things like store multiple with writeback addressing modes. This is > bad for a few reasons: > > 1. If an access other the first one generated by the instruction > causes an abort, the CPU will ultimately re-execute the earlier > accesses, which could be problematic to a device. I don't think that's a problem for these. They're used on RAM like regions. > 2. Writeback addressing modes when accessing MMIO peripherals causes > serious performance problems with virtualisation, as I have > described before. Well, virtualisation is in its infancy on ARM, and I don't think should be worried about _too_ much when these operations are grossly unoptimized for non-virtualised hardware. The tradeoff is between grossly unoptimized on non-virtualised hardware vs performance problems with virtualised hardware. > 3. We have to guarantee that no single instruction causes accesses > that span a page boundary, as this leads to UNPREDICTABLE > behaviour. We do accesses in memset() 16-bytes at a time, so to guarantee that we need to ensure that the pointer passed in was 16-byte aligned. I'm not sure that we can guarantee that in every case. > So, unless we can guarantee that our accesses are all aligned, will never > fault, do not cross a page boundary and we are not running as a guest then > I'd be inclined to stick with byte-by-byte implementations for these > functions. Well, that rather sucks if you're memset_io'ing various sizes (in megabytes - up to 8MB) of video memory. We desperately need these functions optimized. Either that or we allow DRM to be a security hole by omitting any kind of buffer clearing, because using the existing memset_io() is just far too expensive to clear 8MB a byte at a time.
On Fri, Sep 28, 2012 at 11:31:10AM +0100, Russell King - ARM Linux wrote: > On Fri, Sep 28, 2012 at 10:58:08AM +0100, Will Deacon wrote: > > 1. If an access other the first one generated by the instruction > > causes an abort, the CPU will ultimately re-execute the earlier > > accesses, which could be problematic to a device. > > I don't think that's a problem for these. They're used on RAM like > regions. [...] > > So, unless we can guarantee that our accesses are all aligned, will never > > fault, do not cross a page boundary and we are not running as a guest then > > I'd be inclined to stick with byte-by-byte implementations for these > > functions. > > Well, that rather sucks if you're memset_io'ing various sizes (in > megabytes - up to 8MB) of video memory. We desperately need these > functions optimized. How are these regions mapped? If it's something like ioremap_wc, then this gives us normal non-cacheable memory on ARMv7, which wouldn't be subject to the restrictions that you have with device memory. > Either that or we allow DRM to be a security hole by omitting any kind > of buffer clearing, because using the existing memset_io() is just far > too expensive to clear 8MB a byte at a time. If DRM is using normal memory, it sounds like the *_io string functions need to be made to work with different memory types. Will
On Fri, Sep 28, 2012 at 11:42:16AM +0100, Will Deacon wrote: > How are these regions mapped? If it's something like ioremap_wc, then this > gives us normal non-cacheable memory on ARMv7, which wouldn't be subject to > the restrictions that you have with device memory. Yes, ioremap_wc() but we can't guarantee that for all causes of memset_io. > > Either that or we allow DRM to be a security hole by omitting any kind > > of buffer clearing, because using the existing memset_io() is just far > > too expensive to clear 8MB a byte at a time. > > If DRM is using normal memory, it sounds like the *_io string functions need > to be made to work with different memory types. Take a look at the *_io string function declarations, and observe the use of __iomem. That means they're only used with memory returned from the ioremap*() suite of functions (or they're being used non-compliantly and sparse should be spitting warnings.)
On Thursday 27 September 2012, Russell King wrote: > +#ifndef __ARMBE__ > +static inline void memset_io(volatile void __iomem *dst, unsigned c, > + size_t count) > +{ > + memset((void __force *)dst, c, count); > +} > +#define memset_io(dst,c,count) memset_io(dst,c,count) > + > +static inline void memcpy_fromio(void *to, const volatile void __iomem *from, > + size_t count) > +{ > + memcpy(to, (const void __force *)from, count); > +} > +#define memcpy_fromio(to,from,count) memcpy_fromio(to,from,count) > + I wonder whether we need any barriers here. PowerPC has the strongest barriers befor eand after the access, at least since f007cacffc8870 "[POWERPC] Fix MMIO ops to provide expected barrier behaviour". This seems to be done more out of caution and trying to mimic the x86 behavior (where all MMIO accesses are synchronous so some degree) than fixing a particular bug that was observed. Most other architectures are just using the raw string operations like you do here. Some architectures (powerpc and parisc at least) split the access into naturally aligned accesses of up to 4 bytes. I think those are the ones that allow unaligned accesses on RAM but not on MMIO pointers (powerpc will checkstop or throw an exception). arm64 actually copies the arm variant, and I suspect we can use the simpler version there as well. Arnd
On 28 September 2012 14:36, Arnd Bergmann <arnd@arndb.de> wrote: > On Thursday 27 September 2012, Russell King wrote: >> +#ifndef __ARMBE__ >> +static inline void memset_io(volatile void __iomem *dst, unsigned c, >> + size_t count) >> +{ >> + memset((void __force *)dst, c, count); >> +} >> +#define memset_io(dst,c,count) memset_io(dst,c,count) >> + >> +static inline void memcpy_fromio(void *to, const volatile void __iomem *from, >> + size_t count) >> +{ >> + memcpy(to, (const void __force *)from, count); >> +} >> +#define memcpy_fromio(to,from,count) memcpy_fromio(to,from,count) >> + > > I wonder whether we need any barriers here. PowerPC has the strongest > barriers befor eand after the access, at least since f007cacffc8870 > "[POWERPC] Fix MMIO ops to provide expected barrier behaviour". In theory, we need barriers but only at the beginning and the end of the operation, not for every iteration (and in practice we might even be OK without any). The device accesses are ordered with each-other by hardware, it's only when you need them to be ordered with other types or memory accesses (or even other devices). > arm64 actually copies the arm variant, and I suspect we can use the > simpler version there as well. For arm64 I have a set of optimised library functions which I left out from the upstream branch just to make the review easier. I'll post the patches and push them once the core code is upstream. But the memcpy() implementation does not perform pointer alignment (it may at some point, based on benchmarks), so not usable with I/O. The simplest optimisation for arm64 (already suggested by Olof and on my to-do list) is something like the powerpc implementation but using write*_relaxed() to avoid barriers in a loop. The performance of the compiled code shouldn't be far from hand-written assembly, especially on ARMv8 with more out of order execution.
On Fri, Sep 28, 2012 at 03:13:42PM +0100, Catalin Marinas wrote: > On 28 September 2012 14:36, Arnd Bergmann <arnd@arndb.de> wrote: > > On Thursday 27 September 2012, Russell King wrote: > >> +#ifndef __ARMBE__ > >> +static inline void memset_io(volatile void __iomem *dst, unsigned c, > >> + size_t count) > >> +{ > >> + memset((void __force *)dst, c, count); > >> +} > >> +#define memset_io(dst,c,count) memset_io(dst,c,count) > >> + > >> +static inline void memcpy_fromio(void *to, const volatile void __iomem *from, > >> + size_t count) > >> +{ > >> + memcpy(to, (const void __force *)from, count); > >> +} > >> +#define memcpy_fromio(to,from,count) memcpy_fromio(to,from,count) > >> + > > > > I wonder whether we need any barriers here. PowerPC has the strongest > > barriers befor eand after the access, at least since f007cacffc8870 > > "[POWERPC] Fix MMIO ops to provide expected barrier behaviour". > > In theory, we need barriers but only at the beginning and the end of > the operation, not for every iteration (and in practice we might even > be OK without any). Yes, we at least need a barrier before each. The reason for a barrier before is that the memory being accessed may depend on a previous writel() hitting the device, so we need to ensure proper ordering wrt. other writel()s. I don't think we need a barrier after because these any future device writes by the write[bwl]() accessors will have a barrier before them, which will ensure proper ordering.
On 28.09.2012 00:02, Russell King wrote: > If we are building for a LE platform, and we haven't overriden the > MMIO ops, then we can optimize the mem*io operations using the > standard string functions. > > Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> > --- > arch/arm/include/asm/io.h | 25 +++++++++++++++++++++++++ > 1 files changed, 25 insertions(+), 0 deletions(-) > > diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h > index 815c669..ed0577e 100644 > --- a/arch/arm/include/asm/io.h > +++ b/arch/arm/include/asm/io.h > @@ -23,6 +23,7 @@ > > #ifdef __KERNEL__ > > +#include <linux/string.h> > #include <linux/types.h> > #include <asm/byteorder.h> > #include <asm/memory.h> > @@ -249,9 +250,33 @@ extern void _memset_io(volatile void __iomem *, int, size_t); > #define writesw(p,d,l) __raw_writesw(p,d,l) > #define writesl(p,d,l) __raw_writesl(p,d,l) > > +#ifndef __ARMBE__ > +static inline void memset_io(volatile void __iomem *dst, unsigned c, > + size_t count) > +{ > + memset((void __force *)dst, c, count); > +} > +#define memset_io(dst,c,count) memset_io(dst,c,count) > + > +static inline void memcpy_fromio(void *to, const volatile void __iomem *from, > + size_t count) > +{ > + memcpy(to, (const void __force *)from, count); > +} > +#define memcpy_fromio(to,from,count) memcpy_fromio(to,from,count) > + > +static inline void memcpy_toio(volatile void __iomem *to, const void *from, > + size_t count) > +{ > + memcpy((void __force *)to, from, count); > +} > +#define memcpy_toio(to,from,count) memcpy_toio(to,from,count) > + > +#else > #define memset_io(c,v,l) _memset_io(c,(v),(l)) > #define memcpy_fromio(a,c,l) _memcpy_fromio((a),c,(l)) > #define memcpy_toio(c,a,l) _memcpy_toio(c,(a),(l)) > +#endif > > #endif /* readl */ I'd like to ask if there is any chance to get anything like this or the even older proposal http://lists.infradead.org/pipermail/linux-arm-kernel/2009-November/004077.html merged? Maybe an updated version incorporating the discussion conclusions? In the discussion of http://lists.infradead.org/pipermail/linux-arm-kernel/2009-November/004077.html there have been some doubts if it's a good idea to enable it for all SoCs. Therefore that patch introduced a config OPTIMIZED_IOMEM_MEMCPY which would allow us to enable it for only known to be compatible SoCs. Just in case this helps to get it applied ;) Many thanks and best regards Dirk
On 05.06.2013 08:02, Dirk Behme wrote: > On 28.09.2012 00:02, Russell King wrote: >> If we are building for a LE platform, and we haven't overriden the >> MMIO ops, then we can optimize the mem*io operations using the >> standard string functions. >> >> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> >> --- >> arch/arm/include/asm/io.h | 25 +++++++++++++++++++++++++ >> 1 files changed, 25 insertions(+), 0 deletions(-) >> >> diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h >> index 815c669..ed0577e 100644 >> --- a/arch/arm/include/asm/io.h >> +++ b/arch/arm/include/asm/io.h >> @@ -23,6 +23,7 @@ >> >> #ifdef __KERNEL__ >> >> +#include <linux/string.h> >> #include <linux/types.h> >> #include <asm/byteorder.h> >> #include <asm/memory.h> >> @@ -249,9 +250,33 @@ extern void _memset_io(volatile void __iomem *, >> int, size_t); >> #define writesw(p,d,l) __raw_writesw(p,d,l) >> #define writesl(p,d,l) __raw_writesl(p,d,l) >> >> +#ifndef __ARMBE__ >> +static inline void memset_io(volatile void __iomem *dst, unsigned c, >> + size_t count) >> +{ >> + memset((void __force *)dst, c, count); >> +} >> +#define memset_io(dst,c,count) memset_io(dst,c,count) >> + >> +static inline void memcpy_fromio(void *to, const volatile void >> __iomem *from, >> + size_t count) >> +{ >> + memcpy(to, (const void __force *)from, count); >> +} >> +#define memcpy_fromio(to,from,count) memcpy_fromio(to,from,count) >> + >> +static inline void memcpy_toio(volatile void __iomem *to, const >> void *from, >> + size_t count) >> +{ >> + memcpy((void __force *)to, from, count); >> +} >> +#define memcpy_toio(to,from,count) memcpy_toio(to,from,count) >> + >> +#else >> #define memset_io(c,v,l) _memset_io(c,(v),(l)) >> #define memcpy_fromio(a,c,l) _memcpy_fromio((a),c,(l)) >> #define memcpy_toio(c,a,l) _memcpy_toio(c,(a),(l)) >> +#endif >> >> #endif /* readl */ > > I'd like to ask if there is any chance to get anything like this or > the even older proposal > > http://lists.infradead.org/pipermail/linux-arm-kernel/2009-November/004077.html > > > merged? Maybe an updated version incorporating the discussion > conclusions? > > > In the discussion of > > http://lists.infradead.org/pipermail/linux-arm-kernel/2009-November/004077.html > > > there have been some doubts if it's a good idea to enable it for all > SoCs. Therefore that patch introduced a > > config OPTIMIZED_IOMEM_MEMCPY > > which would allow us to enable it for only known to be compatible > SoCs. Just in case this helps to get it applied ;) Any opinion on this? Best regards Dirk
On Tue, Jun 11, 2013 at 08:16:18PM +0200, Dirk Behme wrote: > On 05.06.2013 08:02, Dirk Behme wrote: >> On 28.09.2012 00:02, Russell King wrote: >>> If we are building for a LE platform, and we haven't overriden the >>> MMIO ops, then we can optimize the mem*io operations using the >>> standard string functions. >>> >>> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> >>> --- >>> arch/arm/include/asm/io.h | 25 +++++++++++++++++++++++++ >>> 1 files changed, 25 insertions(+), 0 deletions(-) >>> >>> diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h >>> index 815c669..ed0577e 100644 >>> --- a/arch/arm/include/asm/io.h >>> +++ b/arch/arm/include/asm/io.h >>> @@ -23,6 +23,7 @@ >>> >>> #ifdef __KERNEL__ >>> >>> +#include <linux/string.h> >>> #include <linux/types.h> >>> #include <asm/byteorder.h> >>> #include <asm/memory.h> >>> @@ -249,9 +250,33 @@ extern void _memset_io(volatile void __iomem *, >>> int, size_t); >>> #define writesw(p,d,l) __raw_writesw(p,d,l) >>> #define writesl(p,d,l) __raw_writesl(p,d,l) >>> >>> +#ifndef __ARMBE__ >>> +static inline void memset_io(volatile void __iomem *dst, unsigned c, >>> + size_t count) >>> +{ >>> + memset((void __force *)dst, c, count); >>> +} >>> +#define memset_io(dst,c,count) memset_io(dst,c,count) >>> + >>> +static inline void memcpy_fromio(void *to, const volatile void >>> __iomem *from, >>> + size_t count) >>> +{ >>> + memcpy(to, (const void __force *)from, count); >>> +} >>> +#define memcpy_fromio(to,from,count) memcpy_fromio(to,from,count) >>> + >>> +static inline void memcpy_toio(volatile void __iomem *to, const >>> void *from, >>> + size_t count) >>> +{ >>> + memcpy((void __force *)to, from, count); >>> +} >>> +#define memcpy_toio(to,from,count) memcpy_toio(to,from,count) >>> + >>> +#else >>> #define memset_io(c,v,l) _memset_io(c,(v),(l)) >>> #define memcpy_fromio(a,c,l) _memcpy_fromio((a),c,(l)) >>> #define memcpy_toio(c,a,l) _memcpy_toio(c,(a),(l)) >>> +#endif >>> >>> #endif /* readl */ >> >> I'd like to ask if there is any chance to get anything like this or >> the even older proposal >> >> http://lists.infradead.org/pipermail/linux-arm-kernel/2009-November/004077.html >> >> >> merged? Maybe an updated version incorporating the discussion >> conclusions? >> >> >> In the discussion of >> >> http://lists.infradead.org/pipermail/linux-arm-kernel/2009-November/004077.html >> >> >> there have been some doubts if it's a good idea to enable it for all >> SoCs. Therefore that patch introduced a >> >> config OPTIMIZED_IOMEM_MEMCPY >> >> which would allow us to enable it for only known to be compatible >> SoCs. Just in case this helps to get it applied ;) > > Any opinion on this? Yes, unfortunately it should be selectable by platform, though it needs to be done carefully. Some ARM platforms can't cope with full 32-bit accesses to MMIO. I don't think any of those intersect with the single zImage project, so those should be fine to have it always enabled - it's only the older platforms that might have issues.
On Thursday 13 June 2013 12:05:02 Russell King - ARM Linux wrote: > Yes, unfortunately it should be selectable by platform, though it needs > to be done carefully. Some ARM platforms can't cope with full 32-bit > accesses to MMIO. > > I don't think any of those intersect with the single zImage project, so > those should be fine to have it always enabled - it's only the older > platforms that might have issues. Are those all StrongARM and XScale based platforms? If so, I definitely don't expect them to go multiplatform. Can we combine this with the existing CONFIG_NEED_MACH_IO_H? Arnd
On Thu, Jun 13, 2013 at 02:47:35PM +0200, Arnd Bergmann wrote: > On Thursday 13 June 2013 12:05:02 Russell King - ARM Linux wrote: > > Yes, unfortunately it should be selectable by platform, though it needs > > to be done carefully. Some ARM platforms can't cope with full 32-bit > > accesses to MMIO. > > > > I don't think any of those intersect with the single zImage project, so > > those should be fine to have it always enabled - it's only the older > > platforms that might have issues. > > Are those all StrongARM and XScale based platforms? If so, I definitely > don't expect them to go multiplatform. I'm not 100% positive on that, but... > Can we combine this with the existing CONFIG_NEED_MACH_IO_H? I think we probably can - because that implies that the generic set of IO accessors are already being used, which means 8, 16 and 32 bit accesses must be supportable without any kind of trickery.
Hi Arnd, On 13.06.2013 19:13, Russell King - ARM Linux wrote: > On Thu, Jun 13, 2013 at 02:47:35PM +0200, Arnd Bergmann wrote: >> On Thursday 13 June 2013 12:05:02 Russell King - ARM Linux wrote: >>> Yes, unfortunately it should be selectable by platform, though it needs >>> to be done carefully. Some ARM platforms can't cope with full 32-bit >>> accesses to MMIO. >>> >>> I don't think any of those intersect with the single zImage project, so >>> those should be fine to have it always enabled - it's only the older >>> platforms that might have issues. >> >> Are those all StrongARM and XScale based platforms? If so, I definitely >> don't expect them to go multiplatform. > > I'm not 100% positive on that, but... > >> Can we combine this with the existing CONFIG_NEED_MACH_IO_H? > > I think we probably can - because that implies that the generic set of > IO accessors are already being used, which means 8, 16 and 32 bit > accesses must be supportable without any kind of trickery. Do you like to send a patch with this CONFIG_NEED_MACH_IO_H proposal? Many thanks and best regards Dirk P.S.: Just for reference, the initial patch from Russell we are talking about: http://permalink.gmane.org/gmane.linux.ports.arm.kernel/190902
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h index 815c669..ed0577e 100644 --- a/arch/arm/include/asm/io.h +++ b/arch/arm/include/asm/io.h @@ -23,6 +23,7 @@ #ifdef __KERNEL__ +#include <linux/string.h> #include <linux/types.h> #include <asm/byteorder.h> #include <asm/memory.h> @@ -249,9 +250,33 @@ extern void _memset_io(volatile void __iomem *, int, size_t); #define writesw(p,d,l) __raw_writesw(p,d,l) #define writesl(p,d,l) __raw_writesl(p,d,l) +#ifndef __ARMBE__ +static inline void memset_io(volatile void __iomem *dst, unsigned c, + size_t count) +{ + memset((void __force *)dst, c, count); +} +#define memset_io(dst,c,count) memset_io(dst,c,count) + +static inline void memcpy_fromio(void *to, const volatile void __iomem *from, + size_t count) +{ + memcpy(to, (const void __force *)from, count); +} +#define memcpy_fromio(to,from,count) memcpy_fromio(to,from,count) + +static inline void memcpy_toio(volatile void __iomem *to, const void *from, + size_t count) +{ + memcpy((void __force *)to, from, count); +} +#define memcpy_toio(to,from,count) memcpy_toio(to,from,count) + +#else #define memset_io(c,v,l) _memset_io(c,(v),(l)) #define memcpy_fromio(a,c,l) _memcpy_fromio((a),c,(l)) #define memcpy_toio(c,a,l) _memcpy_toio(c,(a),(l)) +#endif #endif /* readl */
If we are building for a LE platform, and we haven't overriden the MMIO ops, then we can optimize the mem*io operations using the standard string functions. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> --- arch/arm/include/asm/io.h | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-)