Message ID | 20130205210436.670c62e26d2121330e87af35@freescale.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 2013-02-05 at 21:04 -0600, Kim Phillips wrote: > gcc -Os emits calls to __bswapsi2 on those platforms to save space > because they don't have the single rev byte swap instruction. Is that the right thing for GCC to do in that situation? If so, perhaps we should be *providing* __bswap[sd]i2 functions for it to use? If not, perhaps there should be a PR filed? Or is our use case justifiably different to the general case of '-Os'? If so, why?
On Wed, 6 Feb 2013 09:02:04 +0000 "Woodhouse, David" <david.woodhouse@intel.com> wrote: > On Tue, 2013-02-05 at 21:04 -0600, Kim Phillips wrote: > > gcc -Os emits calls to __bswapsi2 on those platforms to save space > > because they don't have the single rev byte swap instruction. > > Is that the right thing for GCC to do in that situation? if it saves space, why wouldn't it be? "Many of these functions are only optimized in certain cases; if they are not optimized in a particular case, a call to the library function is emitted." [1] I see "(arm_arch6 || !optimize_size)" in gcc's define_expand "bswapsi2" source, so GCC considers size optimization as a legitimate one of those cases. > If so, perhaps we should be *providing* __bswap[sd]i2 functions for it > to use? either that, or link with libgcc - why does arch/arm64 do this and arch/arm not? It's not obvious from git log. > If not, perhaps there should be a PR filed? > > Or is our use case justifiably different to the general case of '-Os'? > If so, why? shouldn't be - a patch, such as this, that claims to reduce code size, and that only turns on the new built-in when CC_OPTIMIZE_FOR_SIZE is off, is generally not good :) OTOH, the target here is armv6+ performance - not armv4,5 code density - the OPTIMIZE_FOR_SIZE protection prevents armv4,5 build breakage. Kim [1] http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#Other-Builtins
On Thu, Feb 7, 2013 at 1:19 AM, Kim Phillips <kim.phillips@freescale.com> wrote: > On Wed, 6 Feb 2013 09:02:04 +0000 > "Woodhouse, David" <david.woodhouse@intel.com> wrote: > >> On Tue, 2013-02-05 at 21:04 -0600, Kim Phillips wrote: >> > gcc -Os emits calls to __bswapsi2 on those platforms to save space >> > because they don't have the single rev byte swap instruction. >> >> Is that the right thing for GCC to do in that situation? > > if it saves space, why wouldn't it be? > > "Many of these functions are only optimized in certain cases; if they > are not optimized in a particular case, a call to the library > function is emitted." [1] > > I see "(arm_arch6 || !optimize_size)" in gcc's define_expand > "bswapsi2" source, so GCC considers size optimization as a > legitimate one of those cases. > >> If so, perhaps we should be *providing* __bswap[sd]i2 functions for it >> to use? > > either that, or link with libgcc - why does arch/arm64 do this and > arch/arm not? It's not obvious from git log. One reason I have found, I don't know if it is the canonical one, is that linking with libgcc allows people to use all intrinsics e.g. soft float routines in the kernel without noticing it. If you limit the intrinsics to the ones linked into the kernel explicitly then this cannot happen. I have also seen cases where the libgcc intrinsics are improved over time, having the code in the kernel allows these improvements to be rolled into the kernel even if the user has an older toolchain. A number of ports link against libgcc and a roughly equal number do not, so it isn't clear that there's any consensus on the issue.
On 7 February 2013 10:19, Will Newton <will.newton@gmail.com> wrote: > On Thu, Feb 7, 2013 at 1:19 AM, Kim Phillips <kim.phillips@freescale.com> wrote: >> On Wed, 6 Feb 2013 09:02:04 +0000 >> "Woodhouse, David" <david.woodhouse@intel.com> wrote: >> >>> On Tue, 2013-02-05 at 21:04 -0600, Kim Phillips wrote: >>> > gcc -Os emits calls to __bswapsi2 on those platforms to save space >>> > because they don't have the single rev byte swap instruction. >>> >>> Is that the right thing for GCC to do in that situation? >> >> if it saves space, why wouldn't it be? >> >> "Many of these functions are only optimized in certain cases; if they >> are not optimized in a particular case, a call to the library >> function is emitted." [1] >> >> I see "(arm_arch6 || !optimize_size)" in gcc's define_expand >> "bswapsi2" source, so GCC considers size optimization as a >> legitimate one of those cases. >> >>> If so, perhaps we should be *providing* __bswap[sd]i2 functions for it >>> to use? >> >> either that, or link with libgcc - why does arch/arm64 do this and >> arch/arm not? It's not obvious from git log. > > One reason I have found, I don't know if it is the canonical one, is > that linking with libgcc allows people to use all intrinsics e.g. soft > float routines in the kernel without noticing it. If you limit the > intrinsics to the ones linked into the kernel explicitly then this > cannot happen. For arm64 we explicitly pass -mgeneral-regs-only to avoid any floating point generation. Soft-float is excluded by the ABI automatically. But we use other compiler intrinsics like __ffs and while they are currently generated inline, you can't guarantee, hence the linking with libgcc. > I have also seen cases where the libgcc intrinsics are improved over > time, having the code in the kernel allows these improvements to be > rolled into the kernel even if the user has an older toolchain. Indeed, the gcc guys do a lot benchmarking/optimisations on a wide range of processors, so we can take advantage of that in the kernel. But it's much easier on arm64 since the architecture is stable. On 32-bit arm we have to cope with a range of architecture versions with variations to the instruction set.
On Wed, Feb 06, 2013 at 07:19:05PM -0600, Kim Phillips wrote: > either that, or link with libgcc - why does arch/arm64 do this and > arch/arm not? It's not obvious from git log. We want to be in control of what code the kernel runs, and that means not using libgcc. libgcc can do all sorts of stuff because it makes assumptions about the environment which it is executing in (glibc). For example, it assumes there may be a GOT, and that it can issue SWI calls... However, the biggest reason not to use libgcc is that we want to control what gets used in the kernel - for example, no floating point, and no use of 64 x 64bit division. So all round, using libgcc in AArch32 would bring us a number of issues that we don't get with the current approach. AArch64 uses it because it's maintained entirely separately and the decision has been made by other people - and some of these concerns do not exist on a 64-bit architecture.
On Thu, 2013-02-07 at 18:13 +0000, Russell King - ARM Linux wrote: > > However, the biggest reason not to use libgcc is that we want to control > what gets used in the kernel - for example, no floating point, and no > use of 64 x 64bit division. Which is all very sensible. But there's no particular reason we couldn't add a __bswap[sd]i2 to the kernel's version of libgcc if we wanted to. If we're compiling with -Os on platforms without 'rev' and that's the thing that makes most sense, then we should make it work.
On Fri, 8 Feb 2013, Woodhouse, David wrote: > On Thu, 2013-02-07 at 18:13 +0000, Russell King - ARM Linux wrote: > > > > However, the biggest reason not to use libgcc is that we want to control > > what gets used in the kernel - for example, no floating point, and no > > use of 64 x 64bit division. > > Which is all very sensible. But there's no particular reason we couldn't > add a __bswap[sd]i2 to the kernel's version of libgcc if we wanted to. Absolutely. Nicolas
On Fri, 2013-02-08 at 15:04 -0500, Nicolas Pitre wrote: > On Fri, 8 Feb 2013, Woodhouse, David wrote: > > > On Thu, 2013-02-07 at 18:13 +0000, Russell King - ARM Linux wrote: > > > > > > However, the biggest reason not to use libgcc is that we want to control > > > what gets used in the kernel - for example, no floating point, and no > > > use of 64 x 64bit division. > > > > Which is all very sensible. But there's no particular reason we couldn't > > add a __bswap[sd]i2 to the kernel's version of libgcc if we wanted to. > > Absolutely. And then ARM can just set ARCH_USE_BUILTIN_BSWAP like other architectures do, right?
On Fri, 8 Feb 2013, Woodhouse, David wrote: > On Fri, 2013-02-08 at 15:04 -0500, Nicolas Pitre wrote: > > On Fri, 8 Feb 2013, Woodhouse, David wrote: > > > > > On Thu, 2013-02-07 at 18:13 +0000, Russell King - ARM Linux wrote: > > > > > > > > However, the biggest reason not to use libgcc is that we want to control > > > > what gets used in the kernel - for example, no floating point, and no > > > > use of 64 x 64bit division. > > > > > > Which is all very sensible. But there's no particular reason we couldn't > > > add a __bswap[sd]i2 to the kernel's version of libgcc if we wanted to. > > > > Absolutely. > > And then ARM can just set ARCH_USE_BUILTIN_BSWAP like other > architectures do, right? If that turns out to be beneficial over what we have now, then yes. I didn't read back the whole thread to form an opinion though. Nicolas
diff --git a/arch/arm/include/asm/swab.h b/arch/arm/include/asm/swab.h index 537fc9b..159ab16 100644 --- a/arch/arm/include/asm/swab.h +++ b/arch/arm/include/asm/swab.h @@ -35,4 +35,13 @@ static inline __attribute_const__ __u32 __arch_swab32(__u32 x) #define __arch_swab32 __arch_swab32 #endif + +#if !defined(CONFIG_CC_OPTIMIZE_FOR_SIZE) && GCC_VERSION >= 40500 +#define __HAVE_BUILTIN_BSWAP32__ +#define __HAVE_BUILTIN_BSWAP64__ +#if GCC_VERSION >= 40800 +#define __HAVE_BUILTIN_BSWAP16__ +#endif +#endif + #endif