diff mbox

lib: fix 842 build on 32-bit architectures

Message ID 1431561144-29931-1-git-send-email-ddstreet@ieee.org (mailing list archive)
State RFC
Headers show

Commit Message

Dan Streetman May 13, 2015, 11:52 p.m. UTC
> Building the 842 code on 32-bit ARM currently results in this link
> error:
> 
> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!

Oops!  Guess I should build/test on 32 bit more.

> 
> The reason is that the __do_index function performs a 64-bit
> division by a power-of-two number, but it has no insight into
> the function arguments.
> 
> By marking that function inline, the fsize argument is always
> known at the time that do_index is called, and the compiler is
> able to replace the extremely expensive 64-bit division with
> a cheap constant shift operation.

alternately, we know that fsize will always be less than 64 bits,
at most it's 4<<9 or 8<<8 (both == 1<<11).  So we could just change
its type to u16.


Or, we could inline it and change the type to u16.  In any case,

Acked-by: Dan Streetman <ddstreet@ieee.org>

> 
> Aside from fixing that link error, this approach should also improve
> both code size and performance on 32-bit architectures significantly.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Found while building arm32 allmodconfig with gcc-5.0
> 
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>  	return 0;
>  }
>  
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>  {
>  	u64 index, offset, total = round_down(p->out - p->ostart, 8);
>  	int ret;
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Dan Streetman May 14, 2015, 8:54 a.m. UTC | #1
On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
>> Building the 842 code on 32-bit ARM currently results in this link
>> error:
>>
>> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
>
> Oops!  Guess I should build/test on 32 bit more.
>
>>
>> The reason is that the __do_index function performs a 64-bit
>> division by a power-of-two number, but it has no insight into
>> the function arguments.

wait, do you mean the 64 bit mod, total % fsize?  That should already
be fixed in Herbert's tree, I changed it to subtraction instead.

In any case, I looked at the code again and I think the fsize
parameter can be removed, and just simply calculated in the function,
it's just a shift.  I'll send a patch.


>>
>> By marking that function inline, the fsize argument is always
>> known at the time that do_index is called, and the compiler is
>> able to replace the extremely expensive 64-bit division with
>> a cheap constant shift operation.
>
> alternately, we know that fsize will always be less than 64 bits,
> at most it's 4<<9 or 8<<8 (both == 1<<11).  So we could just change
> its type to u16.
>
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>         return 0;
>  }
>
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static int __do_index(struct sw842_param *p, u8 size, u8 bits, u16 fsize)
>  {
>         u64 index, offset, total = round_down(p->out - p->ostart, 8);
>         int ret;
>
> Or, we could inline it and change the type to u16.  In any case,
>
> Acked-by: Dan Streetman <ddstreet@ieee.org>
>
>>
>> Aside from fixing that link error, this approach should also improve
>> both code size and performance on 32-bit architectures significantly.
>>
>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>> ---
>> Found while building arm32 allmodconfig with gcc-5.0
>>
>> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
>> index 6b2b45aecde3..285bf6b6959c 100644
>> --- a/lib/842/842_decompress.c
>> +++ b/lib/842/842_decompress.c
>> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>>       return 0;
>>  }
>>
>> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>>  {
>>       u64 index, offset, total = round_down(p->out - p->ostart, 8);
>>       int ret;
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann May 14, 2015, 10:49 a.m. UTC | #2
On Thursday 14 May 2015 04:54:07 Dan Streetman wrote:
> On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
> >> Building the 842 code on 32-bit ARM currently results in this link
> >> error:
> >>
> >> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
> >
> > Oops!  Guess I should build/test on 32 bit more.
> >
> >>
> >> The reason is that the __do_index function performs a 64-bit
> >> division by a power-of-two number, but it has no insight into
> >> the function arguments.
> 
> wait, do you mean the 64 bit mod, total % fsize?  That should already
> be fixed in Herbert's tree, I changed it to subtraction instead.

Yes, that's the one. I was looking at yesterday's linux-next which still
had the bug, but your fix has made it into today's release, so my patch
is no longer needed.

> In any case, I looked at the code again and I think the fsize
> parameter can be removed, and just simply calculated in the function,
> it's just a shift.  I'll send a patch.

Not necessary for this problem any more, but it could still make sense
if you think that improves the code. You can also try to see if marking
the function inline has any effect on code size or performance if either
of them matters.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Streetman May 14, 2015, 11:03 a.m. UTC | #3
On Thu, May 14, 2015 at 6:49 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 14 May 2015 04:54:07 Dan Streetman wrote:
>> On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
>> >> Building the 842 code on 32-bit ARM currently results in this link
>> >> error:
>> >>
>> >> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
>> >
>> > Oops!  Guess I should build/test on 32 bit more.
>> >
>> >>
>> >> The reason is that the __do_index function performs a 64-bit
>> >> division by a power-of-two number, but it has no insight into
>> >> the function arguments.
>>
>> wait, do you mean the 64 bit mod, total % fsize?  That should already
>> be fixed in Herbert's tree, I changed it to subtraction instead.
>
> Yes, that's the one. I was looking at yesterday's linux-next which still
> had the bug, but your fix has made it into today's release, so my patch
> is no longer needed.
>
>> In any case, I looked at the code again and I think the fsize
>> parameter can be removed, and just simply calculated in the function,
>> it's just a shift.  I'll send a patch.
>
> Not necessary for this problem any more, but it could still make sense
> if you think that improves the code. You can also try to see if marking
> the function inline has any effect on code size or performance if either
> of them matters.

Yep, I'll run some performance tests both ways and send a patch if it
does improve things.  Thanks!

>
>         Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
index 6b2b45aecde3..285bf6b6959c 100644
--- a/lib/842/842_decompress.c
+++ b/lib/842/842_decompress.c
@@ -169,7 +169,7 @@  static int do_data(struct sw842_param *p, u8 n)
 	return 0;
 }
 
-static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
+static int __do_index(struct sw842_param *p, u8 size, u8 bits, u16 fsize)
 {
 	u64 index, offset, total = round_down(p->out - p->ostart, 8);
 	int ret;