diff mbox

[1/2] crypto: ccp - Reduce stack frame size with KASAN

Message ID 20170328095814.3734615-1-arnd@arndb.de (mailing list archive)
State Superseded
Delegated to: Herbert Xu
Headers show

Commit Message

Arnd Bergmann March 28, 2017, 9:58 a.m. UTC
The newly added AES GCM implementation uses one of the largest stack frames
in the kernel, around 1KB on normal 64-bit kernels, and 1.6KB when CONFIG_KASAN
is enabled:

drivers/crypto/ccp/ccp-ops.c: In function 'ccp_run_aes_gcm_cmd':
drivers/crypto/ccp/ccp-ops.c:851:1: error: the frame size of 1632 bytes is larger than 1536 bytes [-Werror=frame-larger-than=]

This is problematic for multiple reasons:

 - The crypto functions are often used in deep call chains, e.g. behind
   mm, fs and dm layers, making it more likely to run into an actual stack
   overflow

 - Using this much stack space is an indicator that the code is not
   written to be as efficient as it could be.

 - While this goes unnoticed at the moment in mainline with the frame size
   warning being disabled when KASAN is in use, I would like to enable
   the warning again, and the current code is slightly above my arbitrary
   pick for a limit of 1536 bytes (I already did patches for every other
   driver exceeding this).

A more drastic refactoring of the driver might be needed to reduce the
stack usage more substantially, but this patch is fairly simple and
at least addresses the third one of the problems I mentioned, reducing the
stack size by about 150 bytes and bringing it below the warning limit
I picked.

Fixes: 36cf515b9bbe ("crypto: ccp - Enable support for AES GCM on v5 CCPs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 drivers/crypto/ccp/ccp-dev.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Gary R Hook March 28, 2017, 2:15 p.m. UTC | #1
On 03/28/2017 04:58 AM, Arnd Bergmann wrote:> The newly added AES GCM 
implementation uses one of the largest stack frames
> in the kernel, around 1KB on normal 64-bit kernels, and 1.6KB when
> CONFIG_KASAN
> is enabled:
>
> drivers/crypto/ccp/ccp-ops.c: In function 'ccp_run_aes_gcm_cmd':
> drivers/crypto/ccp/ccp-ops.c:851:1: error: the frame size of 1632 bytes
> is larger than 1536 bytes [-Werror=frame-larger-than=]
>
> This is problematic for multiple reasons:
>
>  - The crypto functions are often used in deep call chains, e.g. behind
>    mm, fs and dm layers, making it more likely to run into an actual stack
>    overflow
>
>  - Using this much stack space is an indicator that the code is not
>    written to be as efficient as it could be.

I'm not sure I agree that A -> B, but I will certainly look into this.

>  - While this goes unnoticed at the moment in mainline with the frame size
>    warning being disabled when KASAN is in use, I would like to enable
>    the warning again, and the current code is slightly above my arbitrary
>    pick for a limit of 1536 bytes (I already did patches for every other
>    driver exceeding this).

I've got my stack frame size (also) set to 1536, and would have paid 
more attention
had a warning occurred due to my code.

> A more drastic refactoring of the driver might be needed to reduce the
> stack usage more substantially, but this patch is fairly simple and
> at least addresses the third one of the problems I mentioned, reducing the
> stack size by about 150 bytes and bringing it below the warning limit
> I picked.

Again, I'll devote some time to this.

> diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
> index 3a45c2af2fbd..c5ea0796a891 100644
> --- a/drivers/crypto/ccp/ccp-dev.h
> +++ b/drivers/crypto/ccp/ccp-dev.h
> @@ -432,24 +432,24 @@ struct ccp_dma_info {
>          unsigned int offset;
>          unsigned int length;
>          enum dma_data_direction dir;
> -};
> +} __packed __aligned(4);

My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
#pragma(4) here?


>  struct ccp_dm_workarea {
>          struct device *dev;
>          struct dma_pool *dma_pool;
> -       unsigned int length;
>
>          u8 *address;
>          struct ccp_dma_info dma;
> +       unsigned int length;
>  };
>
>  struct ccp_sg_workarea {
>          struct scatterlist *sg;
>          int nents;
> +       unsigned int dma_count;
>
>          struct scatterlist *dma_sg;
>          struct device *dma_dev;
> -       unsigned int dma_count;
>          enum dma_data_direction dma_dir;
>
>          unsigned int sg_used;

I'm okay with rearranging, but I'm going to submit an alternative patch.
Arnd Bergmann March 28, 2017, 3:03 p.m. UTC | #2
On Tue, Mar 28, 2017 at 4:15 PM, Gary R Hook <ghook@amd.com> wrote:

>> A more drastic refactoring of the driver might be needed to reduce the
>> stack usage more substantially, but this patch is fairly simple and
>> at least addresses the third one of the problems I mentioned, reducing the
>> stack size by about 150 bytes and bringing it below the warning limit
>> I picked.
>
>
> Again, I'll devote some time to this.
>
>> diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
>> index 3a45c2af2fbd..c5ea0796a891 100644
>> --- a/drivers/crypto/ccp/ccp-dev.h
>> +++ b/drivers/crypto/ccp/ccp-dev.h
>> @@ -432,24 +432,24 @@ struct ccp_dma_info {
>>          unsigned int offset;
>>          unsigned int length;
>>          enum dma_data_direction dir;
>> -};
>> +} __packed __aligned(4);
>
>
> My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
> #pragma(4) here?



>>  struct ccp_dm_workarea {
>>          struct device *dev;
>>          struct dma_pool *dma_pool;
>> -       unsigned int length;
>>
>>          u8 *address;
>>          struct ccp_dma_info dma;
>> +       unsigned int length;
>>  };
>>
>>  struct ccp_sg_workarea {
>>          struct scatterlist *sg;
>>          int nents;
>> +       unsigned int dma_count;
>>
>>          struct scatterlist *dma_sg;
>>          struct device *dma_dev;
>> -       unsigned int dma_count;
>>          enum dma_data_direction dma_dir;
>>
>>          unsigned int sg_used;
>
>
> I'm okay with rearranging, but I'm going to submit an alternative patch.

Ok, thanks a lot!
Arnd Bergmann March 28, 2017, 3:10 p.m. UTC | #3
On Tue, Mar 28, 2017 at 4:15 PM, Gary R Hook <ghook@amd.com> wrote:
> On 03/28/2017 04:58 AM, Arnd Bergmann wrote:> The newly added AES GCM
> implementation uses one of the largest stack frames

>> diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
>> index 3a45c2af2fbd..c5ea0796a891 100644
>> --- a/drivers/crypto/ccp/ccp-dev.h
>> +++ b/drivers/crypto/ccp/ccp-dev.h
>> @@ -432,24 +432,24 @@ struct ccp_dma_info {
>>          unsigned int offset;
>>          unsigned int length;
>>          enum dma_data_direction dir;
>> -};
>> +} __packed __aligned(4);
>
>
> My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
> #pragma(4) here?

That is odd, the __aligned() macro gets defined for all compiler versions
in linux/compiler.h, and the aligned attribute should work for all supported
compilers (3.2 and higher), while #pragma pack() requires gcc-4.0 or
higher.

We generally prefer attribute syntax in the kernel over pragmas, even
when they are functionally the same.

      Arnd
Gary R Hook March 28, 2017, 3:28 p.m. UTC | #4
On 03/28/2017 10:10 AM, Arnd Bergmann wrote:
> On Tue, Mar 28, 2017 at 4:15 PM, Gary R Hook <ghook@amd.com> wrote:
>> On 03/28/2017 04:58 AM, Arnd Bergmann wrote:> The newly added AES GCM
>> implementation uses one of the largest stack frames
>
>>> diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
>>> index 3a45c2af2fbd..c5ea0796a891 100644
>>> --- a/drivers/crypto/ccp/ccp-dev.h
>>> +++ b/drivers/crypto/ccp/ccp-dev.h
>>> @@ -432,24 +432,24 @@ struct ccp_dma_info {
>>>          unsigned int offset;
>>>          unsigned int length;
>>>          enum dma_data_direction dir;
>>> -};
>>> +} __packed __aligned(4);
>>
>>
>> My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
>> #pragma(4) here?
>
> That is odd, the __aligned() macro gets defined for all compiler versions
> in linux/compiler.h, and the aligned attribute should work for all supported
> compilers (3.2 and higher), while #pragma pack() requires gcc-4.0 or
> higher.
>
> We generally prefer attribute syntax in the kernel over pragmas, even
> when they are functionally the same.

Yes, it's extremely odd, and I understand this preference. Please ignore my
submitted alternate and let me track this down....
Gary R Hook March 28, 2017, 3:56 p.m. UTC | #5
On 03/28/2017 10:10 AM, Arnd Bergmann wrote:
>>> -};
>>> +} __packed __aligned(4);
>>
>> My gcc 4.8 doesn't understand __aligned(). Shouldn't we use
>> #pragma(4) here?
>
> That is odd, the __aligned() macro gets defined for all compiler versions
> in linux/compiler.h, and the aligned attribute should work for all supported
> compilers (3.2 and higher), while #pragma pack() requires gcc-4.0 or
> higher.
>

Tried again in a couple of trees. Not sure what I did wrong, but the 
compiler
seems to be happy now. Huh.

Will submit a V2.
diff mbox

Patch

diff --git a/drivers/crypto/ccp/ccp-dev.h b/drivers/crypto/ccp/ccp-dev.h
index 3a45c2af2fbd..c5ea0796a891 100644
--- a/drivers/crypto/ccp/ccp-dev.h
+++ b/drivers/crypto/ccp/ccp-dev.h
@@ -432,24 +432,24 @@  struct ccp_dma_info {
 	unsigned int offset;
 	unsigned int length;
 	enum dma_data_direction dir;
-};
+} __packed __aligned(4);
 
 struct ccp_dm_workarea {
 	struct device *dev;
 	struct dma_pool *dma_pool;
-	unsigned int length;
 
 	u8 *address;
 	struct ccp_dma_info dma;
+	unsigned int length;
 };
 
 struct ccp_sg_workarea {
 	struct scatterlist *sg;
 	int nents;
+	unsigned int dma_count;
 
 	struct scatterlist *dma_sg;
 	struct device *dma_dev;
-	unsigned int dma_count;
 	enum dma_data_direction dma_dir;
 
 	unsigned int sg_used;