diff mbox

[1/4] gpio: Remove VLA from gpiolib

Message ID 20180310001021.6437-2-labbott@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Laura Abbott March 10, 2018, 12:10 a.m. UTC
The new challenge is to remove VLAs from the kernel
(see https://lkml.org/lkml/2018/3/7/621)

This patch replaces several VLAs with an appropriate call to
kmalloc_array.

Signed-off-by: Laura Abbott <labbott@redhat.com>
---
 drivers/gpio/gpiolib.c | 55 +++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 48 insertions(+), 7 deletions(-)

Comments

Rasmus Villemoes March 12, 2018, 3 p.m. UTC | #1
On 2018-03-10 01:10, Laura Abbott wrote:
>  		/* collect all inputs belonging to the same chip */
>  		first = i;
> -		memset(mask, 0, sizeof(mask));
> +		memset(mask, 0, sizeof(*mask));

see below

> @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
>  
>  	while (i < array_size) {
>  		struct gpio_chip *chip = desc_array[i]->gdev->chip;
> -		unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
> -		unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
> +		unsigned long *mask;
> +		unsigned long *bits;
>  		int count = 0;
>  
> +		mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> +				sizeof(*mask),
> +				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> +
> +		if (!mask)
> +			return;
> +
> +		bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> +				sizeof(*bits),
> +				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> +
> +		if (!bits) {
> +			kfree(mask);
> +			return;
> +		}
> +
>  		if (!can_sleep)
>  			WARN_ON(chip->can_sleep);
>  
> -		memset(mask, 0, sizeof(mask));
> +		memset(mask, 0, sizeof(*mask));

Hm, it seems you're now only clearing the first word of mask, not the
entire allocation. Why not just use kcalloc() instead of kmalloc_array
to have it automatically cleared?

Other random thoughts: maybe two allocations for each loop iteration is
a bit much. Maybe do a first pass over the array and collect the maximal
chip->ngpio, do the memory allocation and freeing outside the loop (then
you'd of course need to preserve the memset() with appropriate length
computed). And maybe even just do one allocation, making bits point at
the second half.

Does the set function need to be updated to return an int to be able to
inform the caller that memory allocation failed?

Rasmus
Laura Abbott March 12, 2018, 11:40 p.m. UTC | #2
On 03/12/2018 08:00 AM, Rasmus Villemoes wrote:
> On 2018-03-10 01:10, Laura Abbott wrote:
>>   		/* collect all inputs belonging to the same chip */
>>   		first = i;
>> -		memset(mask, 0, sizeof(mask));
>> +		memset(mask, 0, sizeof(*mask));
> 
> see below
> 
>> @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
>>   
>>   	while (i < array_size) {
>>   		struct gpio_chip *chip = desc_array[i]->gdev->chip;
>> -		unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
>> -		unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
>> +		unsigned long *mask;
>> +		unsigned long *bits;
>>   		int count = 0;
>>   
>> +		mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
>> +				sizeof(*mask),
>> +				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
>> +
>> +		if (!mask)
>> +			return;
>> +
>> +		bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
>> +				sizeof(*bits),
>> +				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
>> +
>> +		if (!bits) {
>> +			kfree(mask);
>> +			return;
>> +		}
>> +
>>   		if (!can_sleep)
>>   			WARN_ON(chip->can_sleep);
>>   
>> -		memset(mask, 0, sizeof(mask));
>> +		memset(mask, 0, sizeof(*mask));
> 
> Hm, it seems you're now only clearing the first word of mask, not the
> entire allocation. Why not just use kcalloc() instead of kmalloc_array
> to have it automatically cleared?
> 

Bleh, I didn't think through that carefully. I'll just switch
to kcalloc, especially since it calls kmalloc_array.

> Other random thoughts: maybe two allocations for each loop iteration is
> a bit much. Maybe do a first pass over the array and collect the maximal
> chip->ngpio, do the memory allocation and freeing outside the loop (then
> you'd of course need to preserve the memset() with appropriate length
> computed). And maybe even just do one allocation, making bits point at
> the second half.
> 

I was trying to make minimal changes and match the existing code. Is this
likely to be an actual hot path to optimize?

> Does the set function need to be updated to return an int to be able to
> inform the caller that memory allocation failed?
> 

That would involve changing the public API. I don't have a problem
doing so if that's what you want.

> Rasmus
> 

Thanks,
Laura
Rasmus Villemoes March 13, 2018, 7:23 a.m. UTC | #3
On 2018-03-13 00:40, Laura Abbott wrote:
> On 03/12/2018 08:00 AM, Rasmus Villemoes wrote:
>>
>> Hm, it seems you're now only clearing the first word of mask, not the
>> entire allocation. Why not just use kcalloc() instead of kmalloc_array
>> to have it automatically cleared?
> 
> Bleh, I didn't think through that carefully. I'll just switch
> to kcalloc, especially since it calls kmalloc_array.
> 
>> Other random thoughts: maybe two allocations for each loop iteration is
>> a bit much. Maybe do a first pass over the array and collect the maximal
>> chip->ngpio, do the memory allocation and freeing outside the loop (then
>> you'd of course need to preserve the memset() with appropriate length
>> computed). And maybe even just do one allocation, making bits point at
>> the second half.
>>
> 
> I was trying to make minimal changes and match the existing code.

Well, textually you may be making the minimal changes (though the error
handling adds some "boilerplate" kfree()s etc.), but semantically adding
two kmallocs in a loop could be a big deal. Dunno, maybe in practice all
the gpios (almost always) belong to the same chip, so there's only one
iteration through the loop anyway.

> Is this likely to be an actual hot path to optimize?

No idea, it was just a drive-by comment, so also...

>> Does the set function need to be updated to return an int to be able to
>> inform the caller that memory allocation failed?
> 
> That would involve changing the public API. I don't have a problem
> doing so if that's what you want.

... I don't "want" anything, that'll be for the gpio maintainers to decide.

Rasmus
Lukas Wunner March 17, 2018, 8:25 a.m. UTC | #4
On Mon, Mar 12, 2018 at 04:00:36PM +0100, Rasmus Villemoes wrote:
> On 2018-03-10 01:10, Laura Abbott wrote:
> > @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep,
> >  
> >  	while (i < array_size) {
> >  		struct gpio_chip *chip = desc_array[i]->gdev->chip;
> > -		unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
> > -		unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
> > +		unsigned long *mask;
> > +		unsigned long *bits;
> >  		int count = 0;
> >  
> > +		mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> > +				sizeof(*mask),
> > +				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> > +
> > +		if (!mask)
> > +			return;
> > +
> > +		bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
> > +				sizeof(*bits),
> > +				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
> > +
> > +		if (!bits) {
> > +			kfree(mask);
> > +			return;
> > +		}
> > +
> >  		if (!can_sleep)
> >  			WARN_ON(chip->can_sleep);
> >  
> > -		memset(mask, 0, sizeof(mask));
> > +		memset(mask, 0, sizeof(*mask));
> 
> Other random thoughts: maybe two allocations for each loop iteration is
> a bit much. Maybe do a first pass over the array and collect the maximal
> chip->ngpio, do the memory allocation and freeing outside the loop (then
> you'd of course need to preserve the memset() with appropriate length
> computed). And maybe even just do one allocation, making bits point at
> the second half.

I think those are great ideas because the function is kind of a hotpath
and usage of VLAs was motivated by the desire to make it fast.

I'd go one step further and store the maximum ngpio of all registered
chips in a global variable (and update it in gpiochip_add_data_with_key()),
then allocate 2 * max_ngpio once before entering the loop (as you've
suggested).  That would avoid the first pass to determine the maximum
chip->ngpio.  In most systems max_ngpio will be < 64, so one or two
unsigned longs depending on the arch's bitness.

FWIW, to achieve a stack overflow the platform or a driver need to specify
a huge number of GPIOs for a chip.  So the exploitability is limited,
but of course it's still better to get rid of the VLAs.

Running v2 of this patch through checkpatch --strict results in a few
"Alignment should match open parenthesis" and one "Please don't use
multiple blank lines" complaint, granted those are nits but it may
be worth fixing them up front lest the usual suspects come along and
submit bikeshedding patches.

Thanks,

Lukas
diff mbox

Patch

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index d66de67ef307..124727c74931 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -2662,16 +2662,33 @@  int gpiod_get_array_value_complex(bool raw, bool can_sleep,
 
 	while (i < array_size) {
 		struct gpio_chip *chip = desc_array[i]->gdev->chip;
-		unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
-		unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
+		unsigned long *mask;
+		unsigned long *bits;
 		int first, j, ret;
 
+		mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
+				sizeof(*mask),
+				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+
+		if (!mask)
+			return -ENOMEM;
+
+		bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
+				sizeof(*bits),
+				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+
+		if (!bits) {
+			kfree(mask);
+			return -ENOMEM;
+		}
+
+
 		if (!can_sleep)
 			WARN_ON(chip->can_sleep);
 
 		/* collect all inputs belonging to the same chip */
 		first = i;
-		memset(mask, 0, sizeof(mask));
+		memset(mask, 0, sizeof(*mask));
 		do {
 			const struct gpio_desc *desc = desc_array[i];
 			int hwgpio = gpio_chip_hwgpio(desc);
@@ -2682,8 +2699,11 @@  int gpiod_get_array_value_complex(bool raw, bool can_sleep,
 			 (desc_array[i]->gdev->chip == chip));
 
 		ret = gpio_chip_get_multiple(chip, mask, bits);
-		if (ret)
+		if (ret) {
+			kfree(bits);
+			kfree(mask);
 			return ret;
+		}
 
 		for (j = first; j < i; j++) {
 			const struct gpio_desc *desc = desc_array[j];
@@ -2695,6 +2715,8 @@  int gpiod_get_array_value_complex(bool raw, bool can_sleep,
 			value_array[j] = value;
 			trace_gpio_value(desc_to_gpio(desc), 1, value);
 		}
+		kfree(bits);
+		kfree(mask);
 	}
 	return 0;
 }
@@ -2887,14 +2909,30 @@  void gpiod_set_array_value_complex(bool raw, bool can_sleep,
 
 	while (i < array_size) {
 		struct gpio_chip *chip = desc_array[i]->gdev->chip;
-		unsigned long mask[BITS_TO_LONGS(chip->ngpio)];
-		unsigned long bits[BITS_TO_LONGS(chip->ngpio)];
+		unsigned long *mask;
+		unsigned long *bits;
 		int count = 0;
 
+		mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
+				sizeof(*mask),
+				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+
+		if (!mask)
+			return;
+
+		bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio),
+				sizeof(*bits),
+				can_sleep ? GFP_KERNEL : GFP_ATOMIC);
+
+		if (!bits) {
+			kfree(mask);
+			return;
+		}
+
 		if (!can_sleep)
 			WARN_ON(chip->can_sleep);
 
-		memset(mask, 0, sizeof(mask));
+		memset(mask, 0, sizeof(*mask));
 		do {
 			struct gpio_desc *desc = desc_array[i];
 			int hwgpio = gpio_chip_hwgpio(desc);
@@ -2925,6 +2963,9 @@  void gpiod_set_array_value_complex(bool raw, bool can_sleep,
 		/* push collected bits to outputs */
 		if (count != 0)
 			gpio_chip_set_multiple(chip, mask, bits);
+
+		kfree(mask);
+		kfree(bits);
 	}
 }