Message ID | 20180310001021.6437-2-labbott@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 2018-03-10 01:10, Laura Abbott wrote: > /* collect all inputs belonging to the same chip */ > first = i; > - memset(mask, 0, sizeof(mask)); > + memset(mask, 0, sizeof(*mask)); see below > @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep, > > while (i < array_size) { > struct gpio_chip *chip = desc_array[i]->gdev->chip; > - unsigned long mask[BITS_TO_LONGS(chip->ngpio)]; > - unsigned long bits[BITS_TO_LONGS(chip->ngpio)]; > + unsigned long *mask; > + unsigned long *bits; > int count = 0; > > + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio), > + sizeof(*mask), > + can_sleep ? GFP_KERNEL : GFP_ATOMIC); > + > + if (!mask) > + return; > + > + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio), > + sizeof(*bits), > + can_sleep ? GFP_KERNEL : GFP_ATOMIC); > + > + if (!bits) { > + kfree(mask); > + return; > + } > + > if (!can_sleep) > WARN_ON(chip->can_sleep); > > - memset(mask, 0, sizeof(mask)); > + memset(mask, 0, sizeof(*mask)); Hm, it seems you're now only clearing the first word of mask, not the entire allocation. Why not just use kcalloc() instead of kmalloc_array to have it automatically cleared? Other random thoughts: maybe two allocations for each loop iteration is a bit much. Maybe do a first pass over the array and collect the maximal chip->ngpio, do the memory allocation and freeing outside the loop (then you'd of course need to preserve the memset() with appropriate length computed). And maybe even just do one allocation, making bits point at the second half. Does the set function need to be updated to return an int to be able to inform the caller that memory allocation failed? Rasmus
On 03/12/2018 08:00 AM, Rasmus Villemoes wrote: > On 2018-03-10 01:10, Laura Abbott wrote: >> /* collect all inputs belonging to the same chip */ >> first = i; >> - memset(mask, 0, sizeof(mask)); >> + memset(mask, 0, sizeof(*mask)); > > see below > >> @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep, >> >> while (i < array_size) { >> struct gpio_chip *chip = desc_array[i]->gdev->chip; >> - unsigned long mask[BITS_TO_LONGS(chip->ngpio)]; >> - unsigned long bits[BITS_TO_LONGS(chip->ngpio)]; >> + unsigned long *mask; >> + unsigned long *bits; >> int count = 0; >> >> + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio), >> + sizeof(*mask), >> + can_sleep ? GFP_KERNEL : GFP_ATOMIC); >> + >> + if (!mask) >> + return; >> + >> + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio), >> + sizeof(*bits), >> + can_sleep ? GFP_KERNEL : GFP_ATOMIC); >> + >> + if (!bits) { >> + kfree(mask); >> + return; >> + } >> + >> if (!can_sleep) >> WARN_ON(chip->can_sleep); >> >> - memset(mask, 0, sizeof(mask)); >> + memset(mask, 0, sizeof(*mask)); > > Hm, it seems you're now only clearing the first word of mask, not the > entire allocation. Why not just use kcalloc() instead of kmalloc_array > to have it automatically cleared? > Bleh, I didn't think through that carefully. I'll just switch to kcalloc, especially since it calls kmalloc_array. > Other random thoughts: maybe two allocations for each loop iteration is > a bit much. Maybe do a first pass over the array and collect the maximal > chip->ngpio, do the memory allocation and freeing outside the loop (then > you'd of course need to preserve the memset() with appropriate length > computed). And maybe even just do one allocation, making bits point at > the second half. > I was trying to make minimal changes and match the existing code. Is this likely to be an actual hot path to optimize? > Does the set function need to be updated to return an int to be able to > inform the caller that memory allocation failed? > That would involve changing the public API. I don't have a problem doing so if that's what you want. > Rasmus > Thanks, Laura
On 2018-03-13 00:40, Laura Abbott wrote: > On 03/12/2018 08:00 AM, Rasmus Villemoes wrote: >> >> Hm, it seems you're now only clearing the first word of mask, not the >> entire allocation. Why not just use kcalloc() instead of kmalloc_array >> to have it automatically cleared? > > Bleh, I didn't think through that carefully. I'll just switch > to kcalloc, especially since it calls kmalloc_array. > >> Other random thoughts: maybe two allocations for each loop iteration is >> a bit much. Maybe do a first pass over the array and collect the maximal >> chip->ngpio, do the memory allocation and freeing outside the loop (then >> you'd of course need to preserve the memset() with appropriate length >> computed). And maybe even just do one allocation, making bits point at >> the second half. >> > > I was trying to make minimal changes and match the existing code. Well, textually you may be making the minimal changes (though the error handling adds some "boilerplate" kfree()s etc.), but semantically adding two kmallocs in a loop could be a big deal. Dunno, maybe in practice all the gpios (almost always) belong to the same chip, so there's only one iteration through the loop anyway. > Is this likely to be an actual hot path to optimize? No idea, it was just a drive-by comment, so also... >> Does the set function need to be updated to return an int to be able to >> inform the caller that memory allocation failed? > > That would involve changing the public API. I don't have a problem > doing so if that's what you want. ... I don't "want" anything, that'll be for the gpio maintainers to decide. Rasmus
On Mon, Mar 12, 2018 at 04:00:36PM +0100, Rasmus Villemoes wrote: > On 2018-03-10 01:10, Laura Abbott wrote: > > @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep, > > > > while (i < array_size) { > > struct gpio_chip *chip = desc_array[i]->gdev->chip; > > - unsigned long mask[BITS_TO_LONGS(chip->ngpio)]; > > - unsigned long bits[BITS_TO_LONGS(chip->ngpio)]; > > + unsigned long *mask; > > + unsigned long *bits; > > int count = 0; > > > > + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio), > > + sizeof(*mask), > > + can_sleep ? GFP_KERNEL : GFP_ATOMIC); > > + > > + if (!mask) > > + return; > > + > > + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio), > > + sizeof(*bits), > > + can_sleep ? GFP_KERNEL : GFP_ATOMIC); > > + > > + if (!bits) { > > + kfree(mask); > > + return; > > + } > > + > > if (!can_sleep) > > WARN_ON(chip->can_sleep); > > > > - memset(mask, 0, sizeof(mask)); > > + memset(mask, 0, sizeof(*mask)); > > Other random thoughts: maybe two allocations for each loop iteration is > a bit much. Maybe do a first pass over the array and collect the maximal > chip->ngpio, do the memory allocation and freeing outside the loop (then > you'd of course need to preserve the memset() with appropriate length > computed). And maybe even just do one allocation, making bits point at > the second half. I think those are great ideas because the function is kind of a hotpath and usage of VLAs was motivated by the desire to make it fast. I'd go one step further and store the maximum ngpio of all registered chips in a global variable (and update it in gpiochip_add_data_with_key()), then allocate 2 * max_ngpio once before entering the loop (as you've suggested). That would avoid the first pass to determine the maximum chip->ngpio. In most systems max_ngpio will be < 64, so one or two unsigned longs depending on the arch's bitness. FWIW, to achieve a stack overflow the platform or a driver need to specify a huge number of GPIOs for a chip. So the exploitability is limited, but of course it's still better to get rid of the VLAs. Running v2 of this patch through checkpatch --strict results in a few "Alignment should match open parenthesis" and one "Please don't use multiple blank lines" complaint, granted those are nits but it may be worth fixing them up front lest the usual suspects come along and submit bikeshedding patches. Thanks, Lukas
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index d66de67ef307..124727c74931 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -2662,16 +2662,33 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep, while (i < array_size) { struct gpio_chip *chip = desc_array[i]->gdev->chip; - unsigned long mask[BITS_TO_LONGS(chip->ngpio)]; - unsigned long bits[BITS_TO_LONGS(chip->ngpio)]; + unsigned long *mask; + unsigned long *bits; int first, j, ret; + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio), + sizeof(*mask), + can_sleep ? GFP_KERNEL : GFP_ATOMIC); + + if (!mask) + return -ENOMEM; + + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio), + sizeof(*bits), + can_sleep ? GFP_KERNEL : GFP_ATOMIC); + + if (!bits) { + kfree(mask); + return -ENOMEM; + } + + if (!can_sleep) WARN_ON(chip->can_sleep); /* collect all inputs belonging to the same chip */ first = i; - memset(mask, 0, sizeof(mask)); + memset(mask, 0, sizeof(*mask)); do { const struct gpio_desc *desc = desc_array[i]; int hwgpio = gpio_chip_hwgpio(desc); @@ -2682,8 +2699,11 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep, (desc_array[i]->gdev->chip == chip)); ret = gpio_chip_get_multiple(chip, mask, bits); - if (ret) + if (ret) { + kfree(bits); + kfree(mask); return ret; + } for (j = first; j < i; j++) { const struct gpio_desc *desc = desc_array[j]; @@ -2695,6 +2715,8 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep, value_array[j] = value; trace_gpio_value(desc_to_gpio(desc), 1, value); } + kfree(bits); + kfree(mask); } return 0; } @@ -2887,14 +2909,30 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep, while (i < array_size) { struct gpio_chip *chip = desc_array[i]->gdev->chip; - unsigned long mask[BITS_TO_LONGS(chip->ngpio)]; - unsigned long bits[BITS_TO_LONGS(chip->ngpio)]; + unsigned long *mask; + unsigned long *bits; int count = 0; + mask = kmalloc_array(BITS_TO_LONGS(chip->ngpio), + sizeof(*mask), + can_sleep ? GFP_KERNEL : GFP_ATOMIC); + + if (!mask) + return; + + bits = kmalloc_array(BITS_TO_LONGS(chip->ngpio), + sizeof(*bits), + can_sleep ? GFP_KERNEL : GFP_ATOMIC); + + if (!bits) { + kfree(mask); + return; + } + if (!can_sleep) WARN_ON(chip->can_sleep); - memset(mask, 0, sizeof(mask)); + memset(mask, 0, sizeof(*mask)); do { struct gpio_desc *desc = desc_array[i]; int hwgpio = gpio_chip_hwgpio(desc); @@ -2925,6 +2963,9 @@ void gpiod_set_array_value_complex(bool raw, bool can_sleep, /* push collected bits to outputs */ if (count != 0) gpio_chip_set_multiple(chip, mask, bits); + + kfree(mask); + kfree(bits); } }
The new challenge is to remove VLAs from the kernel (see https://lkml.org/lkml/2018/3/7/621) This patch replaces several VLAs with an appropriate call to kmalloc_array. Signed-off-by: Laura Abbott <labbott@redhat.com> --- drivers/gpio/gpiolib.c | 55 +++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 48 insertions(+), 7 deletions(-)