Message ID | 1448368427-1669-1-git-send-email-linux@rasmusvillemoes.dk (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Jiri Kosina |
Headers | show |
On Tue, 2015-11-24 at 13:33 +0100, Rasmus Villemoes wrote: > The code in hid_debug_event() causes horrible code generation. First, > we do a strlen() call for every byte we copy (we're doing a store to > global memory, so gcc has no way of proving that strlen(buf) doesn't > change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have > signed type, the modulo computation has to take into account the > possibility that list->tail+i is negative, so it's not just a simple > and. > > Fix the former by simply not doing strlen() at all (we have to load > buf[i] anyway, so testing it is almost free) and the latter by > changing i to unsigned. This cuts 29% (69 bytes) of the size of the > function. [] > diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c [] > @@ -659,13 +659,13 @@ EXPORT_SYMBOL_GPL(hid_dump_device); > /* enqueue string to 'events' ring buffer */ > void hid_debug_event(struct hid_device *hdev, char *buf) > { > - int i; > + unsigned i; > struct hid_debug_list *list; > unsigned long flags; > > spin_lock_irqsave(&hdev->debug_list_lock, flags); > list_for_each_entry(list, &hdev->debug_list, node) { > - for (i = 0; i < strlen(buf); i++) > + for (i = 0; buf[i]; i++) > list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] = > buf[i]; > list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE; trivia: The code might look nicer if (list->tail + i) % HID_DEBUG_BUFSIZE was stored into a temporary. Maybe use an if >= BUFSIZE to avoid a % Something like: int pos = list->tail; for (i = 0; buf[i]; i++) { list->hid_debug_buf[pos++] = buf[i]; if (pos >= HID_DEBUG_BUFSIZE) pos = 0; } list->tail = pos; -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 25 2015, Joe Perches <joe@perches.com> wrote: >> spin_lock_irqsave(&hdev->debug_list_lock, flags); >> list_for_each_entry(list, &hdev->debug_list, node) { >> - for (i = 0; i < strlen(buf); i++) >> + for (i = 0; buf[i]; i++) >> list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] = >> buf[i]; >> list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE; > > trivia: > > The code might look nicer if (list->tail + i) % HID_DEBUG_BUFSIZE > was stored into a temporary. Maybe. > Maybe use an if >= BUFSIZE to avoid a % Nah, that would likely be worse; both a cmov and a conditional jump are probably more expensive than a simple '& 0x1ff' which the % should compile to (provided the expression is unsigned). Rasmus -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 24 Nov 2015, Rasmus Villemoes wrote: > The code in hid_debug_event() causes horrible code generation. First, > we do a strlen() call for every byte we copy (we're doing a store to > global memory, so gcc has no way of proving that strlen(buf) doesn't > change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have > signed type, the modulo computation has to take into account the > possibility that list->tail+i is negative, so it's not just a simple > and. > > Fix the former by simply not doing strlen() at all (we have to load > buf[i] anyway, so testing it is almost free) and the latter by > changing i to unsigned. This cuts 29% (69 bytes) of the size of the > function. > > Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Agreed, this is much better. Applied to for-4.5/core, thanks Rasmus.
diff --git a/drivers/hid/hid-debug.c b/drivers/hid/hid-debug.c index 2886b645ced7..acfb522a432a 100644 --- a/drivers/hid/hid-debug.c +++ b/drivers/hid/hid-debug.c @@ -659,13 +659,13 @@ EXPORT_SYMBOL_GPL(hid_dump_device); /* enqueue string to 'events' ring buffer */ void hid_debug_event(struct hid_device *hdev, char *buf) { - int i; + unsigned i; struct hid_debug_list *list; unsigned long flags; spin_lock_irqsave(&hdev->debug_list_lock, flags); list_for_each_entry(list, &hdev->debug_list, node) { - for (i = 0; i < strlen(buf); i++) + for (i = 0; buf[i]; i++) list->hid_debug_buf[(list->tail + i) % HID_DEBUG_BUFSIZE] = buf[i]; list->tail = (list->tail + i) % HID_DEBUG_BUFSIZE;
The code in hid_debug_event() causes horrible code generation. First, we do a strlen() call for every byte we copy (we're doing a store to global memory, so gcc has no way of proving that strlen(buf) doesn't change). Second, since both i, list->tail and HID_DEBUG_BUFSIZE have signed type, the modulo computation has to take into account the possibility that list->tail+i is negative, so it's not just a simple and. Fix the former by simply not doing strlen() at all (we have to load buf[i] anyway, so testing it is almost free) and the latter by changing i to unsigned. This cuts 29% (69 bytes) of the size of the function. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> --- drivers/hid/hid-debug.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)