diff mbox

[RFC] Input: evdev - add new EVIOCGABSRANGE ioctl

Message ID 1403273149-29224-1-git-send-email-dh.herrmann@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

David Herrmann June 20, 2014, 2:05 p.m. UTC
When we introduced the slotted MT ABS extensions, we didn't take care to
make SYN_DROPPED recoverable. Imagine a client recevies a SYN_DROPPED and
syncs its current state via EVIOCGABS. It has to call this ioctl for each
and every ABS code separately. Besides being horribly slow, this series
of ioctl-calls is not atomic. The kernel might queue new ABS events while
the client fetches the data.

Now for normal ABS codes this is negligible as ABS values provide absolute
data. That is, there is usually no need to resync ABS codes as we don't
need previous values to interpret the next ABS code. Furthermore, most ABS
codes are also sent pretty frequently so a refresh is usually useless.

However, with the introduction of slotted ABS axes we added a relative
component: ABS_MT_SLOT. If a client syncs all ABS codes via EVIOCGABS
while the kernel has ABS-events and an ABS_MT_SLOT event queued, the
client will think any read ABS-event is for the retrieved SLOT, however,
this is not true as all events until the next ABS_MT_SLOT event are for
the previously active slot:

    Kernel queue is: { ABS_DROPPED,
                       ABS_MT_POSITION_X(slot: 0),
                       ABS_MT_SLOT(slot: 1),
                       ABS_MT_POSITION_X(slot: 1) }
    Client reads ABS_DROPPED from queue.
    Client syncs all ABS values:
      As part of that, client syncs ABS_MT_SLOT, which is 1 in the current
      view of the kernel.
    Client reads ABS_MT_POSITION_X and attributes it to slot 1 instead of
    slot 0, as the slot-value is not explicit.

This is just a simple example how the relative information provided by the
ABS_MT_SLOT axis can be problematic to clients.

Now there are many ways to fix this:
 * Make ABS_MT_SLOT a per-evdev-client attribute. On each
   EVIOCGABS(ABS_MT_SLOT) we can add fake ABS_MT_SLOT events to the queue.
   => Ugly and overkill
 * Flush all ABS events when clients read ABS_MT_SLOT.
   => Ugly hack and client might loose important ABS_MT_* events
 * Provide atomic EVIOCGABS API.
   => Sounds good!

This patch introduces EVIOCGABSRANGE. Unlike EVIOCGABS, this ioctl only
fetches ABS values, rather than the whole "struct input_absinfo" set.
However, the new ioctl can fetch a range of ABS axes atomically and will
flush matching events from the client's receive queue. Moreover, you can
fetch all axes for *all* slots with a single call.

This way, a client can simply run EVIOCGABSRANGE(0, ABS_CNT) and it will
receive a consistent view of the whole ABS state, while the kernel flushes
the receive-buffer for a consistent view.
While most clients probably only need
EVIOCGABSRANGE(ABS_MT_SLOT, ABS_MT_TOOL_y - ABS_MT_SLOT + 1), the ioctl
allows to receive an arbitrary range of axes.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
---
 drivers/input/evdev.c      | 180 ++++++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/input.h |  42 +++++++++++
 2 files changed, 218 insertions(+), 4 deletions(-)

Comments

Benjamin Tissoires June 20, 2014, 2:49 p.m. UTC | #1
Hi David,

On Fri, Jun 20, 2014 at 10:05 AM, David Herrmann <dh.herrmann@gmail.com> wrote:
> When we introduced the slotted MT ABS extensions, we didn't take care to
> make SYN_DROPPED recoverable. Imagine a client recevies a SYN_DROPPED and
> syncs its current state via EVIOCGABS. It has to call this ioctl for each
> and every ABS code separately. Besides being horribly slow, this series
> of ioctl-calls is not atomic. The kernel might queue new ABS events while
> the client fetches the data.
>
> Now for normal ABS codes this is negligible as ABS values provide absolute
> data. That is, there is usually no need to resync ABS codes as we don't
> need previous values to interpret the next ABS code. Furthermore, most ABS
> codes are also sent pretty frequently so a refresh is usually useless.
>
> However, with the introduction of slotted ABS axes we added a relative
> component: ABS_MT_SLOT. If a client syncs all ABS codes via EVIOCGABS
> while the kernel has ABS-events and an ABS_MT_SLOT event queued, the
> client will think any read ABS-event is for the retrieved SLOT, however,
> this is not true as all events until the next ABS_MT_SLOT event are for
> the previously active slot:
>
>     Kernel queue is: { ABS_DROPPED,
>                        ABS_MT_POSITION_X(slot: 0),
>                        ABS_MT_SLOT(slot: 1),
>                        ABS_MT_POSITION_X(slot: 1) }
>     Client reads ABS_DROPPED from queue.
>     Client syncs all ABS values:
>       As part of that, client syncs ABS_MT_SLOT, which is 1 in the current
>       view of the kernel.
>     Client reads ABS_MT_POSITION_X and attributes it to slot 1 instead of
>     slot 0, as the slot-value is not explicit.
>
> This is just a simple example how the relative information provided by the
> ABS_MT_SLOT axis can be problematic to clients.
>
> Now there are many ways to fix this:
>  * Make ABS_MT_SLOT a per-evdev-client attribute. On each
>    EVIOCGABS(ABS_MT_SLOT) we can add fake ABS_MT_SLOT events to the queue.
>    => Ugly and overkill
>  * Flush all ABS events when clients read ABS_MT_SLOT.
>    => Ugly hack and client might loose important ABS_MT_* events
>  * Provide atomic EVIOCGABS API.
>    => Sounds good!
>
> This patch introduces EVIOCGABSRANGE. Unlike EVIOCGABS, this ioctl only
> fetches ABS values, rather than the whole "struct input_absinfo" set.
> However, the new ioctl can fetch a range of ABS axes atomically and will
> flush matching events from the client's receive queue. Moreover, you can
> fetch all axes for *all* slots with a single call.
>
> This way, a client can simply run EVIOCGABSRANGE(0, ABS_CNT) and it will
> receive a consistent view of the whole ABS state, while the kernel flushes
> the receive-buffer for a consistent view.
> While most clients probably only need
> EVIOCGABSRANGE(ABS_MT_SLOT, ABS_MT_TOOL_y - ABS_MT_SLOT + 1), the ioctl
> allows to receive an arbitrary range of axes.
>
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> ---

Thanks for this. This sounds very promising. I did not went into the
tiny details of the code, but I already have some comments. See below.

>  drivers/input/evdev.c      | 180 ++++++++++++++++++++++++++++++++++++++++++++-
>  include/uapi/linux/input.h |  42 +++++++++++
>  2 files changed, 218 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
> index 6386882..7a25a7a 100644
> --- a/drivers/input/evdev.c
> +++ b/drivers/input/evdev.c
> @@ -175,8 +175,9 @@ static bool __evdev_is_filtered(struct evdev_client *client,
>         return mask && !test_bit(code, mask);
>  }
>
> -/* flush queued events of type @type, caller must hold client->buffer_lock */
> -static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
> +/* flush queued, matching events, caller must hold client->buffer_lock */
> +static void __evdev_flush_queue(struct evdev_client *client, unsigned int type,
> +                               unsigned int code_first, unsigned int code_last)
>  {
>         unsigned int i, head, num;
>         unsigned int mask = client->bufsize - 1;
> @@ -195,7 +196,9 @@ static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
>                 ev = &client->buffer[i];
>                 is_report = ev->type == EV_SYN && ev->code == SYN_REPORT;
>
> -               if (ev->type == type) {
> +               if (ev->type == type &&
> +                   ev->code >= code_first &&
> +                   ev->code <= code_last) {
>                         /* drop matched entry */
>                         continue;
>                 } else if (is_report && !num) {
> @@ -786,6 +789,172 @@ static int handle_eviocgbit(struct input_dev *dev,
>         return bits_to_user(bits, len, size, p, compat_mode);
>  }
>
> +static inline void free_absrange(s32 **pages, size_t page_cnt)
> +{
> +       if (page_cnt > 1) {
> +               while (page_cnt > 0) {
> +                       if (!pages[--page_cnt])
> +                               break;
> +                       __free_page(virt_to_page(pages[page_cnt]));
> +               }
> +               kfree(pages);
> +       } else if (page_cnt == 1) {
> +               kfree(pages);
> +       }
> +}
> +
> +static inline s32 *absrange_ptr(s32 **pages, size_t page_cnt, size_t slots,
> +                               size_t i_code, size_t j_slot)
> +{
> +       size_t idx, off;
> +
> +       idx = (i_code * slots + j_slot) / (PAGE_SIZE / sizeof(s32));
> +       off = (i_code * slots + j_slot) % (PAGE_SIZE / sizeof(s32));
> +
> +       if (page_cnt == 1)
> +               return &((s32*)pages)[off];
> +       else
> +               return &pages[idx][off];
> +}
> +
> +static inline ssize_t fetch_absrange(struct evdev_client *client,
> +                                    struct input_dev *dev, size_t start,
> +                                    size_t count, size_t slots, s32 ***out)
> +{
> +       size_t size, page_cnt, i, j;
> +       unsigned long flags;
> +       s32 **pages;
> +
> +       /*
> +        * Fetch data atomically from the device and flush buffers. We need to
> +        * allocate a temporary buffer as copy_to_user() is not allowed while
> +        * holding spinlocks. However, to-be-copied data might be huge and
> +        * high-order allocations should be avoided. Therefore, do the
> +        * page-allocation manually.
> +        */
> +
> +       BUILD_BUG_ON(PAGE_SIZE % sizeof(s32) != 0);
> +
> +       size = sizeof(s32) * count * slots;
> +       page_cnt = DIV_ROUND_UP(size, PAGE_SIZE);
> +       if (page_cnt < 1) {
> +               return 0;
> +       } else if (page_cnt == 1) {
> +               pages = kzalloc(size, GFP_TEMPORARY);
> +               if (!pages)
> +                       return -ENOMEM;
> +       } else {
> +               pages = kzalloc(sizeof(*pages) * page_cnt, GFP_TEMPORARY);
> +               if (!pages)
> +                       return -ENOMEM;
> +
> +               for (i = 0; i < page_cnt; ++i) {
> +                       pages[i] = (void*)get_zeroed_page(GFP_TEMPORARY);
> +                       if (!pages[i]) {
> +                               free_absrange(pages, page_cnt);
> +                               return -ENOMEM;
> +                       }
> +               }
> +       }
> +
> +       spin_lock_irqsave(&dev->event_lock, flags);
> +       spin_lock(&client->buffer_lock);
> +
> +       for (i = 0; i < count; ++i) {
> +               __u16 code;
> +               bool is_mt;
> +
> +               code = start + i;
> +               is_mt = input_is_mt_value(code);
> +               if (is_mt && !dev->mt)
> +                       continue;
> +
> +               for (j = 0; j < slots; ++j) {
> +                       __s32 v;
> +
> +                       if (is_mt)
> +                               v = input_mt_get_value(&dev->mt->slots[j],
> +                                                      code);
> +                       else
> +                               v = dev->absinfo[code].value;
> +
> +                       *absrange_ptr(pages, page_cnt, slots, i, j) = v;
> +
> +                       if (!is_mt)
> +                               break;
> +               }
> +       }
> +
> +       spin_unlock(&client->buffer_lock);
> +       __evdev_flush_queue(client, EV_ABS, start, start + count - 1);
> +       spin_unlock_irqrestore(&dev->event_lock, flags);
> +
> +       *out = pages;
> +       return page_cnt;
> +}
> +
> +static int evdev_handle_get_absrange(struct evdev_client *client,
> +                                    struct input_dev *dev,
> +                                    struct input_absrange __user *p)
> +{
> +       size_t slots, code, count, i, j;
> +       struct input_absrange absbuf;
> +       s32 **vals = NULL;
> +       ssize_t val_cnt;
> +       s32 __user *b;
> +       int retval;
> +
> +       if (!dev->absinfo)
> +               return -EINVAL;
> +       if (copy_from_user(&absbuf, p, sizeof(absbuf)))
> +               return -EFAULT;
> +
> +       slots = min_t(size_t, dev->mt ? dev->mt->num_slots : 1, absbuf.slots);
> +       code = min_t(size_t, absbuf.code, ABS_CNT);
> +       count = min_t(size_t, absbuf.count, ABS_CNT);
> +
> +       /* first fetch data atomically from device */
> +
> +       if (code + count > ABS_CNT)
> +               count = ABS_CNT - code;
> +
> +       if (!slots || !count) {
> +               val_cnt = 0;
> +       } else {
> +               val_cnt = fetch_absrange(client, dev, code, count,
> +                                        slots, &vals);
> +               if (val_cnt < 0)
> +                       return val_cnt;
> +       }
> +
> +       /* now copy data to user-space */
> +
> +       b = (void __user*)(unsigned long)absbuf.buffer;

What if the user space buffer is not long enough, or if the pointer is
invalid? Is there any means from the kernel to guarantee that we are
not writing in a restricted memory area or in a buffer not owned by
the process?

> +       for (i = 0; i < absbuf.count; ++i) {
> +               for (j = 0; j < absbuf.slots; ++j, ++b) {
> +                       s32 v;
> +
> +                       if (i >= count || j >= slots)
> +                               v = 0;
> +                       else
> +                               v = *absrange_ptr(vals, val_cnt, slots, i, j);
> +
> +                       if (put_user(v, b)) {
> +                               retval = -EFAULT;
> +                               goto out;
> +                       }
> +               }
> +       }
> +

Shouldn't we also call free_absrange(vals, val_cnt); before returning?

> +       retval = 0;

Not sure it matters a lot, but returning the size of what has been
written is more common. This would make sense if the buffer is not
long enough or if it is too big.

Cheers,
Benjamin

> +
> +out:
> +       free_absrange(vals, val_cnt);
> +       if (retval < 0)
> +               evdev_queue_syn_dropped(client);
> +       return retval;
> +}
> +
>  static int evdev_handle_get_keycode(struct input_dev *dev, void __user *p)
>  {
>         struct input_keymap_entry ke = {
> @@ -889,7 +1058,7 @@ static int evdev_handle_get_val(struct evdev_client *client,
>
>         spin_unlock(&dev->event_lock);
>
> -       __evdev_flush_queue(client, type);
> +       __evdev_flush_queue(client, type, 0, UINT_MAX);
>
>         spin_unlock_irq(&client->buffer_lock);
>
> @@ -1006,6 +1175,9 @@ static long evdev_do_ioctl(struct file *file, unsigned int cmd,
>                 else
>                         return evdev_revoke(evdev, client, file);
>
> +       case EVIOCGABSRANGE:
> +               return evdev_handle_get_absrange(client, dev, p);
> +
>         case EVIOCGMASK:
>                 if (copy_from_user(&mask, p, sizeof(mask)))
>                         return -EFAULT;
> diff --git a/include/uapi/linux/input.h b/include/uapi/linux/input.h
> index f6ace0e..32a6443 100644
> --- a/include/uapi/linux/input.h
> +++ b/include/uapi/linux/input.h
> @@ -210,6 +210,48 @@ struct input_mask {
>   */
>  #define EVIOCSMASK             _IOW('E', 0x93, struct input_mask)      /* Set event-masks */
>
> +struct input_absrange {
> +       __u16 slots;
> +       __u16 code;
> +       __u32 count;
> +       __u64 buffer;
> +};
> +
> +/**
> + * EVIOCGABSRANGE - Fetch range of ABS values
> + *
> + * This fetches the current values of a range of ABS codes atomically. The range
> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
> + * which has the following fields:
> + *      slots: Number of MT slots to fetch data for.
> + *       code: First ABS axis to query.
> + *      count: Number of ABS axes to query starting at @code.
> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
> + *             values. This buffer must be an array of __s32 with at least
> + *             (@slots * @code) elements. The buffer is interpreted as two
> + *             dimensional __s32 array, declared as: __s32[slots][codes]
> + *
> + * Compared to EVIOCGABS this ioctl allows to retrieve a range of ABS codes
> + * atomically regarding any concurrent buffer modifications. Furthermore, any
> + * pending events for codes that were retrived via this call are flushed from
> + * the client's receive buffer. But unlike EVIOCGABS, this ioctl only returns
> + * the current value of an axis, rather than the whole "struct input_absinfo"
> + * set. All fields of "struct input_absinfo" except for the value are constant,
> + * though.
> + *
> + * The kernel's current view of the ABS axes is copied into the provided buffer.
> + * If an ABS axis is not enabled on the device, its value will be zero. Also, if
> + * an axis is not a slotted MT-axis, values for all but the first slot will be
> + * 0. If @slots is greater than the actual number of slots provided by the
> + * device, values for all slots higher than that will be 0.
> + *
> + * This call may fail with -EINVAL if the kernel doesn't support this call or
> + * the arguments are invalid, with -ENODEV if access was revoked, -ENOMEM if the
> + * kernel couldn't allocate temporary buffers for data-copy or -EFAULT if the
> + * passed pointer was invalid.
> + */
> +#define EVIOCGABSRANGE         _IOR('E', 0x94, struct input_absrange)
> +
>  #define EVIOCSCLOCKID          _IOW('E', 0xa0, int)                    /* Set clockid to be used for timestamps */
>
>  /*
> --
> 2.0.0
>
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Herrmann June 20, 2014, 2:59 p.m. UTC | #2
Hi

On Fri, Jun 20, 2014 at 4:49 PM, Benjamin Tissoires
<benjamin.tissoires@gmail.com> wrote:
> Hi David,
>
> On Fri, Jun 20, 2014 at 10:05 AM, David Herrmann <dh.herrmann@gmail.com> wrote:
>> When we introduced the slotted MT ABS extensions, we didn't take care to
>> make SYN_DROPPED recoverable. Imagine a client recevies a SYN_DROPPED and
>> syncs its current state via EVIOCGABS. It has to call this ioctl for each
>> and every ABS code separately. Besides being horribly slow, this series
>> of ioctl-calls is not atomic. The kernel might queue new ABS events while
>> the client fetches the data.
>>
>> Now for normal ABS codes this is negligible as ABS values provide absolute
>> data. That is, there is usually no need to resync ABS codes as we don't
>> need previous values to interpret the next ABS code. Furthermore, most ABS
>> codes are also sent pretty frequently so a refresh is usually useless.
>>
>> However, with the introduction of slotted ABS axes we added a relative
>> component: ABS_MT_SLOT. If a client syncs all ABS codes via EVIOCGABS
>> while the kernel has ABS-events and an ABS_MT_SLOT event queued, the
>> client will think any read ABS-event is for the retrieved SLOT, however,
>> this is not true as all events until the next ABS_MT_SLOT event are for
>> the previously active slot:
>>
>>     Kernel queue is: { ABS_DROPPED,
>>                        ABS_MT_POSITION_X(slot: 0),
>>                        ABS_MT_SLOT(slot: 1),
>>                        ABS_MT_POSITION_X(slot: 1) }
>>     Client reads ABS_DROPPED from queue.
>>     Client syncs all ABS values:
>>       As part of that, client syncs ABS_MT_SLOT, which is 1 in the current
>>       view of the kernel.
>>     Client reads ABS_MT_POSITION_X and attributes it to slot 1 instead of
>>     slot 0, as the slot-value is not explicit.
>>
>> This is just a simple example how the relative information provided by the
>> ABS_MT_SLOT axis can be problematic to clients.
>>
>> Now there are many ways to fix this:
>>  * Make ABS_MT_SLOT a per-evdev-client attribute. On each
>>    EVIOCGABS(ABS_MT_SLOT) we can add fake ABS_MT_SLOT events to the queue.
>>    => Ugly and overkill
>>  * Flush all ABS events when clients read ABS_MT_SLOT.
>>    => Ugly hack and client might loose important ABS_MT_* events
>>  * Provide atomic EVIOCGABS API.
>>    => Sounds good!
>>
>> This patch introduces EVIOCGABSRANGE. Unlike EVIOCGABS, this ioctl only
>> fetches ABS values, rather than the whole "struct input_absinfo" set.
>> However, the new ioctl can fetch a range of ABS axes atomically and will
>> flush matching events from the client's receive queue. Moreover, you can
>> fetch all axes for *all* slots with a single call.
>>
>> This way, a client can simply run EVIOCGABSRANGE(0, ABS_CNT) and it will
>> receive a consistent view of the whole ABS state, while the kernel flushes
>> the receive-buffer for a consistent view.
>> While most clients probably only need
>> EVIOCGABSRANGE(ABS_MT_SLOT, ABS_MT_TOOL_y - ABS_MT_SLOT + 1), the ioctl
>> allows to receive an arbitrary range of axes.
>>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> ---
>
> Thanks for this. This sounds very promising. I did not went into the
> tiny details of the code, but I already have some comments. See below.

Thanks for the comments!

>>  drivers/input/evdev.c      | 180 ++++++++++++++++++++++++++++++++++++++++++++-
>>  include/uapi/linux/input.h |  42 +++++++++++
>>  2 files changed, 218 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
>> index 6386882..7a25a7a 100644
>> --- a/drivers/input/evdev.c
>> +++ b/drivers/input/evdev.c
>> @@ -175,8 +175,9 @@ static bool __evdev_is_filtered(struct evdev_client *client,
>>         return mask && !test_bit(code, mask);
>>  }
>>
>> -/* flush queued events of type @type, caller must hold client->buffer_lock */
>> -static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
>> +/* flush queued, matching events, caller must hold client->buffer_lock */
>> +static void __evdev_flush_queue(struct evdev_client *client, unsigned int type,
>> +                               unsigned int code_first, unsigned int code_last)
>>  {
>>         unsigned int i, head, num;
>>         unsigned int mask = client->bufsize - 1;
>> @@ -195,7 +196,9 @@ static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
>>                 ev = &client->buffer[i];
>>                 is_report = ev->type == EV_SYN && ev->code == SYN_REPORT;
>>
>> -               if (ev->type == type) {
>> +               if (ev->type == type &&
>> +                   ev->code >= code_first &&
>> +                   ev->code <= code_last) {
>>                         /* drop matched entry */
>>                         continue;
>>                 } else if (is_report && !num) {
>> @@ -786,6 +789,172 @@ static int handle_eviocgbit(struct input_dev *dev,
>>         return bits_to_user(bits, len, size, p, compat_mode);
>>  }
>>
>> +static inline void free_absrange(s32 **pages, size_t page_cnt)
>> +{
>> +       if (page_cnt > 1) {
>> +               while (page_cnt > 0) {
>> +                       if (!pages[--page_cnt])
>> +                               break;
>> +                       __free_page(virt_to_page(pages[page_cnt]));
>> +               }
>> +               kfree(pages);
>> +       } else if (page_cnt == 1) {
>> +               kfree(pages);
>> +       }
>> +}
>> +
>> +static inline s32 *absrange_ptr(s32 **pages, size_t page_cnt, size_t slots,
>> +                               size_t i_code, size_t j_slot)
>> +{
>> +       size_t idx, off;
>> +
>> +       idx = (i_code * slots + j_slot) / (PAGE_SIZE / sizeof(s32));
>> +       off = (i_code * slots + j_slot) % (PAGE_SIZE / sizeof(s32));
>> +
>> +       if (page_cnt == 1)
>> +               return &((s32*)pages)[off];
>> +       else
>> +               return &pages[idx][off];
>> +}
>> +
>> +static inline ssize_t fetch_absrange(struct evdev_client *client,
>> +                                    struct input_dev *dev, size_t start,
>> +                                    size_t count, size_t slots, s32 ***out)
>> +{
>> +       size_t size, page_cnt, i, j;
>> +       unsigned long flags;
>> +       s32 **pages;
>> +
>> +       /*
>> +        * Fetch data atomically from the device and flush buffers. We need to
>> +        * allocate a temporary buffer as copy_to_user() is not allowed while
>> +        * holding spinlocks. However, to-be-copied data might be huge and
>> +        * high-order allocations should be avoided. Therefore, do the
>> +        * page-allocation manually.
>> +        */
>> +
>> +       BUILD_BUG_ON(PAGE_SIZE % sizeof(s32) != 0);
>> +
>> +       size = sizeof(s32) * count * slots;
>> +       page_cnt = DIV_ROUND_UP(size, PAGE_SIZE);
>> +       if (page_cnt < 1) {
>> +               return 0;
>> +       } else if (page_cnt == 1) {
>> +               pages = kzalloc(size, GFP_TEMPORARY);
>> +               if (!pages)
>> +                       return -ENOMEM;
>> +       } else {
>> +               pages = kzalloc(sizeof(*pages) * page_cnt, GFP_TEMPORARY);
>> +               if (!pages)
>> +                       return -ENOMEM;
>> +
>> +               for (i = 0; i < page_cnt; ++i) {
>> +                       pages[i] = (void*)get_zeroed_page(GFP_TEMPORARY);
>> +                       if (!pages[i]) {
>> +                               free_absrange(pages, page_cnt);
>> +                               return -ENOMEM;
>> +                       }
>> +               }
>> +       }
>> +
>> +       spin_lock_irqsave(&dev->event_lock, flags);
>> +       spin_lock(&client->buffer_lock);
>> +
>> +       for (i = 0; i < count; ++i) {
>> +               __u16 code;
>> +               bool is_mt;
>> +
>> +               code = start + i;
>> +               is_mt = input_is_mt_value(code);
>> +               if (is_mt && !dev->mt)
>> +                       continue;
>> +
>> +               for (j = 0; j < slots; ++j) {
>> +                       __s32 v;
>> +
>> +                       if (is_mt)
>> +                               v = input_mt_get_value(&dev->mt->slots[j],
>> +                                                      code);
>> +                       else
>> +                               v = dev->absinfo[code].value;
>> +
>> +                       *absrange_ptr(pages, page_cnt, slots, i, j) = v;
>> +
>> +                       if (!is_mt)
>> +                               break;
>> +               }
>> +       }
>> +
>> +       spin_unlock(&client->buffer_lock);
>> +       __evdev_flush_queue(client, EV_ABS, start, start + count - 1);
>> +       spin_unlock_irqrestore(&dev->event_lock, flags);
>> +
>> +       *out = pages;
>> +       return page_cnt;
>> +}
>> +
>> +static int evdev_handle_get_absrange(struct evdev_client *client,
>> +                                    struct input_dev *dev,
>> +                                    struct input_absrange __user *p)
>> +{
>> +       size_t slots, code, count, i, j;
>> +       struct input_absrange absbuf;
>> +       s32 **vals = NULL;
>> +       ssize_t val_cnt;
>> +       s32 __user *b;
>> +       int retval;
>> +
>> +       if (!dev->absinfo)
>> +               return -EINVAL;
>> +       if (copy_from_user(&absbuf, p, sizeof(absbuf)))
>> +               return -EFAULT;
>> +
>> +       slots = min_t(size_t, dev->mt ? dev->mt->num_slots : 1, absbuf.slots);
>> +       code = min_t(size_t, absbuf.code, ABS_CNT);
>> +       count = min_t(size_t, absbuf.count, ABS_CNT);
>> +
>> +       /* first fetch data atomically from device */
>> +
>> +       if (code + count > ABS_CNT)
>> +               count = ABS_CNT - code;
>> +
>> +       if (!slots || !count) {
>> +               val_cnt = 0;
>> +       } else {
>> +               val_cnt = fetch_absrange(client, dev, code, count,
>> +                                        slots, &vals);
>> +               if (val_cnt < 0)
>> +                       return val_cnt;
>> +       }
>> +
>> +       /* now copy data to user-space */
>> +
>> +       b = (void __user*)(unsigned long)absbuf.buffer;
>
> What if the user space buffer is not long enough, or if the pointer is
> invalid? Is there any means from the kernel to guarantee that we are
> not writing in a restricted memory area or in a buffer not owned by
> the process?

The buffer is given as pointer in absbuf.buffer, the size of the
buffer is given as absbuf.slots * absbuf.count. This is the same as
for the read() syscall, where we get a pointer plus the size.
I use put_user() to write data into that buffer, so in case it's not
valid user-memory, this will fail with -EFAULT.

This is no different from other calls that return data, apart from
splitting the size into two ints "slots" and "count". So I cannot
follow what you mean? Note that "put_user()" is equivalent to
"copy_to_user()", maybe you missed that part?

>> +       for (i = 0; i < absbuf.count; ++i) {
>> +               for (j = 0; j < absbuf.slots; ++j, ++b) {
>> +                       s32 v;
>> +
>> +                       if (i >= count || j >= slots)
>> +                               v = 0;
>> +                       else
>> +                               v = *absrange_ptr(vals, val_cnt, slots, i, j);
>> +
>> +                       if (put_user(v, b)) {
>> +                               retval = -EFAULT;
>> +                               goto out;
>> +                       }
>> +               }
>> +       }
>> +
>
> Shouldn't we also call free_absrange(vals, val_cnt); before returning?

I do. There's a fall-through to "out:", so we always free the buffer.
Otherwise, I would have called it "error:" :)

>> +       retval = 0;
>
> Not sure it matters a lot, but returning the size of what has been
> written is more common. This would make sense if the buffer is not
> long enough or if it is too big.

This would always be "absbuf.slots * absbuf.count". I don't think
there's much gain in returning a constant size, is there?

Thanks!
David

>
>> +
>> +out:
>> +       free_absrange(vals, val_cnt);
>> +       if (retval < 0)
>> +               evdev_queue_syn_dropped(client);
>> +       return retval;
>> +}
>> +
>>  static int evdev_handle_get_keycode(struct input_dev *dev, void __user *p)
>>  {
>>         struct input_keymap_entry ke = {
>> @@ -889,7 +1058,7 @@ static int evdev_handle_get_val(struct evdev_client *client,
>>
>>         spin_unlock(&dev->event_lock);
>>
>> -       __evdev_flush_queue(client, type);
>> +       __evdev_flush_queue(client, type, 0, UINT_MAX);
>>
>>         spin_unlock_irq(&client->buffer_lock);
>>
>> @@ -1006,6 +1175,9 @@ static long evdev_do_ioctl(struct file *file, unsigned int cmd,
>>                 else
>>                         return evdev_revoke(evdev, client, file);
>>
>> +       case EVIOCGABSRANGE:
>> +               return evdev_handle_get_absrange(client, dev, p);
>> +
>>         case EVIOCGMASK:
>>                 if (copy_from_user(&mask, p, sizeof(mask)))
>>                         return -EFAULT;
>> diff --git a/include/uapi/linux/input.h b/include/uapi/linux/input.h
>> index f6ace0e..32a6443 100644
>> --- a/include/uapi/linux/input.h
>> +++ b/include/uapi/linux/input.h
>> @@ -210,6 +210,48 @@ struct input_mask {
>>   */
>>  #define EVIOCSMASK             _IOW('E', 0x93, struct input_mask)      /* Set event-masks */
>>
>> +struct input_absrange {
>> +       __u16 slots;
>> +       __u16 code;
>> +       __u32 count;
>> +       __u64 buffer;
>> +};
>> +
>> +/**
>> + * EVIOCGABSRANGE - Fetch range of ABS values
>> + *
>> + * This fetches the current values of a range of ABS codes atomically. The range
>> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
>> + * which has the following fields:
>> + *      slots: Number of MT slots to fetch data for.
>> + *       code: First ABS axis to query.
>> + *      count: Number of ABS axes to query starting at @code.
>> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
>> + *             values. This buffer must be an array of __s32 with at least
>> + *             (@slots * @code) elements. The buffer is interpreted as two
>> + *             dimensional __s32 array, declared as: __s32[slots][codes]
>> + *
>> + * Compared to EVIOCGABS this ioctl allows to retrieve a range of ABS codes
>> + * atomically regarding any concurrent buffer modifications. Furthermore, any
>> + * pending events for codes that were retrived via this call are flushed from
>> + * the client's receive buffer. But unlike EVIOCGABS, this ioctl only returns
>> + * the current value of an axis, rather than the whole "struct input_absinfo"
>> + * set. All fields of "struct input_absinfo" except for the value are constant,
>> + * though.
>> + *
>> + * The kernel's current view of the ABS axes is copied into the provided buffer.
>> + * If an ABS axis is not enabled on the device, its value will be zero. Also, if
>> + * an axis is not a slotted MT-axis, values for all but the first slot will be
>> + * 0. If @slots is greater than the actual number of slots provided by the
>> + * device, values for all slots higher than that will be 0.
>> + *
>> + * This call may fail with -EINVAL if the kernel doesn't support this call or
>> + * the arguments are invalid, with -ENODEV if access was revoked, -ENOMEM if the
>> + * kernel couldn't allocate temporary buffers for data-copy or -EFAULT if the
>> + * passed pointer was invalid.
>> + */
>> +#define EVIOCGABSRANGE         _IOR('E', 0x94, struct input_absrange)
>> +
>>  #define EVIOCSCLOCKID          _IOW('E', 0xa0, int)                    /* Set clockid to be used for timestamps */
>>
>>  /*
>> --
>> 2.0.0
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Tissoires June 20, 2014, 3:53 p.m. UTC | #3
On Fri, Jun 20, 2014 at 10:59 AM, David Herrmann <dh.herrmann@gmail.com> wrote:
> Hi
>
> On Fri, Jun 20, 2014 at 4:49 PM, Benjamin Tissoires
> <benjamin.tissoires@gmail.com> wrote:
>> Hi David,
>>
[stripped]
>>> +
>>> +static int evdev_handle_get_absrange(struct evdev_client *client,
>>> +                                    struct input_dev *dev,
>>> +                                    struct input_absrange __user *p)
>>> +{
>>> +       size_t slots, code, count, i, j;
>>> +       struct input_absrange absbuf;
>>> +       s32 **vals = NULL;
>>> +       ssize_t val_cnt;
>>> +       s32 __user *b;
>>> +       int retval;
>>> +
>>> +       if (!dev->absinfo)
>>> +               return -EINVAL;
>>> +       if (copy_from_user(&absbuf, p, sizeof(absbuf)))
>>> +               return -EFAULT;
>>> +
>>> +       slots = min_t(size_t, dev->mt ? dev->mt->num_slots : 1, absbuf.slots);
>>> +       code = min_t(size_t, absbuf.code, ABS_CNT);
>>> +       count = min_t(size_t, absbuf.count, ABS_CNT);
>>> +
>>> +       /* first fetch data atomically from device */
>>> +
>>> +       if (code + count > ABS_CNT)
>>> +               count = ABS_CNT - code;
>>> +
>>> +       if (!slots || !count) {
>>> +               val_cnt = 0;
>>> +       } else {
>>> +               val_cnt = fetch_absrange(client, dev, code, count,
>>> +                                        slots, &vals);
>>> +               if (val_cnt < 0)
>>> +                       return val_cnt;
>>> +       }
>>> +
>>> +       /* now copy data to user-space */
>>> +
>>> +       b = (void __user*)(unsigned long)absbuf.buffer;
>>
>> What if the user space buffer is not long enough, or if the pointer is
>> invalid? Is there any means from the kernel to guarantee that we are
>> not writing in a restricted memory area or in a buffer not owned by
>> the process?
>
> The buffer is given as pointer in absbuf.buffer, the size of the
> buffer is given as absbuf.slots * absbuf.count. This is the same as
> for the read() syscall, where we get a pointer plus the size.
> I use put_user() to write data into that buffer, so in case it's not
> valid user-memory, this will fail with -EFAULT.
>
> This is no different from other calls that return data, apart from
> splitting the size into two ints "slots" and "count". So I cannot
> follow what you mean? Note that "put_user()" is equivalent to
> "copy_to_user()", maybe you missed that part?

Well, my concern here is not to have a CVE in 2 months time, when the
kernel is out. So I prefer being cautious when dealing with user /
kernel exchanges.

And yes, you are right, put_user will check for the validity of the
pointer for each data you send in it.

>
>>> +       for (i = 0; i < absbuf.count; ++i) {
>>> +               for (j = 0; j < absbuf.slots; ++j, ++b) {
>>> +                       s32 v;
>>> +
>>> +                       if (i >= count || j >= slots)
>>> +                               v = 0;
>>> +                       else
>>> +                               v = *absrange_ptr(vals, val_cnt, slots, i, j);
>>> +
>>> +                       if (put_user(v, b)) {

From what I read in include/asm-generic/uaccess.h put_user does not change "b".
Aren't you missing a b++ ? (but maybe I should just stop reviewing
this today because I may not have the eyes completely opened today...)

>>> +                               retval = -EFAULT;
>>> +                               goto out;
>>> +                       }
>>> +               }
>>> +       }
>>> +
>>
>> Shouldn't we also call free_absrange(vals, val_cnt); before returning?
>
> I do. There's a fall-through to "out:", so we always free the buffer.
> Otherwise, I would have called it "error:" :)

Oops, my bad, I read the next statement as "return 0" :o)

>
>>> +       retval = 0;
>>
>> Not sure it matters a lot, but returning the size of what has been
>> written is more common. This would make sense if the buffer is not
>> long enough or if it is too big.
>
> This would always be "absbuf.slots * absbuf.count". I don't think
> there's much gain in returning a constant size, is there?
>

With the current code, if the buffer is not long enough, you return
-EFAULT. However, one can argue that we can simply return the current
count of valid written data (especially because the data have been
dropped from the event queue).

Also, returning the actual written data may help in two corner cases
(when the programmer made a mistake, but programmers make mistakes):
- if count or slots is null -> the return value will be 0 (success),
whereas nothing happened
- if the allocated buffer is bigger than what is required -> a lazy
programmer will consider the whole buffer being valid, whereas only
the first bytes have been written (it happened to me, not in this
case, but still).

It's not a matter of returning a constant, it's a matter of notifying
how many data have been forwarded to the user space. But yes, for the
general case, the user space will now which value will be returned.
(this is just my personal taste, and maybe others will prefer the 0)

Cheers,
Benjamin

>
>>
>>> +
>>> +out:
>>> +       free_absrange(vals, val_cnt);
>>> +       if (retval < 0)
>>> +               evdev_queue_syn_dropped(client);
>>> +       return retval;
>>> +}
>>> +
>>>  static int evdev_handle_get_keycode(struct input_dev *dev, void __user *p)
>>>  {
>>>         struct input_keymap_entry ke = {
>>> @@ -889,7 +1058,7 @@ static int evdev_handle_get_val(struct evdev_client *client,
>>>
>>>         spin_unlock(&dev->event_lock);
>>>
>>> -       __evdev_flush_queue(client, type);
>>> +       __evdev_flush_queue(client, type, 0, UINT_MAX);
>>>
>>>         spin_unlock_irq(&client->buffer_lock);
>>>
>>> @@ -1006,6 +1175,9 @@ static long evdev_do_ioctl(struct file *file, unsigned int cmd,
>>>                 else
>>>                         return evdev_revoke(evdev, client, file);
>>>
>>> +       case EVIOCGABSRANGE:
>>> +               return evdev_handle_get_absrange(client, dev, p);
>>> +
>>>         case EVIOCGMASK:
>>>                 if (copy_from_user(&mask, p, sizeof(mask)))
>>>                         return -EFAULT;
>>> diff --git a/include/uapi/linux/input.h b/include/uapi/linux/input.h
>>> index f6ace0e..32a6443 100644
>>> --- a/include/uapi/linux/input.h
>>> +++ b/include/uapi/linux/input.h
>>> @@ -210,6 +210,48 @@ struct input_mask {
>>>   */
>>>  #define EVIOCSMASK             _IOW('E', 0x93, struct input_mask)      /* Set event-masks */
>>>
>>> +struct input_absrange {
>>> +       __u16 slots;
>>> +       __u16 code;
>>> +       __u32 count;
>>> +       __u64 buffer;
>>> +};
>>> +
>>> +/**
>>> + * EVIOCGABSRANGE - Fetch range of ABS values
>>> + *
>>> + * This fetches the current values of a range of ABS codes atomically. The range
>>> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
>>> + * which has the following fields:
>>> + *      slots: Number of MT slots to fetch data for.
>>> + *       code: First ABS axis to query.
>>> + *      count: Number of ABS axes to query starting at @code.
>>> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
>>> + *             values. This buffer must be an array of __s32 with at least
>>> + *             (@slots * @code) elements. The buffer is interpreted as two
>>> + *             dimensional __s32 array, declared as: __s32[slots][codes]
>>> + *
>>> + * Compared to EVIOCGABS this ioctl allows to retrieve a range of ABS codes
>>> + * atomically regarding any concurrent buffer modifications. Furthermore, any
>>> + * pending events for codes that were retrived via this call are flushed from
>>> + * the client's receive buffer. But unlike EVIOCGABS, this ioctl only returns
>>> + * the current value of an axis, rather than the whole "struct input_absinfo"
>>> + * set. All fields of "struct input_absinfo" except for the value are constant,
>>> + * though.
>>> + *
>>> + * The kernel's current view of the ABS axes is copied into the provided buffer.
>>> + * If an ABS axis is not enabled on the device, its value will be zero. Also, if
>>> + * an axis is not a slotted MT-axis, values for all but the first slot will be
>>> + * 0. If @slots is greater than the actual number of slots provided by the
>>> + * device, values for all slots higher than that will be 0.
>>> + *
>>> + * This call may fail with -EINVAL if the kernel doesn't support this call or
>>> + * the arguments are invalid, with -ENODEV if access was revoked, -ENOMEM if the
>>> + * kernel couldn't allocate temporary buffers for data-copy or -EFAULT if the
>>> + * passed pointer was invalid.
>>> + */
>>> +#define EVIOCGABSRANGE         _IOR('E', 0x94, struct input_absrange)
>>> +
>>>  #define EVIOCSCLOCKID          _IOW('E', 0xa0, int)                    /* Set clockid to be used for timestamps */
>>>
>>>  /*
>>> --
>>> 2.0.0
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 6386882..7a25a7a 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -175,8 +175,9 @@  static bool __evdev_is_filtered(struct evdev_client *client,
 	return mask && !test_bit(code, mask);
 }
 
-/* flush queued events of type @type, caller must hold client->buffer_lock */
-static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
+/* flush queued, matching events, caller must hold client->buffer_lock */
+static void __evdev_flush_queue(struct evdev_client *client, unsigned int type,
+				unsigned int code_first, unsigned int code_last)
 {
 	unsigned int i, head, num;
 	unsigned int mask = client->bufsize - 1;
@@ -195,7 +196,9 @@  static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
 		ev = &client->buffer[i];
 		is_report = ev->type == EV_SYN && ev->code == SYN_REPORT;
 
-		if (ev->type == type) {
+		if (ev->type == type &&
+		    ev->code >= code_first &&
+		    ev->code <= code_last) {
 			/* drop matched entry */
 			continue;
 		} else if (is_report && !num) {
@@ -786,6 +789,172 @@  static int handle_eviocgbit(struct input_dev *dev,
 	return bits_to_user(bits, len, size, p, compat_mode);
 }
 
+static inline void free_absrange(s32 **pages, size_t page_cnt)
+{
+	if (page_cnt > 1) {
+		while (page_cnt > 0) {
+			if (!pages[--page_cnt])
+				break;
+			__free_page(virt_to_page(pages[page_cnt]));
+		}
+		kfree(pages);
+	} else if (page_cnt == 1) {
+		kfree(pages);
+	}
+}
+
+static inline s32 *absrange_ptr(s32 **pages, size_t page_cnt, size_t slots,
+				size_t i_code, size_t j_slot)
+{
+	size_t idx, off;
+
+	idx = (i_code * slots + j_slot) / (PAGE_SIZE / sizeof(s32));
+	off = (i_code * slots + j_slot) % (PAGE_SIZE / sizeof(s32));
+
+	if (page_cnt == 1)
+		return &((s32*)pages)[off];
+	else
+		return &pages[idx][off];
+}
+
+static inline ssize_t fetch_absrange(struct evdev_client *client,
+				     struct input_dev *dev, size_t start,
+				     size_t count, size_t slots, s32 ***out)
+{
+	size_t size, page_cnt, i, j;
+	unsigned long flags;
+	s32 **pages;
+
+	/*
+	 * Fetch data atomically from the device and flush buffers. We need to
+	 * allocate a temporary buffer as copy_to_user() is not allowed while
+	 * holding spinlocks. However, to-be-copied data might be huge and
+	 * high-order allocations should be avoided. Therefore, do the
+	 * page-allocation manually.
+	 */
+
+	BUILD_BUG_ON(PAGE_SIZE % sizeof(s32) != 0);
+
+	size = sizeof(s32) * count * slots;
+	page_cnt = DIV_ROUND_UP(size, PAGE_SIZE);
+	if (page_cnt < 1) {
+		return 0;
+	} else if (page_cnt == 1) {
+		pages = kzalloc(size, GFP_TEMPORARY);
+		if (!pages)
+			return -ENOMEM;
+	} else {
+		pages = kzalloc(sizeof(*pages) * page_cnt, GFP_TEMPORARY);
+		if (!pages)
+			return -ENOMEM;
+
+		for (i = 0; i < page_cnt; ++i) {
+			pages[i] = (void*)get_zeroed_page(GFP_TEMPORARY);
+			if (!pages[i]) {
+				free_absrange(pages, page_cnt);
+				return -ENOMEM;
+			}
+		}
+	}
+
+	spin_lock_irqsave(&dev->event_lock, flags);
+	spin_lock(&client->buffer_lock);
+
+	for (i = 0; i < count; ++i) {
+		__u16 code;
+		bool is_mt;
+
+		code = start + i;
+		is_mt = input_is_mt_value(code);
+		if (is_mt && !dev->mt)
+			continue;
+
+		for (j = 0; j < slots; ++j) {
+			__s32 v;
+
+			if (is_mt)
+				v = input_mt_get_value(&dev->mt->slots[j],
+						       code);
+			else
+				v = dev->absinfo[code].value;
+
+			*absrange_ptr(pages, page_cnt, slots, i, j) = v;
+
+			if (!is_mt)
+				break;
+		}
+	}
+
+	spin_unlock(&client->buffer_lock);
+	__evdev_flush_queue(client, EV_ABS, start, start + count - 1);
+	spin_unlock_irqrestore(&dev->event_lock, flags);
+
+	*out = pages;
+	return page_cnt;
+}
+
+static int evdev_handle_get_absrange(struct evdev_client *client,
+				     struct input_dev *dev,
+				     struct input_absrange __user *p)
+{
+	size_t slots, code, count, i, j;
+	struct input_absrange absbuf;
+	s32 **vals = NULL;
+	ssize_t val_cnt;
+	s32 __user *b;
+	int retval;
+
+	if (!dev->absinfo)
+		return -EINVAL;
+	if (copy_from_user(&absbuf, p, sizeof(absbuf)))
+		return -EFAULT;
+
+	slots = min_t(size_t, dev->mt ? dev->mt->num_slots : 1, absbuf.slots);
+	code = min_t(size_t, absbuf.code, ABS_CNT);
+	count = min_t(size_t, absbuf.count, ABS_CNT);
+
+	/* first fetch data atomically from device */
+
+	if (code + count > ABS_CNT)
+		count = ABS_CNT - code;
+
+	if (!slots || !count) {
+		val_cnt = 0;
+	} else {
+		val_cnt = fetch_absrange(client, dev, code, count,
+					 slots, &vals);
+		if (val_cnt < 0)
+			return val_cnt;
+	}
+
+	/* now copy data to user-space */
+
+	b = (void __user*)(unsigned long)absbuf.buffer;
+	for (i = 0; i < absbuf.count; ++i) {
+		for (j = 0; j < absbuf.slots; ++j, ++b) {
+			s32 v;
+
+			if (i >= count || j >= slots)
+				v = 0;
+			else
+				v = *absrange_ptr(vals, val_cnt, slots, i, j);
+
+			if (put_user(v, b)) {
+				retval = -EFAULT;
+				goto out;
+			}
+		}
+	}
+
+	retval = 0;
+
+out:
+	free_absrange(vals, val_cnt);
+	if (retval < 0)
+		evdev_queue_syn_dropped(client);
+	return retval;
+}
+
 static int evdev_handle_get_keycode(struct input_dev *dev, void __user *p)
 {
 	struct input_keymap_entry ke = {
@@ -889,7 +1058,7 @@  static int evdev_handle_get_val(struct evdev_client *client,
 
 	spin_unlock(&dev->event_lock);
 
-	__evdev_flush_queue(client, type);
+	__evdev_flush_queue(client, type, 0, UINT_MAX);
 
 	spin_unlock_irq(&client->buffer_lock);
 
@@ -1006,6 +1175,9 @@  static long evdev_do_ioctl(struct file *file, unsigned int cmd,
 		else
 			return evdev_revoke(evdev, client, file);
 
+	case EVIOCGABSRANGE:
+		return evdev_handle_get_absrange(client, dev, p);
+
 	case EVIOCGMASK:
 		if (copy_from_user(&mask, p, sizeof(mask)))
 			return -EFAULT;
diff --git a/include/uapi/linux/input.h b/include/uapi/linux/input.h
index f6ace0e..32a6443 100644
--- a/include/uapi/linux/input.h
+++ b/include/uapi/linux/input.h
@@ -210,6 +210,48 @@  struct input_mask {
  */
 #define EVIOCSMASK		_IOW('E', 0x93, struct input_mask)	/* Set event-masks */
 
+struct input_absrange {
+	__u16 slots;
+	__u16 code;
+	__u32 count;
+	__u64 buffer;
+};
+
+/**
+ * EVIOCGABSRANGE - Fetch range of ABS values
+ *
+ * This fetches the current values of a range of ABS codes atomically. The range
+ * of codes to fetch and the buffer-types are passed as "struct input_absrange",
+ * which has the following fields:
+ *      slots: Number of MT slots to fetch data for.
+ *       code: First ABS axis to query.
+ *      count: Number of ABS axes to query starting at @code.
+ *     buffer: Pointer to a receive buffer where to store the fetched ABS
+ *             values. This buffer must be an array of __s32 with at least
+ *             (@slots * @code) elements. The buffer is interpreted as two
+ *             dimensional __s32 array, declared as: __s32[slots][codes]
+ *
+ * Compared to EVIOCGABS this ioctl allows to retrieve a range of ABS codes
+ * atomically regarding any concurrent buffer modifications. Furthermore, any
+ * pending events for codes that were retrived via this call are flushed from
+ * the client's receive buffer. But unlike EVIOCGABS, this ioctl only returns
+ * the current value of an axis, rather than the whole "struct input_absinfo"
+ * set. All fields of "struct input_absinfo" except for the value are constant,
+ * though.
+ *
+ * The kernel's current view of the ABS axes is copied into the provided buffer.
+ * If an ABS axis is not enabled on the device, its value will be zero. Also, if
+ * an axis is not a slotted MT-axis, values for all but the first slot will be
+ * 0. If @slots is greater than the actual number of slots provided by the
+ * device, values for all slots higher than that will be 0.
+ *
+ * This call may fail with -EINVAL if the kernel doesn't support this call or
+ * the arguments are invalid, with -ENODEV if access was revoked, -ENOMEM if the
+ * kernel couldn't allocate temporary buffers for data-copy or -EFAULT if the
+ * passed pointer was invalid.
+ */
+#define EVIOCGABSRANGE		_IOR('E', 0x94, struct input_absrange)
+
 #define EVIOCSCLOCKID		_IOW('E', 0xa0, int)			/* Set clockid to be used for timestamps */
 
 /*