diff mbox

[v2] Input - mt: Fix input_mt_get_slot_by_key

Message ID 1427741655-4142-1-git-send-email-benjamin.tissoires@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Benjamin Tissoires March 30, 2015, 6:54 p.m. UTC
The case occurred recently with a touchscreen using twice a slot during a
single EV_SYN event:

E: 0.288415 0000 0000 0000      # ------------ SYN_REPORT (0) ----------
E: 0.296207 0003 002f 0000      # EV_ABS / ABS_MT_SLOT          0
E: 0.296207 0003 0039 -001      # EV_ABS / ABS_MT_TRACKING_ID   -1
E: 0.296207 0003 002f 0001      # EV_ABS / ABS_MT_SLOT          1
E: 0.296207 0003 0035 0908      # EV_ABS / ABS_MT_POSITION_X    908
E: 0.296207 0003 0036 1062      # EV_ABS / ABS_MT_POSITION_Y    1062
E: 0.296207 0003 002f 0000      # EV_ABS / ABS_MT_SLOT          0
E: 0.296207 0003 0039 8787      # EV_ABS / ABS_MT_TRACKING_ID   8787
E: 0.296207 0003 0035 1566      # EV_ABS / ABS_MT_POSITION_X    1566
E: 0.296207 0003 0036 0861      # EV_ABS / ABS_MT_POSITION_Y    861
E: 0.296207 0003 0000 0908      # EV_ABS / ABS_X                908
E: 0.296207 0003 0001 1062      # EV_ABS / ABS_Y                1062
E: 0.296207 0000 0000 0000      # ------------ SYN_REPORT (0) ----------

This occurred because while having already slots 0 and 1 assigned, the
touchscreen sent:

0.293377 Tip Switch: 0 | Contact Id: 0 | X:  539 | Y: 1960 | Contact Count: 3
0.294783 Tip Switch: 1 | Contact Id: 1 | X:  908 | Y: 1062 | Contact Count: 0
0.296187 Tip Switch: 1 | Contact Id: 2 | X: 1566 | Y:  861 | Contact Count: 0

Slot 0 is released correclty, but when we look for Contact ID 2, the slot
0 is then picked up again because it is marked as inactive (trackingID < 0).

This is wrong, and we should not reuse a slot in the same frame.
The test should also check for input_mt_is_used().

In addition, we need to initialize mt->frame to an other value than 0.
With mt->frame being 0, all slots are tags as currently used, and so
input_mt_get_slot_by_key() would return -1 for all requests.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=88903

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
---

Changes in v2:
- add note in input_mt_get_slot_by_key that input_mt_sync_frame() is required

Hi Dmitry, Henrik, Jiri,

With https://git.kernel.org/cgit/linux/kernel/git/jikos/hid.git/commit/?h=for-4.1/wacom&id=9a1c001298fd567c0f0776ab54ab9965eeb9019f
in Jiri's tree, scheduled for 4.1, this patch should not break any existing
driver. I'd like us to stage it somewhere so that it doesn't get forgotten.

Henrik's previous concerns were that input_mt_sync_frame() may not be called
by a driver using input_mt_get_slot_by_key(), and now, no driver should be in
this case.

Cheers,
Benjamin

 drivers/input/input-mt.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Dmitry Torokhov April 6, 2015, 4:33 p.m. UTC | #1
On Mon, Mar 30, 2015 at 02:54:15PM -0400, Benjamin Tissoires wrote:
> The case occurred recently with a touchscreen using twice a slot during a
> single EV_SYN event:
> 
> E: 0.288415 0000 0000 0000      # ------------ SYN_REPORT (0) ----------
> E: 0.296207 0003 002f 0000      # EV_ABS / ABS_MT_SLOT          0
> E: 0.296207 0003 0039 -001      # EV_ABS / ABS_MT_TRACKING_ID   -1
> E: 0.296207 0003 002f 0001      # EV_ABS / ABS_MT_SLOT          1
> E: 0.296207 0003 0035 0908      # EV_ABS / ABS_MT_POSITION_X    908
> E: 0.296207 0003 0036 1062      # EV_ABS / ABS_MT_POSITION_Y    1062
> E: 0.296207 0003 002f 0000      # EV_ABS / ABS_MT_SLOT          0
> E: 0.296207 0003 0039 8787      # EV_ABS / ABS_MT_TRACKING_ID   8787
> E: 0.296207 0003 0035 1566      # EV_ABS / ABS_MT_POSITION_X    1566
> E: 0.296207 0003 0036 0861      # EV_ABS / ABS_MT_POSITION_Y    861
> E: 0.296207 0003 0000 0908      # EV_ABS / ABS_X                908
> E: 0.296207 0003 0001 1062      # EV_ABS / ABS_Y                1062
> E: 0.296207 0000 0000 0000      # ------------ SYN_REPORT (0) ----------
> 
> This occurred because while having already slots 0 and 1 assigned, the
> touchscreen sent:
> 
> 0.293377 Tip Switch: 0 | Contact Id: 0 | X:  539 | Y: 1960 | Contact Count: 3
> 0.294783 Tip Switch: 1 | Contact Id: 1 | X:  908 | Y: 1062 | Contact Count: 0
> 0.296187 Tip Switch: 1 | Contact Id: 2 | X: 1566 | Y:  861 | Contact Count: 0
> 
> Slot 0 is released correclty, but when we look for Contact ID 2, the slot
> 0 is then picked up again because it is marked as inactive (trackingID < 0).
> 
> This is wrong, and we should not reuse a slot in the same frame.
> The test should also check for input_mt_is_used().

I wonder if we should call it "input_mt_is_used_in_frame" or similar.

> 
> In addition, we need to initialize mt->frame to an other value than 0.
> With mt->frame being 0, all slots are tags as currently used, and so
> input_mt_get_slot_by_key() would return -1 for all requests.
> 
> Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=88903
> 
> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> ---
> 
> Changes in v2:
> - add note in input_mt_get_slot_by_key that input_mt_sync_frame() is required
> 
> Hi Dmitry, Henrik, Jiri,
> 
> With https://git.kernel.org/cgit/linux/kernel/git/jikos/hid.git/commit/?h=for-4.1/wacom&id=9a1c001298fd567c0f0776ab54ab9965eeb9019f
> in Jiri's tree, scheduled for 4.1, this patch should not break any existing
> driver. I'd like us to stage it somewhere so that it doesn't get forgotten.
> 
> Henrik's previous concerns were that input_mt_sync_frame() may not be called
> by a driver using input_mt_get_slot_by_key(), and now, no driver should be in
> this case.

I'm OK with it going through Juri's tree.

Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>


> 
> Cheers,
> Benjamin
> 
>  drivers/input/input-mt.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/input/input-mt.c b/drivers/input/input-mt.c
> index cb150a1..34feb3e 100644
> --- a/drivers/input/input-mt.c
> +++ b/drivers/input/input-mt.c
> @@ -88,10 +88,13 @@ int input_mt_init_slots(struct input_dev *dev, unsigned int num_slots,
>  			goto err_mem;
>  	}
>  
> -	/* Mark slots as 'unused' */
> +	/* Mark slots as 'inactive' */
>  	for (i = 0; i < num_slots; i++)
>  		input_mt_set_value(&mt->slots[i], ABS_MT_TRACKING_ID, -1);
>  
> +	/* Mark slots as 'unused' */
> +	mt->frame = 1;
> +
>  	dev->mt = mt;
>  	return 0;
>  err_mem:
> @@ -439,6 +442,8 @@ EXPORT_SYMBOL(input_mt_assign_slots);
>   * set the key on the first unused slot and return.
>   *
>   * If no available slot can be found, -1 is returned.
> + * Note that for this function to work properly, input_mt_sync_frame() has
> + * to be called at each frame.
>   */
>  int input_mt_get_slot_by_key(struct input_dev *dev, int key)
>  {
> @@ -453,7 +458,7 @@ int input_mt_get_slot_by_key(struct input_dev *dev, int key)
>  			return s - mt->slots;
>  
>  	for (s = mt->slots; s != mt->slots + mt->num_slots; s++)
> -		if (!input_mt_is_active(s)) {
> +		if (!input_mt_is_active(s) && !input_mt_is_used(mt, s)) {
>  			s->key = key;
>  			return s - mt->slots;
>  		}
> -- 
> 2.3.3
>
Jiri Kosina April 7, 2015, 1:17 p.m. UTC | #2
On Mon, 6 Apr 2015, Dmitry Torokhov wrote:

> > With https://git.kernel.org/cgit/linux/kernel/git/jikos/hid.git/commit/?h=for-4.1/wacom&id=9a1c001298fd567c0f0776ab54ab9965eeb9019f
> > in Jiri's tree, scheduled for 4.1, this patch should not break any existing
> > driver. I'd like us to stage it somewhere so that it doesn't get forgotten.
> > 
> > Henrik's previous concerns were that input_mt_sync_frame() may not be called
> > by a driver using input_mt_get_slot_by_key(), and now, no driver should be in
> > this case.
> 
> I'm OK with it going through Juri's tree.
> 
> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

Perfect, thanks. I have now queued it in for-4.1/upstream.
Dmitry Torokhov April 23, 2015, 4:49 p.m. UTC | #3
On Tue, Apr 7, 2015 at 6:17 AM, Jiri Kosina <jkosina@suse.cz> wrote:
> On Mon, 6 Apr 2015, Dmitry Torokhov wrote:
>
>> > With https://git.kernel.org/cgit/linux/kernel/git/jikos/hid.git/commit/?h=for-4.1/wacom&id=9a1c001298fd567c0f0776ab54ab9965eeb9019f
>> > in Jiri's tree, scheduled for 4.1, this patch should not break any existing
>> > driver. I'd like us to stage it somewhere so that it doesn't get forgotten.
>> >
>> > Henrik's previous concerns were that input_mt_sync_frame() may not be called
>> > by a driver using input_mt_get_slot_by_key(), and now, no driver should be in
>> > this case.
>>
>> I'm OK with it going through Juri's tree.
>>
>> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
>
> Perfect, thanks. I have now queued it in for-4.1/upstream.

Actually, thinking about it some more, I do not think this patch is
needed. It should be perfectly fine to replace the tracking ID in a
slot to denote a new contact. From
Documentation/input/multi-touch-protocol.txt

"Creation, replacement and destruction of contacts is achieved by
modifying the ABS_MT_TRACKING_ID of the associated slot.  A
non-negative tracking id is interpreted as a contact, and the value -1
denotes an unused slot.  A tracking id not previously present is
considered new, and a tracking id no longer present is considered
removed."

If some userspace is confused with missing -1 tracking ID for that
slot, that userspace should be fixed.

Thanks.
Henrik Rydberg April 23, 2015, 5:10 p.m. UTC | #4
> "Creation, replacement and destruction of contacts is achieved by
> modifying the ABS_MT_TRACKING_ID of the associated slot.  A
> non-negative tracking id is interpreted as a contact, and the value -1
> denotes an unused slot.  A tracking id not previously present is
> considered new, and a tracking id no longer present is considered
> removed."
> 
> If some userspace is confused with missing -1 tracking ID for that
> slot, that userspace should be fixed.

I agree. Some userland applications work with add/remove out of convenience, and
cannot handle the more compressed notation the kernel slot handling allows.
Fixing those applications will be a good thing.

Unfortunately the patch already appeared in Linus' tree...

Henrik

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Torokhov April 23, 2015, 6:20 p.m. UTC | #5
On Thu, Apr 23, 2015 at 07:10:55PM +0200, Henrik Rydberg wrote:
> > "Creation, replacement and destruction of contacts is achieved by
> > modifying the ABS_MT_TRACKING_ID of the associated slot.  A
> > non-negative tracking id is interpreted as a contact, and the value -1
> > denotes an unused slot.  A tracking id not previously present is
> > considered new, and a tracking id no longer present is considered
> > removed."
> > 
> > If some userspace is confused with missing -1 tracking ID for that
> > slot, that userspace should be fixed.
> 
> I agree. Some userland applications work with add/remove out of convenience, and
> cannot handle the more compressed notation the kernel slot handling allows.
> Fixing those applications will be a good thing.
> 
> Unfortunately the patch already appeared in Linus' tree...

It's in 4.1-rc0 so we can just revert it.
Benjamin Tissoires April 23, 2015, 6:38 p.m. UTC | #6
On Apr 23 2015 or thereabouts, Dmitry Torokhov wrote:
> On Thu, Apr 23, 2015 at 07:10:55PM +0200, Henrik Rydberg wrote:
> > > "Creation, replacement and destruction of contacts is achieved by
> > > modifying the ABS_MT_TRACKING_ID of the associated slot.  A
> > > non-negative tracking id is interpreted as a contact, and the value -1
> > > denotes an unused slot.  A tracking id not previously present is
> > > considered new, and a tracking id no longer present is considered
> > > removed."
> > > 
> > > If some userspace is confused with missing -1 tracking ID for that
> > > slot, that userspace should be fixed.
> > 
> > I agree. Some userland applications work with add/remove out of convenience, and
> > cannot handle the more compressed notation the kernel slot handling allows.
> > Fixing those applications will be a good thing.
> > 
> > Unfortunately the patch already appeared in Linus' tree...
> 
> It's in 4.1-rc0 so we can just revert it.
> 

Before we call for a revert, I'd like to hear Peter thoughts about it,
as he is the main user of the slotted protocol.

Also, Dmitry, can you check if the ChromeOS driver and (or) the android
one would need to be adapted? My guess is that none of these drivers are
able to handle a silent closing of a slot and the easiest solution might
be to simply change the documentation if in fact nobody uses the
compressed notation (which removes just one ABS_SLOT event within the
frame, so not sure we can call it compressed BTW).

Cheers,
Benjamin
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Torokhov April 23, 2015, 6:47 p.m. UTC | #7
On Thu, Apr 23, 2015 at 02:38:27PM -0400, Benjamin Tissoires wrote:
> On Apr 23 2015 or thereabouts, Dmitry Torokhov wrote:
> > On Thu, Apr 23, 2015 at 07:10:55PM +0200, Henrik Rydberg wrote:
> > > > "Creation, replacement and destruction of contacts is achieved by
> > > > modifying the ABS_MT_TRACKING_ID of the associated slot.  A
> > > > non-negative tracking id is interpreted as a contact, and the value -1
> > > > denotes an unused slot.  A tracking id not previously present is
> > > > considered new, and a tracking id no longer present is considered
> > > > removed."
> > > > 
> > > > If some userspace is confused with missing -1 tracking ID for that
> > > > slot, that userspace should be fixed.
> > > 
> > > I agree. Some userland applications work with add/remove out of convenience, and
> > > cannot handle the more compressed notation the kernel slot handling allows.
> > > Fixing those applications will be a good thing.
> > > 
> > > Unfortunately the patch already appeared in Linus' tree...
> > 
> > It's in 4.1-rc0 so we can just revert it.
> > 
> 
> Before we call for a revert, I'd like to hear Peter thoughts about it,
> as he is the main user of the slotted protocol.
> 
> Also, Dmitry, can you check if the ChromeOS driver and (or) the android
> one would need to be adapted? My guess is that none of these drivers are
> able to handle a silent closing of a slot

I will, at least for Chrome. But if it is broken I'll simply open a big
to fix it ;)

> and the easiest solution might
> be to simply change the documentation if in fact nobody uses the
> compressed notation (which removes just one ABS_SLOT event within the
> frame, so not sure we can call it compressed BTW).

No, it is more that that I think. Wouldn't we need to allocate memory
for 2*n slots to actually reliably track all contacts if they all happen
to get released and put back onto the surface in different place very
very quickly if we insist on sending tracking id -1 for previously used
slots?

Thanks.
Henrik Rydberg April 23, 2015, 6:52 p.m. UTC | #8
> My guess is that none of these drivers are
> able to handle a silent closing of a slot and the easiest solution might
> be to simply change the documentation if in fact nobody uses the
> compressed notation (which removes just one ABS_SLOT event within the
> frame, so not sure we can call it compressed BTW).

No, this is a very bad idea. As is obvious from the patch and the problem, there
are cases that the typical add/remove interface cannot handle, without either
adding more frames or support twice as many slots. How else would you report a
complete change of all N contacts into N new contacts, for instance?

The kernel protocol is solid and works for all cases. Userland thus has access
to a complete solution.

Henrik


--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hutterer April 24, 2015, 12:50 a.m. UTC | #9
On Thu, Apr 23, 2015 at 11:47:09AM -0700, Dmitry Torokhov wrote:
> On Thu, Apr 23, 2015 at 02:38:27PM -0400, Benjamin Tissoires wrote:
> > On Apr 23 2015 or thereabouts, Dmitry Torokhov wrote:
> > > On Thu, Apr 23, 2015 at 07:10:55PM +0200, Henrik Rydberg wrote:
> > > > > "Creation, replacement and destruction of contacts is achieved by
> > > > > modifying the ABS_MT_TRACKING_ID of the associated slot.  A
> > > > > non-negative tracking id is interpreted as a contact, and the value -1
> > > > > denotes an unused slot.  A tracking id not previously present is
> > > > > considered new, and a tracking id no longer present is considered
> > > > > removed."
> > > > > 
> > > > > If some userspace is confused with missing -1 tracking ID for that
> > > > > slot, that userspace should be fixed.
> > > > 
> > > > I agree. Some userland applications work with add/remove out of convenience, and
> > > > cannot handle the more compressed notation the kernel slot handling allows.
> > > > Fixing those applications will be a good thing.
> > > > 
> > > > Unfortunately the patch already appeared in Linus' tree...
> > > 
> > > It's in 4.1-rc0 so we can just revert it.
> > > 
> > 
> > Before we call for a revert, I'd like to hear Peter thoughts about it,
> > as he is the main user of the slotted protocol.
> > 
> > Also, Dmitry, can you check if the ChromeOS driver and (or) the android
> > one would need to be adapted? My guess is that none of these drivers are
> > able to handle a silent closing of a slot
> 
> I will, at least for Chrome. But if it is broken I'll simply open a big
> to fix it ;)
> 
> > and the easiest solution might
> > be to simply change the documentation if in fact nobody uses the
> > compressed notation (which removes just one ABS_SLOT event within the
> > frame, so not sure we can call it compressed BTW).
> 
> No, it is more that that I think. Wouldn't we need to allocate memory
> for 2*n slots to actually reliably track all contacts if they all happen
> to get released and put back onto the surface in different place very
> very quickly if we insist on sending tracking id -1 for previously used
> slots?

can't you just slot in an extra event that contains only the
ABS_MT_TRACKING_ID -1 for the needed slots, followed by an SYN_REPORT and
whatever BTN_TOOL_* updates are needed? You don't need extra slots here,
you're just putting one extra event in the queue before handling the new
touches as usual.

thing is: I've always assumed that a touch is terminated by a -1 event
and this hasn't been a problem until very recently. so anything I've ever
touched will be broken if we start switching tracking IDs directly.  That
includes xorg input, libinput and anything that uses libevdev. sorry.

if the kernel switches from one tracking ID to another directly,
libevdev will actually discard the new tracking ID.
http://cgit.freedesktop.org/libevdev/tree/libevdev/libevdev.c#n968
(sorry again) aside from the warning, it's safe to switch directly though,
there shouldn't be any side-effects. 
as for fixing this: I can add something to libevdev to allow it but I'll
also need to fix up every caller to handle this sequence then, they all rely
on the -1. so some stuff will simply break.
plus we still have synaptics up to 1.7.x and evdev up to 2.8.x that are
pre-libevdev.

for other event processing it's tricky as well. if you go from two
touches to two new touches you need to send out a BTN_TOOL_DOUBLETAP 0, then
1. if not, a legacy process missed the event completely (synaptics would
suffer from that). likewise, without the BTN_TOUCH going to 0 for one frame
you'll get coordinate jumps on the pointer emulation.

having the tracking ID go -1 and then to a real one in the same frame makes
this even worse, because now even the MT-capable processes need to attach
flags to each touch whether it intermediately terminated or not. The event
ordering is not guaranteed, so we don't know until the SYN_REPORT whether
we switched from 2 fingers to 1, or 2 fingers to 2 fingers. or possibly
three fingers if BTN_TOOL_TRIPLETAP is set which we won't know until the
end. That has to be fixed in every caller. and it changes the evdev protocol
from "you have the state at SYN_REPORT" to "you need to keep track of some
state changes within the frame". that's no good.

so summary: switching directly between IDs is doable but requires userspace
fixes everywhere. terminating and restarting a contact within the same frame
is going to be nasty, let's not do that please.
best solution: the kernel inserts an additional event to terminate all
restarted contacts before starting the new ones as normal in the subsequent
frame.
 
Cheers,
   Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Henrik Rydberg April 24, 2015, 6:26 a.m. UTC | #10
Peter,

It may be a long time ago now, but we had very vocal discussions regarding the
MT protocol back then, and I am quite sure all the subtleties are well
understood. In order to fully appreciate the simplicity of the protocol, one
only needs to stop misintepreting it. In order to do that, please imagine a
large piece of papers, a set of brushes, and a set of colors.

The MT protocol tracks brushes. When lines are drawn, positions are updated.
When the color on the brush changes, the tracking id changes. When the brush is
lifted, the tracking id becomes -1, the "no color" color. There is a fixed set
of brushes. The old test application mtview works precisely this way.

The add/remove protocol tracks colors. Technichally it tracks contacts, which
are the combined (color, brush) objects, but in this analogy, colors and
tracking ids are interchangeable. When lines a drawn, a color is first assigned
to the color and the brush attributes on the color are updated. The positions on
the color are subsequently updated. To change color, the brush is deassign from
the first color and then the brush is assign to the new color. To lift, the
brush is deassign from the color. This is the abstraction the seems to prevail
in userland at the moment.

> can't you just slot in an extra event that contains only the
> ABS_MT_TRACKING_ID -1 for the needed slots, followed by an SYN_REPORT and
> whatever BTN_TOOL_* updates are needed? You don't need extra slots here,
> you're just putting one extra event in the queue before handling the new
> touches as usual.

So you want to add a rule saying that before a brush changes color, it first has
to be cleaned. That may look simple enough, but it misses out on several subtleties.

1. It is no longer possible to create beautiful contiuous tracks of varying color.

2. When a brush moves quickly, it will sometimes restart the track with a new
color. This happens on all hardware with tracking support when you approach the
sampling limit. It happens in the kernel tracking as well, for the same reasons.
Incompatible with 1.

3. There is a difference between losing track of a brush and lifting a brush.
This is one of the situations where the add/remove protocol has to create very
nonintuitive restrictions and rules to cope. The reason starts with 1.

Forcing tracks (brushes) into the add/remove protocol creates problems that are
on a more fundamental level than the subtle issues one may hope to resolve.

> thing is: I've always assumed that a touch is terminated by a -1 event
> and this hasn't been a problem until very recently.

We have talked a lot about the differences, they can hardly have escaped anyone
deeply involved. It is true that this is not normally a problem. It only becomes
important when the sampling rate is too low to resolve all the actions we deem
important.

> so anything I've ever
> touched will be broken if we start switching tracking IDs directly.
> That
> includes xorg input, libinput and anything that uses libevdev. sorry.

This has been in the mainline kernel for the last five years, and obvisouly
still works well most of the time. I sometimes experience glitches in the
trackpad usage on my laptop, and it probably stems from this issues. It is
slightly annoying, but not broken. No reason to panic.

> if the kernel switches from one tracking ID to another directly,
> libevdev will actually discard the new tracking ID.
> http://cgit.freedesktop.org/libevdev/tree/libevdev/libevdev.c#n968
> (sorry again) aside from the warning, it's safe to switch directly though,
> there shouldn't be any side-effects. 
> as for fixing this: I can add something to libevdev to allow it but I'll
> also need to fix up every caller to handle this sequence then, they all rely
> on the -1. so some stuff will simply break.
> plus we still have synaptics up to 1.7.x and evdev up to 2.8.x that are
> pre-libevdev.

Perhaps this is worth looking at in conjunction with the problem of handling
lost touches. I am thinking of suspend/resume issues in particular. If the
system could handle the distinction between a lift and a lost touch, some logic
would be less complicated and more correct.

> for other event processing it's tricky as well. if you go from two
> touches to two new touches you need to send out a BTN_TOOL_DOUBLETAP 0, then
> 1. if not, a legacy process missed the event completely (synaptics would
> suffer from that). likewise, without the BTN_TOUCH going to 0 for one frame
> you'll get coordinate jumps on the pointer emulation.

I am not so sure about this - the movement in synaptics checks if there has been
any identity changes.

> having the tracking ID go -1 and then to a real one in the same frame makes
> this even worse, because now even the MT-capable processes need to attach
> flags to each touch whether it intermediately terminated or not.

This is simply the result of a poor abstraction to begin with. The state tracked
through the input subsystem is the slot state.

> The event
> ordering is not guaranteed, so we don't know until the SYN_REPORT whether
> we switched from 2 fingers to 1, or 2 fingers to 2 fingers. or possibly
> three fingers if BTN_TOOL_TRIPLETAP is set which we won't know until the
> end. That has to be fixed in every caller. and it changes the evdev protocol
> from "you have the state at SYN_REPORT" to "you need to keep track of some
> state changes within the frame". that's no good.

I think this paragraph is mixing what the kernel is conveying with what
representation this is mapped to in some userland applications. The state
transfer of an id change without lifts is correctly transferred.

> so summary: switching directly between IDs is doable but requires userspace
> fixes everywhere. terminating and restarting a contact within the same frame
> is going to be nasty, let's not do that please.

I agree - userland should perhaps use a different abstraction than add/remove.
Why not use the "brushes" abstraction instead?

> best solution: the kernel inserts an additional event to terminate all
> restarted contacts before starting the new ones as normal in the subsequent
> frame.

No, this would just be the first step towards an endless sequence of subtle
issues trying to shoehorn the protocol into a poor abstraction.

Henrik

--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hutterer April 27, 2015, 3:52 a.m. UTC | #11
On Fri, Apr 24, 2015 at 08:26:39AM +0200, Henrik Rydberg wrote:
> Peter,
> 
> It may be a long time ago now, but we had very vocal discussions regarding the
> MT protocol back then, and I am quite sure all the subtleties are well
> understood. In order to fully appreciate the simplicity of the protocol, one
> only needs to stop misintepreting it. In order to do that, please imagine a
> large piece of papers, a set of brushes, and a set of colors.
> 
> The MT protocol tracks brushes. When lines are drawn, positions are updated.
> When the color on the brush changes, the tracking id changes. When the brush is
> lifted, the tracking id becomes -1, the "no color" color. There is a fixed set
> of brushes. The old test application mtview works precisely this way.
> 
> The add/remove protocol tracks colors. Technichally it tracks contacts, which
> are the combined (color, brush) objects, but in this analogy, colors and
> tracking ids are interchangeable. When lines a drawn, a color is first assigned
> to the color and the brush attributes on the color are updated. The positions on
> the color are subsequently updated. To change color, the brush is deassign from
> the first color and then the brush is assign to the new color. To lift, the
> brush is deassign from the color. This is the abstraction the seems to prevail
> in userland at the moment.
> 
> > can't you just slot in an extra event that contains only the
> > ABS_MT_TRACKING_ID -1 for the needed slots, followed by an SYN_REPORT and
> > whatever BTN_TOOL_* updates are needed? You don't need extra slots here,
> > you're just putting one extra event in the queue before handling the new
> > touches as usual.
> 
> So you want to add a rule saying that before a brush changes color, it first has
> to be cleaned. That may look simple enough, but it misses out on several subtleties.
> 
> 1. It is no longer possible to create beautiful contiuous tracks of varying color.
>
> 2. When a brush moves quickly, it will sometimes restart the track with a new
> color. This happens on all hardware with tracking support when you approach the
> sampling limit. It happens in the kernel tracking as well, for the same reasons.
> Incompatible with 1.
> 
> 3. There is a difference between losing track of a brush and lifting a brush.
> This is one of the situations where the add/remove protocol has to create very
> nonintuitive restrictions and rules to cope. The reason starts with 1.
> 
> Forcing tracks (brushes) into the add/remove protocol creates problems that are
> on a more fundamental level than the subtle issues one may hope to resolve.
> 
> > thing is: I've always assumed that a touch is terminated by a -1 event
> > and this hasn't been a problem until very recently.
> 
> We have talked a lot about the differences, they can hardly have escaped anyone
> deeply involved. 

well, they did. I honestly barely recall any conversations about that
protocol. it's now what, 5 years ago? and since the protocol itself worked
mostly fine I never had to think much about it, short of the various issues
with SYN_DROPPED.

I read through the documentation again after the weekend with fresh eyes and
it IMO can be read both ways. Couple that with seeing tracking IDs of -1 for
a couple of years before this issue came up and here we are at the current
situation. So regardless of what will eventually get merged, I recommend
putting an extra sentence in to explicitly spell out that a tracking ID can
change directly from non-negative to non-negative, and when this is allowed
to happen and what it means to the userspace process. And add an example for
it in the examples section.

> It is true that this is not normally a problem. It only becomes
> important when the sampling rate is too low to resolve all the actions we deem
> important.
> 
> > so anything I've ever
> > touched will be broken if we start switching tracking IDs directly.
> > That
> > includes xorg input, libinput and anything that uses libevdev. sorry.
> 
> This has been in the mainline kernel for the last five years, and obvisouly
> still works well most of the time. I sometimes experience glitches in the
> trackpad usage on my laptop, and it probably stems from this issues. It is
> slightly annoying, but not broken. No reason to panic.

what you're seeing is most likely the effect of libevdev.
libevdev discards the extra tracking ID and the driver will think it's the
same touch. which fixes the various stuck touch issues that we had, along
with crashes, and a couple of memory overflows (specifically in synaptics).

> > if the kernel switches from one tracking ID to another directly,
> > libevdev will actually discard the new tracking ID.
> > http://cgit.freedesktop.org/libevdev/tree/libevdev/libevdev.c#n968
> > (sorry again) aside from the warning, it's safe to switch directly though,
> > there shouldn't be any side-effects. 
> > as for fixing this: I can add something to libevdev to allow it but I'll
> > also need to fix up every caller to handle this sequence then, they all rely
> > on the -1. so some stuff will simply break.
> > plus we still have synaptics up to 1.7.x and evdev up to 2.8.x that are
> > pre-libevdev.
> 
> Perhaps this is worth looking at in conjunction with the problem of handling
> lost touches. I am thinking of suspend/resume issues in particular. If the
> system could handle the distinction between a lift and a lost touch, some logic
> would be less complicated and more correct.

the suspend/resume issues that we had were caused by a SYN_DROPPED event
during resume. that is handled transparently by libevdev now which will drop
touches and restart new touches so that the code to handle SYN_DROPPED and
normal events is virtually the same in the caller.

yes, same underlying problem but different trigger and different handling of
the effect in the xorg drivers, e.g. ignoring any touches already present at
resume time since there are no reliable timestamps on them.

> > for other event processing it's tricky as well. if you go from two
> > touches to two new touches you need to send out a BTN_TOOL_DOUBLETAP 0, then
> > 1. if not, a legacy process missed the event completely (synaptics would
> > suffer from that). likewise, without the BTN_TOUCH going to 0 for one frame
> > you'll get coordinate jumps on the pointer emulation.
> 
> I am not so sure about this - the movement in synaptics checks if there has been
> any identity changes.
> 
> > having the tracking ID go -1 and then to a real one in the same frame makes
> > this even worse, because now even the MT-capable processes need to attach
> > flags to each touch whether it intermediately terminated or not.
> 
> This is simply the result of a poor abstraction to begin with. The state tracked
> through the input subsystem is the slot state.
> 
> > The event
> > ordering is not guaranteed, so we don't know until the SYN_REPORT whether
> > we switched from 2 fingers to 1, or 2 fingers to 2 fingers. or possibly
> > three fingers if BTN_TOOL_TRIPLETAP is set which we won't know until the
> > end. That has to be fixed in every caller. and it changes the evdev protocol
> > from "you have the state at SYN_REPORT" to "you need to keep track of some
> > state changes within the frame". that's no good.
> 
> I think this paragraph is mixing what the kernel is conveying with what
> representation this is mapped to in some userland applications. The state
> transfer of an id change without lifts is correctly transferred.
>
> > so summary: switching directly between IDs is doable but requires userspace
> > fixes everywhere. terminating and restarting a contact within the same frame
> > is going to be nasty, let's not do that please.
> 
> I agree - userland should perhaps use a different abstraction than add/remove.
> Why not use the "brushes" abstraction instead?

it's not as simple as saying "we just use a different abstraction". I
recently traded my time-machine for a surfboard, which did wonders for my
sanity. But the drawback is I can't just go back and make the current code
not happen. I understand that having tracking IDs change directly would be
useful to fix that jumping cursor bug. still means that it requires some
core rewrites in some of the most common users of the evdev protocol.

so the list remains: please don't do -1 and new ID in the same frame.
Alternatively, change tracking ID directly and most current software will
skip over the direct tracking ID change and that's fixable over time in
userspace. Explicitly terminate the touch in the kernel means everything
handles it already.

Cheers,
   Peter

> 
> > best solution: the kernel inserts an additional event to terminate all
> > restarted contacts before starting the new ones as normal in the subsequent
> > frame.
> 
> No, this would just be the first step towards an endless sequence of subtle
> issues trying to shoehorn the protocol into a poor abstraction.
> 
> Henrik
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Tissoires April 27, 2015, 6:02 p.m. UTC | #12
On Apr 23 2015 or thereabouts, Dmitry Torokhov wrote:
> On Thu, Apr 23, 2015 at 02:38:27PM -0400, Benjamin Tissoires wrote:
> > On Apr 23 2015 or thereabouts, Dmitry Torokhov wrote:
> > > On Thu, Apr 23, 2015 at 07:10:55PM +0200, Henrik Rydberg wrote:
> > > > > "Creation, replacement and destruction of contacts is achieved by
> > > > > modifying the ABS_MT_TRACKING_ID of the associated slot.  A
> > > > > non-negative tracking id is interpreted as a contact, and the value -1
> > > > > denotes an unused slot.  A tracking id not previously present is
> > > > > considered new, and a tracking id no longer present is considered
> > > > > removed."
> > > > > 
> > > > > If some userspace is confused with missing -1 tracking ID for that
> > > > > slot, that userspace should be fixed.
> > > > 
> > > > I agree. Some userland applications work with add/remove out of convenience, and
> > > > cannot handle the more compressed notation the kernel slot handling allows.
> > > > Fixing those applications will be a good thing.
> > > > 
> > > > Unfortunately the patch already appeared in Linus' tree...
> > > 
> > > It's in 4.1-rc0 so we can just revert it.
> > > 
> > 
> > Before we call for a revert, I'd like to hear Peter thoughts about it,
> > as he is the main user of the slotted protocol.
> > 
> > Also, Dmitry, can you check if the ChromeOS driver and (or) the android
> > one would need to be adapted? My guess is that none of these drivers are
> > able to handle a silent closing of a slot
> 
> I will, at least for Chrome. But if it is broken I'll simply open a big
> to fix it ;)
> 
> > and the easiest solution might
> > be to simply change the documentation if in fact nobody uses the
> > compressed notation (which removes just one ABS_SLOT event within the
> > frame, so not sure we can call it compressed BTW).
> 
> No, it is more that that I think. Wouldn't we need to allocate memory
> for 2*n slots to actually reliably track all contacts if they all happen
> to get released and put back onto the surface in different place very
> very quickly if we insist on sending tracking id -1 for previously used
> slots?
> 

In addition to Peter's answer, I thought about this case a little bit
more. For most of the devices, this won't be a problem: at the HID
level, the device needs to send a release event. And we declare the
slots according to what the HID protocol is capable of.

So in most of the cases (i.e. with a Win 8 certified touchscreen that
uses the HID protocol), you don't need to augment the slot count to 2*n
given that the pre-filtering will be done by the firmware to ensure the
compatibility with the HID touch specification.

For other devices, we write an ad-hoc driver, and there we can be
smarter and eventually rely on Peter's proposal to add extra EV_SYN for
releasing the unsuded slots.

Note that this discussion came from a HID device which doesn't need to
allocate 2*n slots, but input_mt_get_slot_by key() reallocate the same
slot twice within the same frame while other slots were free to be used.

Cheers,
Benjamin
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/input/input-mt.c b/drivers/input/input-mt.c
index cb150a1..34feb3e 100644
--- a/drivers/input/input-mt.c
+++ b/drivers/input/input-mt.c
@@ -88,10 +88,13 @@  int input_mt_init_slots(struct input_dev *dev, unsigned int num_slots,
 			goto err_mem;
 	}
 
-	/* Mark slots as 'unused' */
+	/* Mark slots as 'inactive' */
 	for (i = 0; i < num_slots; i++)
 		input_mt_set_value(&mt->slots[i], ABS_MT_TRACKING_ID, -1);
 
+	/* Mark slots as 'unused' */
+	mt->frame = 1;
+
 	dev->mt = mt;
 	return 0;
 err_mem:
@@ -439,6 +442,8 @@  EXPORT_SYMBOL(input_mt_assign_slots);
  * set the key on the first unused slot and return.
  *
  * If no available slot can be found, -1 is returned.
+ * Note that for this function to work properly, input_mt_sync_frame() has
+ * to be called at each frame.
  */
 int input_mt_get_slot_by_key(struct input_dev *dev, int key)
 {
@@ -453,7 +458,7 @@  int input_mt_get_slot_by_key(struct input_dev *dev, int key)
 			return s - mt->slots;
 
 	for (s = mt->slots; s != mt->slots + mt->num_slots; s++)
-		if (!input_mt_is_active(s)) {
+		if (!input_mt_is_active(s) && !input_mt_is_used(mt, s)) {
 			s->key = key;
 			return s - mt->slots;
 		}