Message ID | 20200821183236.21963-1-sonnysasaka@chromium.org (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [BlueZ] device: Disconnect att before creating a new one | expand |
Hi Sonny, On Fri, Aug 21, 2020 at 11:36 AM Sonny Sasaka <sonnysasaka@chromium.org> wrote: > > When device_attach_att is invoked, it may be that the old att is still > connected (the disconnection hasn't been detected). > > This patch calls the disconnection callback before creating the new att > since the disconnection callback will never be invoked after the new att > is created. The disconnection callback also cleans up the att of the > device object before assigning a new one. This way the old att will not > at later time fire disconnect event which would operate on the already > freed device pointer. > > When there is a HCI LE Disconnection Complete event followed by HCI LE > Connection Complete event and they are very close together, this > sequence could happen because the kernel doesn't guarantee that the > closing of the l2cap socket (due to LE Disconnection Complete event) > always happens earlier than the creation of the new l2cap socket (due to > LE Connection Complete event). This actually sounds like a bug in the kernel, either it should cleanup the existing sockets or not disconnect them at all if there is a new connection in place, otherwise it quite possible that there will be many more races like this. Have you tried doing something similar with BR/EDR, we actually depend on the socket being disconnected to notify all the profiles so then can cleanup properly. > Tested by repeatedly connecting/disconnecting to a device until the > situation happens and checking that bluetoothd doesn't crash. > > Reviewed-by: Shyh-In Hwang <josephsih@chromium.org> > > --- > src/device.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/src/device.c b/src/device.c > index 7b7808405..bed8f38b5 100644 > --- a/src/device.c > +++ b/src/device.c > @@ -5304,6 +5304,15 @@ bool device_attach_att(struct btd_device *dev, GIOChannel *io) > return false; > } > > + /* This may be reached when the device already has att attached to it. > + * In this case disconnect the att first before assigning the new one, > + * otherwise the old att may fire a disconnect event at later time > + * and will invoke operations on the already freed device pointer. > + * The error code (ECONNRESET) is chosen arbitrarily since the > + * disconnection event (with an error code) is not yet detected. > + */ > + if (dev->attrib || dev->att) > + att_disconnected_cb(ECONNRESET, dev); Another way to resolve this is to cleanup earlier at device_remove_connection since we know at that point ATT will be disconnected. > dev->attrib = attrib; > dev->att = g_attrib_get_att(attrib); > > -- > 2.26.2 >
diff --git a/src/device.c b/src/device.c index 7b7808405..bed8f38b5 100644 --- a/src/device.c +++ b/src/device.c @@ -5304,6 +5304,15 @@ bool device_attach_att(struct btd_device *dev, GIOChannel *io) return false; } + /* This may be reached when the device already has att attached to it. + * In this case disconnect the att first before assigning the new one, + * otherwise the old att may fire a disconnect event at later time + * and will invoke operations on the already freed device pointer. + * The error code (ECONNRESET) is chosen arbitrarily since the + * disconnection event (with an error code) is not yet detected. + */ + if (dev->attrib || dev->att) + att_disconnected_cb(ECONNRESET, dev); dev->attrib = attrib; dev->att = g_attrib_get_att(attrib);
When device_attach_att is invoked, it may be that the old att is still connected (the disconnection hasn't been detected). This patch calls the disconnection callback before creating the new att since the disconnection callback will never be invoked after the new att is created. The disconnection callback also cleans up the att of the device object before assigning a new one. This way the old att will not at later time fire disconnect event which would operate on the already freed device pointer. When there is a HCI LE Disconnection Complete event followed by HCI LE Connection Complete event and they are very close together, this sequence could happen because the kernel doesn't guarantee that the closing of the l2cap socket (due to LE Disconnection Complete event) always happens earlier than the creation of the new l2cap socket (due to LE Connection Complete event). Tested by repeatedly connecting/disconnecting to a device until the situation happens and checking that bluetoothd doesn't crash. Reviewed-by: Shyh-In Hwang <josephsih@chromium.org> --- src/device.c | 9 +++++++++ 1 file changed, 9 insertions(+)