diff mbox series

Bluetooth: Fix receiving HCI_LE_Advertising_Set_Terminated event

Message ID 20211102192742.1.I3ba1a76d72da5a813cf6e6f219838c9ef28c5eaa@changeid (mailing list archive)
State Superseded
Delegated to: Luiz Von Dentz
Headers show
Series Bluetooth: Fix receiving HCI_LE_Advertising_Set_Terminated event | expand

Checks

Context Check Description
tedd_an/checkpatch success Checkpatch PASS
tedd_an/gitlint success Gitlint PASS
tedd_an/buildkernel success Build Kernel PASS
tedd_an/testrunnersetup success Test Runner Setup PASS
tedd_an/testrunnerl2cap-tester success Total: 40, Passed: 40 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnerbnep-tester success Total: 1, Passed: 1 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnermgmt-tester fail Total: 468, Passed: 463 (98.9%), Failed: 5, Not Run: 0
tedd_an/testrunnerrfcomm-tester success Total: 9, Passed: 9 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnersco-tester success Total: 12, Passed: 12 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnersmp-tester success Total: 8, Passed: 8 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunneruserchan-tester success Total: 4, Passed: 4 (100.0%), Failed: 0, Not Run: 0

Commit Message

Archie Pusaka Nov. 2, 2021, 11:27 a.m. UTC
From: Archie Pusaka <apusaka@chromium.org>

This event is received when the controller stops advertising,
specifically for these three reasons:
(a) Connection is successfully created (success).
(b) Timeout is reached (error).
(c) Number of advertising events is reached (error).
(*) This event is NOT generated when the host stops the advertisement.
Refer to the BT spec ver 5.3 vol 4 part E sec 7.7.65.18. Note that the
section was revised from BT spec ver 5.0 vol 2 part E sec 7.7.65.18
which was ambiguous about (*).

Some chips (e.g. RTL8822CE) send this event when the host stops the
advertisement with status = HCI_ERROR_CANCELLED_BY_HOST (due to (*)
above). This is treated as an error and the advertisement will be
removed and userspace will be informed via MGMT event.

On suspend, we are supposed to temporarily disable advertisements,
and continue advertising on resume. However, due to the behavior
above, the advertisements are removed instead.

This patch returns early if HCI_ERROR_CANCELLED_BY_HOST is received.

Additionally, this patch also clear HCI_LE_ADV if there are no more
advertising instances after receiving other errors.

Signed-off-by: Archie Pusaka <apusaka@chromium.org>
Reviewed-by: Alain Michaud <alainm@chromium.org>

---

 include/net/bluetooth/hci.h |  1 +
 net/bluetooth/hci_event.c   | 12 ++++++++++++
 2 files changed, 13 insertions(+)

Comments

Marcel Holtmann Nov. 2, 2021, 2 p.m. UTC | #1
Hi Archie,

> This event is received when the controller stops advertising,
> specifically for these three reasons:
> (a) Connection is successfully created (success).
> (b) Timeout is reached (error).
> (c) Number of advertising events is reached (error).
> (*) This event is NOT generated when the host stops the advertisement.
> Refer to the BT spec ver 5.3 vol 4 part E sec 7.7.65.18. Note that the
> section was revised from BT spec ver 5.0 vol 2 part E sec 7.7.65.18
> which was ambiguous about (*).
> 
> Some chips (e.g. RTL8822CE) send this event when the host stops the
> advertisement with status = HCI_ERROR_CANCELLED_BY_HOST (due to (*)
> above). This is treated as an error and the advertisement will be
> removed and userspace will be informed via MGMT event.
> 
> On suspend, we are supposed to temporarily disable advertisements,
> and continue advertising on resume. However, due to the behavior
> above, the advertisements are removed instead.
> 
> This patch returns early if HCI_ERROR_CANCELLED_BY_HOST is received.

lets include a btmon snippet here to show the faulty behavior.

> 
> Additionally, this patch also clear HCI_LE_ADV if there are no more
> advertising instances after receiving other errors.

Does this really belong in this patch? I think it warrants a separate patch with an appropriate Fixes: tag. Especially in the case we are working around a firmware bug, this should be separate. It gives us a better chance to bisect anything if we ever have to.

> 
> Signed-off-by: Archie Pusaka <apusaka@chromium.org>
> Reviewed-by: Alain Michaud <alainm@chromium.org>
> 
> ---
> 
> include/net/bluetooth/hci.h |  1 +
> net/bluetooth/hci_event.c   | 12 ++++++++++++
> 2 files changed, 13 insertions(+)
> 
> diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
> index 63065bc01b76..84db6b275231 100644
> --- a/include/net/bluetooth/hci.h
> +++ b/include/net/bluetooth/hci.h
> @@ -566,6 +566,7 @@ enum {
> #define HCI_ERROR_INVALID_LL_PARAMS	0x1e
> #define HCI_ERROR_UNSPECIFIED		0x1f
> #define HCI_ERROR_ADVERTISING_TIMEOUT	0x3c
> +#define HCI_ERROR_CANCELLED_BY_HOST	0x44
> 
> /* Flow control modes */
> #define HCI_FLOW_CTL_MODE_PACKET_BASED	0x00
> diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> index d4b75a6cfeee..150b50677790 100644
> --- a/net/bluetooth/hci_event.c
> +++ b/net/bluetooth/hci_event.c
> @@ -5538,6 +5538,14 @@ static void hci_le_ext_adv_term_evt(struct hci_dev *hdev, struct sk_buff *skb)
> 
> 	adv = hci_find_adv_instance(hdev, ev->handle);
> 
> +	/* Some chips (e.g. RTL8822CE) emit HCI_ERROR_CANCELLED_BY_HOST. This
> +	 * event is being fired as a result of a hci_cp_le_set_ext_adv_enable
> +	 * disable request, which will have its own callback and cleanup via
> +	 * the hci_cc_le_set_ext_adv_enable path.
> +	 */

I am not in favor of pointing fingers at bad hardware in the source code of core (that belongs in a commit message). Blaming hardware is really up to the drivers. So I would rather phrase it like this:

	/* The Bluetooth Core 5.3 specification clearly states that this event
	 * shall not be sent when the Host disables the advertising set. So in
	 * case of HCI_ERROR_CANCELLED_BY_HOST, just ignore the event.
	 *
	 * When the Host disables an advertising set, all cleanup is done via
	 * its command callback and not needed to be duplicated here.
	 */

> +	if (ev->status == HCI_ERROR_CANCELLED_BY_HOST)
> +		return;
> +

And since this is clearly an implementation issue, the manufactures can issue a firmware fix for this. So lets be verbose and complain about it.

	if (ev->status == HCI_ERRROR..) {
		bt_dev_warn_ratelimited(hdev, “Unexpected advertising set terminated event”);
		return;
	}

> 	if (ev->status) {
> 		if (!adv)
> 			return;
> @@ -5546,6 +5554,10 @@ static void hci_le_ext_adv_term_evt(struct hci_dev *hdev, struct sk_buff *skb)
> 		hci_remove_adv_instance(hdev, ev->handle);
> 		mgmt_advertising_removed(NULL, hdev, ev->handle);
> 
> +		/* If we are no longer advertising, clear HCI_LE_ADV */
> +		if (list_empty(&hdev->adv_instances))
> +			hci_dev_clear_flag(hdev, HCI_LE_ADV);
> +

See comment above why this might be better suited for a separate patch.

Regards

Marcel
Archie Pusaka Nov. 3, 2021, 8:07 a.m. UTC | #2
Hi Marcel,

Thanks for your reply.
I've sent a v2 patch to incorporate your suggestions.

Regards,
Archie

On Tue, 2 Nov 2021 at 22:00, Marcel Holtmann <marcel@holtmann.org> wrote:
>
> Hi Archie,
>
> > This event is received when the controller stops advertising,
> > specifically for these three reasons:
> > (a) Connection is successfully created (success).
> > (b) Timeout is reached (error).
> > (c) Number of advertising events is reached (error).
> > (*) This event is NOT generated when the host stops the advertisement.
> > Refer to the BT spec ver 5.3 vol 4 part E sec 7.7.65.18. Note that the
> > section was revised from BT spec ver 5.0 vol 2 part E sec 7.7.65.18
> > which was ambiguous about (*).
> >
> > Some chips (e.g. RTL8822CE) send this event when the host stops the
> > advertisement with status = HCI_ERROR_CANCELLED_BY_HOST (due to (*)
> > above). This is treated as an error and the advertisement will be
> > removed and userspace will be informed via MGMT event.
> >
> > On suspend, we are supposed to temporarily disable advertisements,
> > and continue advertising on resume. However, due to the behavior
> > above, the advertisements are removed instead.
> >
> > This patch returns early if HCI_ERROR_CANCELLED_BY_HOST is received.
>
> lets include a btmon snippet here to show the faulty behavior.
>
> >
> > Additionally, this patch also clear HCI_LE_ADV if there are no more
> > advertising instances after receiving other errors.
>
> Does this really belong in this patch? I think it warrants a separate patch with an appropriate Fixes: tag. Especially in the case we are working around a firmware bug, this should be separate. It gives us a better chance to bisect anything if we ever have to.
>
> >
> > Signed-off-by: Archie Pusaka <apusaka@chromium.org>
> > Reviewed-by: Alain Michaud <alainm@chromium.org>
> >
> > ---
> >
> > include/net/bluetooth/hci.h |  1 +
> > net/bluetooth/hci_event.c   | 12 ++++++++++++
> > 2 files changed, 13 insertions(+)
> >
> > diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
> > index 63065bc01b76..84db6b275231 100644
> > --- a/include/net/bluetooth/hci.h
> > +++ b/include/net/bluetooth/hci.h
> > @@ -566,6 +566,7 @@ enum {
> > #define HCI_ERROR_INVALID_LL_PARAMS   0x1e
> > #define HCI_ERROR_UNSPECIFIED         0x1f
> > #define HCI_ERROR_ADVERTISING_TIMEOUT 0x3c
> > +#define HCI_ERROR_CANCELLED_BY_HOST  0x44
> >
> > /* Flow control modes */
> > #define HCI_FLOW_CTL_MODE_PACKET_BASED        0x00
> > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > index d4b75a6cfeee..150b50677790 100644
> > --- a/net/bluetooth/hci_event.c
> > +++ b/net/bluetooth/hci_event.c
> > @@ -5538,6 +5538,14 @@ static void hci_le_ext_adv_term_evt(struct hci_dev *hdev, struct sk_buff *skb)
> >
> >       adv = hci_find_adv_instance(hdev, ev->handle);
> >
> > +     /* Some chips (e.g. RTL8822CE) emit HCI_ERROR_CANCELLED_BY_HOST. This
> > +      * event is being fired as a result of a hci_cp_le_set_ext_adv_enable
> > +      * disable request, which will have its own callback and cleanup via
> > +      * the hci_cc_le_set_ext_adv_enable path.
> > +      */
>
> I am not in favor of pointing fingers at bad hardware in the source code of core (that belongs in a commit message). Blaming hardware is really up to the drivers. So I would rather phrase it like this:
>
>         /* The Bluetooth Core 5.3 specification clearly states that this event
>          * shall not be sent when the Host disables the advertising set. So in
>          * case of HCI_ERROR_CANCELLED_BY_HOST, just ignore the event.
>          *
>          * When the Host disables an advertising set, all cleanup is done via
>          * its command callback and not needed to be duplicated here.
>          */
>
> > +     if (ev->status == HCI_ERROR_CANCELLED_BY_HOST)
> > +             return;
> > +
>
> And since this is clearly an implementation issue, the manufactures can issue a firmware fix for this. So lets be verbose and complain about it.
>
>         if (ev->status == HCI_ERRROR..) {
>                 bt_dev_warn_ratelimited(hdev, “Unexpected advertising set terminated event”);
>                 return;
>         }
>
> >       if (ev->status) {
> >               if (!adv)
> >                       return;
> > @@ -5546,6 +5554,10 @@ static void hci_le_ext_adv_term_evt(struct hci_dev *hdev, struct sk_buff *skb)
> >               hci_remove_adv_instance(hdev, ev->handle);
> >               mgmt_advertising_removed(NULL, hdev, ev->handle);
> >
> > +             /* If we are no longer advertising, clear HCI_LE_ADV */
> > +             if (list_empty(&hdev->adv_instances))
> > +                     hci_dev_clear_flag(hdev, HCI_LE_ADV);
> > +
>
> See comment above why this might be better suited for a separate patch.
>
> Regards
>
> Marcel
>
diff mbox series

Patch

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index 63065bc01b76..84db6b275231 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -566,6 +566,7 @@  enum {
 #define HCI_ERROR_INVALID_LL_PARAMS	0x1e
 #define HCI_ERROR_UNSPECIFIED		0x1f
 #define HCI_ERROR_ADVERTISING_TIMEOUT	0x3c
+#define HCI_ERROR_CANCELLED_BY_HOST	0x44
 
 /* Flow control modes */
 #define HCI_FLOW_CTL_MODE_PACKET_BASED	0x00
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index d4b75a6cfeee..150b50677790 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -5538,6 +5538,14 @@  static void hci_le_ext_adv_term_evt(struct hci_dev *hdev, struct sk_buff *skb)
 
 	adv = hci_find_adv_instance(hdev, ev->handle);
 
+	/* Some chips (e.g. RTL8822CE) emit HCI_ERROR_CANCELLED_BY_HOST. This
+	 * event is being fired as a result of a hci_cp_le_set_ext_adv_enable
+	 * disable request, which will have its own callback and cleanup via
+	 * the hci_cc_le_set_ext_adv_enable path.
+	 */
+	if (ev->status == HCI_ERROR_CANCELLED_BY_HOST)
+		return;
+
 	if (ev->status) {
 		if (!adv)
 			return;
@@ -5546,6 +5554,10 @@  static void hci_le_ext_adv_term_evt(struct hci_dev *hdev, struct sk_buff *skb)
 		hci_remove_adv_instance(hdev, ev->handle);
 		mgmt_advertising_removed(NULL, hdev, ev->handle);
 
+		/* If we are no longer advertising, clear HCI_LE_ADV */
+		if (list_empty(&hdev->adv_instances))
+			hci_dev_clear_flag(hdev, HCI_LE_ADV);
+
 		return;
 	}