diff mbox series

Bluetooth: use hdev->workqueue when queuing hdev->{cmd,ncmd}_timer works

Message ID 733e6931-aa66-5295-d8a8-49063b7347f1@I-love.SAKURA.ne.jp (mailing list archive)
State Accepted
Commit deee93d13d385103205879a8a0915036ecd83261
Headers show
Series Bluetooth: use hdev->workqueue when queuing hdev->{cmd,ncmd}_timer works | expand

Checks

Context Check Description
tedd_an/pre-ci_am success Success
tedd_an/checkpatch fail Bluetooth: use hdev->workqueue when queuing hdev->{cmd,ncmd}_timer works\ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit fatal: unsaf ("ace/src' is owned by someone else)")' #66: commit c8efcc2589464ac7 ("workqueue: allow chained queueing during destruction") does not allow such operation. ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit fatal: unsaf ("ace/src' is owned by someone else)")' #69: The check introduced by commit 877afadad2dce8aa ("Bluetooth: When HCI work queue is drained, only queue chained work") was incomplete. total: 2 errors, 0 warnings, 0 checks, 51 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. /github/workspace/src/12964073.patch has style problems, please review. NOTE: Ignored message types: UNKNOWN_COMMIT_ID NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS.
tedd_an/gitlint success Gitlint PASS
tedd_an/subjectprefix success PASS
tedd_an/buildkernel success Build Kernel PASS
tedd_an/buildkernel32 success Build Kernel32 PASS
tedd_an/incremental_build success Pass
tedd_an/testrunnersetup success Test Runner Setup PASS
tedd_an/testrunnerl2cap-tester success Total: 40, Passed: 40 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunneriso-tester success Total: 55, Passed: 55 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnerbnep-tester success Total: 1, Passed: 1 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnermgmt-tester success Total: 494, Passed: 494 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnerrfcomm-tester success Total: 10, Passed: 10 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnersco-tester success Total: 12, Passed: 12 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunnersmp-tester success Total: 8, Passed: 8 (100.0%), Failed: 0, Not Run: 0
tedd_an/testrunneruserchan-tester success Total: 4, Passed: 4 (100.0%), Failed: 0, Not Run: 0

Commit Message

Tetsuo Handa Sept. 2, 2022, 11:23 a.m. UTC
syzbot is reporting attempt to schedule hdev->cmd_work work from system_wq
WQ into hdev->workqueue WQ which is under draining operation [1], for
commit c8efcc2589464ac7 ("workqueue: allow chained queueing during
destruction") does not allow such operation.

The check introduced by commit 877afadad2dce8aa ("Bluetooth: When HCI work
queue is drained, only queue chained work") was incomplete.

Use hdev->workqueue WQ when queuing hdev->{cmd,ncmd}_timer works because
hci_{cmd,ncmd}_timeout() calls queue_work(hdev->workqueue). Also, protect
the queuing operation with RCU read lock in order to avoid calling
queue_delayed_work() after cancel_delayed_work() completed.

Link: https://syzkaller.appspot.com/bug?extid=243b7d89777f90f7613b [1]
Reported-by: syzbot <syzbot+243b7d89777f90f7613b@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 877afadad2dce8aa ("Bluetooth: When HCI work queue is drained, only queue chained work")
---
This is a difficult to trigger race condition, and therefore reproducer is
not available. Please do logical check in addition to automated testing.

 net/bluetooth/hci_core.c  | 15 +++++++++++++--
 net/bluetooth/hci_event.c |  6 ++++--
 2 files changed, 17 insertions(+), 4 deletions(-)

Comments

bluez.test.bot@gmail.com Sept. 2, 2022, noon UTC | #1
This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=673604

---Test result---

Test Summary:
CheckPatch                    FAIL      1.66 seconds
GitLint                       PASS      0.82 seconds
SubjectPrefix                 PASS      0.67 seconds
BuildKernel                   PASS      34.07 seconds
BuildKernel32                 PASS      30.21 seconds
Incremental Build with patchesPASS      44.33 seconds
TestRunner: Setup             PASS      512.05 seconds
TestRunner: l2cap-tester      PASS      16.75 seconds
TestRunner: iso-tester        PASS      15.58 seconds
TestRunner: bnep-tester       PASS      6.27 seconds
TestRunner: mgmt-tester       PASS      100.77 seconds
TestRunner: rfcomm-tester     PASS      9.46 seconds
TestRunner: sco-tester        PASS      9.39 seconds
TestRunner: smp-tester        PASS      9.49 seconds
TestRunner: userchan-tester   PASS      6.46 seconds

Details
##############################
Test: CheckPatch - FAIL - 1.66 seconds
Run checkpatch.pl script with rule in .checkpatch.conf
Bluetooth: use hdev->workqueue when queuing hdev->{cmd,ncmd}_timer works\ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit fatal: unsaf ("ace/src' is owned by someone else)")'
#66: 
commit c8efcc2589464ac7 ("workqueue: allow chained queueing during
destruction") does not allow such operation.

ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit fatal: unsaf ("ace/src' is owned by someone else)")'
#69: 
The check introduced by commit 877afadad2dce8aa ("Bluetooth: When HCI work
queue is drained, only queue chained work") was incomplete.

total: 2 errors, 0 warnings, 0 checks, 51 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

/github/workspace/src/12964073.patch has style problems, please review.

NOTE: Ignored message types: UNKNOWN_COMMIT_ID

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.




---
Regards,
Linux Bluetooth
Luiz Augusto von Dentz Sept. 2, 2022, 6:45 p.m. UTC | #2
Hi Tetsuo,

On Fri, Sep 2, 2022 at 4:23 AM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> syzbot is reporting attempt to schedule hdev->cmd_work work from system_wq
> WQ into hdev->workqueue WQ which is under draining operation [1], for
> commit c8efcc2589464ac7 ("workqueue: allow chained queueing during
> destruction") does not allow such operation.
>
> The check introduced by commit 877afadad2dce8aa ("Bluetooth: When HCI work
> queue is drained, only queue chained work") was incomplete.
>
> Use hdev->workqueue WQ when queuing hdev->{cmd,ncmd}_timer works because
> hci_{cmd,ncmd}_timeout() calls queue_work(hdev->workqueue). Also, protect
> the queuing operation with RCU read lock in order to avoid calling
> queue_delayed_work() after cancel_delayed_work() completed.

Didn't we introduce HCI_CMD_DRAIN_WORKQUEUE exactly to avoid queuing
after the cancel pattern? I wonder if wouldn't be better to introduce
some function that disables/enables the workqueue so we don't have to
do extra tracking in the driver/subsystem?

> Link: https://syzkaller.appspot.com/bug?extid=243b7d89777f90f7613b [1]
> Reported-by: syzbot <syzbot+243b7d89777f90f7613b@syzkaller.appspotmail.com>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Fixes: 877afadad2dce8aa ("Bluetooth: When HCI work queue is drained, only queue chained work")
> ---
> This is a difficult to trigger race condition, and therefore reproducer is
> not available. Please do logical check in addition to automated testing.
>
>  net/bluetooth/hci_core.c  | 15 +++++++++++++--
>  net/bluetooth/hci_event.c |  6 ++++--
>  2 files changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> index b3a5a3cc9372..9873d2e67988 100644
> --- a/net/bluetooth/hci_core.c
> +++ b/net/bluetooth/hci_core.c
> @@ -597,6 +597,15 @@ static int hci_dev_do_reset(struct hci_dev *hdev)
>
>         /* Cancel these to avoid queueing non-chained pending work */
>         hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
> +       /* Wait for
> +        *
> +        *    if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
> +        *        queue_delayed_work(&hdev->{cmd,ncmd}_timer)
> +        *
> +        * inside RCU section to see the flag or complete scheduling.
> +        */
> +       synchronize_rcu();
> +       /* Explicitly cancel works in case scheduled after setting the flag. */
>         cancel_delayed_work(&hdev->cmd_timer);
>         cancel_delayed_work(&hdev->ncmd_timer);
>
> @@ -4056,12 +4065,14 @@ static void hci_cmd_work(struct work_struct *work)
>                         if (res < 0)
>                                 __hci_cmd_sync_cancel(hdev, -res);
>
> +                       rcu_read_lock();
>                         if (test_bit(HCI_RESET, &hdev->flags) ||
>                             hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
>                                 cancel_delayed_work(&hdev->cmd_timer);
>                         else
> -                               schedule_delayed_work(&hdev->cmd_timer,
> -                                                     HCI_CMD_TIMEOUT);
> +                               queue_delayed_work(hdev->workqueue, &hdev->cmd_timer,
> +                                                  HCI_CMD_TIMEOUT);
> +                       rcu_read_unlock();
>                 } else {
>                         skb_queue_head(&hdev->cmd_q, skb);
>                         queue_work(hdev->workqueue, &hdev->cmd_work);
> diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> index 6643c9c20fa4..d6f0e6ca0e7e 100644
> --- a/net/bluetooth/hci_event.c
> +++ b/net/bluetooth/hci_event.c
> @@ -3766,16 +3766,18 @@ static inline void handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd)
>  {
>         cancel_delayed_work(&hdev->cmd_timer);
>
> +       rcu_read_lock();
>         if (!test_bit(HCI_RESET, &hdev->flags)) {
>                 if (ncmd) {
>                         cancel_delayed_work(&hdev->ncmd_timer);
>                         atomic_set(&hdev->cmd_cnt, 1);
>                 } else {
>                         if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
> -                               schedule_delayed_work(&hdev->ncmd_timer,
> -                                                     HCI_NCMD_TIMEOUT);
> +                               queue_delayed_work(hdev->workqueue, &hdev->ncmd_timer,
> +                                                  HCI_NCMD_TIMEOUT);
>                 }
>         }
> +       rcu_read_unlock();
>  }
>
>  static u8 hci_cc_le_read_buffer_size_v2(struct hci_dev *hdev, void *data,
> --
> 2.18.4
>
Luiz Augusto von Dentz Sept. 2, 2022, 9:31 p.m. UTC | #3
Hi Tetsuo,

On Fri, Sep 2, 2022 at 11:45 AM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
>
> Hi Tetsuo,
>
> On Fri, Sep 2, 2022 at 4:23 AM Tetsuo Handa
> <penguin-kernel@i-love.sakura.ne.jp> wrote:
> >
> > syzbot is reporting attempt to schedule hdev->cmd_work work from system_wq
> > WQ into hdev->workqueue WQ which is under draining operation [1], for
> > commit c8efcc2589464ac7 ("workqueue: allow chained queueing during
> > destruction") does not allow such operation.
> >
> > The check introduced by commit 877afadad2dce8aa ("Bluetooth: When HCI work
> > queue is drained, only queue chained work") was incomplete.
> >
> > Use hdev->workqueue WQ when queuing hdev->{cmd,ncmd}_timer works because
> > hci_{cmd,ncmd}_timeout() calls queue_work(hdev->workqueue). Also, protect
> > the queuing operation with RCU read lock in order to avoid calling
> > queue_delayed_work() after cancel_delayed_work() completed.
>
> Didn't we introduce HCI_CMD_DRAIN_WORKQUEUE exactly to avoid queuing
> after the cancel pattern? I wonder if wouldn't be better to introduce
> some function that disables/enables the workqueue so we don't have to
> do extra tracking in the driver/subsystem?
>
> > Link: https://syzkaller.appspot.com/bug?extid=243b7d89777f90f7613b [1]
> > Reported-by: syzbot <syzbot+243b7d89777f90f7613b@syzkaller.appspotmail.com>
> > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > Fixes: 877afadad2dce8aa ("Bluetooth: When HCI work queue is drained, only queue chained work")
> > ---
> > This is a difficult to trigger race condition, and therefore reproducer is
> > not available. Please do logical check in addition to automated testing.
> >
> >  net/bluetooth/hci_core.c  | 15 +++++++++++++--
> >  net/bluetooth/hci_event.c |  6 ++++--
> >  2 files changed, 17 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
> > index b3a5a3cc9372..9873d2e67988 100644
> > --- a/net/bluetooth/hci_core.c
> > +++ b/net/bluetooth/hci_core.c
> > @@ -597,6 +597,15 @@ static int hci_dev_do_reset(struct hci_dev *hdev)
> >
> >         /* Cancel these to avoid queueing non-chained pending work */
> >         hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
> > +       /* Wait for
> > +        *
> > +        *    if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
> > +        *        queue_delayed_work(&hdev->{cmd,ncmd}_timer)
> > +        *
> > +        * inside RCU section to see the flag or complete scheduling.
> > +        */
> > +       synchronize_rcu();
> > +       /* Explicitly cancel works in case scheduled after setting the flag. */
> >         cancel_delayed_work(&hdev->cmd_timer);
> >         cancel_delayed_work(&hdev->ncmd_timer);
> >
> > @@ -4056,12 +4065,14 @@ static void hci_cmd_work(struct work_struct *work)
> >                         if (res < 0)
> >                                 __hci_cmd_sync_cancel(hdev, -res);
> >
> > +                       rcu_read_lock();
> >                         if (test_bit(HCI_RESET, &hdev->flags) ||
> >                             hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
> >                                 cancel_delayed_work(&hdev->cmd_timer);
> >                         else
> > -                               schedule_delayed_work(&hdev->cmd_timer,
> > -                                                     HCI_CMD_TIMEOUT);
> > +                               queue_delayed_work(hdev->workqueue, &hdev->cmd_timer,
> > +                                                  HCI_CMD_TIMEOUT);
> > +                       rcu_read_unlock();
> >                 } else {
> >                         skb_queue_head(&hdev->cmd_q, skb);
> >                         queue_work(hdev->workqueue, &hdev->cmd_work);
> > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > index 6643c9c20fa4..d6f0e6ca0e7e 100644
> > --- a/net/bluetooth/hci_event.c
> > +++ b/net/bluetooth/hci_event.c
> > @@ -3766,16 +3766,18 @@ static inline void handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd)
> >  {
> >         cancel_delayed_work(&hdev->cmd_timer);
> >
> > +       rcu_read_lock();
> >         if (!test_bit(HCI_RESET, &hdev->flags)) {
> >                 if (ncmd) {
> >                         cancel_delayed_work(&hdev->ncmd_timer);
> >                         atomic_set(&hdev->cmd_cnt, 1);
> >                 } else {
> >                         if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
> > -                               schedule_delayed_work(&hdev->ncmd_timer,
> > -                                                     HCI_NCMD_TIMEOUT);
> > +                               queue_delayed_work(hdev->workqueue, &hdev->ncmd_timer,
> > +                                                  HCI_NCMD_TIMEOUT);
> >                 }
> >         }
> > +       rcu_read_unlock();
> >  }
> >
> >  static u8 hci_cc_le_read_buffer_size_v2(struct hci_dev *hdev, void *data,
> > --
> > 2.18.4
> >

I was thinking on doing something like the following:

https://gist.github.com/Vudentz/a2288015fedbed366fcdb612264a9d16

Since there is no reason to queue any command if we are draining and
are gonna reset at the end it is pretty useless to queue commands at
that point.

>
> --
> Luiz Augusto von Dentz
Tetsuo Handa Sept. 3, 2022, 6:49 a.m. UTC | #4
On 2022/09/03 6:31, Luiz Augusto von Dentz wrote:
> Hi Tetsuo,
> 
> On Fri, Sep 2, 2022 at 11:45 AM Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:
>> Didn't we introduce HCI_CMD_DRAIN_WORKQUEUE exactly to avoid queuing
>> after the cancel pattern?

HCI_CMD_DRAIN_WORKQUEUE does not help for this case.

What extid=243b7d89777f90f7613b is reporting is

  hci_cmd_timeout() {                             hci_dev_do_reset() {
    starts sleeping due to e.g. preemption
                                                    hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE); // Sets HCI_CMD_DRAIN_WORKQUEUE flag
                                                    cancel_delayed_work(&hdev->cmd_timer); // does nothing because hci_cmd_timeout() is already running
                                                    cancel_delayed_work(&hdev->ncmd_timer);
                                                    drain_workqueue(hdev->workqueue) {
                                                      sets __WQ_DRAINING flag on hdev->workqueue
                                                      starts waiting for completion of all works on hdev->workqueue
    finishes sleeping due to e.g. preemption
    queue_work(hdev->workqueue,  &hdev->cmd_work) // <= complains attempt to queue work from system_wq into __WQ_DRAINING hdev->workqueue
  }
                                                      finishes waiting for completion of all works on hdev->workqueue
                                                      clears __WQ_DRAINING flag
                                                    }
                                                  }

race condition. Notice that cancel_delayed_work() does not wait for
completion of already started hci_cmd_timeout() callback.

If you need to wait for completion of already started callback,
you need to use _sync version (e.g. cancel_delayed_work_sync()).
And watch out for locking dependency when using _sync version.

>>                           I wonder if wouldn't be better to introduce
>> some function that disables/enables the workqueue so we don't have to
>> do extra tracking in the driver/subsystem?
>>
> 
> I was thinking on doing something like the following:
> 
> https://gist.github.com/Vudentz/a2288015fedbed366fcdb612264a9d16

That patch does not close race, for

@@ -4037,6 +4038,10 @@ static void hci_cmd_work(struct work_struct *work)
        BT_DBG("%s cmd_cnt %d cmd queued %d", hdev->name,
               atomic_read(&hdev->cmd_cnt), skb_queue_len(&hdev->cmd_q));
 
+       /* Don't queue while draining */
+       if (hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
+               return;
        /*
         * BUG: WE ARE FREE TO SLEEP FOR ARBITRARY DURATION IMMEDIATELY AFTER CHECKING THE FLAG.
         * ANY "TEST AND DO SOMETHING" NEEDS TO BE PROTECTED BY A LOCK MECHANISM.
         */
+
        /* Send queued commands */
        if (atomic_read(&hdev->cmd_cnt)) {
                skb = skb_dequeue(&hdev->cmd_q);

. In other words, HCI_CMD_DRAIN_WORKQUEUE does not fix what extid=63bed493aebbf6872647 is reporting.

If "TEST AND DO SOMETHING" does not sleep, RCU is a handy lock mechanism.

> 
> Since there is no reason to queue any command if we are draining and
> are gonna reset at the end it is pretty useless to queue commands at
> that point.

Then, you can add that check.
Luiz Augusto von Dentz Sept. 4, 2022, 2:11 a.m. UTC | #5
Hi Tetsuo,

On Fri, Sep 2, 2022 at 11:49 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2022/09/03 6:31, Luiz Augusto von Dentz wrote:
> > Hi Tetsuo,
> >
> > On Fri, Sep 2, 2022 at 11:45 AM Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:
> >> Didn't we introduce HCI_CMD_DRAIN_WORKQUEUE exactly to avoid queuing
> >> after the cancel pattern?
>
> HCI_CMD_DRAIN_WORKQUEUE does not help for this case.
>
> What extid=243b7d89777f90f7613b is reporting is
>
>   hci_cmd_timeout() {                             hci_dev_do_reset() {
>     starts sleeping due to e.g. preemption
>                                                     hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE); // Sets HCI_CMD_DRAIN_WORKQUEUE flag
>                                                     cancel_delayed_work(&hdev->cmd_timer); // does nothing because hci_cmd_timeout() is already running
>                                                     cancel_delayed_work(&hdev->ncmd_timer);
>                                                     drain_workqueue(hdev->workqueue) {
>                                                       sets __WQ_DRAINING flag on hdev->workqueue
>                                                       starts waiting for completion of all works on hdev->workqueue
>     finishes sleeping due to e.g. preemption
>     queue_work(hdev->workqueue,  &hdev->cmd_work) // <= complains attempt to queue work from system_wq into __WQ_DRAINING hdev->workqueue

And we can check for __WQ_DRAINING? Anyway checking
HCI_CMD_DRAIN_WORKQUEUE seems useless so we either have to check if
queue_work can be used or not.

>   }
>                                                       finishes waiting for completion of all works on hdev->workqueue
>                                                       clears __WQ_DRAINING flag
>                                                     }
>                                                   }
>
> race condition. Notice that cancel_delayed_work() does not wait for
> completion of already started hci_cmd_timeout() callback.
>
> If you need to wait for completion of already started callback,
> you need to use _sync version (e.g. cancel_delayed_work_sync()).
> And watch out for locking dependency when using _sync version.
>
> >>                           I wonder if wouldn't be better to introduce
> >> some function that disables/enables the workqueue so we don't have to
> >> do extra tracking in the driver/subsystem?
> >>
> >
> > I was thinking on doing something like the following:
> >
> > https://gist.github.com/Vudentz/a2288015fedbed366fcdb612264a9d16
>
> That patch does not close race, for
>
> @@ -4037,6 +4038,10 @@ static void hci_cmd_work(struct work_struct *work)
>         BT_DBG("%s cmd_cnt %d cmd queued %d", hdev->name,
>                atomic_read(&hdev->cmd_cnt), skb_queue_len(&hdev->cmd_q));
>
> +       /* Don't queue while draining */
> +       if (hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
> +               return;
>         /*
>          * BUG: WE ARE FREE TO SLEEP FOR ARBITRARY DURATION IMMEDIATELY AFTER CHECKING THE FLAG.
>          * ANY "TEST AND DO SOMETHING" NEEDS TO BE PROTECTED BY A LOCK MECHANISM.
>          */

Then we need a lock not a flag.

>         /* Send queued commands */
>         if (atomic_read(&hdev->cmd_cnt)) {
>                 skb = skb_dequeue(&hdev->cmd_q);
>
> . In other words, HCI_CMD_DRAIN_WORKQUEUE does not fix what extid=63bed493aebbf6872647 is reporting.
>
> If "TEST AND DO SOMETHING" does not sleep, RCU is a handy lock mechanism.
>
> >
> > Since there is no reason to queue any command if we are draining and
> > are gonna reset at the end it is pretty useless to queue commands at
> > that point.
>
> Then, you can add that check.
>
Tejun Heo Sept. 4, 2022, 2:20 a.m. UTC | #6
Hello,

On Sat, Sep 03, 2022 at 07:11:18PM -0700, Luiz Augusto von Dentz wrote:
> And we can check for __WQ_DRAINING? Anyway checking

Please don't do that. That's an internal flag. It shouldn't be *that*
difficult to avoid this without peeking into wq internal state.

Thanks.
Schspa Shi Sept. 5, 2022, 8:24 a.m. UTC | #7
Tejun Heo <tj@kernel.org> writes:

> Hello,
>
> On Sat, Sep 03, 2022 at 07:11:18PM -0700, Luiz Augusto von Dentz wrote:
>> And we can check for __WQ_DRAINING? Anyway checking
>
> Please don't do that. That's an internal flag. It shouldn't be *that*
> difficult to avoid this without peeking into wq internal state.
>
> Thanks.

It seems we only need to change hdev->{cmd,ncmd}_timer to
hdev->workqueue, there will be no race because drain_workqueue will
flush all pending work internally.
Any new timeout work will see HCI_CMD_DRAIN_WORKQUEUE flags after we
cancel and flushed all the delayed work.
Tetsuo Handa Sept. 5, 2022, 11:23 a.m. UTC | #8
On 2022/09/05 17:24, Schspa Shi wrote:
> 
> Tejun Heo <tj@kernel.org> writes:
> 
>> Hello,
>>
>> On Sat, Sep 03, 2022 at 07:11:18PM -0700, Luiz Augusto von Dentz wrote:
>>> And we can check for __WQ_DRAINING? Anyway checking
>>
>> Please don't do that. That's an internal flag. It shouldn't be *that*
>> difficult to avoid this without peeking into wq internal state.
>>
>> Thanks.
> 
> It seems we only need to change hdev->{cmd,ncmd}_timer to
> hdev->workqueue, there will be no race because drain_workqueue will
> flush all pending work internally.

True for queue_work(), not always true for queue_delayed_work(). Explained below.

> Any new timeout work will see HCI_CMD_DRAIN_WORKQUEUE flags after we
> cancel and flushed all the delayed work.
> 

If you don't mind calling

  queue_work(&hdev->cmd_work) followed by hci_cmd_work() (case A below)

and/or

  queue_delayed_work(&hdev->ncmd_timer) potentially followed by hci_ncmd_timeout()/hci_reset_dev() (case B and C below)

after observing HCI_CMD_DRAIN_WORKQUEUE flag.
We need to use RCU protection if you mind one of these.



Case A:

hci_dev_do_reset() {
                                      hci_cmd_work() {
                                        if (test_bit(HCI_RESET, &hdev->flags) ||
                                          hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
                                            cancel_delayed_work(&hdev->cmd_timer);
                                          else
                                            queue_delayed_work(hdev->workqueue, &hdev->cmd_timer, HCI_CMD_TIMEOUT);
                                        } else {
                                          skb_queue_head(&hdev->cmd_q, skb);
  hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
  cancel_delayed_work(&hdev->cmd_timer);
  cancel_delayed_work(&hdev->ncmd_timer);
                                          queue_work(hdev->workqueue, &hdev->cmd_work); // Queuing after setting HCI_CMD_DRAIN_WORKQUEUE despite the intent of HCI_CMD_DRAIN_WORKQUEUE...
  drain_workqueue(hdev->workqueue); // Will wait for hci_cmd_timeout() queued by queue_work() to complete.

                                        }

  // Actual flush() happens here.

  hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
}



Case B:

hci_dev_do_reset() {
                                          handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd) {
                                            if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
  hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
  cancel_delayed_work(&hdev->cmd_timer);
                                              queue_delayed_work(hdev->workqueue, &hdev->ncmd_timer, HCI_NCMD_TIMEOUT);
  cancel_delayed_work(&hdev->ncmd_timer); // May or may not cancel hci_ncmd_timeout() queued by queue_delayed_work().
  drain_workqueue(hdev->workqueue); // Will wait for hci_ncmd_timeout() queued by queue_delayed_work() to complete if cancel_delayed_work() failed to cancel.

                                          }

  // Actual flush() happens here.

  hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
}



Case C:

hci_dev_do_reset() {
                                          handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd) {
                                            if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
  hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
  cancel_delayed_work(&hdev->cmd_timer);
  cancel_delayed_work(&hdev->ncmd_timer); // Does nothing.
                                              queue_delayed_work(hdev->workqueue, &hdev->ncmd_timer, HCI_NCMD_TIMEOUT);
  drain_workqueue(hdev->workqueue); // Will wait for hci_ncmd_timeout() queued by queue_delayed_work() to complete if delay timer has expired.

                                          }

  // Actual flush() happens here, but hci_ncmd_timeout() queued by queue_delayed_work() can be running if delay timer has not expired as of calling drain_workqueue().

  hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
}
Schspa Shi Sept. 5, 2022, 12:26 p.m. UTC | #9
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> writes:

> On 2022/09/05 17:24, Schspa Shi wrote:
>> 
>> Tejun Heo <tj@kernel.org> writes:
>> 
>>> Hello,
>>>
>>> On Sat, Sep 03, 2022 at 07:11:18PM -0700, Luiz Augusto von Dentz wrote:
>>>> And we can check for __WQ_DRAINING? Anyway checking
>>>
>>> Please don't do that. That's an internal flag. It shouldn't be *that*
>>> difficult to avoid this without peeking into wq internal state.
>>>
>>> Thanks.
>> 
>> It seems we only need to change hdev->{cmd,ncmd}_timer to
>> hdev->workqueue, there will be no race because drain_workqueue will
>> flush all pending work internally.
>
> True for queue_work(), not always true for queue_delayed_work(). Explained below.
>

Ok, you are right, got it now.

>> Any new timeout work will see HCI_CMD_DRAIN_WORKQUEUE flags after we
>> cancel and flushed all the delayed work.
>> 
>
> If you don't mind calling
>
>   queue_work(&hdev->cmd_work) followed by hci_cmd_work() (case A below)
>
> and/or
>
>   queue_delayed_work(&hdev->ncmd_timer) potentially followed by hci_ncmd_timeout()/hci_reset_dev() (case B and C below)
>
> after observing HCI_CMD_DRAIN_WORKQUEUE flag.
> We need to use RCU protection if you mind one of these.
>
>
>
> Case A:
>
> hci_dev_do_reset() {
>                                       hci_cmd_work() {
>                                         if (test_bit(HCI_RESET, &hdev->flags) ||
>                                           hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
>                                             cancel_delayed_work(&hdev->cmd_timer);
>                                           else
>                                             queue_delayed_work(hdev->workqueue, &hdev->cmd_timer, HCI_CMD_TIMEOUT);
>                                         } else {
>                                           skb_queue_head(&hdev->cmd_q, skb);
>   hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
>   cancel_delayed_work(&hdev->cmd_timer);
>   cancel_delayed_work(&hdev->ncmd_timer);
>                                           queue_work(hdev->workqueue, &hdev->cmd_work); // Queuing after setting HCI_CMD_DRAIN_WORKQUEUE despite the intent of HCI_CMD_DRAIN_WORKQUEUE...
>   drain_workqueue(hdev->workqueue); // Will wait for hci_cmd_timeout() queued by queue_work() to complete.
>
>                                         }
>
>   // Actual flush() happens here.
>
>   hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
> }
>
>
>
> Case B:
>
> hci_dev_do_reset() {
>                                           handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd) {
>                                             if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
>   hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
>   cancel_delayed_work(&hdev->cmd_timer);
>                                               queue_delayed_work(hdev->workqueue, &hdev->ncmd_timer, HCI_NCMD_TIMEOUT);
>   cancel_delayed_work(&hdev->ncmd_timer); // May or may not cancel hci_ncmd_timeout() queued by queue_delayed_work().
>   drain_workqueue(hdev->workqueue); // Will wait for hci_ncmd_timeout() queued by queue_delayed_work() to complete if cancel_delayed_work() failed to cancel.
>
>                                           }
>
>   // Actual flush() happens here.
>
>   hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
> }
>
>
>
> Case C:
>
> hci_dev_do_reset() {
>                                           handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd) {
>                                             if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
>   hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
>   cancel_delayed_work(&hdev->cmd_timer);
>   cancel_delayed_work(&hdev->ncmd_timer); // Does nothing.
>                                               queue_delayed_work(hdev->workqueue, &hdev->ncmd_timer, HCI_NCMD_TIMEOUT);
>   drain_workqueue(hdev->workqueue); // Will wait for hci_ncmd_timeout() queued by queue_delayed_work() to complete if delay timer has expired.
>
>                                           }
>
>   // Actual flush() happens here, but hci_ncmd_timeout() queued by queue_delayed_work() can be running if delay timer has not expired as of calling drain_workqueue().
>
>   hci_dev_clear_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
> }
patchwork-bot+bluetooth@kernel.org Sept. 19, 2022, 5:30 p.m. UTC | #10
Hello:

This patch was applied to bluetooth/bluetooth-next.git (master)
by Luiz Augusto von Dentz <luiz.von.dentz@intel.com>:

On Fri, 2 Sep 2022 20:23:48 +0900 you wrote:
> syzbot is reporting attempt to schedule hdev->cmd_work work from system_wq
> WQ into hdev->workqueue WQ which is under draining operation [1], for
> commit c8efcc2589464ac7 ("workqueue: allow chained queueing during
> destruction") does not allow such operation.
> 
> The check introduced by commit 877afadad2dce8aa ("Bluetooth: When HCI work
> queue is drained, only queue chained work") was incomplete.
> 
> [...]

Here is the summary with links:
  - Bluetooth: use hdev->workqueue when queuing hdev->{cmd,ncmd}_timer works
    https://git.kernel.org/bluetooth/bluetooth-next/c/deee93d13d38

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index b3a5a3cc9372..9873d2e67988 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -597,6 +597,15 @@  static int hci_dev_do_reset(struct hci_dev *hdev)
 
 	/* Cancel these to avoid queueing non-chained pending work */
 	hci_dev_set_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE);
+	/* Wait for
+	 *
+	 *    if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
+	 *        queue_delayed_work(&hdev->{cmd,ncmd}_timer)
+	 *
+	 * inside RCU section to see the flag or complete scheduling.
+	 */
+	synchronize_rcu();
+	/* Explicitly cancel works in case scheduled after setting the flag. */
 	cancel_delayed_work(&hdev->cmd_timer);
 	cancel_delayed_work(&hdev->ncmd_timer);
 
@@ -4056,12 +4065,14 @@  static void hci_cmd_work(struct work_struct *work)
 			if (res < 0)
 				__hci_cmd_sync_cancel(hdev, -res);
 
+			rcu_read_lock();
 			if (test_bit(HCI_RESET, &hdev->flags) ||
 			    hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
 				cancel_delayed_work(&hdev->cmd_timer);
 			else
-				schedule_delayed_work(&hdev->cmd_timer,
-						      HCI_CMD_TIMEOUT);
+				queue_delayed_work(hdev->workqueue, &hdev->cmd_timer,
+						   HCI_CMD_TIMEOUT);
+			rcu_read_unlock();
 		} else {
 			skb_queue_head(&hdev->cmd_q, skb);
 			queue_work(hdev->workqueue, &hdev->cmd_work);
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index 6643c9c20fa4..d6f0e6ca0e7e 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -3766,16 +3766,18 @@  static inline void handle_cmd_cnt_and_timer(struct hci_dev *hdev, u8 ncmd)
 {
 	cancel_delayed_work(&hdev->cmd_timer);
 
+	rcu_read_lock();
 	if (!test_bit(HCI_RESET, &hdev->flags)) {
 		if (ncmd) {
 			cancel_delayed_work(&hdev->ncmd_timer);
 			atomic_set(&hdev->cmd_cnt, 1);
 		} else {
 			if (!hci_dev_test_flag(hdev, HCI_CMD_DRAIN_WORKQUEUE))
-				schedule_delayed_work(&hdev->ncmd_timer,
-						      HCI_NCMD_TIMEOUT);
+				queue_delayed_work(hdev->workqueue, &hdev->ncmd_timer,
+						   HCI_NCMD_TIMEOUT);
 		}
 	}
+	rcu_read_unlock();
 }
 
 static u8 hci_cc_le_read_buffer_size_v2(struct hci_dev *hdev, void *data,