diff mbox series

[net-next] net/sched: cls_api: fix possible infinite loop in tcf_idr_check_alloc()

Message ID 20240612204610.4137697-1-druth@chromium.org (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net/sched: cls_api: fix possible infinite loop in tcf_idr_check_alloc() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 852 this patch: 852
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 854 this patch: 854
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 856 this patch: 856
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 15 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-06-13--03-00 (tests: 643)

Commit Message

David Ruth June 12, 2024, 8:46 p.m. UTC
syzbot found hanging tasks waiting on rtnl_lock [1]

When a request to add multiple actions with the same index is sent, the
second request will block forever on the first request. This results in an
infinite loop that holds rtnl_lock, and causes tasks to hang.

Return -EAGAIN to prevent infinite looping, while keeping documented
behavior.

[1]

INFO: task kworker/1:0:5088 blocked for more than 143 seconds.
Not tainted 6.9.0-rc4-syzkaller-00173-g3cdb45594619 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/1:0 state:D stack:23744 pid:5088 tgid:5088 ppid:2 flags:0x00004000
Workqueue: events_power_efficient reg_check_chans_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5409 [inline]
__schedule+0xf15/0x5d00 kernel/sched/core.c:6746
__schedule_loop kernel/sched/core.c:6823 [inline]
schedule+0xe7/0x350 kernel/sched/core.c:6838
schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6895
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x5b8/0x9c0 kernel/locking/mutex.c:752
wiphy_lock include/net/cfg80211.h:5953 [inline]
reg_leave_invalid_chans net/wireless/reg.c:2466 [inline]
reg_check_chans_work+0x10a/0x10e0 net/wireless/reg.c:2481

Reported-by: syzbot+b87c222546179f4513a7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b87c222546179f4513a7
Signed-off-by: David Ruth <druth@chromium.org>
---
 net/sched/act_api.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Pedro Tammela June 12, 2024, 9:25 p.m. UTC | #1
On 12/06/2024 17:46, David Ruth wrote:
> syzbot found hanging tasks waiting on rtnl_lock [1]
> 
> When a request to add multiple actions with the same index is sent, the
> second request will block forever on the first request. This results in an
> infinite loop that holds rtnl_lock, and causes tasks to hang.
> 
> Return -EAGAIN to prevent infinite looping, while keeping documented
> behavior.
> 
> [1]
> 
> INFO: task kworker/1:0:5088 blocked for more than 143 seconds.
> Not tainted 6.9.0-rc4-syzkaller-00173-g3cdb45594619 #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/1:0 state:D stack:23744 pid:5088 tgid:5088 ppid:2 flags:0x00004000
> Workqueue: events_power_efficient reg_check_chans_work
> Call Trace:
> <TASK>
> context_switch kernel/sched/core.c:5409 [inline]
> __schedule+0xf15/0x5d00 kernel/sched/core.c:6746
> __schedule_loop kernel/sched/core.c:6823 [inline]
> schedule+0xe7/0x350 kernel/sched/core.c:6838
> schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:6895
> __mutex_lock_common kernel/locking/mutex.c:684 [inline]
> __mutex_lock+0x5b8/0x9c0 kernel/locking/mutex.c:752
> wiphy_lock include/net/cfg80211.h:5953 [inline]
> reg_leave_invalid_chans net/wireless/reg.c:2466 [inline]
> reg_check_chans_work+0x10a/0x10e0 net/wireless/reg.c:2481
> 
> Reported-by: syzbot+b87c222546179f4513a7@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b87c222546179f4513a7
> Signed-off-by: David Ruth <druth@chromium.org>

Hi,

Thanks for fixing it.

Syzbot is reproducing in net, so the patch should target the net tree.

Also missing the following tag:
Fixes: 4b55e86736d5 ("net/sched: act_api: rely on rcu in 
tcf_idr_check_alloc")

> ---
>   net/sched/act_api.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 7458b3154426..2714c4ed928e 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -830,7 +830,6 @@ int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index,
>   	u32 max;
>   
>   	if (*index) {
> -again:
>   		rcu_read_lock();
>   		p = idr_find(&idrinfo->action_idr, *index);
>   
> @@ -839,7 +838,7 @@ int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index,
>   			 * index but did not assign the pointer yet.
>   			 */
>   			rcu_read_unlock();
> -			goto again;
> +			return -EAGAIN;
>   		}
>   
>   		if (!p) {
David Ruth June 13, 2024, 5:25 a.m. UTC | #2
> Hi,
>
> Thanks for fixing it.
>
> Syzbot is reproducing in net, so the patch should target the net tree.

Ack. Will resend to net.

> Also missing the following tag:
> Fixes: 4b55e86736d5 ("net/sched: act_api: rely on rcu in
> tcf_idr_check_alloc")

My understanding is that this issue is significantly older than that
change, and therefore does not fix that change. Should I still apply
that fixes tag?
Jiri Pirko June 13, 2024, 5:35 a.m. UTC | #3
Thu, Jun 13, 2024 at 07:25:32AM CEST, druth@chromium.org wrote:
>> Hi,
>>
>> Thanks for fixing it.
>>
>> Syzbot is reproducing in net, so the patch should target the net tree.
>
>Ack. Will resend to net.
>
>> Also missing the following tag:
>> Fixes: 4b55e86736d5 ("net/sched: act_api: rely on rcu in
>> tcf_idr_check_alloc")
>
>My understanding is that this issue is significantly older than that
>change, and therefore does not fix that change. Should I still apply
>that fixes tag?

So find the right commit that intruduced the issue.
diff mbox series

Patch

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 7458b3154426..2714c4ed928e 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -830,7 +830,6 @@  int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index,
 	u32 max;
 
 	if (*index) {
-again:
 		rcu_read_lock();
 		p = idr_find(&idrinfo->action_idr, *index);
 
@@ -839,7 +838,7 @@  int tcf_idr_check_alloc(struct tc_action_net *tn, u32 *index,
 			 * index but did not assign the pointer yet.
 			 */
 			rcu_read_unlock();
-			goto again;
+			return -EAGAIN;
 		}
 
 		if (!p) {