diff mbox series

[net] can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate

Message ID 20210906094200.95868-1-william.xuanziyang@huawei.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [net] can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers warning 1 maintainers not CCed: kernel@pengutronix.de
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 10 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Ziyang Xuan (William) Sept. 6, 2021, 9:42 a.m. UTC
The conclusion "j1939_session_deactivate() should be called with a
session ref-count of at least 2" is incorrect. In some concurrent
scenarios, j1939_session_deactivate can be called with the session
ref-count less than 2. But there is not any problem because it
will check the session active state before session putting in
j1939_session_deactivate_locked().

Here is the concurrent scenario of the problem reported by syzbot
and my reproduction log.

        cpu0                            cpu1
                                j1939_xtp_rx_eoma
j1939_xtp_rx_abort_one
                                j1939_session_get_by_addr [kref == 2]
j1939_session_get_by_addr [kref == 3]
j1939_session_deactivate [kref == 2]
j1939_session_put [kref == 1]
				j1939_session_completed
				j1939_session_deactivate
				WARN_ON_ONCE(kref < 2)

=====================================================
WARNING: CPU: 1 PID: 21 at net/can/j1939/transport.c:1088 j1939_session_deactivate+0x5f/0x70
CPU: 1 PID: 21 Comm: ksoftirqd/1 Not tainted 5.14.0-rc7+ #32
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
RIP: 0010:j1939_session_deactivate+0x5f/0x70
Call Trace:
 j1939_session_deactivate_activate_next+0x11/0x28
 j1939_xtp_rx_eoma+0x12a/0x180
 j1939_tp_recv+0x4a2/0x510
 j1939_can_recv+0x226/0x380
 can_rcv_filter+0xf8/0x220
 can_receive+0x102/0x220
 ? process_backlog+0xf0/0x2c0
 can_rcv+0x53/0xf0
 __netif_receive_skb_one_core+0x67/0x90
 ? process_backlog+0x97/0x2c0
 __netif_receive_skb+0x22/0x80

Fixes: 0c71437dd50d ("can: j1939: j1939_session_deactivate(): clarify lifetime of session object")
Reported-by: syzbot+9981a614060dcee6eeca@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
---
 net/can/j1939/transport.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Oleksij Rempel Sept. 8, 2021, 3:46 a.m. UTC | #1
Hi,

Thank you for your patches. Please stay on hold, I'll review it end of
this week.

On Mon, Sep 06, 2021 at 05:42:00PM +0800, Ziyang Xuan wrote:
> The conclusion "j1939_session_deactivate() should be called with a
> session ref-count of at least 2" is incorrect. In some concurrent
> scenarios, j1939_session_deactivate can be called with the session
> ref-count less than 2. But there is not any problem because it
> will check the session active state before session putting in
> j1939_session_deactivate_locked().
> 
> Here is the concurrent scenario of the problem reported by syzbot
> and my reproduction log.
> 
>         cpu0                            cpu1
>                                 j1939_xtp_rx_eoma
> j1939_xtp_rx_abort_one
>                                 j1939_session_get_by_addr [kref == 2]
> j1939_session_get_by_addr [kref == 3]
> j1939_session_deactivate [kref == 2]
> j1939_session_put [kref == 1]
> 				j1939_session_completed
> 				j1939_session_deactivate
> 				WARN_ON_ONCE(kref < 2)
> 
> =====================================================
> WARNING: CPU: 1 PID: 21 at net/can/j1939/transport.c:1088 j1939_session_deactivate+0x5f/0x70
> CPU: 1 PID: 21 Comm: ksoftirqd/1 Not tainted 5.14.0-rc7+ #32
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
> RIP: 0010:j1939_session_deactivate+0x5f/0x70
> Call Trace:
>  j1939_session_deactivate_activate_next+0x11/0x28
>  j1939_xtp_rx_eoma+0x12a/0x180
>  j1939_tp_recv+0x4a2/0x510
>  j1939_can_recv+0x226/0x380
>  can_rcv_filter+0xf8/0x220
>  can_receive+0x102/0x220
>  ? process_backlog+0xf0/0x2c0
>  can_rcv+0x53/0xf0
>  __netif_receive_skb_one_core+0x67/0x90
>  ? process_backlog+0x97/0x2c0
>  __netif_receive_skb+0x22/0x80
> 
> Fixes: 0c71437dd50d ("can: j1939: j1939_session_deactivate(): clarify lifetime of session object")
> Reported-by: syzbot+9981a614060dcee6eeca@syzkaller.appspotmail.com
> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
> ---
>  net/can/j1939/transport.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
> index bdc95bd7a851..0f8309314075 100644
> --- a/net/can/j1939/transport.c
> +++ b/net/can/j1939/transport.c
> @@ -1079,10 +1079,6 @@ static bool j1939_session_deactivate(struct j1939_session *session)
>  	bool active;
>  
>  	j1939_session_list_lock(priv);
> -	/* This function should be called with a session ref-count of at
> -	 * least 2.
> -	 */
> -	WARN_ON_ONCE(kref_read(&session->kref) < 2);
>  	active = j1939_session_deactivate_locked(session);
>  	j1939_session_list_unlock(priv);
>  
> -- 
> 2.25.1
> 
>
Oleksij Rempel Sept. 10, 2021, 12:40 p.m. UTC | #2
On Mon, Sep 06, 2021 at 05:42:00PM +0800, Ziyang Xuan wrote:
> The conclusion "j1939_session_deactivate() should be called with a
> session ref-count of at least 2" is incorrect. In some concurrent
> scenarios, j1939_session_deactivate can be called with the session
> ref-count less than 2. But there is not any problem because it
> will check the session active state before session putting in
> j1939_session_deactivate_locked().
> 
> Here is the concurrent scenario of the problem reported by syzbot
> and my reproduction log.
> 
>         cpu0                            cpu1
>                                 j1939_xtp_rx_eoma
> j1939_xtp_rx_abort_one
>                                 j1939_session_get_by_addr [kref == 2]
> j1939_session_get_by_addr [kref == 3]
> j1939_session_deactivate [kref == 2]
> j1939_session_put [kref == 1]
> 				j1939_session_completed
> 				j1939_session_deactivate
> 				WARN_ON_ONCE(kref < 2)
> 

Ok, I see, this warning makes sense only if session will actually be
deactivated.

Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>

Thank you!

> =====================================================
> WARNING: CPU: 1 PID: 21 at net/can/j1939/transport.c:1088 j1939_session_deactivate+0x5f/0x70
> CPU: 1 PID: 21 Comm: ksoftirqd/1 Not tainted 5.14.0-rc7+ #32
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
> RIP: 0010:j1939_session_deactivate+0x5f/0x70
> Call Trace:
>  j1939_session_deactivate_activate_next+0x11/0x28
>  j1939_xtp_rx_eoma+0x12a/0x180
>  j1939_tp_recv+0x4a2/0x510
>  j1939_can_recv+0x226/0x380
>  can_rcv_filter+0xf8/0x220
>  can_receive+0x102/0x220
>  ? process_backlog+0xf0/0x2c0
>  can_rcv+0x53/0xf0
>  __netif_receive_skb_one_core+0x67/0x90
>  ? process_backlog+0x97/0x2c0
>  __netif_receive_skb+0x22/0x80
> 
> Fixes: 0c71437dd50d ("can: j1939: j1939_session_deactivate(): clarify lifetime of session object")
> Reported-by: syzbot+9981a614060dcee6eeca@syzkaller.appspotmail.com
> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
> ---
>  net/can/j1939/transport.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
> index bdc95bd7a851..0f8309314075 100644
> --- a/net/can/j1939/transport.c
> +++ b/net/can/j1939/transport.c
> @@ -1079,10 +1079,6 @@ static bool j1939_session_deactivate(struct j1939_session *session)
>  	bool active;
>  
>  	j1939_session_list_lock(priv);
> -	/* This function should be called with a session ref-count of at
> -	 * least 2.
> -	 */
> -	WARN_ON_ONCE(kref_read(&session->kref) < 2);
>  	active = j1939_session_deactivate_locked(session);
>  	j1939_session_list_unlock(priv);
>  
> -- 
> 2.25.1
> 
>
Ziyang Xuan (William) Nov. 10, 2021, 2 a.m. UTC | #3
Hello,

I notice that the patch is not applied in upstream. Is it missed
or any other problems?

Thank you!

> On Mon, Sep 06, 2021 at 05:42:00PM +0800, Ziyang Xuan wrote:
>> The conclusion "j1939_session_deactivate() should be called with a
>> session ref-count of at least 2" is incorrect. In some concurrent
>> scenarios, j1939_session_deactivate can be called with the session
>> ref-count less than 2. But there is not any problem because it
>> will check the session active state before session putting in
>> j1939_session_deactivate_locked().
>>
>> Here is the concurrent scenario of the problem reported by syzbot
>> and my reproduction log.
>>
>>         cpu0                            cpu1
>>                                 j1939_xtp_rx_eoma
>> j1939_xtp_rx_abort_one
>>                                 j1939_session_get_by_addr [kref == 2]
>> j1939_session_get_by_addr [kref == 3]
>> j1939_session_deactivate [kref == 2]
>> j1939_session_put [kref == 1]
>> 				j1939_session_completed
>> 				j1939_session_deactivate
>> 				WARN_ON_ONCE(kref < 2)
>>
> 
> Ok, I see, this warning makes sense only if session will actually be
> deactivated.
> 
> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
> 
> Thank you!
> 
>> =====================================================
>> WARNING: CPU: 1 PID: 21 at net/can/j1939/transport.c:1088 j1939_session_deactivate+0x5f/0x70
>> CPU: 1 PID: 21 Comm: ksoftirqd/1 Not tainted 5.14.0-rc7+ #32
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
>> RIP: 0010:j1939_session_deactivate+0x5f/0x70
>> Call Trace:
>>  j1939_session_deactivate_activate_next+0x11/0x28
>>  j1939_xtp_rx_eoma+0x12a/0x180
>>  j1939_tp_recv+0x4a2/0x510
>>  j1939_can_recv+0x226/0x380
>>  can_rcv_filter+0xf8/0x220
>>  can_receive+0x102/0x220
>>  ? process_backlog+0xf0/0x2c0
>>  can_rcv+0x53/0xf0
>>  __netif_receive_skb_one_core+0x67/0x90
>>  ? process_backlog+0x97/0x2c0
>>  __netif_receive_skb+0x22/0x80
>>
>> Fixes: 0c71437dd50d ("can: j1939: j1939_session_deactivate(): clarify lifetime of session object")
>> Reported-by: syzbot+9981a614060dcee6eeca@syzkaller.appspotmail.com
>> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
>> ---
>>  net/can/j1939/transport.c | 4 ----
>>  1 file changed, 4 deletions(-)
>>
>> diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
>> index bdc95bd7a851..0f8309314075 100644
>> --- a/net/can/j1939/transport.c
>> +++ b/net/can/j1939/transport.c
>> @@ -1079,10 +1079,6 @@ static bool j1939_session_deactivate(struct j1939_session *session)
>>  	bool active;
>>  
>>  	j1939_session_list_lock(priv);
>> -	/* This function should be called with a session ref-count of at
>> -	 * least 2.
>> -	 */
>> -	WARN_ON_ONCE(kref_read(&session->kref) < 2);
>>  	active = j1939_session_deactivate_locked(session);
>>  	j1939_session_list_unlock(priv);
>>  
>> -- 
>> 2.25.1
>>
>>
>
Fedor Pchelkin Jan. 14, 2023, 5:35 p.m. UTC | #4
Hello,

On Fri, 10 Sep 2021 14:40:05 +0200, Oleksij Rempel wrote:
> Ok, I see, this warning makes sense only if session will actually be
> deactivated.
>
> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
>
> Thank you!

As Ziyang Xuan stated, the patch was not applied to upstream.

Usage of WARN_ON_ONCE in this case is actually discouraged: it erroneusly
complains in a valid situation.

So the macro should be removed with the aforementioned patch. If it makes
some sense for debugging purposes, WARN_ON_ONCE can be replaced with
netdev_warn/netdev_notice but anyway discard of WARN_ON_ONCE.

--
Regards,

Fedor
diff mbox series

Patch

diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index bdc95bd7a851..0f8309314075 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -1079,10 +1079,6 @@  static bool j1939_session_deactivate(struct j1939_session *session)
 	bool active;
 
 	j1939_session_list_lock(priv);
-	/* This function should be called with a session ref-count of at
-	 * least 2.
-	 */
-	WARN_ON_ONCE(kref_read(&session->kref) < 2);
 	active = j1939_session_deactivate_locked(session);
 	j1939_session_list_unlock(priv);