diff mbox series

[net-next,V2] can: j1939: fix uaf warning in j1939_session_destroy

Message ID tencent_1F473700968236B84AEA74ED76FF67023C09@qq.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [net-next,V2] can: j1939: fix uaf warning in j1939_session_destroy | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 29 this patch: 29
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 29 this patch: 29
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 29 this patch: 29
netdev/checkpatch warning WARNING: The commit message has 'syzkaller', perhaps it also needs a 'Fixes:' tag?
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest warning net-next-2024-08-08--00-00 (tests: 704)

Commit Message

Edward Adam Davis Aug. 7, 2024, 11:08 p.m. UTC
The root cause of this problem is when both of the following conditions
are met simultaneously:
[1] Introduced commit c9c0ee5f20c5, There are following rules:
In debug builds (CONFIG_DEBUG_NET set), the reference count is always
decremented, even when it's 1.

[2] When executing sendmsg, the newly created session did not increase the
skb reference count, only added skb to the session's skb_queue.

The solution is:
When creating a new session, do not add the skb to the skb_queue.
Instead, when using skb, uniformly use j1939_session_skb_queue to add
the skb to the queue and increase the skb reference count through it.

Reported-and-tested-by: syzbot+ad601904231505ad6617@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ad601904231505ad6617
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
 net/can/j1939/socket.c    | 7 ++++---
 net/can/j1939/transport.c | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

Comments

Oleksij Rempel Aug. 8, 2024, 7:49 a.m. UTC | #1
Hi Edward,

On Thu, Aug 08, 2024 at 07:08:49AM +0800, Edward Adam Davis wrote:
> The root cause of this problem is when both of the following conditions
> are met simultaneously:
> [1] Introduced commit c9c0ee5f20c5, There are following rules:
> In debug builds (CONFIG_DEBUG_NET set), the reference count is always
> decremented, even when it's 1.
> 
> [2] When executing sendmsg, the newly created session did not increase the
> skb reference count, only added skb to the session's skb_queue.
> 
> The solution is:
> When creating a new session, do not add the skb to the skb_queue.
> Instead, when using skb, uniformly use j1939_session_skb_queue to add
> the skb to the queue and increase the skb reference count through it.
> 
> Reported-and-tested-by: syzbot+ad601904231505ad6617@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=ad601904231505ad6617
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>

This patch breaks j1939.
The issue can be reproduced by running following commands:
git clone git@github.com:linux-can/can-tests.git
cd can-tests/j1939/
ip link add type vcan
ip l s dev vcan0 up
./run_all.sh vcan0 vcan0


Regards,
Oleksij
Edward Adam Davis Aug. 8, 2024, 11:07 a.m. UTC | #2
On Thu, 8 Aug 2024 09:49:18 +0200, Oleksij Rempel wrote:
> > the skb to the queue and increase the skb reference count through it.
> > 
> > Reported-and-tested-by: syzbot+ad601904231505ad6617@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=ad601904231505ad6617
> > Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> 
> This patch breaks j1939.
> The issue can be reproduced by running following commands:
I tried to reproduce the problem using the following command, but was 
unsuccessful. Prompt me to install j1939cat and j1939acd, and there are
some other errors.

Can you share the logs from when you reproduced the problem?
> git clone git@github.com:linux-can/can-tests.git
> cd can-tests/j1939/
> ip link add type vcan
> ip l s dev vcan0 up
> ./run_all.sh vcan0 vcan0

BR,

--
Edward
Marc Kleine-Budde Aug. 8, 2024, 11:57 a.m. UTC | #3
On 08.08.2024 19:07:55, Edward Adam Davis wrote:
> On Thu, 8 Aug 2024 09:49:18 +0200, Oleksij Rempel wrote:
> > > the skb to the queue and increase the skb reference count through it.
> > > 
> > > Reported-and-tested-by: syzbot+ad601904231505ad6617@syzkaller.appspotmail.com
> > > Closes: https://syzkaller.appspot.com/bug?extid=ad601904231505ad6617
> > > Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> > 
> > This patch breaks j1939.
> > The issue can be reproduced by running following commands:
> I tried to reproduce the problem using the following command, but was 
> unsuccessful. Prompt me to install j1939cat and j1939acd, and there are
> some other errors.

The j1939 commands are part of the can-utils package. On Debian
compatible systems it's "sudo apt install can-utils". Or you can build
them from source: https://github.com/linux-can/can-utils

Marc
Sabyrzhan Tasbolatov Oct. 11, 2024, 1:41 p.m. UTC | #4
On Thu, 8 Aug 2024 19:07:55 +0800, Edward Adam Davis wrote:
> On Thu, 8 Aug 2024 09:49:18 +0200, Oleksij Rempel wrote:
> > > the skb to the queue and increase the skb reference count through it.
> > > 
> > > Reported-and-tested-by: syzbot+ad601904231505ad6617@syzkaller.appspotmail.com
> > > Closes: https://syzkaller.appspot.com/bug?extid=ad601904231505ad6617
> > > Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> > 
> > This patch breaks j1939.
> > The issue can be reproduced by running following commands:
> I tried to reproduce the problem using the following command, but was 
> unsuccessful. Prompt me to install j1939cat and j1939acd, and there are
> some other errors.
> 
> Can you share the logs from when you reproduced the problem?

Hello,

Here is the log of can-tests/j1939/run_all.sh:

# ip link add type vcan
# ip l s dev vcan0 up
# ./run_all.sh vcan0 vcan0
##############################################
run: j1939_ac_100k_dual_can.sh
generate random data for the test
1+0 records in
1+0 records out
102400 bytes (102 kB, 100 KiB) copied, 0.00191192 s, 53.6 MB/s
start j1939acd and j1939cat on vcan0
8321
8323
start j1939acd and j1939cat on vcan0
[  132.211317][ T8326] vcan0: tx drop: invalid sa for name 0x0000000011223340
j1939cat: j1939cat_send_one: transfer error: -99: Cannot assign requested address

It fails here:
https://github.com/linux-can/can-tests/blob/master/j1939/j1939_ac_100k_dual_can.sh#L70

The error message is printed in this condition:
https://elixir.bootlin.com/linux/v6.12-rc2/source/net/can/j1939/address-claim.c#L104-L108

I've applied your patch on the current 6.12.0-rc2 and the syzkaller C repro
doesn't trigger WARNING uaf, refcount anymore though.

== Offtopic:
I wonder if can-tests/j1939 should be refactored from shell to C tests in the
same linux-can/can-tests repository (or even migrate to KUnit tests)
to improve debugging, test coverage. I'd like to understand which syscalls
and params are used j1939cat and j1939acd utils -- currently, tracing with
strace and trace-cmd (ftrace).

> > git clone git@github.com:linux-can/can-tests.git
> > cd can-tests/j1939/
> > ip link add type vcan
> > ip l s dev vcan0 up
> > ./run_all.sh vcan0 vcan0
Oleksij Rempel Oct. 11, 2024, 2:10 p.m. UTC | #5
Hi Sabyrzhan,

On Fri, Oct 11, 2024 at 06:41:24PM +0500, Sabyrzhan Tasbolatov wrote:
> On Thu, 8 Aug 2024 19:07:55 +0800, Edward Adam Davis wrote:
> > On Thu, 8 Aug 2024 09:49:18 +0200, Oleksij Rempel wrote:
> > > > the skb to the queue and increase the skb reference count through it.
> > > > 
> > > > Reported-and-tested-by: syzbot+ad601904231505ad6617@syzkaller.appspotmail.com
> > > > Closes: https://syzkaller.appspot.com/bug?extid=ad601904231505ad6617
> > > > Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> > > 
> > > This patch breaks j1939.
> > > The issue can be reproduced by running following commands:
> > I tried to reproduce the problem using the following command, but was 
> > unsuccessful. Prompt me to install j1939cat and j1939acd, and there are
> > some other errors.
> > 
> > Can you share the logs from when you reproduced the problem?
 
ah, i was on vacation and it went under my radar, sorry :(

> Hello,
> 
> Here is the log of can-tests/j1939/run_all.sh:
> 
> # ip link add type vcan
> # ip l s dev vcan0 up
> # ./run_all.sh vcan0 vcan0
> ##############################################
> run: j1939_ac_100k_dual_can.sh
> generate random data for the test
> 1+0 records in
> 1+0 records out
> 102400 bytes (102 kB, 100 KiB) copied, 0.00191192 s, 53.6 MB/s
> start j1939acd and j1939cat on vcan0
> 8321
> 8323
> start j1939acd and j1939cat on vcan0
> [  132.211317][ T8326] vcan0: tx drop: invalid sa for name 0x0000000011223340
> j1939cat: j1939cat_send_one: transfer error: -99: Cannot assign requested address
> 
> It fails here:
> https://github.com/linux-can/can-tests/blob/master/j1939/j1939_ac_100k_dual_can.sh#L70

I assume it is just secondary fail, it probably failed on address claim
stage in j1939acd, so the j1939cat was not able to start transfer due to
missing (not claimed) address.

> The error message is printed in this condition:
> https://elixir.bootlin.com/linux/v6.12-rc2/source/net/can/j1939/address-claim.c#L104-L108
> 
> I've applied your patch on the current 6.12.0-rc2 and the syzkaller C repro
> doesn't trigger WARNING uaf, refcount anymore though.

Yes, because transfer protocol is broken now. 

> == Offtopic:
> I wonder if can-tests/j1939 should be refactored from shell to C tests in the
> same linux-can/can-tests repository (or even migrate to KUnit tests)
> to improve debugging, test coverage. I'd like to understand which syscalls
> and params are used j1939cat and j1939acd utils -- currently, tracing with
> strace and trace-cmd (ftrace).

I have nothing against it, some of them I implemented in C:
https://github.com/linux-can/can-tests/blob/master/j1939/tst-j1939-ac.c#L1160

Right now I do not have enough time to port it, but I can support anyone
who is willing to do it.

Best Regards,
Oleksij
diff mbox series

Patch

diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 305dd72c844c..ec78bee1bfa6 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1170,10 +1170,11 @@  static int j1939_sk_send_loop(struct j1939_priv *priv,  struct sock *sk,
 					break;
 				}
 			}
-		} else {
-			skcb->offset = session->total_queued_size;
-			j1939_session_skb_queue(session, skb);
 		}
+		/* Session is ready, add it to skb queue and increase ref count.
+		 */
+		skcb->offset = session->total_queued_size;
+		j1939_session_skb_queue(session, skb);
 
 		todo_size -= segment_size;
 		session->total_queued_size += segment_size;
diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index 4be73de5033c..dd503bc3adb5 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -1505,7 +1505,6 @@  static struct j1939_session *j1939_session_new(struct j1939_priv *priv,
 	session->state = J1939_SESSION_NEW;
 
 	skb_queue_head_init(&session->skb_queue);
-	skb_queue_tail(&session->skb_queue, skb);
 
 	skcb = j1939_skb_to_cb(skb);
 	memcpy(&session->skcb, skcb, sizeof(session->skcb));
@@ -1548,6 +1547,7 @@  j1939_session *j1939_session_fresh_new(struct j1939_priv *priv,
 		kfree_skb(skb);
 		return NULL;
 	}
+	j1939_session_skb_queue(session, skb);
 
 	/* alloc data area */
 	skb_put(skb, size);