diff mbox series

[net,v2] ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet

Message ID 20231122042936.1831735-1-shaozhengchao@huawei.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net,v2] ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/codegen success Generated files up to date
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1122 this patch: 1122
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 1143 this patch: 1143
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1149 this patch: 1149
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 12 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

shaozhengchao Nov. 22, 2023, 4:29 a.m. UTC
When I perform the following test operations:
1.ip link add br0 type bridge
2.brctl addif br0 eth0
3.ip addr add 239.0.0.1/32 dev eth0
4.ip addr add 239.0.0.1/32 dev br0
5.ip addr add 224.0.0.1/32 dev br0
6.while ((1))
    do
        ifconfig br0 up
        ifconfig br0 down
    done
7.send IGMPv2 query packets to port eth0 continuously. For example,
./mausezahn ethX -c 0 "01 00 5e 00 00 01 00 72 19 88 aa 02 08 00 45 00 00
1c 00 01 00 00 01 02 0e 7f c0 a8 0a b7 e0 00 00 01 11 64 ee 9b 00 00 00 00"

The preceding tests may trigger the refcnt uaf issue of the mc list. The
stack is as follows:
	refcount_t: addition on 0; use-after-free.
	WARNING: CPU: 21 PID: 144 at lib/refcount.c:25 refcount_warn_saturate+0x78/0x110
	CPU: 21 PID: 144 Comm: ksoftirqd/21 Kdump: loaded Not tainted 6.7.0-rc1-next-20231117-dirty #57
	RIP: 0010:refcount_warn_saturate+0x78/0x110
	Call Trace:
	<TASK>
	__warn+0x83/0x130
	refcount_warn_saturate+0x78/0x110
	igmp_start_timer
	igmp_mod_timer
	igmp_heard_query+0x221/0x690
	igmp_rcv+0xea/0x2f0
	ip_protocol_deliver_rcu+0x156/0x160
	ip_local_deliver_finish+0x77/0xa0
	__netif_receive_skb_one_core+0x8b/0xa0
	netif_receive_skb_internal+0x80/0xd0
	netif_receive_skb+0x18/0xc0
	br_handle_frame_finish+0x340/0x5c0 [bridge]
	nf_hook_bridge_pre+0x117/0x130 [bridge]
	__netif_receive_skb_core+0x241/0x1090
	__netif_receive_skb_list_core+0x13f/0x2e0
	__netif_receive_skb_list+0xfc/0x190
	netif_receive_skb_list_internal+0x102/0x1e0
	napi_gro_receive+0xd7/0x220
	e1000_clean_rx_irq+0x1d4/0x4f0 [e1000]
	e1000_clean+0x5e/0xe0 [e1000]
	__napi_poll+0x2c/0x1b0
	net_rx_action+0x2cb/0x3a0
	__do_softirq+0xcd/0x2a7
	run_ksoftirqd+0x22/0x30
	smpboot_thread_fn+0xdb/0x1d0
	kthread+0xe2/0x110
	ret_from_fork+0x34/0x50
	ret_from_fork_asm+0x1a/0x30
	</TASK>

The root causes are as follows:
Thread A					Thread B
...						netif_receive_skb
br_dev_stop					...
    br_multicast_leave_snoopers			...
        __ip_mc_dec_group			...
            __igmp_group_dropped		igmp_rcv
                igmp_stop_timer			    igmp_heard_query         //ref = 1
                ip_ma_put			        igmp_mod_timer
                    refcount_dec_and_test	            igmp_start_timer //ref = 0
			...                                     refcount_inc //ref increases from 0
When the device receives an IGMPv2 Query message, it starts the timer
immediately, regardless of whether the device is running. If the device is
down and has left the multicast group, it will cause the mc list refcount
uaf issue.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
---
v2: use cmd "cat messages |/root/linux-next/scripts/decode_stacktrace.sh
    /root/linux-next/vmlinux" to get precise stack traces and check whether
    the im is destroyed before timer is started.
---
 net/ipv4/igmp.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Eric Dumazet Nov. 22, 2023, 2:13 p.m. UTC | #1
On Wed, Nov 22, 2023 at 5:17 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote:
>
> When I perform the following test operations:
> 1.ip link add br0 type bridge
> 2.brctl addif br0 eth0
> 3.ip addr add 239.0.0.1/32 dev eth0
> 4.ip addr add 239.0.0.1/32 dev br0
> 5.ip addr add 224.0.0.1/32 dev br0
> 6.while ((1))
>     do
>         ifconfig br0 up
>         ifconfig br0 down
>     done
> 7.send IGMPv2 query packets to port eth0 continuously. For example,
> ./mausezahn ethX -c 0 "01 00 5e 00 00 01 00 72 19 88 aa 02 08 00 45 00 00
> 1c 00 01 00 00 01 02 0e 7f c0 a8 0a b7 e0 00 00 01 11 64 ee 9b 00 00 00 00"
>
> The preceding tests may trigger the refcnt uaf issue of the mc list. The
> stack is as follows:
>         refcount_t: addition on 0; use-after-free.
>         WARNING: CPU: 21 PID: 144 at lib/refcount.c:25 refcount_warn_saturate+0x78/0x110
>         CPU: 21 PID: 144 Comm: ksoftirqd/21 Kdump: loaded Not tainted 6.7.0-rc1-next-20231117-dirty #57
>         RIP: 0010:refcount_warn_saturate+0x78/0x110
>         Call Trace:
>         <TASK>
>         __warn+0x83/0x130
>         refcount_warn_saturate+0x78/0x110
>         igmp_start_timer
>         igmp_mod_timer
>         igmp_heard_query+0x221/0x690
>         igmp_rcv+0xea/0x2f0
>         ip_protocol_deliver_rcu+0x156/0x160
>         ip_local_deliver_finish+0x77/0xa0
>         __netif_receive_skb_one_core+0x8b/0xa0
>         netif_receive_skb_internal+0x80/0xd0
>         netif_receive_skb+0x18/0xc0

Yet no symbols...

>         br_handle_frame_finish+0x340/0x5c0 [bridge]
>         nf_hook_bridge_pre+0x117/0x130 [bridge]
>         __netif_receive_skb_core+0x241/0x1090
>         __netif_receive_skb_list_core+0x13f/0x2e0
>         __netif_receive_skb_list+0xfc/0x190
>         netif_receive_skb_list_internal+0x102/0x1e0
>         napi_gro_receive+0xd7/0x220
>         e1000_clean_rx_irq+0x1d4/0x4f0 [e1000]
>         e1000_clean+0x5e/0xe0 [e1000]
>         __napi_poll+0x2c/0x1b0
>         net_rx_action+0x2cb/0x3a0
>         __do_softirq+0xcd/0x2a7
>         run_ksoftirqd+0x22/0x30
>         smpboot_thread_fn+0xdb/0x1d0
>         kthread+0xe2/0x110
>         ret_from_fork+0x34/0x50
>         ret_from_fork_asm+0x1a/0x30
>         </TASK>
>
> The root causes are as follows:
> Thread A                                        Thread B
> ...                                             netif_receive_skb
> br_dev_stop                                     ...
>     br_multicast_leave_snoopers                 ...
>         __ip_mc_dec_group                       ...
>             __igmp_group_dropped                igmp_rcv
>                 igmp_stop_timer                     igmp_heard_query         //ref = 1
>                 ip_ma_put                               igmp_mod_timer
>                     refcount_dec_and_test                   igmp_start_timer //ref = 0
>                         ...                                     refcount_inc //ref increases from 0
> When the device receives an IGMPv2 Query message, it starts the timer
> immediately, regardless of whether the device is running. If the device is
> down and has left the multicast group, it will cause the mc list refcount
> uaf issue.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
> ---
> v2: use cmd "cat messages |/root/linux-next/scripts/decode_stacktrace.sh
>     /root/linux-next/vmlinux" to get precise stack traces and check whether
>     the im is destroyed before timer is started.

I do not think you understood the point I made.

Look at commit 9fce92f050f448a ("mptcp: deal with large GSO size")
for a good example of what a stack trace should look like.


> ---
>  net/ipv4/igmp.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 76c3ea75b8dd..efeeca2b1328 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -216,8 +216,10 @@ static void igmp_start_timer(struct ip_mc_list *im, int max_delay)
>         int tv = get_random_u32_below(max_delay);
>
>         im->tm_running = 1;
> -       if (!mod_timer(&im->timer, jiffies+tv+2))
> -               refcount_inc(&im->refcnt);
> +       if (refcount_inc_not_zero(&im->refcnt)) {
> +               if (mod_timer(&im->timer, jiffies + tv + 2))
> +                       ip_ma_put(im);
> +       }
>  }
>
>  static void igmp_gq_start_timer(struct in_device *in_dev)
> --
> 2.34.1
>
shaozhengchao Nov. 23, 2023, 1:39 a.m. UTC | #2
On 2023/11/22 22:13, Eric Dumazet wrote:
> On Wed, Nov 22, 2023 at 5:17 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote:
>>
>> When I perform the following test operations:
>> 1.ip link add br0 type bridge
>> 2.brctl addif br0 eth0
>> 3.ip addr add 239.0.0.1/32 dev eth0
>> 4.ip addr add 239.0.0.1/32 dev br0
>> 5.ip addr add 224.0.0.1/32 dev br0
>> 6.while ((1))
>>      do
>>          ifconfig br0 up
>>          ifconfig br0 down
>>      done
>> 7.send IGMPv2 query packets to port eth0 continuously. For example,
>> ./mausezahn ethX -c 0 "01 00 5e 00 00 01 00 72 19 88 aa 02 08 00 45 00 00
>> 1c 00 01 00 00 01 02 0e 7f c0 a8 0a b7 e0 00 00 01 11 64 ee 9b 00 00 00 00"
>>
>> The preceding tests may trigger the refcnt uaf issue of the mc list. The
>> stack is as follows:
>>          refcount_t: addition on 0; use-after-free.
>>          WARNING: CPU: 21 PID: 144 at lib/refcount.c:25 refcount_warn_saturate+0x78/0x110
>>          CPU: 21 PID: 144 Comm: ksoftirqd/21 Kdump: loaded Not tainted 6.7.0-rc1-next-20231117-dirty #57
>>          RIP: 0010:refcount_warn_saturate+0x78/0x110
>>          Call Trace:
>>          <TASK>
>>          __warn+0x83/0x130
>>          refcount_warn_saturate+0x78/0x110
>>          igmp_start_timer
>>          igmp_mod_timer
>>          igmp_heard_query+0x221/0x690
>>          igmp_rcv+0xea/0x2f0
>>          ip_protocol_deliver_rcu+0x156/0x160
>>          ip_local_deliver_finish+0x77/0xa0
>>          __netif_receive_skb_one_core+0x8b/0xa0
>>          netif_receive_skb_internal+0x80/0xd0
>>          netif_receive_skb+0x18/0xc0
> 
> Yet no symbols...
> 
>>          br_handle_frame_finish+0x340/0x5c0 [bridge]
>>          nf_hook_bridge_pre+0x117/0x130 [bridge]
>>          __netif_receive_skb_core+0x241/0x1090
>>          __netif_receive_skb_list_core+0x13f/0x2e0
>>          __netif_receive_skb_list+0xfc/0x190
>>          netif_receive_skb_list_internal+0x102/0x1e0
>>          napi_gro_receive+0xd7/0x220
>>          e1000_clean_rx_irq+0x1d4/0x4f0 [e1000]
>>          e1000_clean+0x5e/0xe0 [e1000]
>>          __napi_poll+0x2c/0x1b0
>>          net_rx_action+0x2cb/0x3a0
>>          __do_softirq+0xcd/0x2a7
>>          run_ksoftirqd+0x22/0x30
>>          smpboot_thread_fn+0xdb/0x1d0
>>          kthread+0xe2/0x110
>>          ret_from_fork+0x34/0x50
>>          ret_from_fork_asm+0x1a/0x30
>>          </TASK>
>>
>> The root causes are as follows:
>> Thread A                                        Thread B
>> ...                                             netif_receive_skb
>> br_dev_stop                                     ...
>>      br_multicast_leave_snoopers                 ...
>>          __ip_mc_dec_group                       ...
>>              __igmp_group_dropped                igmp_rcv
>>                  igmp_stop_timer                     igmp_heard_query         //ref = 1
>>                  ip_ma_put                               igmp_mod_timer
>>                      refcount_dec_and_test                   igmp_start_timer //ref = 0
>>                          ...                                     refcount_inc //ref increases from 0
>> When the device receives an IGMPv2 Query message, it starts the timer
>> immediately, regardless of whether the device is running. If the device is
>> down and has left the multicast group, it will cause the mc list refcount
>> uaf issue.
>>
>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
>> ---
>> v2: use cmd "cat messages |/root/linux-next/scripts/decode_stacktrace.sh
>>      /root/linux-next/vmlinux" to get precise stack traces and check whether
>>      the im is destroyed before timer is started.
> 
> I do not think you understood the point I made.
> 
> Look at commit 9fce92f050f448a ("mptcp: deal with large GSO size")
> for a good example of what a stack trace should look like.
> 

Hi Eric:
	Thank your for your suggestion. I will update the commit
message.

Thanks

Zhengchao Shao
> 
>> ---
>>   net/ipv4/igmp.c | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
>> index 76c3ea75b8dd..efeeca2b1328 100644
>> --- a/net/ipv4/igmp.c
>> +++ b/net/ipv4/igmp.c
>> @@ -216,8 +216,10 @@ static void igmp_start_timer(struct ip_mc_list *im, int max_delay)
>>          int tv = get_random_u32_below(max_delay);
>>
>>          im->tm_running = 1;
>> -       if (!mod_timer(&im->timer, jiffies+tv+2))
>> -               refcount_inc(&im->refcnt);
>> +       if (refcount_inc_not_zero(&im->refcnt)) {
>> +               if (mod_timer(&im->timer, jiffies + tv + 2))
>> +                       ip_ma_put(im);
>> +       }
>>   }
>>
>>   static void igmp_gq_start_timer(struct in_device *in_dev)
>> --
>> 2.34.1
>>
>
diff mbox series

Patch

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 76c3ea75b8dd..efeeca2b1328 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -216,8 +216,10 @@  static void igmp_start_timer(struct ip_mc_list *im, int max_delay)
 	int tv = get_random_u32_below(max_delay);
 
 	im->tm_running = 1;
-	if (!mod_timer(&im->timer, jiffies+tv+2))
-		refcount_inc(&im->refcnt);
+	if (refcount_inc_not_zero(&im->refcnt)) {
+		if (mod_timer(&im->timer, jiffies + tv + 2))
+			ip_ma_put(im);
+	}
 }
 
 static void igmp_gq_start_timer(struct in_device *in_dev)