diff mbox series

[net,v5] team: fix null-ptr-deref when team device type is changed

Message ID 20230918123011.1884401-1-william.xuanziyang@huawei.com (mailing list archive)
State Accepted
Commit 492032760127251e5540a5716a70996bacf2a3fd
Delegated to: Netdev Maintainers
Headers show
Series [net,v5] team: fix null-ptr-deref when team device type is changed | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1342 this patch: 1342
netdev/cc_maintainers success CCed 6 of 6 maintainers
netdev/build_clang success Errors and warnings before: 1363 this patch: 1363
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1365 this patch: 1365
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 32 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Ziyang Xuan (William) Sept. 18, 2023, 12:30 p.m. UTC
Get a null-ptr-deref bug as follows with reproducer [1].

BUG: kernel NULL pointer dereference, address: 0000000000000228
...
RIP: 0010:vlan_dev_hard_header+0x35/0x140 [8021q]
...
Call Trace:
 <TASK>
 ? __die+0x24/0x70
 ? page_fault_oops+0x82/0x150
 ? exc_page_fault+0x69/0x150
 ? asm_exc_page_fault+0x26/0x30
 ? vlan_dev_hard_header+0x35/0x140 [8021q]
 ? vlan_dev_hard_header+0x8e/0x140 [8021q]
 neigh_connected_output+0xb2/0x100
 ip6_finish_output2+0x1cb/0x520
 ? nf_hook_slow+0x43/0xc0
 ? ip6_mtu+0x46/0x80
 ip6_finish_output+0x2a/0xb0
 mld_sendpack+0x18f/0x250
 mld_ifc_work+0x39/0x160
 process_one_work+0x1e6/0x3f0
 worker_thread+0x4d/0x2f0
 ? __pfx_worker_thread+0x10/0x10
 kthread+0xe5/0x120
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x34/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1b/0x30

[1]
$ teamd -t team0 -d -c '{"runner": {"name": "loadbalance"}}'
$ ip link add name t-dummy type dummy
$ ip link add link t-dummy name t-dummy.100 type vlan id 100
$ ip link add name t-nlmon type nlmon
$ ip link set t-nlmon master team0
$ ip link set t-nlmon nomaster
$ ip link set t-dummy up
$ ip link set team0 up
$ ip link set t-dummy.100 down
$ ip link set t-dummy.100 master team0

When enslave a vlan device to team device and team device type is changed
from non-ether to ether, header_ops of team device is changed to
vlan_header_ops. That is incorrect and will trigger null-ptr-deref
for vlan->real_dev in vlan_dev_hard_header() because team device is not
a vlan device.

Cache eth_header_ops in team_setup(), then assign cached header_ops to
header_ops of team net device when its type is changed from non-ether
to ether to fix the bug.

Fixes: 1d76efe1577b ("team: add support for non-ethernet devices")
Suggested-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
---
v5:
  - Do not export eth_header_ops, cache it.
v4:
  - Add Reviewed-by tag.
v3:
  - Export eth_header_ops to fix modpost error.
v2:
  - Just modify header_ops to eth_header_ops not use ether_setup().
---
 drivers/net/team/team.c | 10 +++++++++-
 include/linux/if_team.h |  2 ++
 2 files changed, 11 insertions(+), 1 deletion(-)

Comments

Jiri Pirko Sept. 18, 2023, 12:50 p.m. UTC | #1
Mon, Sep 18, 2023 at 02:30:11PM CEST, william.xuanziyang@huawei.com wrote:
>Get a null-ptr-deref bug as follows with reproducer [1].
>
>BUG: kernel NULL pointer dereference, address: 0000000000000228
>...
>RIP: 0010:vlan_dev_hard_header+0x35/0x140 [8021q]
>...
>Call Trace:
> <TASK>
> ? __die+0x24/0x70
> ? page_fault_oops+0x82/0x150
> ? exc_page_fault+0x69/0x150
> ? asm_exc_page_fault+0x26/0x30
> ? vlan_dev_hard_header+0x35/0x140 [8021q]
> ? vlan_dev_hard_header+0x8e/0x140 [8021q]
> neigh_connected_output+0xb2/0x100
> ip6_finish_output2+0x1cb/0x520
> ? nf_hook_slow+0x43/0xc0
> ? ip6_mtu+0x46/0x80
> ip6_finish_output+0x2a/0xb0
> mld_sendpack+0x18f/0x250
> mld_ifc_work+0x39/0x160
> process_one_work+0x1e6/0x3f0
> worker_thread+0x4d/0x2f0
> ? __pfx_worker_thread+0x10/0x10
> kthread+0xe5/0x120
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x34/0x50
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1b/0x30
>
>[1]
>$ teamd -t team0 -d -c '{"runner": {"name": "loadbalance"}}'
>$ ip link add name t-dummy type dummy
>$ ip link add link t-dummy name t-dummy.100 type vlan id 100
>$ ip link add name t-nlmon type nlmon
>$ ip link set t-nlmon master team0
>$ ip link set t-nlmon nomaster
>$ ip link set t-dummy up
>$ ip link set team0 up
>$ ip link set t-dummy.100 down
>$ ip link set t-dummy.100 master team0
>
>When enslave a vlan device to team device and team device type is changed
>from non-ether to ether, header_ops of team device is changed to
>vlan_header_ops. That is incorrect and will trigger null-ptr-deref
>for vlan->real_dev in vlan_dev_hard_header() because team device is not
>a vlan device.
>
>Cache eth_header_ops in team_setup(), then assign cached header_ops to
>header_ops of team net device when its type is changed from non-ether
>to ether to fix the bug.
>
>Fixes: 1d76efe1577b ("team: add support for non-ethernet devices")
>Suggested-by: Hangbin Liu <liuhangbin@gmail.com>
>Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
>Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Eric Dumazet Sept. 18, 2023, 1:04 p.m. UTC | #2
On Mon, Sep 18, 2023 at 2:30 PM Ziyang Xuan
<william.xuanziyang@huawei.com> wrote:
>
> Get a null-ptr-deref bug as follows with reproducer [1].
>
> BUG: kernel NULL pointer dereference, address: 0000000000000228
> ...
> RIP: 0010:vlan_dev_hard_header+0x35/0x140 [8021q]
> ...
> Call Trace:
>  <TASK>
>  ? __die+0x24/0x70
>  ? page_fault_oops+0x82/0x150
>  ? exc_page_fault+0x69/0x150
>  ? asm_exc_page_fault+0x26/0x30
>  ? vlan_dev_hard_header+0x35/0x140 [8021q]
>  ? vlan_dev_hard_header+0x8e/0x140 [8021q]
>  neigh_connected_output+0xb2/0x100
>  ip6_finish_output2+0x1cb/0x520
>  ? nf_hook_slow+0x43/0xc0
>  ? ip6_mtu+0x46/0x80
>  ip6_finish_output+0x2a/0xb0
>  mld_sendpack+0x18f/0x250
>  mld_ifc_work+0x39/0x160
>  process_one_work+0x1e6/0x3f0
>  worker_thread+0x4d/0x2f0
>  ? __pfx_worker_thread+0x10/0x10
>  kthread+0xe5/0x120
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork+0x34/0x50
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork_asm+0x1b/0x30
>
>

I am quite sure this will solve some syzbot reports as well, thanks.

Reviewed-by: Eric Dumazet <edumazet@google.com>
patchwork-bot+netdevbpf@kernel.org Sept. 19, 2023, 2:40 p.m. UTC | #3
Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Mon, 18 Sep 2023 20:30:11 +0800 you wrote:
> Get a null-ptr-deref bug as follows with reproducer [1].
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000228
> ...
> RIP: 0010:vlan_dev_hard_header+0x35/0x140 [8021q]
> ...
> Call Trace:
>  <TASK>
>  ? __die+0x24/0x70
>  ? page_fault_oops+0x82/0x150
>  ? exc_page_fault+0x69/0x150
>  ? asm_exc_page_fault+0x26/0x30
>  ? vlan_dev_hard_header+0x35/0x140 [8021q]
>  ? vlan_dev_hard_header+0x8e/0x140 [8021q]
>  neigh_connected_output+0xb2/0x100
>  ip6_finish_output2+0x1cb/0x520
>  ? nf_hook_slow+0x43/0xc0
>  ? ip6_mtu+0x46/0x80
>  ip6_finish_output+0x2a/0xb0
>  mld_sendpack+0x18f/0x250
>  mld_ifc_work+0x39/0x160
>  process_one_work+0x1e6/0x3f0
>  worker_thread+0x4d/0x2f0
>  ? __pfx_worker_thread+0x10/0x10
>  kthread+0xe5/0x120
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork+0x34/0x50
>  ? __pfx_kthread+0x10/0x10
>  ret_from_fork_asm+0x1b/0x30
> 
> [...]

Here is the summary with links:
  - [net,v5] team: fix null-ptr-deref when team device type is changed
    https://git.kernel.org/netdev/net/c/492032760127

You are awesome, thank you!
diff mbox series

Patch

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index d3dc22509ea5..8e6a87ba85aa 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -2127,7 +2127,12 @@  static const struct ethtool_ops team_ethtool_ops = {
 static void team_setup_by_port(struct net_device *dev,
 			       struct net_device *port_dev)
 {
-	dev->header_ops	= port_dev->header_ops;
+	struct team *team = netdev_priv(dev);
+
+	if (port_dev->type == ARPHRD_ETHER)
+		dev->header_ops	= team->header_ops_cache;
+	else
+		dev->header_ops	= port_dev->header_ops;
 	dev->type = port_dev->type;
 	dev->hard_header_len = port_dev->hard_header_len;
 	dev->needed_headroom = port_dev->needed_headroom;
@@ -2174,8 +2179,11 @@  static int team_dev_type_check_change(struct net_device *dev,
 
 static void team_setup(struct net_device *dev)
 {
+	struct team *team = netdev_priv(dev);
+
 	ether_setup(dev);
 	dev->max_mtu = ETH_MAX_MTU;
+	team->header_ops_cache = dev->header_ops;
 
 	dev->netdev_ops = &team_netdev_ops;
 	dev->ethtool_ops = &team_ethtool_ops;
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
index 8de6b6e67829..34bcba5a7067 100644
--- a/include/linux/if_team.h
+++ b/include/linux/if_team.h
@@ -189,6 +189,8 @@  struct team {
 	struct net_device *dev; /* associated netdevice */
 	struct team_pcpu_stats __percpu *pcpu_stats;
 
+	const struct header_ops *header_ops_cache;
+
 	struct mutex lock; /* used for overall locking, e.g. port lists write */
 
 	/*