Message ID | 20241016093011.318078-1-aleksandr.loktionov@intel.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [iwl-net,v2] i40e: fix race condition by adding filter's intermediate sync state | expand |
> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Aleksandr Loktionov > Sent: 16 October 2024 15:00 > To: intel-wired-lan@lists.osuosl.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com> > Cc: netdev@vger.kernel.org > Subject: [Intel-wired-lan] [PATCH iwl-net v2] i40e: fix race condition by adding filter's intermediate sync state > > Fix a race condition in the i40e driver that leads to MAC/VLAN filters becoming corrupted and leaking. Address the issue that occurs under heavy load when multiple threads are concurrently modifying MAC/VLAN filters by setting mac and port VLAN. > > 1. Thread T0 allocates a filter in i40e_add_filter() within > i40e_ndo_set_vf_port_vlan(). > 2. Thread T1 concurrently frees the filter in __i40e_del_filter() within > i40e_ndo_set_vf_mac(). > 3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which > refers to the already freed filter memory, causing corruption. > > Reproduction steps: > 1. Spawn multiple VFs. > 2. Apply a concurrent heavy load by running parallel operations to change > MAC addresses on the VFs and change port VLANs on the host. > 3. Observe errors in dmesg: > "Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX, > please set promiscuous on manually for VF XX". > > Exact code for stable reproduction Intel can't open-source now. > > The fix involves implementing a new intermediate filter state, I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list. > These filters cannot be deleted from the hash list directly but must be removed using the full process. > > Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key") > Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> > --- > v1->v2 change commit title, removed RESERVED state byt request in review > --- > drivers/net/ethernet/intel/i40e/i40e.h | 2 ++ > drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 1 + > drivers/net/ethernet/intel/i40e/i40e_main.c | 12 ++++++++++-- > 3 files changed, 13 insertions(+), 2 deletions(-) > Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
On Wed, Oct 16, 2024 at 11:30 AM Aleksandr Loktionov <aleksandr.loktionov@intel.com> wrote: > > Fix a race condition in the i40e driver that leads to MAC/VLAN filters > becoming corrupted and leaking. Address the issue that occurs under > heavy load when multiple threads are concurrently modifying MAC/VLAN > filters by setting mac and port VLAN. > > 1. Thread T0 allocates a filter in i40e_add_filter() within > i40e_ndo_set_vf_port_vlan(). > 2. Thread T1 concurrently frees the filter in __i40e_del_filter() within > i40e_ndo_set_vf_mac(). > 3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which > refers to the already freed filter memory, causing corruption. I think an important detail is missing from the race description. I am guessing it could happen like this: 1. A thread allocates a filter with i40e_add_filter(). 2. i40e_service_task() calls i40e_sync_vsi_filters(), which adds an entry to its local tmp_add_list referencing the filter. It releases vsi->mac_filter_hash_lock before it processes tmp_add_list. 3. A thread frees the filter in __i40e_del_filter(). This is while holding vsi->mac_filter_hash_lock. 4. The service task processes tmp_add_list and dereferences the already freed filter. Do I understand the race right? Michal > Reproduction steps: > 1. Spawn multiple VFs. > 2. Apply a concurrent heavy load by running parallel operations to change > MAC addresses on the VFs and change port VLANs on the host. > 3. Observe errors in dmesg: > "Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX, > please set promiscuous on manually for VF XX". > > Exact code for stable reproduction Intel can't open-source now. > > The fix involves implementing a new intermediate filter state, > I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list. > These filters cannot be deleted from the hash list directly but > must be removed using the full process. > > Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key") > Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> > --- > v1->v2 change commit title, removed RESERVED state byt request in review > --- > drivers/net/ethernet/intel/i40e/i40e.h | 2 ++ > drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 1 + > drivers/net/ethernet/intel/i40e/i40e_main.c | 12 ++++++++++-- > 3 files changed, 13 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h > index 2089a0e..2e205218 100644 > --- a/drivers/net/ethernet/intel/i40e/i40e.h > +++ b/drivers/net/ethernet/intel/i40e/i40e.h > @@ -755,6 +755,8 @@ enum i40e_filter_state { > I40E_FILTER_ACTIVE, /* Added to switch by FW */ > I40E_FILTER_FAILED, /* Rejected by FW */ > I40E_FILTER_REMOVE, /* To be removed */ > + I40E_FILTER_NEW_SYNC, /* New, not sent yet, is in > + i40e_sync_vsi_filters() */ > /* There is no 'removed' state; the filter struct is freed */ > }; > struct i40e_mac_filter { > diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c > index abf624d..208c2f0 100644 > --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c > +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c > @@ -89,6 +89,7 @@ static char *i40e_filter_state_string[] = { > "ACTIVE", > "FAILED", > "REMOVE", > + "NEW_SYNC", > }; > > /** > diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c > index 25295ae..55fb362 100644 > --- a/drivers/net/ethernet/intel/i40e/i40e_main.c > +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c > @@ -1255,6 +1255,7 @@ int i40e_count_filters(struct i40e_vsi *vsi) > > hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) { > if (f->state == I40E_FILTER_NEW || > + f->state == I40E_FILTER_NEW_SYNC || > f->state == I40E_FILTER_ACTIVE) > ++cnt; > } > @@ -1441,6 +1442,8 @@ static int i40e_correct_mac_vlan_filters(struct i40e_vsi *vsi, > > new->f = add_head; > new->state = add_head->state; > + if (add_head->state == I40E_FILTER_NEW) > + add_head->state = I40E_FILTER_NEW_SYNC; > > /* Add the new filter to the tmp list */ > hlist_add_head(&new->hlist, tmp_add_list); > @@ -1550,6 +1553,8 @@ static int i40e_correct_vf_mac_vlan_filters(struct i40e_vsi *vsi, > return -ENOMEM; > new_mac->f = add_head; > new_mac->state = add_head->state; > + if (add_head->state == I40E_FILTER_NEW) > + add_head->state = I40E_FILTER_NEW_SYNC; > > /* Add the new filter to the tmp list */ > hlist_add_head(&new_mac->hlist, tmp_add_list); > @@ -2437,7 +2442,8 @@ static int > i40e_aqc_broadcast_filter(struct i40e_vsi *vsi, const char *vsi_name, > struct i40e_mac_filter *f) > { > - bool enable = f->state == I40E_FILTER_NEW; > + bool enable = f->state == I40E_FILTER_NEW || > + f->state == I40E_FILTER_NEW_SYNC; > struct i40e_hw *hw = &vsi->back->hw; > int aq_ret; > > @@ -2611,6 +2617,7 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi) > > /* Add it to the hash list */ > hlist_add_head(&new->hlist, &tmp_add_list); > + f->state = I40E_FILTER_NEW_SYNC; > } > > /* Count the number of active (current and new) VLAN > @@ -2762,7 +2769,8 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi) > spin_lock_bh(&vsi->mac_filter_hash_lock); > hlist_for_each_entry_safe(new, h, &tmp_add_list, hlist) { > /* Only update the state if we're still NEW */ > - if (new->f->state == I40E_FILTER_NEW) > + if (new->f->state == I40E_FILTER_NEW || > + new->f->state == I40E_FILTER_NEW_SYNC) > new->f->state = new->state; > hlist_del(&new->hlist); > netdev_hw_addr_refcnt(new->f, vsi->netdev, -1); > -- > 2.25.1 >
> -----Original Message----- > From: Michal Schmidt <mschmidt@redhat.com> > Sent: Wednesday, October 23, 2024 6:34 PM > To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com> > Cc: intel-wired-lan@lists.osuosl.org; Nguyen, Anthony L > <anthony.l.nguyen@intel.com>; netdev@vger.kernel.org > Subject: Re: [Intel-wired-lan] [PATCH iwl-net v2] i40e: fix race > condition by adding filter's intermediate sync state > > On Wed, Oct 16, 2024 at 11:30 AM Aleksandr Loktionov > <aleksandr.loktionov@intel.com> wrote: > > > > Fix a race condition in the i40e driver that leads to MAC/VLAN > filters > > becoming corrupted and leaking. Address the issue that occurs > under > > heavy load when multiple threads are concurrently modifying > MAC/VLAN > > filters by setting mac and port VLAN. > > > > 1. Thread T0 allocates a filter in i40e_add_filter() within > > i40e_ndo_set_vf_port_vlan(). > > 2. Thread T1 concurrently frees the filter in __i40e_del_filter() > within > > i40e_ndo_set_vf_mac(). > > 3. Subsequently, i40e_service_task() calls > i40e_sync_vsi_filters(), which > > refers to the already freed filter memory, causing > corruption. > > I think an important detail is missing from the race description. I > am guessing it could happen like this: > > 1. A thread allocates a filter with i40e_add_filter(). > 2. i40e_service_task() calls i40e_sync_vsi_filters(), which adds an > entry to its local tmp_add_list referencing the filter. It releases > vsi->mac_filter_hash_lock before it processes tmp_add_list. > 3. A thread frees the filter in __i40e_del_filter(). This is while > holding vsi->mac_filter_hash_lock. > 4. The service task processes tmp_add_list and dereferences the > already freed filter. > > Do I understand the race right? > > Michal I think you've got it right. And that I've wrote above. > > > > Reproduction steps: > > 1. Spawn multiple VFs. > > 2. Apply a concurrent heavy load by running parallel operations > to change > > MAC addresses on the VFs and change port VLANs on the > host. > > 3. Observe errors in dmesg: > > "Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX, > > please set promiscuous on manually for VF XX". > > > > Exact code for stable reproduction Intel can't open-source now. > > > > The fix involves implementing a new intermediate filter state, > > I40E_FILTER_NEW_SYNC, for the time when a filter is on a > tmp_add_list. > > These filters cannot be deleted from the hash list directly but > must > > be removed using the full process. > > > > Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with > the > > MAC Address as key") > > Signed-off-by: Aleksandr Loktionov > <aleksandr.loktionov@intel.com> > > --- > > v1->v2 change commit title, removed RESERVED state byt request in > > v1->review > > --- > > drivers/net/ethernet/intel/i40e/i40e.h | 2 ++ > > drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 1 + > > drivers/net/ethernet/intel/i40e/i40e_main.c | 12 ++++++++++-- > > 3 files changed, 13 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/i40e/i40e.h > > b/drivers/net/ethernet/intel/i40e/i40e.h > > index 2089a0e..2e205218 100644 > > --- a/drivers/net/ethernet/intel/i40e/i40e.h > > +++ b/drivers/net/ethernet/intel/i40e/i40e.h > > @@ -755,6 +755,8 @@ enum i40e_filter_state { > > I40E_FILTER_ACTIVE, /* Added to switch by FW > */ > > I40E_FILTER_FAILED, /* Rejected by FW */ > > I40E_FILTER_REMOVE, /* To be removed */ > > + I40E_FILTER_NEW_SYNC, /* New, not sent yet, is > in > > + > i40e_sync_vsi_filters() */ > > /* There is no 'removed' state; the filter struct is freed */ > }; > > struct i40e_mac_filter { diff --git > > a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c > > b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c > > index abf624d..208c2f0 100644 > > --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c > > +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c > > @@ -89,6 +89,7 @@ static char *i40e_filter_state_string[] = { > > "ACTIVE", > > "FAILED", > > "REMOVE", > > + "NEW_SYNC", > > }; > > > > /** > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c > > b/drivers/net/ethernet/intel/i40e/i40e_main.c > > index 25295ae..55fb362 100644 > > --- a/drivers/net/ethernet/intel/i40e/i40e_main.c > > +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c > > @@ -1255,6 +1255,7 @@ int i40e_count_filters(struct i40e_vsi > *vsi) > > > > hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, > hlist) { > > if (f->state == I40E_FILTER_NEW || > > + f->state == I40E_FILTER_NEW_SYNC || > > f->state == I40E_FILTER_ACTIVE) > > ++cnt; > > } > > @@ -1441,6 +1442,8 @@ static int > i40e_correct_mac_vlan_filters(struct > > i40e_vsi *vsi, > > > > new->f = add_head; > > new->state = add_head->state; > > + if (add_head->state == I40E_FILTER_NEW) > > + add_head->state = > > + I40E_FILTER_NEW_SYNC; > > > > /* Add the new filter to the tmp list */ > > hlist_add_head(&new->hlist, > tmp_add_list); @@ > > -1550,6 +1553,8 @@ static int > i40e_correct_vf_mac_vlan_filters(struct i40e_vsi *vsi, > > return -ENOMEM; > > new_mac->f = add_head; > > new_mac->state = add_head->state; > > + if (add_head->state == I40E_FILTER_NEW) > > + add_head->state = > > + I40E_FILTER_NEW_SYNC; > > > > /* Add the new filter to the tmp list */ > > hlist_add_head(&new_mac->hlist, > tmp_add_list); > > @@ -2437,7 +2442,8 @@ static int > i40e_aqc_broadcast_filter(struct > > i40e_vsi *vsi, const char *vsi_name, > > struct i40e_mac_filter *f) { > > - bool enable = f->state == I40E_FILTER_NEW; > > + bool enable = f->state == I40E_FILTER_NEW || > > + f->state == I40E_FILTER_NEW_SYNC; > > struct i40e_hw *hw = &vsi->back->hw; > > int aq_ret; > > > > @@ -2611,6 +2617,7 @@ int i40e_sync_vsi_filters(struct i40e_vsi > *vsi) > > > > /* Add it to the hash list */ > > hlist_add_head(&new->hlist, > > &tmp_add_list); > > + f->state = I40E_FILTER_NEW_SYNC; > > } > > > > /* Count the number of active (current > and > > new) VLAN @@ -2762,7 +2769,8 @@ int i40e_sync_vsi_filters(struct > i40e_vsi *vsi) > > spin_lock_bh(&vsi->mac_filter_hash_lock); > > hlist_for_each_entry_safe(new, h, &tmp_add_list, > hlist) { > > /* Only update the state if we're still > NEW */ > > - if (new->f->state == I40E_FILTER_NEW) > > + if (new->f->state == I40E_FILTER_NEW || > > + new->f->state == > I40E_FILTER_NEW_SYNC) > > new->f->state = new->state; > > hlist_del(&new->hlist); > > netdev_hw_addr_refcnt(new->f, vsi- > >netdev, > > -1); > > -- > > 2.25.1 > >
On Wed, Oct 16, 2024 at 11:30 AM Aleksandr Loktionov <aleksandr.loktionov@intel.com> wrote: > > Fix a race condition in the i40e driver that leads to MAC/VLAN filters > becoming corrupted and leaking. Address the issue that occurs under > heavy load when multiple threads are concurrently modifying MAC/VLAN > filters by setting mac and port VLAN. > > 1. Thread T0 allocates a filter in i40e_add_filter() within > i40e_ndo_set_vf_port_vlan(). > 2. Thread T1 concurrently frees the filter in __i40e_del_filter() within > i40e_ndo_set_vf_mac(). > 3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which > refers to the already freed filter memory, causing corruption. > > Reproduction steps: > 1. Spawn multiple VFs. > 2. Apply a concurrent heavy load by running parallel operations to change > MAC addresses on the VFs and change port VLANs on the host. > 3. Observe errors in dmesg: > "Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX, > please set promiscuous on manually for VF XX". > > Exact code for stable reproduction Intel can't open-source now. I wrote a reproducer that uses Systemtap to enlarge the race window in i40e_sync_vsi_filters(): https://gitlab.com/mschmidt2/repro/-/tree/master/i40e-filters-uaf With KASAN enabled, it looks like this when it triggers: ================================================================== BUG: KASAN: slab-use-after-free in i40e_sync_vsi_filters+0x37ee/0x3c10 [i40e] Read of size 2 at addr ffff888120a43350 by task kworker/29:0/211 CPU: 29 UID: 0 PID: 211 Comm: kworker/29:0 Tainted: G OE 6.11.4-301.fc41.x86_64+debug #1 Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Abacus electric, s.r.o. - servis@abacus.cz Super Server/H12SSW-iN, BIOS 2.7 10/25/2023 Workqueue: i40e i40e_service_task [i40e] Call Trace: <TASK> dump_stack_lvl+0x84/0xd0 ? i40e_sync_vsi_filters+0x37ee/0x3c10 [i40e] print_report+0x174/0x505 ? i40e_sync_vsi_filters+0x37ee/0x3c10 [i40e] ? srso_alias_return_thunk+0x5/0xfbef5 ? __virt_addr_valid+0x231/0x420 ? i40e_sync_vsi_filters+0x37ee/0x3c10 [i40e] kasan_report+0xab/0x180 ? i40e_sync_vsi_filters+0x37ee/0x3c10 [i40e] i40e_sync_vsi_filters+0x37ee/0x3c10 [i40e] ? __pfx_i40e_sync_vsi_filters+0x10/0x10 [i40e] ? __pfx_register_lock_class+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 i40e_sync_filters_subtask.part.0+0x1e0/0x260 [i40e] i40e_service_task+0x1b3/0x23a0 [i40e] ? srso_alias_return_thunk+0x5/0xfbef5 ? rethook_hook+0x19/0x90 ? srso_alias_return_thunk+0x5/0xfbef5 ? pre_handler_kretprobe+0xc0/0x140 ? __pfx_i40e_service_task+0x10/0x10 [i40e] ? aggr_pre_handler+0xd2/0x160 ? srso_alias_return_thunk+0x5/0xfbef5 ? kprobe_ftrace_handler+0x371/0x480 ? i40e_service_task+0x9/0x23a0 [i40e] ? srso_alias_return_thunk+0x5/0xfbef5 ? loop_init+0xce/0xff0 [loop] ? __pfx_i40e_service_task+0x10/0x10 [i40e] ? process_one_work+0x860/0x1450 osnoise_arch_unregister+0x210/0x210 ? worker_thread+0xe3/0xfc0 ? __pfx_process_one_work+0x10/0x10 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? assign_work+0x16c/0x240 worker_thread+0x5e6/0xfc0 ? __pfx_worker_thread+0x10/0x10 kthread+0x2d5/0x3a0 ? _raw_spin_unlock_irq+0x28/0x60 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 2198: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 i40e_add_filter+0x133/0x4c0 [i40e] i40e_add_vlan_all_mac+0x7e/0x160 [i40e] i40e_ndo_set_vf_port_vlan.cold+0x291/0x363 [i40e] do_setlink+0x1216/0x33e0 __rtnl_newlink+0xb1d/0x1600 rtnl_newlink+0xc0/0x100 rtnetlink_rcv_msg+0x2f6/0xb20 netlink_rcv_skb+0x140/0x3b0 netlink_unicast+0x431/0x720 netlink_sendmsg+0x765/0xc20 ____sys_sendmsg+0x97f/0xc60 ___sys_sendmsg+0xfd/0x180 __sys_sendmsg+0x19c/0x220 do_syscall_64+0x97/0x190 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 2198: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x70 poison_slab_object+0x109/0x180 __kasan_slab_free+0x14/0x30 kfree+0x11b/0x450 i40e_vsi_release+0x38a/0xbd0 [i40e] i40e_free_vf_res+0x551/0x9e0 [i40e] i40e_cleanup_reset_vf+0x89/0x1620 [i40e] i40e_reset_vf+0x216/0x360 [i40e] i40e_ndo_set_vf_port_vlan+0x4a8/0x850 [i40e] do_setlink+0x1216/0x33e0 __rtnl_newlink+0xb1d/0x1600 rtnl_newlink+0xc0/0x100 rtnetlink_rcv_msg+0x2f6/0xb20 netlink_rcv_skb+0x140/0x3b0 netlink_unicast+0x431/0x720 netlink_sendmsg+0x765/0xc20 ____sys_sendmsg+0x97f/0xc60 ___sys_sendmsg+0xfd/0x180 __sys_sendmsg+0x19c/0x220 do_syscall_64+0x97/0x190 entry_SYSCALL_64_after_hwframe+0x76/0x7e The buggy address belongs to the object at ffff888120a43340 which belongs to the cache kmalloc-rnd-02-32 of size 32 The buggy address is located 16 bytes inside of freed 32-byte region [ffff888120a43340, ffff888120a43360) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888120a43a00 pfn:0x120a43 flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) page_type: 0xfdffffff(slab) raw: 0017ffffc0000000 ffff88810004e800 dead000000000122 0000000000000000 raw: ffff888120a43a00 0000000080400034 00000001fdffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888120a43200: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc ffff888120a43280: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc >ffff888120a43300: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc ^ ffff888120a43380: fa fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc ffff888120a43400: fa fb fb fb fc fc fc fc fa fb fb fb fc fc fc fc ================================================================== > The fix involves implementing a new intermediate filter state, > I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list. > These filters cannot be deleted from the hash list directly but > must be removed using the full process. > > Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key") > Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Tested-by: Michal Schmidt <mschmidt@redhat.com> Michal
diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h index 2089a0e..2e205218 100644 --- a/drivers/net/ethernet/intel/i40e/i40e.h +++ b/drivers/net/ethernet/intel/i40e/i40e.h @@ -755,6 +755,8 @@ enum i40e_filter_state { I40E_FILTER_ACTIVE, /* Added to switch by FW */ I40E_FILTER_FAILED, /* Rejected by FW */ I40E_FILTER_REMOVE, /* To be removed */ + I40E_FILTER_NEW_SYNC, /* New, not sent yet, is in + i40e_sync_vsi_filters() */ /* There is no 'removed' state; the filter struct is freed */ }; struct i40e_mac_filter { diff --git a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c index abf624d..208c2f0 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_debugfs.c +++ b/drivers/net/ethernet/intel/i40e/i40e_debugfs.c @@ -89,6 +89,7 @@ static char *i40e_filter_state_string[] = { "ACTIVE", "FAILED", "REMOVE", + "NEW_SYNC", }; /** diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 25295ae..55fb362 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -1255,6 +1255,7 @@ int i40e_count_filters(struct i40e_vsi *vsi) hash_for_each_safe(vsi->mac_filter_hash, bkt, h, f, hlist) { if (f->state == I40E_FILTER_NEW || + f->state == I40E_FILTER_NEW_SYNC || f->state == I40E_FILTER_ACTIVE) ++cnt; } @@ -1441,6 +1442,8 @@ static int i40e_correct_mac_vlan_filters(struct i40e_vsi *vsi, new->f = add_head; new->state = add_head->state; + if (add_head->state == I40E_FILTER_NEW) + add_head->state = I40E_FILTER_NEW_SYNC; /* Add the new filter to the tmp list */ hlist_add_head(&new->hlist, tmp_add_list); @@ -1550,6 +1553,8 @@ static int i40e_correct_vf_mac_vlan_filters(struct i40e_vsi *vsi, return -ENOMEM; new_mac->f = add_head; new_mac->state = add_head->state; + if (add_head->state == I40E_FILTER_NEW) + add_head->state = I40E_FILTER_NEW_SYNC; /* Add the new filter to the tmp list */ hlist_add_head(&new_mac->hlist, tmp_add_list); @@ -2437,7 +2442,8 @@ static int i40e_aqc_broadcast_filter(struct i40e_vsi *vsi, const char *vsi_name, struct i40e_mac_filter *f) { - bool enable = f->state == I40E_FILTER_NEW; + bool enable = f->state == I40E_FILTER_NEW || + f->state == I40E_FILTER_NEW_SYNC; struct i40e_hw *hw = &vsi->back->hw; int aq_ret; @@ -2611,6 +2617,7 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi) /* Add it to the hash list */ hlist_add_head(&new->hlist, &tmp_add_list); + f->state = I40E_FILTER_NEW_SYNC; } /* Count the number of active (current and new) VLAN @@ -2762,7 +2769,8 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi) spin_lock_bh(&vsi->mac_filter_hash_lock); hlist_for_each_entry_safe(new, h, &tmp_add_list, hlist) { /* Only update the state if we're still NEW */ - if (new->f->state == I40E_FILTER_NEW) + if (new->f->state == I40E_FILTER_NEW || + new->f->state == I40E_FILTER_NEW_SYNC) new->f->state = new->state; hlist_del(&new->hlist); netdev_hw_addr_refcnt(new->f, vsi->netdev, -1);
Fix a race condition in the i40e driver that leads to MAC/VLAN filters becoming corrupted and leaking. Address the issue that occurs under heavy load when multiple threads are concurrently modifying MAC/VLAN filters by setting mac and port VLAN. 1. Thread T0 allocates a filter in i40e_add_filter() within i40e_ndo_set_vf_port_vlan(). 2. Thread T1 concurrently frees the filter in __i40e_del_filter() within i40e_ndo_set_vf_mac(). 3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which refers to the already freed filter memory, causing corruption. Reproduction steps: 1. Spawn multiple VFs. 2. Apply a concurrent heavy load by running parallel operations to change MAC addresses on the VFs and change port VLANs on the host. 3. Observe errors in dmesg: "Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX, please set promiscuous on manually for VF XX". Exact code for stable reproduction Intel can't open-source now. The fix involves implementing a new intermediate filter state, I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list. These filters cannot be deleted from the hash list directly but must be removed using the full process. Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> --- v1->v2 change commit title, removed RESERVED state byt request in review --- drivers/net/ethernet/intel/i40e/i40e.h | 2 ++ drivers/net/ethernet/intel/i40e/i40e_debugfs.c | 1 + drivers/net/ethernet/intel/i40e/i40e_main.c | 12 ++++++++++-- 3 files changed, 13 insertions(+), 2 deletions(-)