diff mbox series

[v0] idb: Add rtnl_lock to avoid data race

Message ID 20220808081050.25229-1-linma@zju.edu.cn (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [v0] idb: Add rtnl_lock to avoid data race | expand

Checks

Context Check Description
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 2 blamed authors not CCed: alex.williamson@redhat.com mitch.a.williams@intel.com; 2 maintainers not CCed: alex.williamson@redhat.com mitch.a.williams@intel.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Lin Ma Aug. 8, 2022, 8:10 a.m. UTC
The commit c23d92b80e0b ("igb: Teardown SR-IOV before
unregister_netdev()") places the unregister_netdev() call after the
igb_disable_sriov() call to avoid functionality issue.

However, it introduces several race conditions when detaching a device.
For example, when .remove() is called, the below interleaving leads to
use-after-free.

 (FREE from device detaching)      |   (USE from netdev core)
igb_remove                         |  igb_ndo_get_vf_config
 igb_disable_sriov                 |  vf >= adapter->vfs_allocated_count?
  kfree(adapter->vf_data)          |
  adapter->vfs_allocated_count = 0 |
                                   |    memcpy(... adapter->vf_data[vf]

In short, there are data races between read and write of
adapter->vfs_allocated_count. To fix this, we can add a new lock to
protect members in adapter object. However, we cau use the existing
rtnl_lock just as other drivers do. (See how dpaa2_eth_disconnect_mac is
protected in dpaa2_eth_remove function). This patch adopts similar
fixes.

Fixes: c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jakub Kicinski Aug. 8, 2022, 6:55 p.m. UTC | #1
On Mon,  8 Aug 2022 16:10:50 +0800 Lin Ma wrote:
> The commit c23d92b80e0b ("igb: Teardown SR-IOV before
> unregister_netdev()") places the unregister_netdev() call after the
> igb_disable_sriov() call to avoid functionality issue.
> 
> However, it introduces several race conditions when detaching a device.
> For example, when .remove() is called, the below interleaving leads to
> use-after-free.
> 
>  (FREE from device detaching)      |   (USE from netdev core)
> igb_remove                         |  igb_ndo_get_vf_config
>  igb_disable_sriov                 |  vf >= adapter->vfs_allocated_count?
>   kfree(adapter->vf_data)          |
>   adapter->vfs_allocated_count = 0 |
>                                    |    memcpy(... adapter->vf_data[vf]
> 
> In short, there are data races between read and write of
> adapter->vfs_allocated_count. To fix this, we can add a new lock to
> protect members in adapter object. However, we cau use the existing
> rtnl_lock just as other drivers do. (See how dpaa2_eth_disconnect_mac is
> protected in dpaa2_eth_remove function). This patch adopts similar
> fixes.
> 
> Fixes: c23d92b80e0b ("igb: Teardown SR-IOV before unregister_netdev()")
> Signed-off-by: Lin Ma <linma@zju.edu.cn>
> ---
>  drivers/net/ethernet/intel/igb/igb_main.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index d8b836a85cc3..e86ea4de05f8 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -3814,7 +3814,9 @@ static void igb_remove(struct pci_dev *pdev)
>  	igb_release_hw_control(adapter);
>  
>  #ifdef CONFIG_PCI_IOV
> +	rtnl_lock();
>  	igb_disable_sriov(pdev);
> +	rtnl_unlock();
>  #endif
>  
>  	unregister_netdev(netdev);

What about the disable path coming from sysfs? This looks incomplete to
me. Perhaps take a look at commit 1e53834ce541 ("ixgbe: Add locking to
prevent panic when setting sriov_numvfs to zero") for some inspiration.
Edward Cree Aug. 8, 2022, 8:50 p.m. UTC | #2
s/idb/igb in Subject?

-ed
Lin Ma Aug. 9, 2022, 3:05 a.m. UTC | #3
Hello there,

> 
> What about the disable path coming from sysfs? This looks incomplete to
> me. Perhaps take a look at commit 1e53834ce541 ("ixgbe: Add locking to
> prevent panic when setting sriov_numvfs to zero") for some inspiration.

Thanks for the advice, I sent the new version of the patch which uses a new spinlock to avoid race cases such as described in commit 1e53834ce541.

Additionally, I also keep the rtnl_lock to eliminate the races that come from netdev core. Although this can also be handled with the newly added spinlock, I found that adding the spinlock every time accessing the VF resources is not trivial.
(If you think that keep using the spinlock is better I will craft a new version of patch)

It seems that ixgbe_disable_sriov also suffers from the mentioned races from netdev core. If you think the rtnl_lock solution is fine, I will also send a patch for that driver too.

Thanks
Lin Ma
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index d8b836a85cc3..e86ea4de05f8 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3814,7 +3814,9 @@  static void igb_remove(struct pci_dev *pdev)
 	igb_release_hw_control(adapter);
 
 #ifdef CONFIG_PCI_IOV
+	rtnl_lock();
 	igb_disable_sriov(pdev);
+	rtnl_unlock();
 #endif
 
 	unregister_netdev(netdev);