diff mbox series

[net,7/7] net: hns3: fix kernel crash when devlink reload during vf initialization

Message ID 20240422134327.3160587-8-shaojijie@huawei.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series There are some bugfix for the HNS3 ethernet driver | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 926 this patch: 926
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 2 blamed authors not CCed: huangguangbin2@huawei.com moyufeng@huawei.com; 2 maintainers not CCed: huangguangbin2@huawei.com moyufeng@huawei.com
netdev/build_clang success Errors and warnings before: 937 this patch: 937
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 937 this patch: 937
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 22 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-04-22--18-00 (tests: 962)

Commit Message

Jijie Shao April 22, 2024, 1:43 p.m. UTC
From: Yonglong Liu <liuyonglong@huawei.com>

The devlink reload process will access the hardware resources,
but the register operation is done before the hardware is initialized.
So, processing the devlink reload during initialization may lead to kernel
crash. This patch fixes this by taking devl_lock during initialization.

Fixes: cd6242991d2e ("net: hns3: add support for registering devlink for VF")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Jiri Pirko April 22, 2024, 2:18 p.m. UTC | #1
Mon, Apr 22, 2024 at 03:43:27PM CEST, shaojijie@huawei.com wrote:
>From: Yonglong Liu <liuyonglong@huawei.com>
>
>The devlink reload process will access the hardware resources,
>but the register operation is done before the hardware is initialized.
>So, processing the devlink reload during initialization may lead to kernel
>crash. This patch fixes this by taking devl_lock during initialization.
>
>Fixes: cd6242991d2e ("net: hns3: add support for registering devlink for VF")
>Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
>Signed-off-by: Jijie Shao <shaojijie@huawei.com>
>---
> drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
>diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
>index 08db8e84be4e..3ee41943d15f 100644
>--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
>+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
>@@ -2849,6 +2849,8 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev)
> 	if (ret)
> 		goto err_devlink_init;
> 

Hmm, what if reload happens here? I think that you don't fix the issue,
rather just narrowing the race window.

Why don't you rather calle devlink_register at the end of this function?


Also, parallel to this patch, why don't you register devlink port of
flavour "virtual" for this VF?


>+	devl_lock(hdev->devlink);
>+
> 	ret = hclge_comm_cmd_queue_init(hdev->pdev, &hdev->hw.hw);
> 	if (ret)
> 		goto err_cmd_queue_init;
>@@ -2950,6 +2952,7 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev)
> 	hclgevf_task_schedule(hdev, round_jiffies_relative(HZ));
> 	timer_setup(&hdev->reset_timer, hclgevf_reset_timer, 0);
> 
>+	devl_unlock(hdev->devlink);
> 	return 0;
> 
> err_config:
>@@ -2960,6 +2963,7 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev)
> err_cmd_init:
> 	hclge_comm_cmd_uninit(hdev->ae_dev, &hdev->hw.hw);
> err_cmd_queue_init:
>+	devl_unlock(hdev->devlink);
> 	hclgevf_devlink_uninit(hdev);
> err_devlink_init:
> 	hclgevf_pci_uninit(hdev);
>-- 
>2.30.0
>
>
Jiri Pirko April 23, 2024, 1 p.m. UTC | #2
Tue, Apr 23, 2024 at 01:51:21PM CEST, shaojijie@huawei.com wrote:
>
>on 2024/4/22 22:18, Jiri Pirko wrote:
>> Mon, Apr 22, 2024 at 03:43:27PM CEST,shaojijie@huawei.com  wrote:
>> > From: Yonglong Liu<liuyonglong@huawei.com>
>> > 
>> > The devlink reload process will access the hardware resources,
>> > but the register operation is done before the hardware is initialized.
>> > So, processing the devlink reload during initialization may lead to kernel
>> > crash. This patch fixes this by taking devl_lock during initialization.
>> > 
>> > Fixes: cd6242991d2e ("net: hns3: add support for registering devlink for VF")
>> > Signed-off-by: Yonglong Liu<liuyonglong@huawei.com>
>> > Signed-off-by: Jijie Shao<shaojijie@huawei.com>
>> > ---
>> > drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 4 ++++
>> > 1 file changed, 4 insertions(+)
>> > 
>> > diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
>> > index 08db8e84be4e..3ee41943d15f 100644
>> > --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
>> > +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
>> > @@ -2849,6 +2849,8 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev)
>> > 	if (ret)
>> > 		goto err_devlink_init;
>> > 
>> Hmm, what if reload happens here? I think that you don't fix the issue,
>> rather just narrowing the race window.
>
>Yes the issue still occurs.
>
>> Why don't you rather calle devlink_register at the end of this function?
>
>We intended to fix this issue with minimal changes.

I'm not suggesting anything that would mean the opposite.


>But now it seems that we can only call devlink_register at the end of this function

I don't follow. Is that a question? That is what I suggested.


>
>> Also, parallel to this patch, why don't you register devlink port of
>> flavour "virtual" for this VF?
>
>I'm sorry, I'm not sure what "flavour "virtual"" means?

git grep DEVLINK_PORT_FLAVOUR_VIRTUAL


>And How can we use it? If it is useful, I can send another patch to optimize the code.

Please do.


>
>Tnanks,
>Jijie Shao
diff mbox series

Patch

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 08db8e84be4e..3ee41943d15f 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -2849,6 +2849,8 @@  static int hclgevf_init_hdev(struct hclgevf_dev *hdev)
 	if (ret)
 		goto err_devlink_init;
 
+	devl_lock(hdev->devlink);
+
 	ret = hclge_comm_cmd_queue_init(hdev->pdev, &hdev->hw.hw);
 	if (ret)
 		goto err_cmd_queue_init;
@@ -2950,6 +2952,7 @@  static int hclgevf_init_hdev(struct hclgevf_dev *hdev)
 	hclgevf_task_schedule(hdev, round_jiffies_relative(HZ));
 	timer_setup(&hdev->reset_timer, hclgevf_reset_timer, 0);
 
+	devl_unlock(hdev->devlink);
 	return 0;
 
 err_config:
@@ -2960,6 +2963,7 @@  static int hclgevf_init_hdev(struct hclgevf_dev *hdev)
 err_cmd_init:
 	hclge_comm_cmd_uninit(hdev->ae_dev, &hdev->hw.hw);
 err_cmd_queue_init:
+	devl_unlock(hdev->devlink);
 	hclgevf_devlink_uninit(hdev);
 err_devlink_init:
 	hclgevf_pci_uninit(hdev);