diff mbox series

[net,6/7] net: hns3: fix VF reset fail issue

Message ID 20231028025917.314305-7-shaojijie@huawei.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series There are some bugfix for the HNS3 ethernet driver | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1362 this patch: 1362
netdev/cc_maintainers fail 2 blamed authors not CCed: zhangjiaran@huawei.com huangguangbin2@huawei.com; 3 maintainers not CCed: zhangjiaran@huawei.com lanhao@huawei.com huangguangbin2@huawei.com
netdev/build_clang success Errors and warnings before: 1386 this patch: 1386
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1386 this patch: 1386
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Jijie Shao Oct. 28, 2023, 2:59 a.m. UTC
Currently the reset process in hns3 and firmware watchdog init process is
asynchronous. We think firmware watchdog initialization is completed
before VF clear the interrupt source. However, firmware initialization
may not complete early. So VF will receive multiple reset interrupts
and fail to reset.

So we add delay before VF interrupt source and 5 ms delay
is enough to avoid second reset interrupt.

Fixes: 427900d27d86 ("net: hns3: fix the timing issue of VF clearing interrupt sources")
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
---
 .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c   | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Paolo Abeni Nov. 2, 2023, 10:45 a.m. UTC | #1
On Sat, 2023-10-28 at 10:59 +0800, Jijie Shao wrote:
> Currently the reset process in hns3 and firmware watchdog init process is
> asynchronous. We think firmware watchdog initialization is completed
> before VF clear the interrupt source. However, firmware initialization
> may not complete early. So VF will receive multiple reset interrupts
> and fail to reset.
> 
> So we add delay before VF interrupt source and 5 ms delay
> is enough to avoid second reset interrupt.
> 
> Fixes: 427900d27d86 ("net: hns3: fix the timing issue of VF clearing interrupt sources")
> Signed-off-by: Jijie Shao <shaojijie@huawei.com>
> ---
>  .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c   | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
> index 1c62e58ff6d8..7b87da031be6 100644
> --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
> @@ -1924,8 +1924,14 @@ static void hclgevf_service_task(struct work_struct *work)
>  	hclgevf_mailbox_service_task(hdev);
>  }
>  
> -static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr)
> +static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr,
> +				      bool need_dalay)
>  {
> +#define HCLGEVF_RESET_DELAY		5
> +
> +	if (need_dalay)
> +		mdelay(HCLGEVF_RESET_DELAY);

5ms delay in an interrupt handler is quite a lot. What about scheduling
a timer from the IH to clear the register when such delay is needed?

Thanks!

Paolo
Jijie Shao Nov. 2, 2023, 12:16 p.m. UTC | #2
on 2023/11/2 18:45, Paolo Abeni wrote:
> On Sat, 2023-10-28 at 10:59 +0800, Jijie Shao wrote:
>>   
>> -static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr)
>> +static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr,
>> +				      bool need_dalay)
>>   {
>> +#define HCLGEVF_RESET_DELAY		5
>> +
>> +	if (need_dalay)
>> +		mdelay(HCLGEVF_RESET_DELAY);
> 5ms delay in an interrupt handler is quite a lot. What about scheduling
> a timer from the IH to clear the register when such delay is needed?
>
> Thanks!
>
> Paolo

Using timer in this case will complicate the code and make maintenance difficult.

We consider reducing the delay time by polling. For example,
the code cycles every 50 us to check whether the write register takes effect.
If yes, the function returns immediately. or the code cycles until 5 ms.

Is this method appropriate?

Thanks!
Jijie
Paolo Abeni Nov. 2, 2023, 4:24 p.m. UTC | #3
On Thu, 2023-11-02 at 20:16 +0800, Jijie Shao wrote:
> on 2023/11/2 18:45, Paolo Abeni wrote:
> > On Sat, 2023-10-28 at 10:59 +0800, Jijie Shao wrote:
> > >   
> > > -static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr)
> > > +static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr,
> > > +				      bool need_dalay)
> > >   {
> > > +#define HCLGEVF_RESET_DELAY		5
> > > +
> > > +	if (need_dalay)
> > > +		mdelay(HCLGEVF_RESET_DELAY);
> > 5ms delay in an interrupt handler is quite a lot. What about scheduling
> > a timer from the IH to clear the register when such delay is needed?
> > 
> > Thanks!
> > 
> > Paolo
> 
> Using timer in this case will complicate the code and make maintenance difficult.

Why? 

Would something alike the following be ok? (plus reset_timer
initialization at vf creation and cleanup at vf removal time):

---
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index a4d68fb216fb..626bc67065fc 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1974,6 +1974,14 @@ static enum hclgevf_evt_cause hclgevf_check_evt_cause(struct hclgevf_dev *hdev,
 	return HCLGEVF_VECTOR0_EVENT_OTHER;
 }
 
+static void hclgevf_reset_timer(struct timer_list *t)
+{
+	struct hclgevf_dev *hdev = from_timer(hclgevf_dev, t, reset_timer);
+
+	hclgevf_clear_event_cause(hdev, HCLGEVF_VECTOR0_EVENT_RST);
+	hclgevf_reset_task_schedule(hdev);
+}
+
 static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
 {
 	enum hclgevf_evt_cause event_cause;
@@ -1982,13 +1990,13 @@ static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
 
 	hclgevf_enable_vector(&hdev->misc_vector, false);
 	event_cause = hclgevf_check_evt_cause(hdev, &clearval);
+	if (event_cause == HCLGEVF_VECTOR0_EVENT_RST)
+		mod_timer(hdev->reset_timer, jiffies + msecs_to_jiffies(5));
+
 	if (event_cause != HCLGEVF_VECTOR0_EVENT_OTHER)
 		hclgevf_clear_event_cause(hdev, clearval);
 
 	switch (event_cause) {
-	case HCLGEVF_VECTOR0_EVENT_RST:
-		hclgevf_reset_task_schedule(hdev);
-		break;
 	case HCLGEVF_VECTOR0_EVENT_MBX:
 		hclgevf_mbx_handler(hdev);
 		break;
---

> We consider reducing the delay time by polling. For example,
> the code cycles every 50 us to check whether the write register takes effect.
> If yes, the function returns immediately. or the code cycles until 5 ms.
> 
> Is this method appropriate?

IMHO such solution will not remove the problem. How frequent is
expected to be the irq generating such delay?

Thanks

Paolo
diff mbox series

Patch

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
index 1c62e58ff6d8..7b87da031be6 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -1924,8 +1924,14 @@  static void hclgevf_service_task(struct work_struct *work)
 	hclgevf_mailbox_service_task(hdev);
 }
 
-static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr)
+static void hclgevf_clear_event_cause(struct hclgevf_dev *hdev, u32 regclr,
+				      bool need_dalay)
 {
+#define HCLGEVF_RESET_DELAY		5
+
+	if (need_dalay)
+		mdelay(HCLGEVF_RESET_DELAY);
+
 	hclgevf_write_dev(&hdev->hw, HCLGE_COMM_VECTOR0_CMDQ_SRC_REG, regclr);
 }
 
@@ -1990,7 +1996,8 @@  static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
 	hclgevf_enable_vector(&hdev->misc_vector, false);
 	event_cause = hclgevf_check_evt_cause(hdev, &clearval);
 	if (event_cause != HCLGEVF_VECTOR0_EVENT_OTHER)
-		hclgevf_clear_event_cause(hdev, clearval);
+		hclgevf_clear_event_cause(hdev, clearval,
+					  event_cause == HCLGEVF_VECTOR0_EVENT_RST);
 
 	switch (event_cause) {
 	case HCLGEVF_VECTOR0_EVENT_RST:
@@ -2340,7 +2347,7 @@  static int hclgevf_misc_irq_init(struct hclgevf_dev *hdev)
 		return ret;
 	}
 
-	hclgevf_clear_event_cause(hdev, 0);
+	hclgevf_clear_event_cause(hdev, 0, false);
 
 	/* enable misc. vector(vector 0) */
 	hclgevf_enable_vector(&hdev->misc_vector, true);