igb: Fix igb_down hung on surprise removal

Message ID	20230518072657.1.If9539da710217ed92e764cc0ba0f3d2d246a1aee@changeid (mailing list archive)
State	Awaiting Upstream
Delegated to:	Netdev Maintainers
Headers	show Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4521417FC for <netdev@vger.kernel.org>; Thu, 18 May 2023 07:28:02 +0000 (UTC) From: Ying Hsu <yinghsu@chromium.org> To: netdev@vger.kernel.org Cc: grundler@chromium.org, Ying Hsu <yinghsu@chromium.org>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Jesse Brandeburg <jesse.brandeburg@intel.com>, Paolo Abeni <pabeni@redhat.com>, Tony Nguyen <anthony.l.nguyen@intel.com>, intel-wired-lan@lists.osuosl.org, linux-kernel@vger.kernel.org Subject: [PATCH] igb: Fix igb_down hung on surprise removal Date: Thu, 18 May 2023 07:26:57 +0000 Message-ID: <20230518072657.1.If9539da710217ed92e764cc0ba0f3d2d246a1aee@changeid> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	igb: Fix igb_down hung on surprise removal \| expand igb: Fix igb_down hung on surprise removal

Context	Check	Description
netdev/series_format	warning	Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection	success	Guessed tree name to be net-next
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers	success	CCed 8 of 8 maintainers
netdev/build_clang	success	Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 8 this patch: 8
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

Ying Hsu May 18, 2023, 7:26 a.m. UTC

In a setup where a Thunderbolt hub connects to Ethernet and a display
through USB Type-C, users may experience a hung task timeout when they
remove the cable between the PC and the Thunderbolt hub.
This is because the igb_down function is called multiple times when
the Thunderbolt hub is unplugged. For example, the igb_io_error_detected
triggers the first call, and the igb_remove triggers the second call.
The second call to igb_down will block at napi_synchronize.
Here's the call trace:
    __schedule+0x3b0/0xddb
    ? __mod_timer+0x164/0x5d3
    schedule+0x44/0xa8
    schedule_timeout+0xb2/0x2a4
    ? run_local_timers+0x4e/0x4e
    msleep+0x31/0x38
    igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4]
    __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4]
    igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4]
    __dev_close_many+0x95/0xec
    dev_close_many+0x6e/0x103
    unregister_netdevice_many+0x105/0x5b1
    unregister_netdevice_queue+0xc2/0x10d
    unregister_netdev+0x1c/0x23
    igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4]
    pci_device_remove+0x3f/0x9c
    device_release_driver_internal+0xfe/0x1b4
    pci_stop_bus_device+0x5b/0x7f
    pci_stop_bus_device+0x30/0x7f
    pci_stop_bus_device+0x30/0x7f
    pci_stop_and_remove_bus_device+0x12/0x19
    pciehp_unconfigure_device+0x76/0xe9
    pciehp_disable_slot+0x6e/0x131
    pciehp_handle_presence_or_link_change+0x7a/0x3f7
    pciehp_ist+0xbe/0x194
    irq_thread_fn+0x22/0x4d
    ? irq_thread+0x1fd/0x1fd
    irq_thread+0x17b/0x1fd
    ? irq_forced_thread_fn+0x5f/0x5f
    kthread+0x142/0x153
    ? __irq_get_irqchip_state+0x46/0x46
    ? kthread_associate_blkcg+0x71/0x71
    ret_from_fork+0x1f/0x30

In this case, igb_io_error_detected detaches the network interface
and requests a PCIE slot reset, however, the PCIE reset callback is
not being invoked and thus the Ethernet connection breaks down.
As the PCIE error in this case is a non-fatal one, requesting a
slot reset can be avoided.
This patch fixes the task hung issue and preserves Ethernet
connection by ignoring non-fatal PCIE errors.

Signed-off-by: Ying Hsu <yinghsu@chromium.org>
---
This commit has been tested on a HP Elite Dragonfly Chromebook and
a Caldigit TS3+ Thunderbolt hub. The Ethernet driver for the hub
is igb. Non-fatal PCIE errors happen when users hot-plug the cables
connected to the chromebook or to the external display.

 drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++
 1 file changed, 5 insertions(+)

Pavan Chebbi May 18, 2023, 10:36 a.m. UTC | #1

On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org> wrote:
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index 58872a4c2540..a8b217368ca1 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev,
>         struct net_device *netdev = pci_get_drvdata(pdev);
>         struct igb_adapter *adapter = netdev_priv(netdev);
>
> +       if (state == pci_channel_io_normal) {
> +               dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n");
> +               return PCI_ERS_RESULT_CAN_RECOVER;
> +       }
> +

This code may be good to have. But not sure if this should be the fix
for igb_down() synchronization.
Intel guys may comment.

>         netif_device_detach(netdev);
>
>         if (state == pci_channel_io_perm_failure)
> --
> 2.40.1.606.ga4b1b128d6-goog
>
>

Grant Grundler May 22, 2023, 8:16 p.m. UTC | #2

On Thu, May 18, 2023 at 3:36 AM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote:
>
> On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org> wrote:
> >
> > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> > index 58872a4c2540..a8b217368ca1 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_main.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> > @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev,
> >         struct net_device *netdev = pci_get_drvdata(pdev);
> >         struct igb_adapter *adapter = netdev_priv(netdev);
> >
> > +       if (state == pci_channel_io_normal) {
> > +               dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n");
> > +               return PCI_ERS_RESULT_CAN_RECOVER;
> > +       }
> > +
>
> This code may be good to have. But not sure if this should be the fix
> for igb_down() synchronization.

I have the same opinion. This appears to solve the problem - but I
don't know if there is a better way to solve this problem.

> Intel guys may comment.

Ping? Can we please get feedback from IGB/IGC maintainers this week?

(I hope igc maintainers can confirm this isn't an issue for igc.)

cheers,
grant

>
> >         netif_device_detach(netdev);
> >
> >         if (state == pci_channel_io_perm_failure)
> > --
> > 2.40.1.606.ga4b1b128d6-goog
> >
> >

Tony Nguyen May 23, 2023, 6:03 p.m. UTC | #3

On 5/22/2023 1:16 PM, Grant Grundler wrote:
> On Thu, May 18, 2023 at 3:36 AM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote:
>>
>> On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org> wrote:
>>>
>>> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
>>> index 58872a4c2540..a8b217368ca1 100644
>>> --- a/drivers/net/ethernet/intel/igb/igb_main.c
>>> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
>>> @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev,
>>>          struct net_device *netdev = pci_get_drvdata(pdev);
>>>          struct igb_adapter *adapter = netdev_priv(netdev);
>>>
>>> +       if (state == pci_channel_io_normal) {
>>> +               dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n");
>>> +               return PCI_ERS_RESULT_CAN_RECOVER;
>>> +       }
>>> +
>>
>> This code may be good to have. But not sure if this should be the fix
>> for igb_down() synchronization.
> 
> I have the same opinion. This appears to solve the problem - but I
> don't know if there is a better way to solve this problem.
> 
>> Intel guys may comment.
> 
> Ping? Can we please get feedback from IGB/IGC maintainers this week?
> 
> (I hope igc maintainers can confirm this isn't an issue for igc.)

Adding some of the igb and igc developers.

> cheers,
> grant
> 
>>
>>>          netif_device_detach(netdev);
>>>
>>>          if (state == pci_channel_io_perm_failure)
>>> --
>>> 2.40.1.606.ga4b1b128d6-goog
>>>
>>>

Loktionov, Aleksandr May 24, 2023, 12:31 p.m. UTC | #4

Good day Tony

We reviewed the patch and have nothing against.

With the best regards
Alex
ND ITP Linux 40G base driver TL 



> -----Original Message-----
> From: Nguyen, Anthony L <anthony.l.nguyen@intel.com>
> Sent: Tuesday, May 23, 2023 8:04 PM
> To: Grant Grundler <grundler@chromium.org>; Pavan Chebbi
> <pavan.chebbi@broadcom.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; Neftin, Sasha <sasha.neftin@intel.com>;
> Ruinskiy, Dima <dima.ruinskiy@intel.com>
> Cc: Ying Hsu <yinghsu@chromium.org>; netdev@vger.kernel.org; David S.
> Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>;
> Jakub Kicinski <kuba@kernel.org>; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; Paolo Abeni <pabeni@redhat.com>; intel-
> wired-lan@lists.osuosl.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] igb: Fix igb_down hung on surprise removal
> 
> On 5/22/2023 1:16 PM, Grant Grundler wrote:
> > On Thu, May 18, 2023 at 3:36 AM Pavan Chebbi
> <pavan.chebbi@broadcom.com> wrote:
> >>
> >> On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org>
> wrote:
> >>>
> >>> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
> >>> b/drivers/net/ethernet/intel/igb/igb_main.c
> >>> index 58872a4c2540..a8b217368ca1 100644
> >>> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> >>> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> >>> @@ -9581,6 +9581,11 @@ static pci_ers_result_t
> igb_io_error_detected(struct pci_dev *pdev,
> >>>          struct net_device *netdev = pci_get_drvdata(pdev);
> >>>          struct igb_adapter *adapter = netdev_priv(netdev);
> >>>
> >>> +       if (state == pci_channel_io_normal) {
> >>> +               dev_warn(&pdev->dev, "Non-correctable non-fatal error
> reported.\n");
> >>> +               return PCI_ERS_RESULT_CAN_RECOVER;
> >>> +       }
> >>> +
> >>
> >> This code may be good to have. But not sure if this should be the fix
> >> for igb_down() synchronization.
> >
> > I have the same opinion. This appears to solve the problem - but I
> > don't know if there is a better way to solve this problem.
> >
> >> Intel guys may comment.
> >
> > Ping? Can we please get feedback from IGB/IGC maintainers this week?
> >
> > (I hope igc maintainers can confirm this isn't an issue for igc.)
> 
> Adding some of the igb and igc developers.
> 
> > cheers,
> > grant
> >
> >>
> >>>          netif_device_detach(netdev);
> >>>
> >>>          if (state == pci_channel_io_perm_failure)
> >>> --
> >>> 2.40.1.606.ga4b1b128d6-goog
> >>>
> >>>

Grant Grundler May 24, 2023, 9:01 p.m. UTC | #5

On Wed, May 24, 2023 at 5:34 AM Loktionov, Aleksandr
<aleksandr.loktionov@intel.com> wrote:
>
> Good day Tony
>
> We reviewed the patch and have nothing against.

Thank you for reviewing!

Can I take this as the equivalent of "Signed-off-by: Loktionov,
Aleksandr <aleksandr.loktionov@intel.com>"?

Or since Tony is listed in MAINTAINERS for drivers/net/ethernet/intel,
is he supposed to provide that?

cheers,
grant

>
> With the best regards
> Alex
> ND ITP Linux 40G base driver TL
>
>
>
> > -----Original Message-----
> > From: Nguyen, Anthony L <anthony.l.nguyen@intel.com>
> > Sent: Tuesday, May 23, 2023 8:04 PM
> > To: Grant Grundler <grundler@chromium.org>; Pavan Chebbi
> > <pavan.chebbi@broadcom.com>; Loktionov, Aleksandr
> > <aleksandr.loktionov@intel.com>; Neftin, Sasha <sasha.neftin@intel.com>;
> > Ruinskiy, Dima <dima.ruinskiy@intel.com>
> > Cc: Ying Hsu <yinghsu@chromium.org>; netdev@vger.kernel.org; David S.
> > Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>;
> > Jakub Kicinski <kuba@kernel.org>; Brandeburg, Jesse
> > <jesse.brandeburg@intel.com>; Paolo Abeni <pabeni@redhat.com>; intel-
> > wired-lan@lists.osuosl.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH] igb: Fix igb_down hung on surprise removal
> >
> > On 5/22/2023 1:16 PM, Grant Grundler wrote:
> > > On Thu, May 18, 2023 at 3:36 AM Pavan Chebbi
> > <pavan.chebbi@broadcom.com> wrote:
> > >>
> > >> On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org>
> > wrote:
> > >>>
> > >>> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c
> > >>> b/drivers/net/ethernet/intel/igb/igb_main.c
> > >>> index 58872a4c2540..a8b217368ca1 100644
> > >>> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> > >>> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> > >>> @@ -9581,6 +9581,11 @@ static pci_ers_result_t
> > igb_io_error_detected(struct pci_dev *pdev,
> > >>>          struct net_device *netdev = pci_get_drvdata(pdev);
> > >>>          struct igb_adapter *adapter = netdev_priv(netdev);
> > >>>
> > >>> +       if (state == pci_channel_io_normal) {
> > >>> +               dev_warn(&pdev->dev, "Non-correctable non-fatal error
> > reported.\n");
> > >>> +               return PCI_ERS_RESULT_CAN_RECOVER;
> > >>> +       }
> > >>> +
> > >>
> > >> This code may be good to have. But not sure if this should be the fix
> > >> for igb_down() synchronization.
> > >
> > > I have the same opinion. This appears to solve the problem - but I
> > > don't know if there is a better way to solve this problem.
> > >
> > >> Intel guys may comment.
> > >
> > > Ping? Can we please get feedback from IGB/IGC maintainers this week?
> > >
> > > (I hope igc maintainers can confirm this isn't an issue for igc.)
> >
> > Adding some of the igb and igc developers.
> >
> > > cheers,
> > > grant
> > >
> > >>
> > >>>          netif_device_detach(netdev);
> > >>>
> > >>>          if (state == pci_channel_io_perm_failure)
> > >>> --
> > >>> 2.40.1.606.ga4b1b128d6-goog
> > >>>
> > >>>

Tony Nguyen May 24, 2023, 10:22 p.m. UTC | #6

Hi Grant,

On 5/24/2023 2:01 PM, Grant Grundler wrote:
> On Wed, May 24, 2023 at 5:34 AM Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com> wrote:
>>
>> Good day Tony
>>
>> We reviewed the patch and have nothing against.
> 
> Thank you for reviewing!
> 
> Can I take this as the equivalent of "Signed-off-by: Loktionov,
> Aleksandr <aleksandr.loktionov@intel.com>"?

Unless a tag is explicitly given, I don't believe one can be inferred.

> Or since Tony is listed in MAINTAINERS for drivers/net/ethernet/intel,
> is he supposed to provide that?

Assuming there's no comments/issues brought up, I'll apply it to the 
respective Intel Wired LAN tree for our validation to have a pass at it. 
Upon successful completion, I'll send the patch on to netdev for them to 
pull. Hope that helps.

Thanks,
Tony

Grant Grundler May 24, 2023, 10:34 p.m. UTC | #7

On Wed, May 24, 2023 at 3:22 PM Tony Nguyen <anthony.l.nguyen@intel.com> wrote:
>
> Hi Grant,
>
> On 5/24/2023 2:01 PM, Grant Grundler wrote:
> > On Wed, May 24, 2023 at 5:34 AM Loktionov, Aleksandr
> > <aleksandr.loktionov@intel.com> wrote:
> >>
> >> Good day Tony
> >>
> >> We reviewed the patch and have nothing against.
> >
> > Thank you for reviewing!
> >
> > Can I take this as the equivalent of "Signed-off-by: Loktionov,
> > Aleksandr <aleksandr.loktionov@intel.com>"?
>
> Unless a tag is explicitly given, I don't believe one can be inferred.

Yes - that's what I thought and was asking in case that's what
Aleksandr meant (and could easily confirm)

>
> > Or since Tony is listed in MAINTAINERS for drivers/net/ethernet/intel,
> > is he supposed to provide that?
>
> Assuming there's no comments/issues brought up, I'll apply it to the
> respective Intel Wired LAN tree for our validation to have a pass at it.
> Upon successful completion, I'll send the patch on to netdev for them to
> pull. Hope that helps.

Yes - that's what I needed to know. Thank you Tony! :)

Ying Hsu will apply this patch to Chrome OS kernels:
   https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4548800

cheers,
grant

>
> Thanks,
> Tony

Pucha, HimasekharX Reddy June 5, 2023, 4:47 a.m. UTC | #8

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Ying Hsu
> Sent: Thursday, May 18, 2023 12:57 PM
> To: netdev@vger.kernel.org
> Cc: grundler@chromium.org; intel-wired-lan@lists.osuosl.org; Ying Hsu <yinghsu@chromium.org>; Brandeburg, Jesse <jesse.brandeburg@intel.com>; linux-kernel@vger.kernel.org; Eric Dumazet <edumazet@google.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S. Miller <davem@davemloft.net>
> Subject: [Intel-wired-lan] [PATCH] igb: Fix igb_down hung on surprise removal
>
> In a setup where a Thunderbolt hub connects to Ethernet and a display through USB Type-C, users may experience a hung task timeout when they remove the cable between the PC and the Thunderbolt hub.
> This is because the igb_down function is called multiple times when the Thunderbolt hub is unplugged. For example, the igb_io_error_detected triggers the first call, and the igb_remove triggers the second call.
> The second call to igb_down will block at napi_synchronize.
> Here's the call trace:
>    __schedule+0x3b0/0xddb
>    ? __mod_timer+0x164/0x5d3
>    schedule+0x44/0xa8
>    schedule_timeout+0xb2/0x2a4
>    ? run_local_timers+0x4e/0x4e
>    msleep+0x31/0x38
>    igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4]
>    __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4]
>    igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4]
>    __dev_close_many+0x95/0xec
>    dev_close_many+0x6e/0x103
>    unregister_netdevice_many+0x105/0x5b1
>    unregister_netdevice_queue+0xc2/0x10d
>    unregister_netdev+0x1c/0x23
>    igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4]
>    pci_device_remove+0x3f/0x9c
>    device_release_driver_internal+0xfe/0x1b4
>    pci_stop_bus_device+0x5b/0x7f
>    pci_stop_bus_device+0x30/0x7f
>    pci_stop_bus_device+0x30/0x7f
>    pci_stop_and_remove_bus_device+0x12/0x19
>    pciehp_unconfigure_device+0x76/0xe9
>    pciehp_disable_slot+0x6e/0x131
>    pciehp_handle_presence_or_link_change+0x7a/0x3f7
>   pciehp_ist+0xbe/0x194
>    irq_thread_fn+0x22/0x4d
>    ? irq_thread+0x1fd/0x1fd
>    irq_thread+0x17b/0x1fd
>    ? irq_forced_thread_fn+0x5f/0x5f
>    kthread+0x142/0x153
>    ? __irq_get_irqchip_state+0x46/0x46
>    ? kthread_associate_blkcg+0x71/0x71
>    ret_from_fork+0x1f/0x30
>
> In this case, igb_io_error_detected detaches the network interface and requests a PCIE slot reset, however, the PCIE reset callback is not being invoked and thus the Ethernet connection breaks down.
> As the PCIE error in this case is a non-fatal one, requesting a slot reset can be avoided.
> This patch fixes the task hung issue and preserves Ethernet connection by ignoring non-fatal PCIE errors.
>
> Signed-off-by: Ying Hsu <yinghsu@chromium.org>
> ---
> This commit has been tested on a HP Elite Dragonfly Chromebook and a Caldigit TS3+ Thunderbolt hub. The Ethernet driver for the hub is igb. Non-fatal PCIE errors happen when users hot-plug the cables connected to the chromebook or to the external display.
>
> drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++
> 1 file changed, 5 insertions(+)
>

Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)

Stefan Schaeckeler June 6, 2024, 12:04 a.m. UTC | #9

Hi all,

This commit introduced a regression. What does not work with this commit is an AER without a surprise removal event.


On 5/18/23 00:26, Ying Hsu wrote:
> In a setup where a Thunderbolt hub connects to Ethernet and a display
> through USB Type-C, users may experience a hung task timeout when they
> remove the cable between the PC and the Thunderbolt hub.
> This is because the igb_down function is called multiple times when
> the Thunderbolt hub is unplugged. For example, the igb_io_error_detected
> triggers the first call, and the igb_remove triggers the second call.
> The second call to igb_down will block at napi_synchronize.
> Here's the call trace:
>     __schedule+0x3b0/0xddb
>     ? __mod_timer+0x164/0x5d3
>     schedule+0x44/0xa8
>     schedule_timeout+0xb2/0x2a4
>     ? run_local_timers+0x4e/0x4e
>     msleep+0x31/0x38
>     igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4]
>     __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4]
>     igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4]
>     __dev_close_many+0x95/0xec
>     dev_close_many+0x6e/0x103
>     unregister_netdevice_many+0x105/0x5b1
>     unregister_netdevice_queue+0xc2/0x10d
>     unregister_netdev+0x1c/0x23
>     igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4]
>     pci_device_remove+0x3f/0x9c
>     device_release_driver_internal+0xfe/0x1b4
>     pci_stop_bus_device+0x5b/0x7f
>     pci_stop_bus_device+0x30/0x7f
>     pci_stop_bus_device+0x30/0x7f
>     pci_stop_and_remove_bus_device+0x12/0x19
>     pciehp_unconfigure_device+0x76/0xe9
>     pciehp_disable_slot+0x6e/0x131
>     pciehp_handle_presence_or_link_change+0x7a/0x3f7
>     pciehp_ist+0xbe/0x194
>     irq_thread_fn+0x22/0x4d
>     ? irq_thread+0x1fd/0x1fd
>     irq_thread+0x17b/0x1fd
>     ? irq_forced_thread_fn+0x5f/0x5f
>     kthread+0x142/0x153
>     ? __irq_get_irqchip_state+0x46/0x46
>     ? kthread_associate_blkcg+0x71/0x71
>     ret_from_fork+0x1f/0x30
>
> In this case, igb_io_error_detected detaches the network interface
> and requests a PCIE slot reset, however, the PCIE reset callback is
> not being invoked and thus the Ethernet connection breaks down.
> As the PCIE error in this case is a non-fatal one, requesting a
> slot reset can be avoided.
> This patch fixes the task hung issue and preserves Ethernet
> connection by ignoring non-fatal PCIE errors.
>
> Signed-off-by: Ying Hsu <yinghsu@chromium.org>
> ---
> This commit has been tested on a HP Elite Dragonfly Chromebook and
> a Caldigit TS3+ Thunderbolt hub. The Ethernet driver for the hub
> is igb. Non-fatal PCIE errors happen when users hot-plug the cables
> connected to the chromebook or to the external display.
>
>  drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index 58872a4c2540..a8b217368ca1 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev,
>  	struct net_device *netdev = pci_get_drvdata(pdev);
>  	struct igb_adapter *adapter = netdev_priv(netdev);
>
> +	if (state == pci_channel_io_normal) {
> +		dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n");
> +		return PCI_ERS_RESULT_CAN_RECOVER;
> +	}
> +
>  	netif_device_detach(netdev);
>
>  	if (state == pci_channel_io_perm_failure)

We are currently stuck with the 5.4 kernel - our embedded system can't easily boot arbitrary kernel versions. The igb driver code has not changed and I'm quite positive the issue still persist in the latest upstream kernel.

This issue is reproducible with aer_inject. 09:00.0 is our i210 which is directly connected to the cpu root port 00:01.1:


- - - snip - - -
[node0_RP0_CPU0:~]$cat aer.log
AER
PCI_ID 09:00.0
UNCOR_STATUS COMP_TIME
HEADER_LOG 0 1 2 3
- - - snip - - -

- - - snip - - -
[node0_RP0_CPU0:~]$aer-inject aer.log
[369145.900845] pcieport 0000:00:01.1: aer_inject: Injecting errors 00000000/00004000 into device 0000:09:00.0
[369145.912726] pcieport 0000:00:01.1: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0
[369145.923124] igb 0000:09:00.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[369145.936791] igb 0000:09:00.0: AER:   device [8086:1537] error status/mask=00004000/00000000
[369145.947068] igb 0000:09:00.0: AER:    [14] CmpltTO
[369145.954602] igb 0000:09:00.0: Non-correctable non-fatal error reported.
[369145.984564] ------------[ cut here ]------------
[369145.990285] kernel BUG at include/linux/netdevice.h:529!
[369145.996860] invalid opcode: 0000 [#1] SMP PTI
[369146.002267] CPU: 3 PID: 142 Comm: irq/26-aerdrv Kdump: loaded Tainted: G           O      5.4.251-yocto-standard #1
[369146.015073] Hardware name: Cisco System Inc. SF-D8/Type2 - Board Product Name1, BIOS 1-29-g46d9e72a-s 05/03/2019
[369146.027570] RIP: 0010:igb_up+0x51/0x160 [igb]
[369146.032974] Code: d2 eb 16 f0 80 60 60 fe f0 80 60 60 f7 48 83 c2 01 39 93 14 02 00 00 76 13 48 8b 84 d3 08 0f 00 00 48 8b 48 60 83 e1 01 75 d9 <0f> 0b f6 83 11 02 00 00 20 0f 85 c0 00 00 00 48 8b bb 08 0f 00 00
[369146.055938] RSP: 0018:ffffb29b0045bcf0 EFLAGS: 00010246
[369146.062399] RAX: ffff8d938d99e400 RBX: ffff8d9398a08740 RCX: 0000000000000000
[369146.071186] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8d938d99e2c0
[369146.079974] RBP: ffffb29b0045bd00 R08: 00000000000002ec R09: ffffb29b002fa000
[369146.088761] R10: 0000000000006f80 R11: 000000000090ef2c R12: ffff8d9398a08000
[369146.097547] R13: ffff8d9398a08740 R14: ffff8d939ae92c00 R15: ffff8d939ae92c28
[369146.106335] FS:  0000000000000000(0000) GS:ffff8d939fcc0000(0000) knlGS:0000000000000000
[369146.116286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[369146.123381] CR2: 000056318eb418e8 CR3: 000000084e744002 CR4: 00000000003606e0
[369146.132170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[369146.140960] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[369146.149747] Call Trace:
[369146.152826]  ? show_regs.cold+0x1a/0x1f
[369146.157592]  ? __die+0x90/0xd9
[369146.161408]  ? die+0x30/0x50
[369146.165002]  ? do_trap+0x85/0xf0
[369146.169029]  ? do_error_trap+0x7c/0xb0
[369146.173695]  ? igb_up+0x51/0x160 [igb]
[369146.178355]  ? do_invalid_op+0x3c/0x50
[369146.183019]  ? igb_up+0x51/0x160 [igb]
[369146.187681]  ? invalid_op+0x7c/0x90
[369146.192027]  ? igb_up+0x51/0x160 [igb]
[369146.196691]  ? hrtimer_try_to_cancel+0x2c/0x110
[369146.202305]  ? schedule_hrtimeout_range_clock+0xa0/0x110
[369146.208871]  ? hrtimer_init_sleeper+0x90/0x90
[369146.214277]  ? igb_rx_fifo_flush_82575+0x32/0x270 [igb]
[369146.220738]  ? igb_configure+0x417/0x650 [igb]
[369146.226247]  ? igb_up+0x51/0x160 [igb]
[369146.230913]  igb_io_resume+0x31/0x50 [igb]
[369146.235998]  report_resume+0x5c/0x80
[369146.240449]  ? pcie_portdrv_probe+0x70/0x70
[369146.245631]  pci_walk_bus+0x75/0x90
[369146.249966]  pcie_do_recovery+0x163/0x280
[369146.254947]  aer_process_err_devices+0xa2/0xd1
[369146.260455]  aer_isr.cold+0x52/0xa1
[369146.264799]  ? __schedule+0x2bf/0x680
[369146.269349]  ? irq_finalize_oneshot+0xf0/0xf0
[369146.274754]  irq_thread_fn+0x28/0x50
[369146.279205]  irq_thread+0xf8/0x180
[369146.283445]  ? wake_threads_waitq+0x30/0x30
[369146.288638]  kthread+0x104/0x140
[369146.292666]  ? irq_thread_check_affinity+0x80/0x80
[369146.298597]  ? __kthread_cancel_work+0x40/0x40
[369146.304106]  ret_from_fork+0x35/0x40
- - - snip - - -

BUG_ON() comes from

- - - snip - - -
static inline void napi_enable(struct napi_struct *n)
{
        BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
        smp_mb__before_atomic();
        clear_bit(NAPI_STATE_SCHED, &n->state);
        clear_bit(NAPI_STATE_NPSVC, &n->state);
}
- - - snip - - -

The stack-trace shows that the AER handler starts here and goes into igb_up():

- - - snip - - -
static void igb_io_resume(struct pci_dev *pdev)
{
        struct net_device *netdev = pci_get_drvdata(pdev);
        struct igb_adapter *adapter = netdev_priv(netdev);

        if (netif_running(netdev)) {
                if (igb_up(adapter)) {
                        dev_err(&pdev->dev, "igb_up failed after reset\n");
                        return;
                }
        }

       ...
}
- - - snip - - -
Three functions come into the picture:

igb_io_error_detected() { // runs upon AER detection
  igb_down();  // before this commit
  noop;        // with this commit
}

igb_io_resume() { // runs upon AER handling
  igb_up()
}

igb_remove() { // runs upon rmmod igb, or surprise down removal (as shown in the commit log)
  igb_down();
}

Before this commit, the flow for an AER on 09:00.0 was
 igb_down; // from igb_io_error_detect() - this sets the NAPI_STATE_SCHED bit
 igb_up;   // from igb_io_resume()       - that works as it finds a set NAPI_STATE_SCHED bit (and then clears it)

Now with this commit, the flow for an AER on 09:00.0 is
 noop;     // from igb_io_error_detect() - the NAPI_STATE_SCHED bit is not touched, e.g. it remains cleared
 igb_up;   // from igb_io_resume()       - BUG_ON() is triggered as it finds a cleared NAPI_STATE_SCHED bit


I don't have a means to reproduce this i210 on thunderbolt issue and don't completely understand its flow. Before this commit, the "expected" flow for i210 on thunderbolt was probably

 igb_down; // from igb_io_error_detect() - this sets the NAPI_STATE_SCHED bit
 igb_up;   // from igb_io_resume()       - that works as it finds a set NAPI_STATE_SCHED bit and cleans it.
 igb_down; // from igb_remove()          - that works b/c of the previous igb_up()

The bug report shows that this was not the case. I don't completely understand the reported problem: what happens to and in igb_io_resume() and its call to igb_up()? In the commit log backtrace, I see igb_remove() eventually calling __dev_close_many() which clears the __LINK__STATE_START bit. igb_io_resume() does check this bit via netif_running() and calls only then igb_up(). My guess is igb_io_resume() executes after igb_remove() and skips therefore the execution of igb_up(). As we see from the commit log backtrace, that makes igb_remove() starve now in igb_down()->napi_synchronize().

How to fix that, I don't know.

 Stefan

Ying Hsu June 6, 2024, 8:03 a.m. UTC | #10

On the CalDigit Thunderbolt Station 3 Plus, we've encountered an issue
when the USB downstream display connection state changes. The
problematic sequence observed is:
```
igb_io_error_detected
igb_down
igb_io_error_detected
igb_down
```

The second igb_down call blocks at napi_synchronize.
Simply avoiding redundant igb_down calls makes the Ethernet of the
thunderbolt dock unusable.

If Intel can identify when an Ethernet device is within a Thunderbolt
tunnel, the patch can be more specific.

On Thu, Jun 6, 2024 at 8:04 AM Stefan Schaeckeler <schaecsn@gmx.net> wrote:
>
> Hi all,
>
> This commit introduced a regression. What does not work with this commit is an AER without a surprise removal event.
>
>
> On 5/18/23 00:26, Ying Hsu wrote:
> > In a setup where a Thunderbolt hub connects to Ethernet and a display
> > through USB Type-C, users may experience a hung task timeout when they
> > remove the cable between the PC and the Thunderbolt hub.
> > This is because the igb_down function is called multiple times when
> > the Thunderbolt hub is unplugged. For example, the igb_io_error_detected
> > triggers the first call, and the igb_remove triggers the second call.
> > The second call to igb_down will block at napi_synchronize.
> > Here's the call trace:
> >     __schedule+0x3b0/0xddb
> >     ? __mod_timer+0x164/0x5d3
> >     schedule+0x44/0xa8
> >     schedule_timeout+0xb2/0x2a4
> >     ? run_local_timers+0x4e/0x4e
> >     msleep+0x31/0x38
> >     igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4]
> >     __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4]
> >     igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4]
> >     __dev_close_many+0x95/0xec
> >     dev_close_many+0x6e/0x103
> >     unregister_netdevice_many+0x105/0x5b1
> >     unregister_netdevice_queue+0xc2/0x10d
> >     unregister_netdev+0x1c/0x23
> >     igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4]
> >     pci_device_remove+0x3f/0x9c
> >     device_release_driver_internal+0xfe/0x1b4
> >     pci_stop_bus_device+0x5b/0x7f
> >     pci_stop_bus_device+0x30/0x7f
> >     pci_stop_bus_device+0x30/0x7f
> >     pci_stop_and_remove_bus_device+0x12/0x19
> >     pciehp_unconfigure_device+0x76/0xe9
> >     pciehp_disable_slot+0x6e/0x131
> >     pciehp_handle_presence_or_link_change+0x7a/0x3f7
> >     pciehp_ist+0xbe/0x194
> >     irq_thread_fn+0x22/0x4d
> >     ? irq_thread+0x1fd/0x1fd
> >     irq_thread+0x17b/0x1fd
> >     ? irq_forced_thread_fn+0x5f/0x5f
> >     kthread+0x142/0x153
> >     ? __irq_get_irqchip_state+0x46/0x46
> >     ? kthread_associate_blkcg+0x71/0x71
> >     ret_from_fork+0x1f/0x30
> >
> > In this case, igb_io_error_detected detaches the network interface
> > and requests a PCIE slot reset, however, the PCIE reset callback is
> > not being invoked and thus the Ethernet connection breaks down.
> > As the PCIE error in this case is a non-fatal one, requesting a
> > slot reset can be avoided.
> > This patch fixes the task hung issue and preserves Ethernet
> > connection by ignoring non-fatal PCIE errors.
> >
> > Signed-off-by: Ying Hsu <yinghsu@chromium.org>
> > ---
> > This commit has been tested on a HP Elite Dragonfly Chromebook and
> > a Caldigit TS3+ Thunderbolt hub. The Ethernet driver for the hub
> > is igb. Non-fatal PCIE errors happen when users hot-plug the cables
> > connected to the chromebook or to the external display.
> >
> >  drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> > index 58872a4c2540..a8b217368ca1 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_main.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> > @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev,
> >       struct net_device *netdev = pci_get_drvdata(pdev);
> >       struct igb_adapter *adapter = netdev_priv(netdev);
> >
> > +     if (state == pci_channel_io_normal) {
> > +             dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n");
> > +             return PCI_ERS_RESULT_CAN_RECOVER;
> > +     }
> > +
> >       netif_device_detach(netdev);
> >
> >       if (state == pci_channel_io_perm_failure)
>
> We are currently stuck with the 5.4 kernel - our embedded system can't easily boot arbitrary kernel versions. The igb driver code has not changed and I'm quite positive the issue still persist in the latest upstream kernel.
>
> This issue is reproducible with aer_inject. 09:00.0 is our i210 which is directly connected to the cpu root port 00:01.1:
>
>
> - - - snip - - -
> [node0_RP0_CPU0:~]$cat aer.log
> AER
> PCI_ID 09:00.0
> UNCOR_STATUS COMP_TIME
> HEADER_LOG 0 1 2 3
> - - - snip - - -
>
> - - - snip - - -
> [node0_RP0_CPU0:~]$aer-inject aer.log
> [369145.900845] pcieport 0000:00:01.1: aer_inject: Injecting errors 00000000/00004000 into device 0000:09:00.0
> [369145.912726] pcieport 0000:00:01.1: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0
> [369145.923124] igb 0000:09:00.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
> [369145.936791] igb 0000:09:00.0: AER:   device [8086:1537] error status/mask=00004000/00000000
> [369145.947068] igb 0000:09:00.0: AER:    [14] CmpltTO
> [369145.954602] igb 0000:09:00.0: Non-correctable non-fatal error reported.
> [369145.984564] ------------[ cut here ]------------
> [369145.990285] kernel BUG at include/linux/netdevice.h:529!
> [369145.996860] invalid opcode: 0000 [#1] SMP PTI
> [369146.002267] CPU: 3 PID: 142 Comm: irq/26-aerdrv Kdump: loaded Tainted: G           O      5.4.251-yocto-standard #1
> [369146.015073] Hardware name: Cisco System Inc. SF-D8/Type2 - Board Product Name1, BIOS 1-29-g46d9e72a-s 05/03/2019
> [369146.027570] RIP: 0010:igb_up+0x51/0x160 [igb]
> [369146.032974] Code: d2 eb 16 f0 80 60 60 fe f0 80 60 60 f7 48 83 c2 01 39 93 14 02 00 00 76 13 48 8b 84 d3 08 0f 00 00 48 8b 48 60 83 e1 01 75 d9 <0f> 0b f6 83 11 02 00 00 20 0f 85 c0 00 00 00 48 8b bb 08 0f 00 00
> [369146.055938] RSP: 0018:ffffb29b0045bcf0 EFLAGS: 00010246
> [369146.062399] RAX: ffff8d938d99e400 RBX: ffff8d9398a08740 RCX: 0000000000000000
> [369146.071186] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8d938d99e2c0
> [369146.079974] RBP: ffffb29b0045bd00 R08: 00000000000002ec R09: ffffb29b002fa000
> [369146.088761] R10: 0000000000006f80 R11: 000000000090ef2c R12: ffff8d9398a08000
> [369146.097547] R13: ffff8d9398a08740 R14: ffff8d939ae92c00 R15: ffff8d939ae92c28
> [369146.106335] FS:  0000000000000000(0000) GS:ffff8d939fcc0000(0000) knlGS:0000000000000000
> [369146.116286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [369146.123381] CR2: 000056318eb418e8 CR3: 000000084e744002 CR4: 00000000003606e0
> [369146.132170] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [369146.140960] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [369146.149747] Call Trace:
> [369146.152826]  ? show_regs.cold+0x1a/0x1f
> [369146.157592]  ? __die+0x90/0xd9
> [369146.161408]  ? die+0x30/0x50
> [369146.165002]  ? do_trap+0x85/0xf0
> [369146.169029]  ? do_error_trap+0x7c/0xb0
> [369146.173695]  ? igb_up+0x51/0x160 [igb]
> [369146.178355]  ? do_invalid_op+0x3c/0x50
> [369146.183019]  ? igb_up+0x51/0x160 [igb]
> [369146.187681]  ? invalid_op+0x7c/0x90
> [369146.192027]  ? igb_up+0x51/0x160 [igb]
> [369146.196691]  ? hrtimer_try_to_cancel+0x2c/0x110
> [369146.202305]  ? schedule_hrtimeout_range_clock+0xa0/0x110
> [369146.208871]  ? hrtimer_init_sleeper+0x90/0x90
> [369146.214277]  ? igb_rx_fifo_flush_82575+0x32/0x270 [igb]
> [369146.220738]  ? igb_configure+0x417/0x650 [igb]
> [369146.226247]  ? igb_up+0x51/0x160 [igb]
> [369146.230913]  igb_io_resume+0x31/0x50 [igb]
> [369146.235998]  report_resume+0x5c/0x80
> [369146.240449]  ? pcie_portdrv_probe+0x70/0x70
> [369146.245631]  pci_walk_bus+0x75/0x90
> [369146.249966]  pcie_do_recovery+0x163/0x280
> [369146.254947]  aer_process_err_devices+0xa2/0xd1
> [369146.260455]  aer_isr.cold+0x52/0xa1
> [369146.264799]  ? __schedule+0x2bf/0x680
> [369146.269349]  ? irq_finalize_oneshot+0xf0/0xf0
> [369146.274754]  irq_thread_fn+0x28/0x50
> [369146.279205]  irq_thread+0xf8/0x180
> [369146.283445]  ? wake_threads_waitq+0x30/0x30
> [369146.288638]  kthread+0x104/0x140
> [369146.292666]  ? irq_thread_check_affinity+0x80/0x80
> [369146.298597]  ? __kthread_cancel_work+0x40/0x40
> [369146.304106]  ret_from_fork+0x35/0x40
> - - - snip - - -
>
> BUG_ON() comes from
>
> - - - snip - - -
> static inline void napi_enable(struct napi_struct *n)
> {
>         BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
>         smp_mb__before_atomic();
>         clear_bit(NAPI_STATE_SCHED, &n->state);
>         clear_bit(NAPI_STATE_NPSVC, &n->state);
> }
> - - - snip - - -
>
> The stack-trace shows that the AER handler starts here and goes into igb_up():
>
> - - - snip - - -
> static void igb_io_resume(struct pci_dev *pdev)
> {
>         struct net_device *netdev = pci_get_drvdata(pdev);
>         struct igb_adapter *adapter = netdev_priv(netdev);
>
>         if (netif_running(netdev)) {
>                 if (igb_up(adapter)) {
>                         dev_err(&pdev->dev, "igb_up failed after reset\n");
>                         return;
>                 }
>         }
>
>        ...
> }
> - - - snip - - -
> Three functions come into the picture:
>
> igb_io_error_detected() { // runs upon AER detection
>   igb_down();  // before this commit
>   noop;        // with this commit
> }
>
> igb_io_resume() { // runs upon AER handling
>   igb_up()
> }
>
> igb_remove() { // runs upon rmmod igb, or surprise down removal (as shown in the commit log)
>   igb_down();
> }
>
> Before this commit, the flow for an AER on 09:00.0 was
>  igb_down; // from igb_io_error_detect() - this sets the NAPI_STATE_SCHED bit
>  igb_up;   // from igb_io_resume()       - that works as it finds a set NAPI_STATE_SCHED bit (and then clears it)
>
> Now with this commit, the flow for an AER on 09:00.0 is
>  noop;     // from igb_io_error_detect() - the NAPI_STATE_SCHED bit is not touched, e.g. it remains cleared
>  igb_up;   // from igb_io_resume()       - BUG_ON() is triggered as it finds a cleared NAPI_STATE_SCHED bit
>
>
> I don't have a means to reproduce this i210 on thunderbolt issue and don't completely understand its flow. Before this commit, the "expected" flow for i210 on thunderbolt was probably
>
>  igb_down; // from igb_io_error_detect() - this sets the NAPI_STATE_SCHED bit
>  igb_up;   // from igb_io_resume()       - that works as it finds a set NAPI_STATE_SCHED bit and cleans it.
>  igb_down; // from igb_remove()          - that works b/c of the previous igb_up()
>
> The bug report shows that this was not the case. I don't completely understand the reported problem: what happens to and in igb_io_resume() and its call to igb_up()? In the commit log backtrace, I see igb_remove() eventually calling __dev_close_many() which clears the __LINK__STATE_START bit. igb_io_resume() does check this bit via netif_running() and calls only then igb_up(). My guess is igb_io_resume() executes after igb_remove() and skips therefore the execution of igb_up(). As we see from the commit log backtrace, that makes igb_remove() starve now in igb_down()->napi_synchronize().
>
> How to fix that, I don't know.
>
>  Stefan
>
>

Stefan Schaeckeler June 6, 2024, 3:46 p.m. UTC | #11

Hello Ying,

On 6/6/24 01:03, Ying Hsu wrote:
> On the CalDigit Thunderbolt Station 3 Plus, we've encountered an issue
> when the USB downstream display connection state changes. The
> problematic sequence observed is:
> ```
> igb_io_error_detected
> igb_down
> igb_io_error_detected
> igb_down
> ```
>
> The second igb_down call blocks at napi_synchronize.

From the backtrace in your commit message, I gain the impression you get a hotplug event for removing the ethernet device. From your commit message I gain the impression you get an AER as well which is handled in igb_io_error_detected()/igb_io_resume(). The problem lies IMHO in the interaction of both.

> Simply avoiding redundant igb_down calls makes the Ethernet of the thunderbolt dock unusable.

I'm not too sure if the current code is even perfect in your use-case. What happens when you get an AER on the ethernet device (without plugging it out at the same time)?

Can you try to AER inject a completion timeout into your ethernet device, similar how I showed it in my previous message? Just replace the bdf 09:00.0 with the bdf of your ethernet device. I expect a kernel crash like we see that on our embedded system.

> If Intel can identify when an Ethernet device is within a Thunderbolt
> tunnel, the patch can be more specific.

 Stefan

igb: Fix igb_down hung on surprise removal

Checks

Commit Message

Comments

Patch