Message ID | 20240810002302.2054816-1-vinicius.gomes@intel.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [iwl-net,v1] igb: Fix not clearing TimeSync interrupts for 82580 | expand |
Dear Vinicius, Thank you for the patch. Am 10.08.24 um 02:23 schrieb Vinicius Costa Gomes: > It was reported that 82580 NICs have a hardware bug that makes it > necessary to write into the TSICR (TimeSync Interrupt Cause) register > to clear it. Were you able to verify that report by checking against some errata? Is Intel aware of the problem? > Add a conditional so only for 82580 we write into the TSICR register, > so we don't risk losing events for other models. > > This (partially) reverts commit ee14cc9ea19b ("igb: Fix missing time sync events"). > > Fixes: ee14cc9ea19b ("igb: Fix missing time sync events") > Reported-by: Daiwei Li <daiweili@gmail.com> > Closes: https://lore.kernel.org/intel-wired-lan/CAN0jFd1kO0MMtOh8N2Ztxn6f7vvDKp2h507sMryobkBKe=xk=w@mail.gmail.com/ > Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> > --- > > @Daiwei Li, I don't have a 82580 handy, please confirm that the patch > fixes the issue you are having. Please also add a description of the test case, and maybe the PCI vendor and device code of your network device. > drivers/net/ethernet/intel/igb/igb_main.c | 27 ++++++++++++++++++----- > 1 file changed, 22 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c > index 11be39f435f3..edb34f67ae03 100644 > --- a/drivers/net/ethernet/intel/igb/igb_main.c > +++ b/drivers/net/ethernet/intel/igb/igb_main.c > @@ -6960,31 +6960,48 @@ static void igb_extts(struct igb_adapter *adapter, int tsintr_tt) > static void igb_tsync_interrupt(struct igb_adapter *adapter) > { > struct e1000_hw *hw = &adapter->hw; > - u32 tsicr = rd32(E1000_TSICR); > + u32 ack = 0, tsicr = rd32(E1000_TSICR); > struct ptp_clock_event event; > > if (tsicr & TSINTR_SYS_WRAP) { > event.type = PTP_CLOCK_PPS; > if (adapter->ptp_caps.pps) > ptp_clock_event(adapter->ptp_clock, &event); > + ack |= TSINTR_SYS_WRAP; > } > > if (tsicr & E1000_TSICR_TXTS) { > /* retrieve hardware timestamp */ > schedule_work(&adapter->ptp_tx_work); > + ack |= E1000_TSICR_TXTS; > } > > - if (tsicr & TSINTR_TT0) > + if (tsicr & TSINTR_TT0) { > igb_perout(adapter, 0); > + ack |= TSINTR_TT0; > + } > > - if (tsicr & TSINTR_TT1) > + if (tsicr & TSINTR_TT1) { > igb_perout(adapter, 1); > + ack |= TSINTR_TT1; > + } > > - if (tsicr & TSINTR_AUTT0) > + if (tsicr & TSINTR_AUTT0) { > igb_extts(adapter, 0); > + ack |= TSINTR_AUTT0; > + } > > - if (tsicr & TSINTR_AUTT1) > + if (tsicr & TSINTR_AUTT1) { > igb_extts(adapter, 1); > + ack |= TSINTR_AUTT1; > + } > + > + if (hw->mac.type == e1000_82580) { > + /* 82580 has a hardware bug that requires a explicit a*n* > + * write to clear the TimeSync interrupt cause. > + */ > + wr32(E1000_TSICR, ack); > + } Is there a nicer way to write this, so `ack` is only assigned in case for the 82580? > } > > static irqreturn_t igb_msix_other(int irq, void *data) Kind regards, Paul
On Fri, Aug 09, 2024 at 05:23:02PM -0700, Vinicius Costa Gomes wrote: > It was reported that 82580 NICs have a hardware bug that makes it > necessary to write into the TSICR (TimeSync Interrupt Cause) register > to clear it. Bug, or was it a feature? Or IOW, maybe i210 changed the semantics of the TSICR? And what about the 82576? Thanks, Richard
> @Daiwei Li, I don't have a 82580 handy, please confirm that the patch fixes the issue you are having. Thank you for the patch! I can confirm it fixes my issue. Below I offer a patch that also works in response to Paul's feedback. > Please also add a description of the test case I am running ptp4l to serve PTP to a client device attached to the NIC. To test, I am rebuilding igb.ko and reloading it. Without this patch, I see repeatedly in the output of ptp4l: > timed out while polling for tx timestamp increasing tx_timestamp_timeout or > increasing kworker priority may correct this issue, but a driver bug likely > causes it as well as my client device failing to sync time. > and maybe the PCI vendor and device code of your network device. % lspci -nn | grep Network 17:00.0 Ethernet controller [0200]: Intel Corporation 82580 Gigabit Network Connection [8086:150e] (rev 01) 17:00.1 Ethernet controller [0200]: Intel Corporation 82580 Gigabit Network Connection [8086:150e] (rev 01) 17:00.2 Ethernet controller [0200]: Intel Corporation 82580 Gigabit Network Connection [8086:150e] (rev 01) 17:00.3 Ethernet controller [0200]: Intel Corporation 82580 Gigabit Network Connection [8086:150e] (rev 01) > Bug, or was it a feature? According to https://lore.kernel.org/all/CDCB8BE0.1EC2C%25matthew.vick@intel.com/ it was a bug. It looks like the datasheet was not updated to acknowledge this bug: https://www.intel.com/content/www/us/en/content-details/333167/intel-82580-eb-82580-db-gbe-controller-datasheet.html (section 8.17.28.1). > Is there a nicer way to write this, so `ack` is only assigned in case > for the 82580? diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index ada42ba63549..87ec1258e22a 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6986,6 +6986,10 @@ static void igb_tsync_interrupt(struct igb_adapter *adapter) struct e1000_hw *hw = &adapter->hw; u32 tsicr = rd32(E1000_TSICR); struct ptp_clock_event event; + const u32 mask = (TSINTR_SYS_WRAP | E1000_TSICR_TXTS | + TSINTR_TT0 | TSINTR_TT1 | + TSINTR_AUTT0 | TSINTR_AUTT1); + if (tsicr & TSINTR_SYS_WRAP) { event.type = PTP_CLOCK_PPS; @@ -7009,6 +7013,13 @@ static void igb_tsync_interrupt(struct igb_adapter *adapter) if (tsicr & TSINTR_AUTT1) igb_extts(adapter, 1); + + if (hw->mac.type == e1000_82580) { + /* 82580 has a hardware bug that requires a explicit + * write to clear the TimeSync interrupt cause. + */ + wr32(E1000_TSICR, tsicr & mask); + } } On Fri, Aug 9, 2024 at 10:04 PM Richard Cochran <richardcochran@gmail.com> wrote: > > On Fri, Aug 09, 2024 at 05:23:02PM -0700, Vinicius Costa Gomes wrote: > > It was reported that 82580 NICs have a hardware bug that makes it > > necessary to write into the TSICR (TimeSync Interrupt Cause) register > > to clear it. > > Bug, or was it a feature? > > Or IOW, maybe i210 changed the semantics of the TSICR? > > And what about the 82576? > > Thanks, > Richard
Hi, Daiwei Li <daiweili@gmail.com> writes: >> @Daiwei Li, I don't have a 82580 handy, please confirm that the patch > fixes the issue you are having. > > Thank you for the patch! I can confirm it fixes my issue. Below I offer a > patch that also works in response to Paul's feedback. > Your patch looks better than mine. I would suggest for you to go ahead and propose yours for inclusion. >> Please also add a description of the test case > > I am running ptp4l to serve PTP to a client device attached to the NIC. > To test, I am rebuilding igb.ko and reloading it. > Without this patch, I see repeatedly in the output of ptp4l: > >> timed out while polling for tx timestamp increasing tx_timestamp_timeout or >> increasing kworker priority may correct this issue, but a driver bug likely >> causes it > > as well as my client device failing to sync time. > >> and maybe the PCI vendor and device code of your network device. > > % lspci -nn | grep Network > 17:00.0 Ethernet controller [0200]: Intel Corporation 82580 Gigabit > Network Connection [8086:150e] (rev 01) > 17:00.1 Ethernet controller [0200]: Intel Corporation 82580 Gigabit > Network Connection [8086:150e] (rev 01) > 17:00.2 Ethernet controller [0200]: Intel Corporation 82580 Gigabit > Network Connection [8086:150e] (rev 01) > 17:00.3 Ethernet controller [0200]: Intel Corporation 82580 Gigabit > Network Connection [8086:150e] (rev 01) > >> Bug, or was it a feature? > > According to https://lore.kernel.org/all/CDCB8BE0.1EC2C%25matthew.vick@intel.com/ > it was a bug. It looks like the datasheet was not updated to > acknowledge this bug: > https://www.intel.com/content/www/us/en/content-details/333167/intel-82580-eb-82580-db-gbe-controller-datasheet.html > (section 8.17.28.1). > >> Is there a nicer way to write this, so `ack` is only assigned in case >> for the 82580? > > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c > b/drivers/net/ethernet/intel/igb/igb_main.c > index ada42ba63549..87ec1258e22a 100644 > --- a/drivers/net/ethernet/intel/igb/igb_main.c > +++ b/drivers/net/ethernet/intel/igb/igb_main.c > @@ -6986,6 +6986,10 @@ static void igb_tsync_interrupt(struct > igb_adapter *adapter) > struct e1000_hw *hw = &adapter->hw; > u32 tsicr = rd32(E1000_TSICR); > struct ptp_clock_event event; > + const u32 mask = (TSINTR_SYS_WRAP | E1000_TSICR_TXTS | > + TSINTR_TT0 | TSINTR_TT1 | > + TSINTR_AUTT0 | TSINTR_AUTT1); > + > > if (tsicr & TSINTR_SYS_WRAP) { > event.type = PTP_CLOCK_PPS; > @@ -7009,6 +7013,13 @@ static void igb_tsync_interrupt(struct > igb_adapter *adapter) > > if (tsicr & TSINTR_AUTT1) > igb_extts(adapter, 1); > + > + if (hw->mac.type == e1000_82580) { > + /* 82580 has a hardware bug that requires a explicit > + * write to clear the TimeSync interrupt cause. > + */ > + wr32(E1000_TSICR, tsicr & mask); Yeah, I should have thought about that, that writing '1' into an interrupr that is cleared should be fine. > + } > } > On Fri, Aug 9, 2024 at 10:04 PM Richard Cochran > <richardcochran@gmail.com> wrote: >> >> On Fri, Aug 09, 2024 at 05:23:02PM -0700, Vinicius Costa Gomes wrote: >> > It was reported that 82580 NICs have a hardware bug that makes it >> > necessary to write into the TSICR (TimeSync Interrupt Cause) register >> > to clear it. >> >> Bug, or was it a feature? >> >> Or IOW, maybe i210 changed the semantics of the TSICR? >> >> And what about the 82576? >> >> Thanks, >> Richard
On 2024-08-10 at 05:53:02, Vinicius Costa Gomes (vinicius.gomes@intel.com) wrote: > @@ -6960,31 +6960,48 @@ static void igb_extts(struct igb_adapter *adapter, int tsintr_tt) > static void igb_tsync_interrupt(struct igb_adapter *adapter) > { > struct e1000_hw *hw = &adapter->hw; > - u32 tsicr = rd32(E1000_TSICR); > + u32 ack = 0, tsicr = rd32(E1000_TSICR); nit: reverse xmas tree.
On Tue, 13 Aug 2024 09:42:53 +0530 Ratheesh Kannoth wrote: > On 2024-08-10 at 05:53:02, Vinicius Costa Gomes (vinicius.gomes@intel.com) wrote: > > @@ -6960,31 +6960,48 @@ static void igb_extts(struct igb_adapter *adapter, int tsintr_tt) > > static void igb_tsync_interrupt(struct igb_adapter *adapter) > > { > > struct e1000_hw *hw = &adapter->hw; > > - u32 tsicr = rd32(E1000_TSICR); > > + u32 ack = 0, tsicr = rd32(E1000_TSICR); > nit: reverse xmas tree. Please read the last paragraph below and disseminate this information among your colleagues at Marvell. Reviewer guidance ----------------- Reviewing other people's patches on the list is highly encouraged, regardless of the level of expertise. For general guidance and helpful tips please see :ref:`development_advancedtopics_reviews`. It's safe to assume that netdev maintainers know the community and the level of expertise of the reviewers. The reviewers should not be concerned about their comments impeding or derailing the patch flow. Less experienced reviewers are highly encouraged to do more in-depth review of submissions and not focus exclusively on trivial or subjective matters like code formatting, tags etc. See: https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#reviewer-guidance
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 11be39f435f3..edb34f67ae03 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6960,31 +6960,48 @@ static void igb_extts(struct igb_adapter *adapter, int tsintr_tt) static void igb_tsync_interrupt(struct igb_adapter *adapter) { struct e1000_hw *hw = &adapter->hw; - u32 tsicr = rd32(E1000_TSICR); + u32 ack = 0, tsicr = rd32(E1000_TSICR); struct ptp_clock_event event; if (tsicr & TSINTR_SYS_WRAP) { event.type = PTP_CLOCK_PPS; if (adapter->ptp_caps.pps) ptp_clock_event(adapter->ptp_clock, &event); + ack |= TSINTR_SYS_WRAP; } if (tsicr & E1000_TSICR_TXTS) { /* retrieve hardware timestamp */ schedule_work(&adapter->ptp_tx_work); + ack |= E1000_TSICR_TXTS; } - if (tsicr & TSINTR_TT0) + if (tsicr & TSINTR_TT0) { igb_perout(adapter, 0); + ack |= TSINTR_TT0; + } - if (tsicr & TSINTR_TT1) + if (tsicr & TSINTR_TT1) { igb_perout(adapter, 1); + ack |= TSINTR_TT1; + } - if (tsicr & TSINTR_AUTT0) + if (tsicr & TSINTR_AUTT0) { igb_extts(adapter, 0); + ack |= TSINTR_AUTT0; + } - if (tsicr & TSINTR_AUTT1) + if (tsicr & TSINTR_AUTT1) { igb_extts(adapter, 1); + ack |= TSINTR_AUTT1; + } + + if (hw->mac.type == e1000_82580) { + /* 82580 has a hardware bug that requires a explicit + * write to clear the TimeSync interrupt cause. + */ + wr32(E1000_TSICR, ack); + } } static irqreturn_t igb_msix_other(int irq, void *data)
It was reported that 82580 NICs have a hardware bug that makes it necessary to write into the TSICR (TimeSync Interrupt Cause) register to clear it. Add a conditional so only for 82580 we write into the TSICR register, so we don't risk losing events for other models. This (partially) reverts commit ee14cc9ea19b ("igb: Fix missing time sync events"). Fixes: ee14cc9ea19b ("igb: Fix missing time sync events") Reported-by: Daiwei Li <daiweili@gmail.com> Closes: https://lore.kernel.org/intel-wired-lan/CAN0jFd1kO0MMtOh8N2Ztxn6f7vvDKp2h507sMryobkBKe=xk=w@mail.gmail.com/ Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> --- @Daiwei Li, I don't have a 82580 handy, please confirm that the patch fixes the issue you are having. drivers/net/ethernet/intel/igb/igb_main.c | 27 ++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-)