mbox series

[net,0/4] Fix bit timings for m_can_pci (Elkhart Lake)

Message ID cover.1636967198.git.matthias.schiffer@ew.tq-group.com (mailing list archive)
Headers show
Series Fix bit timings for m_can_pci (Elkhart Lake) | expand

Message

Matthias Schiffer Nov. 15, 2021, 9:18 a.m. UTC
This series fixes two issues we found with the setup of the CAN
controller of Intel Elkhart Lake CPUs:

- Patch 1 fixes an incorrect reference clock rate, which caused the
  configured and the actual bitrate always to differ by a factor of 2.
- Patches 2-4 fix a deviation between the driver and the documentation.
  We did not actually see issues without these patches, however we did
  only superficial testing and may just not have hit the specific
  bittiming values that violate the documented limits.


Matthias Schiffer (4):
  can: m_can: pci: fix incorrect reference clock rate
  Revert "can: m_can: remove support for custom bit timing"
  can: m_can: make custom bittiming fields const
  can: m_can: pci: use custom bit timings for Elkhart Lake

 drivers/net/can/m_can/m_can.c     | 24 ++++++++++++----
 drivers/net/can/m_can/m_can.h     |  3 ++
 drivers/net/can/m_can/m_can_pci.c | 48 ++++++++++++++++++++++++++++---
 3 files changed, 65 insertions(+), 10 deletions(-)

Comments

Matthias Schiffer Nov. 16, 2021, 1:58 p.m. UTC | #1
On Mon, 2021-11-15 at 10:18 +0100, Matthias Schiffer wrote:
> This series fixes two issues we found with the setup of the CAN
> controller of Intel Elkhart Lake CPUs:
> 
> - Patch 1 fixes an incorrect reference clock rate, which caused the
>   configured and the actual bitrate always to differ by a factor of 2.
> - Patches 2-4 fix a deviation between the driver and the documentation.
>   We did not actually see issues without these patches, however we did
>   only superficial testing and may just not have hit the specific
>   bittiming values that violate the documented limits.
> 
> 
> Matthias Schiffer (4):
>   can: m_can: pci: fix incorrect reference clock rate
>   Revert "can: m_can: remove support for custom bit timing"
>   can: m_can: make custom bittiming fields const
>   can: m_can: pci: use custom bit timings for Elkhart Lake
> 
>  drivers/net/can/m_can/m_can.c     | 24 ++++++++++++----
>  drivers/net/can/m_can/m_can.h     |  3 ++
>  drivers/net/can/m_can/m_can_pci.c | 48 ++++++++++++++++++++++++++++---
>  3 files changed, 65 insertions(+), 10 deletions(-)
> 

I just noticed that m_can_pci is completely broken on 5.15.2, while
it's working fine on 5.14.y.

I assume something simliar to [1] will be necessary in m_can_pci as
well, however I'm not really familiar with the driver. There is no
"mram_base" in m_can_plat_pci, only "base". Is using "base" with
iowrite32/ioread32 + manual increment the correct solution here?


[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=99d173fbe8944861a00ebd1c73817a1260d21e60
Jarkko Nikula Nov. 17, 2021, 12:14 p.m. UTC | #2
Hi

On 11/16/21 3:58 PM, Matthias Schiffer wrote:
> I just noticed that m_can_pci is completely broken on 5.15.2, while
> it's working fine on 5.14.y.
> 
Hmm.. so that may explain why I once saw candump received just zeroes on 
v5.15-rc something but earlier kernels were ok. What's odd then next 
time v5.15-rc was ok so went blaming sun spots instead of bisecting.

> I assume something simliar to [1] will be necessary in m_can_pci as
> well, however I'm not really familiar with the driver. There is no
> "mram_base" in m_can_plat_pci, only "base". Is using "base" with
> iowrite32/ioread32 + manual increment the correct solution here?
> 
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=99d173fbe8944861a00ebd1c73817a1260d21e60
> 
If your test case after 5.15 reliably fails are you able to bisect or 
check does the regression originate from the same commit?

Jarkko
Matthias Schiffer Nov. 18, 2021, 2:47 p.m. UTC | #3
On Wed, 2021-11-17 at 14:14 +0200, Jarkko Nikula wrote:
> Hi
> 
> On 11/16/21 3:58 PM, Matthias Schiffer wrote:
> > I just noticed that m_can_pci is completely broken on 5.15.2, while
> > it's working fine on 5.14.y.
> > 
> 
> Hmm.. so that may explain why I once saw candump received just zeroes on 
> v5.15-rc something but earlier kernels were ok. What's odd then next 
> time v5.15-rc was ok so went blaming sun spots instead of bisecting.
> 
> > I assume something simliar to [1] will be necessary in m_can_pci as
> > well, however I'm not really familiar with the driver. There is no
> > "mram_base" in m_can_plat_pci, only "base". Is using "base" with
> > iowrite32/ioread32 + manual increment the correct solution here?
> > 
> > 
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=99d173fbe8944861a00ebd1c73817a1260d21e60
> > 
> 
> If your test case after 5.15 reliably fails are you able to bisect or 
> check does the regression originate from the same commit?
> 
> Jarkko

The Fixes tag of 99d173fbe894 ("can: m_can: fix iomap_read_fifo() and
iomap_write_fifo()") is off AFAICT, the actual breakage happened with
812270e5445b ("can: m_can: Batch FIFO writes during CAN transmit") and
1aa6772f64b4 ("can: m_can: Batch FIFO reads during CAN receive");
reverting these two patches fixes the regression.

I just sent a patch for m_can_pci that applies the same fix that was
done in m_can_platform in 99d173fbe894, and verified that this makes
CAN communication work on Elkhart Lake with Linux 5.15.y together with
my other patches.

Matthias