diff mbox

[v3,4/4] can: flexcan: add vf610 support for FlexCAN

Message ID e88108c655e2c16dc9cf6c4194547e73@agner.ch (mailing list archive)
State New, archived
Headers show

Commit Message

Stefan Agner July 28, 2014, 4:20 p.m. UTC
Am 2014-07-25 15:33, schrieb Marc Kleine-Budde:
> On 07/25/2014 12:50 PM, Stefan Agner wrote:
>> Am 2014-07-16 08:43, schrieb Stefan Agner:
>>> Am 2014-07-15 16:24, schrieb Marc Kleine-Budde:
>>> <snip>
>>>>> @@ -150,18 +171,20 @@
>>>>>   * FLEXCAN hardware feature flags
>>>>>   *
>>>>>   * Below is some version info we got:
>>>>> - *    SOC   Version   IP-Version  Glitch-  [TR]WRN_INT
>>>>> - *                                Filter?   connected?
>>>>> - *   MX25  FlexCAN2  03.00.00.00     no         no
>>>>> - *   MX28  FlexCAN2  03.00.04.00    yes        yes
>>>>> - *   MX35  FlexCAN2  03.00.00.00     no         no
>>>>> - *   MX53  FlexCAN2  03.00.00.00    yes         no
>>>>> - *   MX6s  FlexCAN3  10.00.12.00    yes        yes
>>>>> + *    SOC   Version   IP-Version  Glitch-  [TR]WRN_INT  Memory err
>>>>> + *                                Filter?   connected?  detection
>>>>> + *   MX25  FlexCAN2  03.00.00.00     no         no         no
>>>>> + *   MX28  FlexCAN2  03.00.04.00    yes        yes         no
>>>>> + *   MX35  FlexCAN2  03.00.00.00     no         no         no
>>>>> + *   MX53  FlexCAN2  03.00.00.00    yes         no         no
>>>>> + *   MX6s  FlexCAN3  10.00.12.00    yes        yes         no
>>>>> + *   VF610 FlexCAN3  ?               no         no        yes
>>>>                                         ^^         ^^
>>>>
>>>> Can you check the datasheet if the flexcan core has a "Glitch Filter
>>>> Width Register (FLEXCANx_GFWR)"
>>>
>>>
>>> There is no such register called GFWR/Glitch Filter or similar.
>>>
>>>> Can you check if the core generates a warning interrupt with the current
>>>> setup, if you don't switch on bus error reporting? This means internally
>>>> the [TR]WRN_INT is connected and works as specified.
>>>
>>> Ok, so I disabled TWRNMSK (Bit 11) in the control register and printed
>>> out the error and status register (ESR1), this is what I get:
>>> [  191.285295] flexcan_irq, esr=00040080
>>> [  191.288996] flexcan_irq, ctrl=17092051
>>>
>>> Bit 17 (TWRNINT) is not set while TWRNMSK is disabled. Hence [TR]WRN_INT
>>> is not connected?
>>
>> Ping. Anything open/to do from my side?
> 
> Please keep the printing of esr and ctrl in the interrupt handler, add a
> #define DEBUG in the driver, but do not change anything else. Then:
> 

Ok, my changes look like this:

                flexcan_write(reg_esr & FLEXCAN_ESR_ALL_INT,
&regs->esr);
@@ -885,8 +889,8 @@ static int flexcan_chip_start(struct net_device
*dev)
         */
        reg_ctrl = flexcan_read(&regs->ctrl);
        reg_ctrl &= ~FLEXCAN_CTRL_TSYN;
-       reg_ctrl |= FLEXCAN_CTRL_BOFF_REC | FLEXCAN_CTRL_LBUF |
-               FLEXCAN_CTRL_ERR_STATE;
+       reg_ctrl |= FLEXCAN_CTRL_BOFF_REC | FLEXCAN_CTRL_LBUF;// |
+               //FLEXCAN_CTRL_ERR_STATE;
        /*
         * enable the "error interrupt" (FLEXCAN_CTRL_ERR_MSK),
         * on most Flexcan cores, too. Otherwise we don't get

I'm not sure whether you really want to keep the FLEXCAN_CTRL_ERR_STATE
commented out...


> start the driver as usual:
> - configure bitrate
> - ifconfig can0 up
> - _do_not_ enable bit error reporint
> - start candump from gitorious with:
>   candump -e any,0:0,#FFFFFFFF
> - short circuit CAN-High and CAN-Low
> - send a CAN frame


root@colibri-vf:~# ip link set can0 type can bitrate 500000
[  146.140862] flexcan 40020000.flexcan can0: bitrate error 0.7%
root@colibri-vf:~# ip link set can0 up
[  146.661430] flexcan 40020000.flexcan can0: writing ctrl=0x17092001
[  146.667902] flexcan 40020000.flexcan can0: flexcan_set_bittiming:
mcr=0x5980000f ctrl=0x17092001
[  146.676790] flexcan 40020000.flexcan can0: flexcan_chip_start:
writing mcr=0x79a20208
[  146.684709] flexcan 40020000.flexcan can0: flexcan_chip_start:
writing ctrl=0x17092051
[  146.698228] flexcan 40020000.flexcan can0: flexcan_chip_start:
reading mcr=0x60a20208 ctrl=0x1709205

# cansend can0 1F334455#1122334455667788
interface = can0, family = 29, type = 3, proto = 

Nothing happens on candump.
 
> Another test is to setup a proper CAN bus, but configure the a second
> CAN node with a different bitrate. Start the can0 on vf610 as usual,
> then candump -e any,0:0,#FFFFFFFF, but do not send any CAN frames on the
> vf610. Then on the other system keep sending CAN frames, you might have
> to restart the CAN on the second system....

I'm using a PCAN-USB from Peak System for this test. When running with
the same bitrates on both sides (500000, things work smoothly as
expected:

# ip link set can0 type can bitrate 500000
# ip link set can0 up
# cansend can0 1F334455#1122334455667788

root@colibri-vf:~# ./candump -e any,0:0,#FFFFFFFF
  can0  1F334455   [8]  11 22 33 44 55 66 77 88

dmesg:
[  541.772502] flexcan_irq, esr=00040180
[  541.776207] flexcan_irq, ctrl=17092051

Then I reconfigure the system on my host:
# ip link set can0 down
# ip link set can0 type can bitrate 1000000
# ip link set can0 up
# cansend can0 1F334455#1122334455667788

Nothing happens on Vybrid side.


However, I got once this Kernel panic after I reconfigured to normal
mode. But I could not reproduce this.

[  461.954394] flexcan_irq, esr=00059d82
[  461.958093] flexcan_irq, ctrl=17092051
[  461.961873] ------------[ cut here ]------------
[  461.966536] WARNING: CPU: 0 PID: 0 at kernel/locking/mutex.c:826
mutex_trylock+0x1b0/0x208()
[  461.974982] DEBUG_LOCKS_WARN_ON(in_interrupt())
[  461.979347] Modules linked in:
[  461.982621] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
3.16.0-rc4-00017-ge9a974f-dirty #215
[  461.990896] Backtrace:
[  461.993412] [<80011e1c>] (dump_backtrace) from [<80011fb8>]
(show_stack+0x18/0x1c)
[  462.000991]  r6:806718ec r5:00000000 r4:00000000 r3:00000000
[  462.006836] [<80011fa0>] (show_stack) from [<8066e160>]
(dump_stack+0x88/0xa4)
[  462.014143] [<8066e0d8>] (dump_stack) from [<80028f48>]
(warn_slowpath_common+0x70/0x94)
[  462.022280]  r5:00000009 r4:808f9d50
[  462.025920] [<80028ed8>] (warn_slowpath_common) from [<80029010>]
(warn_slowpath_fmt+0x38/0x40)
[  462.034666]  r8:808f9e14 r7:8110cf7c r6:804c2fa0 r5:8e9ee000
r4:8094cdcc
[  462.041478] [<80028fdc>] (warn_slowpath_fmt) from [<806718ec>]
(mutex_trylock+0x1b0/0x208)
[  462.049792]  r3:807e3f6c r2:807e1b8c
[  462.053466] [<8067173c>] (mutex_trylock) from [<804c2fa0>]
(clk_prepare_lock+0x14/0xec)
[  462.061479]  r7:90880000 r6:8e81f800 r5:8e9ee000 r4:8e81f800
[  462.067280] [<804c2f8c>] (clk_prepare_lock) from [<804c50d0>]
(clk_prepare+0x14/0x2c)
[  462.075158]  r6:8e81f800 r5:8e9ee000 r4:8e81f800 r3:00000000
[  462.080918] [<804c50bc>] (clk_prepare) from [<803e7a50>]
(flexcan_get_berr_counter+0x24/0xd0)
[  462.089487]  r4:00000001 r3:00000000
[  462.093156] [<803e7a2c>] (flexcan_get_berr_counter) from [<803e895c>]
(flexcan_poll+0x40c/0x58c)
[  462.101992]  r8:0000000a r7:0000000a r6:8e372388 r5:8e172a80
r4:00000001 r3:00000000
[  462.109843] [<803e8550>] (flexcan_poll) from [<80511f90>]
(net_rx_action+0xcc/0x1b4)
[  462.117630]  r10:8e9ee6c0 r9:00000003 r8:0000000a r7:8fdde188
r6:808fa0c0 r5:8fdde180
[  462.125587]  r4:0000012c
[  462.128164] [<80511ec4>] (net_rx_action) from [<8002d290>]
(__do_softirq+0x120/0x26c)
[  462.136038]  r10:808fa080 r9:00000003 r8:00000100 r7:00000003
r6:808f8000 r5:808fa08c
[  462.143994]  r4:00000000
[  462.146566] [<8002d170>] (__do_softirq) from [<8002d6d4>]
(irq_exit+0xb0/0x104)
[  462.153923]  r10:806788ac r9:808f8000 r8:00000000 r7:0000005a
r6:808f8000 r5:808f5e2c
[  462.161877]  r4:808f8000
[  462.164459] [<8002d624>] (irq_exit) from [<8000f4bc>]
(handle_IRQ+0x58/0xb8)
[  462.171520]  r4:80900d2c r3:000000a0
[  462.175200] [<8000f464>] (handle_IRQ) from [<800086c0>]
(gic_handle_irq+0x30/0x68)
[  462.182818]  r8:8095c587 r7:90802100 r6:808f9f28 r5:80900ea0
r4:9080210c r3:000000a0
[  462.190671] [<80008690>] (gic_handle_irq) from [<80012ae4>]
(__irq_svc+0x44/0x5c)
[  462.198203] Exception stack(0x808f9f28 to 0x808f9f70)
[  462.203315] 9f20:                   00000001 00000001 00000000
80904070 8090098c 80900938
[  462.211517] 9f40: 8095c587 00000000 8095c587 808f8000 806788ac
808f9f7c 808f9f40 808f9f70
[  462.219746] 9f60: 80066a98 8000f834 20000013 ffffffff
[  462.224852]  r7:808f9f5c r6:ffffffff r5:20000013 r4:8000f834
[  462.230621] [<8000f80c>] (arch_cpu_idle) from [<8005fba4>]
(cpu_startup_entry+0x104/0x16c)
[  462.238964] [<8005faa0>] (cpu_startup_entry) from [<80668cf8>]
(rest_init+0xb0/0xd8)
[  462.246757]  r7:8ffffcc0 r3:00000000
[  462.250419] [<80668c48>] (rest_init) from [<808a5bf8>]
(start_kernel+0x340/0x3ac)
[  462.257979]  r5:8095c7c0 r4:80900a38
[  462.261620] [<808a58b8>] (start_kernel) from [<80008074>]
(0x80008074)
[  462.268204] ---[ end trace c9c9dee0ce2272a7 ]---
[  462.272899] flexcan 40020000.flexcan can0: Error Warning IRQ

candump showed this:
  can0  20000004   [8]  00 04 00 00 00 00 00 00   ERRORFRAME
	controller-problem{rx-error-warning}

> 
> Please send the outputs of candump and demsg to the list. It should
> generate a warning, error passive and finally a busoff message.
> 

--
Stefan

Comments

Marc Kleine-Budde July 28, 2014, 4:28 p.m. UTC | #1
On 07/28/2014 06:20 PM, Stefan Agner wrote:
>>> Ping. Anything open/to do from my side?
>>
>> Please keep the printing of esr and ctrl in the interrupt handler, add a
>> #define DEBUG in the driver, but do not change anything else. Then:
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
> index 1c31a5d..fe8b81c 100644
> --- a/drivers/net/can/flexcan.c
> +++ b/drivers/net/can/flexcan.c
> @@ -19,6 +19,7 @@
>   *
>   */
>  
> +#define DEBUG

keep

>  #include <linux/netdevice.h>
>  #include <linux/can.h>
>  #include <linux/can/dev.h>
> @@ -743,6 +744,9 @@ static irqreturn_t flexcan_irq(int irq, void
> *dev_id)
>  
>         reg_iflag1 = flexcan_read(&regs->iflag1);
>         reg_esr = flexcan_read(&regs->esr);
> +
> +       printk("flexcan_irq, esr=%08x\n", reg_esr);
> +       printk("flexcan_irq, ctrl=%08x\n", flexcan_read(&regs->ctrl));

keep

>         /* ACK all bus error and state change IRQ sources */
>         if (reg_esr & FLEXCAN_ESR_ALL_INT)
>                 flexcan_write(reg_esr & FLEXCAN_ESR_ALL_INT,
> &regs->esr);
> @@ -885,8 +889,8 @@ static int flexcan_chip_start(struct net_device
> *dev)
>          */
>         reg_ctrl = flexcan_read(&regs->ctrl);
>         reg_ctrl &= ~FLEXCAN_CTRL_TSYN;
> -       reg_ctrl |= FLEXCAN_CTRL_BOFF_REC | FLEXCAN_CTRL_LBUF |
> -               FLEXCAN_CTRL_ERR_STATE;
> +       reg_ctrl |= FLEXCAN_CTRL_BOFF_REC | FLEXCAN_CTRL_LBUF;// |
> +               //FLEXCAN_CTRL_ERR_STATE;
>         /*
>          * enable the "error interrupt" (FLEXCAN_CTRL_ERR_MSK),
>          * on most Flexcan cores, too. Otherwise we don't get
> 
> I'm not sure whether you really want to keep the FLEXCAN_CTRL_ERR_STATE
> commented out...

No, please remove this change and redo the test.

Marc
Stefan Agner July 29, 2014, 7:29 a.m. UTC | #2
Am 2014-07-28 18:28, schrieb Marc Kleine-Budde:
> On 07/28/2014 06:20 PM, Stefan Agner wrote:
>> I'm not sure whether you really want to keep the FLEXCAN_CTRL_ERR_STATE
>> commented out...
> 
> No, please remove this change and redo the test.
> 

Ok, removed that change and did the tests again:

== Wrong Bitrate test

[  146.485022] flexcan 40020000.flexcan can0: bitrate error 0.7%
[  148.401793] flexcan 40020000.flexcan can0: writing ctrl=0x17092001
[  148.408263] flexcan 40020000.flexcan can0: flexcan_set_bittiming:
mcr=0x5980000f ctrl=0x17092001
[  148.408298] flexcan 40020000.flexcan can0: flexcan_chip_start:
writing mcr=0x79a20208
[  148.408328] flexcan 40020000.flexcan can0: flexcan_chip_start:
writing ctrl=0x1709ac51
[  148.414373] flexcan 40020000.flexcan can0: flexcan_chip_start:
reading mcr=0x60a20208 ctrl=0x1709ac51
^---- Initialization
[  152.968685] flexcan_irq, esr=00040080
[  152.972386] flexcan_irq, ctrl=1709ac51
[  155.488623] flexcan_irq, esr=00040080
[  155.492326] flexcan_irq, ctrl=1709ac51
^---- Two messages with right bitrate
[  171.014124] flexcan_irq, esr=00058d0a
[  171.017823] flexcan_irq, ctrl=1709ac51
^---- One message with wrong bitrate
[  171.021631] flexcan 40020000.flexcan can0: Error Warning IRQ
[  171.021660] flexcan 40020000.flexcan can0: Error Passive IRQ

# ./candump -e any,0:0,#FFFFFFFF
  can0  1F334455   [8]  11 22 33 44 55 66 77 88
  can0  1F334455   [8]  11 22 33 44 55 66 77 88
  can0  20000004   [8]  00 10 00 00 00 00 00 00   ERRORFRAME
	controller-problem{rx-error-passive}

== Loopback test

[  464.003776] flexcan 40020000.flexcan can0: writing ctrl=0x17092051
[  464.010005] flexcan 40020000.flexcan can0: flexcan_set_bittiming:
mcr=0x5980000f ctrl=0x17092051
[  464.010035] flexcan 40020000.flexcan can0: flexcan_chip_start:
writing mcr=0x79a20208
[  464.010064] flexcan 40020000.flexcan can0: flexcan_chip_start:
writing ctrl=0x1709ac51
[  464.010199] flexcan 40020000.flexcan can0: flexcan_chip_start:
reading mcr=0x60a20208 ctrl=0x1709ac51
[  464.011682] IPv6: ADDRCONF(NETDEV_CHANGE): can0: link becomes ready
[  470.970133] flexcan_irq, esr=00064252
[  470.974833] flexcan_irq, ctrl=1709ac51
[  470.980171] flexcan 40020000.flexcan can0: Error Warning IRQ
[  470.980203] flexcan 40020000.flexcan can0: Error Passive IRQ
[  470.980328] flexcan_irq, esr=00000034
[  470.984998] flexcan_irq, ctrl=1709ac51

# ./candump -e any,0:0,#FFFFFFFF
  can0  20000044   [8]  00 10 00 00 00 00 00 00   ERRORFRAME
	controller-problem{rx-error-passive}
	bus-off

--
Stefan
Marc Kleine-Budde July 30, 2014, 11:47 a.m. UTC | #3
On 07/29/2014 09:29 AM, Stefan Agner wrote:
> Am 2014-07-28 18:28, schrieb Marc Kleine-Budde:
>> On 07/28/2014 06:20 PM, Stefan Agner wrote:
>>> I'm not sure whether you really want to keep the FLEXCAN_CTRL_ERR_STATE
>>> commented out...
>>
>> No, please remove this change and redo the test.
>>
> 
> Ok, removed that change and did the tests again:
> 
> == Wrong Bitrate test
> 
> [  146.485022] flexcan 40020000.flexcan can0: bitrate error 0.7%
> [  148.401793] flexcan 40020000.flexcan can0: writing ctrl=0x17092001
> [  148.408263] flexcan 40020000.flexcan can0: flexcan_set_bittiming:
> mcr=0x5980000f ctrl=0x17092001
> [  148.408298] flexcan 40020000.flexcan can0: flexcan_chip_start:
> writing mcr=0x79a20208
> [  148.408328] flexcan 40020000.flexcan can0: flexcan_chip_start:
> writing ctrl=0x1709ac51
> [  148.414373] flexcan 40020000.flexcan can0: flexcan_chip_start:
> reading mcr=0x60a20208 ctrl=0x1709ac51
> ^---- Initialization
> [  152.968685] flexcan_irq, esr=00040080
> [  152.972386] flexcan_irq, ctrl=1709ac51
> [  155.488623] flexcan_irq, esr=00040080
> [  155.492326] flexcan_irq, ctrl=1709ac51
> ^---- Two messages with right bitrate
> [  171.014124] flexcan_irq, esr=00058d0a
> [  171.017823] flexcan_irq, ctrl=1709ac51
> ^---- One message with wrong bitrate
> [  171.021631] flexcan 40020000.flexcan can0: Error Warning IRQ
> [  171.021660] flexcan 40020000.flexcan can0: Error Passive IRQ

Thanks for the test, so far looks promising :) With this setup the other
CAN node repeats the CAN frame until it's ACKed. Because there is no
node with a compatible bitrate, there is no ACking CAN node.

Can you add a third CAN node to the network. The second and third node
should use the same bitrate, while your vf610 uses a different one. With
the new setup it should take more than one frame until the vf610 goes
into error warning and even more frames to error passive. This way we
can see it the error warning interrupt is connected or not. The error
counters should increase with each "wrong" bitrate frame it sees, you
can check with:

    ip -details link show can0

The output looks like this:

> 4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
>     link/can
>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
                              ^^^^^^^^^^^^^^^^^^^^^^
>     bitrate 1000000 sample-point 0.750
>     tq 125 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
>     sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
>     clock 8000000

When one of the berr-counter crosses 96 (and stays below 128) a warning
interrupt should be generated.

Marc
Stefan Agner Aug. 4, 2014, 1:43 p.m. UTC | #4
Am 2014-07-30 13:47, schrieb Marc Kleine-Budde:
> On 07/29/2014 09:29 AM, Stefan Agner wrote:
>> Am 2014-07-28 18:28, schrieb Marc Kleine-Budde:
>>> On 07/28/2014 06:20 PM, Stefan Agner wrote:
>>>> I'm not sure whether you really want to keep the FLEXCAN_CTRL_ERR_STATE
>>>> commented out...
>>>
>>> No, please remove this change and redo the test.
>>>
>>
>> Ok, removed that change and did the tests again:
>>
>> == Wrong Bitrate test
>>
>> [  146.485022] flexcan 40020000.flexcan can0: bitrate error 0.7%
>> [  148.401793] flexcan 40020000.flexcan can0: writing ctrl=0x17092001
>> [  148.408263] flexcan 40020000.flexcan can0: flexcan_set_bittiming:
>> mcr=0x5980000f ctrl=0x17092001
>> [  148.408298] flexcan 40020000.flexcan can0: flexcan_chip_start:
>> writing mcr=0x79a20208
>> [  148.408328] flexcan 40020000.flexcan can0: flexcan_chip_start:
>> writing ctrl=0x1709ac51
>> [  148.414373] flexcan 40020000.flexcan can0: flexcan_chip_start:
>> reading mcr=0x60a20208 ctrl=0x1709ac51
>> ^---- Initialization
>> [  152.968685] flexcan_irq, esr=00040080
>> [  152.972386] flexcan_irq, ctrl=1709ac51
>> [  155.488623] flexcan_irq, esr=00040080
>> [  155.492326] flexcan_irq, ctrl=1709ac51
>> ^---- Two messages with right bitrate
>> [  171.014124] flexcan_irq, esr=00058d0a
>> [  171.017823] flexcan_irq, ctrl=1709ac51
>> ^---- One message with wrong bitrate
>> [  171.021631] flexcan 40020000.flexcan can0: Error Warning IRQ
>> [  171.021660] flexcan 40020000.flexcan can0: Error Passive IRQ
> 
> Thanks for the test, so far looks promising :) With this setup the other
> CAN node repeats the CAN frame until it's ACKed. Because there is no
> node with a compatible bitrate, there is no ACking CAN node.
> 
> Can you add a third CAN node to the network. The second and third node
> should use the same bitrate, while your vf610 uses a different one. With
> the new setup it should take more than one frame until the vf610 goes
> into error warning and even more frames to error passive. This way we
> can see it the error warning interrupt is connected or not. The error
> counters should increase with each "wrong" bitrate frame it sees, you
> can check with:
> 
>     ip -details link show can0
> 
> The output looks like this:
> 
>> 4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
>>     link/can
>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>                               ^^^^^^^^^^^^^^^^^^^^^^
>>     bitrate 1000000 sample-point 0.750
>>     tq 125 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
>>     sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
>>     clock 8000000
> 
> When one of the berr-counter crosses 96 (and stays below 128) a warning
> interrupt should be generated.

Ok, created this setup, could successfully communicate with all three
nodes. I then set the Vybrid to half of the bitrate. When I send a frame
from Vybrid, the berr-counter tx immediately ends up at 128 and the
device is in ERROR-PASSIVE:

root@colibri-vf:~# ip -details link show can1
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421
root@colibri-vf:~# cansend can1 1F334455#1122334455667788
interface = can1, family = 29, type = 3, proto = 1
root@colibri-vf:~# [  818.886664] flexcan_irq, esr=00062242
[  818.890365] flexcan_irq, ctrl=1c3dac57

root@colibri-vf:~# ip -details link show can1
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421


When I send the frames from another device on the bus, I can see the rx
count incrementing by one on each frame I send. As you expected, the
device changes to ERROR-WARNING when crossing the 96 frame boundary:

root@colibri-vf:~# ip -details link show can1
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-ACTIVE (berr-counter tx 0 rx 92) restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421
root@colibri-vf:~# [  448.331150] flexcan_irq, esr=0005050a
[  448.334851] flexcan_irq, ctrl=1c3dac57
ip -details link show can1
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-WARNING (berr-counter tx 0 rx 102) restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421

However, once reaching 128, I don't get another interrupt and the device
stays in ERROR-WARNING:

3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-WARNING (berr-counter tx 0 rx 128) restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421

Is this the expected behavior on receive?

--
Stefan
Marc Kleine-Budde Aug. 4, 2014, 2:27 p.m. UTC | #5
On 08/04/2014 03:43 PM, Stefan Agner wrote:
[...]

>> Thanks for the test, so far looks promising :) With this setup the other
>> CAN node repeats the CAN frame until it's ACKed. Because there is no
>> node with a compatible bitrate, there is no ACking CAN node.
>>
>> Can you add a third CAN node to the network. The second and third node
>> should use the same bitrate, while your vf610 uses a different one. With
>> the new setup it should take more than one frame until the vf610 goes
>> into error warning and even more frames to error passive. This way we
>> can see it the error warning interrupt is connected or not. The error
>> counters should increase with each "wrong" bitrate frame it sees, you
>> can check with:
>>
>>     ip -details link show can0
>>
>> The output looks like this:
>>
>>> 4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
>>>     link/can
>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>                               ^^^^^^^^^^^^^^^^^^^^^^
>>>     bitrate 1000000 sample-point 0.750
>>>     tq 125 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
>>>     sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
>>>     clock 8000000
>>
>> When one of the berr-counter crosses 96 (and stays below 128) a warning
>> interrupt should be generated.
> 
> Ok, created this setup, could successfully communicate with all three
> nodes. I then set the Vybrid to half of the bitrate. When I send a frame
> from Vybrid, the berr-counter tx immediately ends up at 128 and the
> device is in ERROR-PASSIVE:

This is expected.

> root@colibri-vf:~# ip -details link show can1
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
            ^^^^^^^^

BTW: the can core has a really weird clock rate, have a look at the
datasheet if you manage to route a 24 MHz clock (or another multiple of
8MHz) to the flexcan core. I had a quick glance at the datasheet, if I
understand it correctly the Fast OSC clock runs with 24 MHz.

> root@colibri-vf:~# cansend can1 1F334455#1122334455667788
> interface = can1, family = 29, type = 3, proto = 1
> root@colibri-vf:~# [  818.886664] flexcan_irq, esr=00062242
> [  818.890365] flexcan_irq, ctrl=1c3dac57
> 
> root@colibri-vf:~# ip -details link show can1
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> 
> 
> When I send the frames from another device on the bus, I can see the rx
> count incrementing by one on each frame I send. As you expected, the
> device changes to ERROR-WARNING when crossing the 96 frame boundary:

This is correct

> root@colibri-vf:~# ip -details link show can1
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-ACTIVE (berr-counter tx 0 rx 92) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> root@colibri-vf:~# [  448.331150] flexcan_irq, esr=0005050a
> [  448.334851] flexcan_irq, ctrl=1c3dac57
> ip -details link show can1
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-WARNING (berr-counter tx 0 rx 102) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421

> However, once reaching 128, I don't get another interrupt and the device
> stays in ERROR-WARNING:

The contents of the esr reg would be interesting, especially the
FLT_CONF part.

> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-WARNING (berr-counter tx 0 rx 128) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> 
> Is this the expected behavior on receive?

No, it should go into error passive if one of the error counters have >
127. Maybe this is an error onthe vf610, maybe it's a general flexcan
problem.

Marc
Stefan Agner Aug. 4, 2014, 4:01 p.m. UTC | #6
Am 2014-08-04 16:27, schrieb Marc Kleine-Budde:
> On 08/04/2014 03:43 PM, Stefan Agner wrote:
> [...]
> 
>>> Thanks for the test, so far looks promising :) With this setup the other
>>> CAN node repeats the CAN frame until it's ACKed. Because there is no
>>> node with a compatible bitrate, there is no ACking CAN node.
>>>
>>> Can you add a third CAN node to the network. The second and third node
>>> should use the same bitrate, while your vf610 uses a different one. With
>>> the new setup it should take more than one frame until the vf610 goes
>>> into error warning and even more frames to error passive. This way we
>>> can see it the error warning interrupt is connected or not. The error
>>> counters should increase with each "wrong" bitrate frame it sees, you
>>> can check with:
>>>
>>>     ip -details link show can0
>>>
>>> The output looks like this:
>>>
>>>> 4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
>>>>     link/can
>>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>                               ^^^^^^^^^^^^^^^^^^^^^^
>>>>     bitrate 1000000 sample-point 0.750
>>>>     tq 125 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
>>>>     sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
>>>>     clock 8000000
>>>
>>> When one of the berr-counter crosses 96 (and stays below 128) a warning
>>> interrupt should be generated.
>>
>> Ok, created this setup, could successfully communicate with all three
>> nodes. I then set the Vybrid to half of the bitrate. When I send a frame
>> from Vybrid, the berr-counter tx immediately ends up at 128 and the
>> device is in ERROR-PASSIVE:
> 
> This is expected.
> 
>> root@colibri-vf:~# ip -details link show can1
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>             ^^^^^^^^
> 
> BTW: the can core has a really weird clock rate, have a look at the
> datasheet if you manage to route a 24 MHz clock (or another multiple of
> 8MHz) to the flexcan core. I had a quick glance at the datasheet, if I
> understand it correctly the Fast OSC clock runs with 24 MHz.
> 

This pinmux is actually part of the IPs CTRL register (see CLKSRC).
Currently this is set unconditionally to peripheral clock, which is on
my device 83.3 MHz (500MHz/6). This would be a bit bigger change. Just
thinking how to implement this. Maybe we have to reference a third clock
"osc" and a device tree property fsl,can-clock-osc which one can choose
between "osc" and "per" what do you think?

>> root@colibri-vf:~# cansend can1 1F334455#1122334455667788
>> interface = can1, family = 29, type = 3, proto = 1
>> root@colibri-vf:~# [  818.886664] flexcan_irq, esr=00062242
>> [  818.890365] flexcan_irq, ctrl=1c3dac57
>>
>> root@colibri-vf:~# ip -details link show can1
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>>
>>
>> When I send the frames from another device on the bus, I can see the rx
>> count incrementing by one on each frame I send. As you expected, the
>> device changes to ERROR-WARNING when crossing the 96 frame boundary:
> 
> This is correct
> 

Just found out when using too high bitrate (500000 vs 125000), the error
counter immediately goes up to 128 (or even beyond) and ends up in
ERROR-PASSIVE:

# ip -details link show can1
[  292.164820] flexcan_get_berr_counter, esr=00000000
[  292.169715] flexcan_get_berr_counter, esr=00040190
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0
    can state ERROR-PASSIVE (berr-counter tx 0 rx 135) restart-ms 0
    bitrate 298811 sample-point 0.777
    tq 371 prop-seg 3 phase-seg1 3 phase-seg2 2 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421



>> root@colibri-vf:~# ip -details link show can1
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 92) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>> root@colibri-vf:~# [  448.331150] flexcan_irq, esr=0005050a
>> [  448.334851] flexcan_irq, ctrl=1c3dac57
>> ip -details link show can1
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 102) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
> 
>> However, once reaching 128, I don't get another interrupt and the device
>> stays in ERROR-WARNING:
> 
> The contents of the esr reg would be interesting, especially the
> FLT_CONF part.
> 

root@vybridvf61-v11:~# [  222.221651] flexcan_irq, esr=0005050a
[  222.225349] flexcan_irq, ctrl=1c3dac57
ip -details link show can1
[  223.791082] flexcan_get_berr_counter, esr=00000000
[  223.796246] flexcan_get_berr_counter, esr=00040180
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 
10
    link/can  promiscuity 0
    can state ERROR-WARNING (berr-counter tx 0 rx 96) restart-ms 0
    bitrate 124990 sample-point 0.739
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421
root@vybridvf61-v11:~# ip -details link show can1
[  243.571175] flexcan_get_berr_counter, esr=00000000
[  243.576343] flexcan_get_berr_counter, esr=00040582
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 
10
    link/can  promiscuity 0
    can state ERROR-WARNING (berr-counter tx 0 rx 104) restart-ms 0
    bitrate 124990 sample-point 0.739
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421

...

root@vybridvf61-v11:~# ip -details link show can1
[  299.831192] flexcan_get_berr_counter, esr=00000000
[  299.836358] flexcan_get_berr_counter, esr=00040582
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-WARNING (berr-counter tx 0 rx 126) restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421
root@vybridvf61-v11:~# ip -details link show can1
[  302.411025] flexcan_get_berr_counter, esr=00000000
[  302.416195] flexcan_get_berr_counter, esr=00040592
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-WARNING (berr-counter tx 0 rx 128) restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421

The FLTCONF is in Error Passive, however the stack did not received that
information. I still have the printk in the interrupt, however I did not
get an interrupt here (while I get one when it switched to
ERROR-WARNING... But it looks like the ERRINT is still pending...


>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 128) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>>
>> Is this the expected behavior on receive?
> 
> No, it should go into error passive if one of the error counters have >
> 127. Maybe this is an error onthe vf610, maybe it's a general flexcan
> problem.
> 
> Marc

--
Stefan
Marc Kleine-Budde Aug. 5, 2014, 9:52 a.m. UTC | #7
On 08/04/2014 06:01 PM, Stefan Agner wrote:
> Am 2014-08-04 16:27, schrieb Marc Kleine-Budde:
>> On 08/04/2014 03:43 PM, Stefan Agner wrote:
>> [...]
>>
>>>> Thanks for the test, so far looks promising :) With this setup the other
>>>> CAN node repeats the CAN frame until it's ACKed. Because there is no
>>>> node with a compatible bitrate, there is no ACking CAN node.
>>>>
>>>> Can you add a third CAN node to the network. The second and third node
>>>> should use the same bitrate, while your vf610 uses a different one. With
>>>> the new setup it should take more than one frame until the vf610 goes
>>>> into error warning and even more frames to error passive. This way we
>>>> can see it the error warning interrupt is connected or not. The error
>>>> counters should increase with each "wrong" bitrate frame it sees, you
>>>> can check with:
>>>>
>>>>     ip -details link show can0
>>>>
>>>> The output looks like this:
>>>>
>>>>> 4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
>>>>>     link/can
>>>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>>                               ^^^^^^^^^^^^^^^^^^^^^^
>>>>>     bitrate 1000000 sample-point 0.750
>>>>>     tq 125 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
>>>>>     sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
>>>>>     clock 8000000
>>>>
>>>> When one of the berr-counter crosses 96 (and stays below 128) a warning
>>>> interrupt should be generated.
>>>
>>> Ok, created this setup, could successfully communicate with all three
>>> nodes. I then set the Vybrid to half of the bitrate. When I send a frame
>>> from Vybrid, the berr-counter tx immediately ends up at 128 and the
>>> device is in ERROR-PASSIVE:
>>
>> This is expected.
>>
>>> root@colibri-vf:~# ip -details link show can1
>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>> mode DEFAULT group default qlen 10
>>>     link/can  promiscuity 0
>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>     bitrate 124990 sample-point 0.739
>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>     clock 83368421
>>             ^^^^^^^^
>>
>> BTW: the can core has a really weird clock rate, have a look at the
>> datasheet if you manage to route a 24 MHz clock (or another multiple of
>> 8MHz) to the flexcan core. I had a quick glance at the datasheet, if I
>> understand it correctly the Fast OSC clock runs with 24 MHz.
>>
> 
> This pinmux is actually part of the IPs CTRL register (see CLKSRC).
> Currently this is set unconditionally to peripheral clock, which is on
> my device 83.3 MHz (500MHz/6). This would be a bit bigger change. Just
> thinking how to implement this.

Sigh! I've missed the fact, that this bit is _inside_ the CAN core
(again). On the mx53 and imx6 this bit was obsolete and the clock was
selected outside of the CAN core.

> Maybe we have to reference a third clock
> "osc" and a device tree property fsl,can-clock-osc which one can choose
> between "osc" and "per" what do you think?

Yes, but that's a different patch. I thought it would be as easy as on
the mx53, where you just had to reconfigure the clock tree.

>>> root@colibri-vf:~# cansend can1 1F334455#1122334455667788
>>> interface = can1, family = 29, type = 3, proto = 1
>>> root@colibri-vf:~# [  818.886664] flexcan_irq, esr=00062242
>>> [  818.890365] flexcan_irq, ctrl=1c3dac57
>>>
>>> root@colibri-vf:~# ip -details link show can1
>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>> mode DEFAULT group default qlen 10
>>>     link/can  promiscuity 0
>>>     can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
>>>     bitrate 124990 sample-point 0.739
>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>     clock 83368421
>>>
>>>
>>> When I send the frames from another device on the bus, I can see the rx
>>> count incrementing by one on each frame I send. As you expected, the
>>> device changes to ERROR-WARNING when crossing the 96 frame boundary:
>>
>> This is correct
>>
> 
> Just found out when using too high bitrate (500000 vs 125000), the error
> counter immediately goes up to 128 (or even beyond) and ends up in
> ERROR-PASSIVE:

Maybe there is a off-by-one error in the flexcan core, so that a rx
error of 127 doesn't trigger an error passive interrupt.

> # ip -details link show can1
> [  292.164820] flexcan_get_berr_counter, esr=00000000
> [  292.169715] flexcan_get_berr_counter, esr=00040190
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0
>     can state ERROR-PASSIVE (berr-counter tx 0 rx 135) restart-ms 0
>     bitrate 298811 sample-point 0.777
>     tq 371 prop-seg 3 phase-seg1 3 phase-seg2 2 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421

[...]

>>> However, once reaching 128, I don't get another interrupt and the device
>>> stays in ERROR-WARNING:
>>
>> The contents of the esr reg would be interesting, especially the
>> FLT_CONF part.

> root@vybridvf61-v11:~# [  222.221651] flexcan_irq, esr=0005050a
> [  222.225349] flexcan_irq, ctrl=1c3dac57

> ip -details link show can1
> [  223.791082] flexcan_get_berr_counter, esr=00000000
> [  223.796246] flexcan_get_berr_counter, esr=00040180
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 
> 10
>     link/can  promiscuity 0
>     can state ERROR-WARNING (berr-counter tx 0 rx 96) restart-ms 0
>     bitrate 124990 sample-point 0.739
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421

> root@vybridvf61-v11:~# ip -details link show can1
> [  243.571175] flexcan_get_berr_counter, esr=00000000
> [  243.576343] flexcan_get_berr_counter, esr=00040582
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 
> 10
>     link/can  promiscuity 0
>     can state ERROR-WARNING (berr-counter tx 0 rx 104) restart-ms 0
>     bitrate 124990 sample-point 0.739
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> 
> ...
> 
> root@vybridvf61-v11:~# ip -details link show can1
> [  299.831192] flexcan_get_berr_counter, esr=00000000
> [  299.836358] flexcan_get_berr_counter, esr=00040582
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-WARNING (berr-counter tx 0 rx 126) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> root@vybridvf61-v11:~# ip -details link show can1
> [  302.411025] flexcan_get_berr_counter, esr=00000000
> [  302.416195] flexcan_get_berr_counter, esr=00040592
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-WARNING (berr-counter tx 0 rx 128) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> 
> The FLTCONF is in Error Passive, however the stack did not received that
> information. I still have the printk in the interrupt, however I did not
> get an interrupt here (while I get one when it switched to
> ERROR-WARNING... But it looks like the ERRINT is still pending...

According to the datasheet the ERRINT indicates a bit error (bit 10...15
is set), not the error passive state.

Marc
Stefan Agner Aug. 5, 2014, 12:38 p.m. UTC | #8
Am 2014-08-05 11:52, schrieb Marc Kleine-Budde:
> On 08/04/2014 06:01 PM, Stefan Agner wrote:
>> Am 2014-08-04 16:27, schrieb Marc Kleine-Budde:
>>> On 08/04/2014 03:43 PM, Stefan Agner wrote:
>>> [...]
>>>
>>>>> Thanks for the test, so far looks promising :) With this setup the other
>>>>> CAN node repeats the CAN frame until it's ACKed. Because there is no
>>>>> node with a compatible bitrate, there is no ACking CAN node.
>>>>>
>>>>> Can you add a third CAN node to the network. The second and third node
>>>>> should use the same bitrate, while your vf610 uses a different one. With
>>>>> the new setup it should take more than one frame until the vf610 goes
>>>>> into error warning and even more frames to error passive. This way we
>>>>> can see it the error warning interrupt is connected or not. The error
>>>>> counters should increase with each "wrong" bitrate frame it sees, you
>>>>> can check with:
>>>>>
>>>>>     ip -details link show can0
>>>>>
>>>>> The output looks like this:
>>>>>
>>>>>> 4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
>>>>>>     link/can
>>>>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>>>                               ^^^^^^^^^^^^^^^^^^^^^^
>>>>>>     bitrate 1000000 sample-point 0.750
>>>>>>     tq 125 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
>>>>>>     sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
>>>>>>     clock 8000000
>>>>>
>>>>> When one of the berr-counter crosses 96 (and stays below 128) a warning
>>>>> interrupt should be generated.
>>>>
>>>> Ok, created this setup, could successfully communicate with all three
>>>> nodes. I then set the Vybrid to half of the bitrate. When I send a frame
>>>> from Vybrid, the berr-counter tx immediately ends up at 128 and the
>>>> device is in ERROR-PASSIVE:
>>>
>>> This is expected.
>>>
>>>> root@colibri-vf:~# ip -details link show can1
>>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>>> mode DEFAULT group default qlen 10
>>>>     link/can  promiscuity 0
>>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>>     bitrate 124990 sample-point 0.739
>>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>>     clock 83368421
>>>             ^^^^^^^^
>>>
>>> BTW: the can core has a really weird clock rate, have a look at the
>>> datasheet if you manage to route a 24 MHz clock (or another multiple of
>>> 8MHz) to the flexcan core. I had a quick glance at the datasheet, if I
>>> understand it correctly the Fast OSC clock runs with 24 MHz.
>>>
>>
>> This pinmux is actually part of the IPs CTRL register (see CLKSRC).
>> Currently this is set unconditionally to peripheral clock, which is on
>> my device 83.3 MHz (500MHz/6). This would be a bit bigger change. Just
>> thinking how to implement this.
> 
> Sigh! I've missed the fact, that this bit is _inside_ the CAN core
> (again). On the mx53 and imx6 this bit was obsolete and the clock was
> selected outside of the CAN core.
> 
>> Maybe we have to reference a third clock
>> "osc" and a device tree property fsl,can-clock-osc which one can choose
>> between "osc" and "per" what do you think?
> 
> Yes, but that's a different patch. I thought it would be as easy as on
> the mx53, where you just had to reconfigure the clock tree.
> 

Agreed.

>>>> root@colibri-vf:~# cansend can1 1F334455#1122334455667788
>>>> interface = can1, family = 29, type = 3, proto = 1
>>>> root@colibri-vf:~# [  818.886664] flexcan_irq, esr=00062242
>>>> [  818.890365] flexcan_irq, ctrl=1c3dac57
>>>>
>>>> root@colibri-vf:~# ip -details link show can1
>>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>>> mode DEFAULT group default qlen 10
>>>>     link/can  promiscuity 0
>>>>     can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
>>>>     bitrate 124990 sample-point 0.739
>>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>>     clock 83368421
>>>>
>>>>
>>>> When I send the frames from another device on the bus, I can see the rx
>>>> count incrementing by one on each frame I send. As you expected, the
>>>> device changes to ERROR-WARNING when crossing the 96 frame boundary:
>>>
>>> This is correct
>>>
>>
>> Just found out when using too high bitrate (500000 vs 125000), the error
>> counter immediately goes up to 128 (or even beyond) and ends up in
>> ERROR-PASSIVE:
> 
> Maybe there is a off-by-one error in the flexcan core, so that a rx
> error of 127 doesn't trigger an error passive interrupt.
> 
>> # ip -details link show can1
>> [  292.164820] flexcan_get_berr_counter, esr=00000000
>> [  292.169715] flexcan_get_berr_counter, esr=00040190
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-PASSIVE (berr-counter tx 0 rx 135) restart-ms 0
>>     bitrate 298811 sample-point 0.777
>>     tq 371 prop-seg 3 phase-seg1 3 phase-seg2 2 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
> 
> [...]
> 
>>>> However, once reaching 128, I don't get another interrupt and the device
>>>> stays in ERROR-WARNING:
>>>
>>> The contents of the esr reg would be interesting, especially the
>>> FLT_CONF part.
> 
>> root@vybridvf61-v11:~# [  222.221651] flexcan_irq, esr=0005050a
>> [  222.225349] flexcan_irq, ctrl=1c3dac57
> 
>> ip -details link show can1
>> [  223.791082] flexcan_get_berr_counter, esr=00000000
>> [  223.796246] flexcan_get_berr_counter, esr=00040180
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen
>> 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 96) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
> 
>> root@vybridvf61-v11:~# ip -details link show can1
>> [  243.571175] flexcan_get_berr_counter, esr=00000000
>> [  243.576343] flexcan_get_berr_counter, esr=00040582
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen
>> 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 104) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>>
>> ...
>>
>> root@vybridvf61-v11:~# ip -details link show can1
>> [  299.831192] flexcan_get_berr_counter, esr=00000000
>> [  299.836358] flexcan_get_berr_counter, esr=00040582
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 126) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>> root@vybridvf61-v11:~# ip -details link show can1
>> [  302.411025] flexcan_get_berr_counter, esr=00000000
>> [  302.416195] flexcan_get_berr_counter, esr=00040592
>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>> mode DEFAULT group default qlen 10
>>     link/can  promiscuity 0
>>     can state ERROR-WARNING (berr-counter tx 0 rx 128) restart-ms 0
>>     bitrate 124990 sample-point 0.739
>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>     clock 83368421
>>
>> The FLTCONF is in Error Passive, however the stack did not received that
>> information. I still have the printk in the interrupt, however I did not
>> get an interrupt here (while I get one when it switched to
>> ERROR-WARNING... But it looks like the ERRINT is still pending...
> 
> According to the datasheet the ERRINT indicates a bit error (bit 10...15
> is set), not the error passive state.
> 
> Marc

True, but fact is, I don't get a interrupt when the system switches from
ERROR-WARNING to ERROR-PASSIVE state. However, when I set berr-reporting
on, I get an interrupt (probably because of a change of bit 10...15),
but the state change is recognized:

ip -details link show can1
[ 2303.571656] flexcan_get_berr_counter, esr=00000000
[ 2303.576832] flexcan_get_berr_counter, esr=00040180
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING> state ERROR-WARNING (berr-counter tx 0 rx 127)
restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421
root@vybridvf61-v11:~# [ 2307.023789] flexcan_irq, esr=0004051a
[ 2307.027490] flexcan_irq, ctrl=1c3dec57
ip -details link show can1
[ 2309.081840] flexcan_get_berr_counter, esr=00000000
[ 2309.087016] flexcan_get_berr_counter, esr=00040190
3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING> state ERROR-PASSIVE (berr-counter tx 0 rx 128)
restart-ms 0 
    bitrate 124990 sample-point 0.739 
    tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
    flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
    clock 83368421

It looks like, the FLTCONF state change itself doesn't lead to a
interrupt when the state changes to ERROR-PASSIVE. IMHO, this is also
not explicitly stated in the Vybrid RM (46.3.8).

What is the solution here? Force enabling ERR interrupt? The
FLEXCAN_HAS_BROKEN_ERR_STATE flag would be appropriate, however, this
was meant to work around the missing WRN_INT, which seems not to be the
case here...

--
Stefan
Marc Kleine-Budde Aug. 14, 2014, 10:38 a.m. UTC | #9
On 08/04/2014 06:01 PM, Stefan Agner wrote:
> Am 2014-08-04 16:27, schrieb Marc Kleine-Budde:
>> On 08/04/2014 03:43 PM, Stefan Agner wrote:
>> [...]
>>
>>>> Thanks for the test, so far looks promising :) With this setup the other
>>>> CAN node repeats the CAN frame until it's ACKed. Because there is no
>>>> node with a compatible bitrate, there is no ACking CAN node.
>>>>
>>>> Can you add a third CAN node to the network. The second and third node
>>>> should use the same bitrate, while your vf610 uses a different one. With
>>>> the new setup it should take more than one frame until the vf610 goes
>>>> into error warning and even more frames to error passive. This way we
>>>> can see it the error warning interrupt is connected or not. The error
>>>> counters should increase with each "wrong" bitrate frame it sees, you
>>>> can check with:
>>>>
>>>>     ip -details link show can0
>>>>
>>>> The output looks like this:
>>>>
>>>>> 4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT qlen 10
>>>>>     link/can
>>>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>>                               ^^^^^^^^^^^^^^^^^^^^^^
>>>>>     bitrate 1000000 sample-point 0.750
>>>>>     tq 125 prop-seg 2 phase-seg1 3 phase-seg2 2 sjw 1
>>>>>     sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
>>>>>     clock 8000000
>>>>
>>>> When one of the berr-counter crosses 96 (and stays below 128) a warning
>>>> interrupt should be generated.
>>>
>>> Ok, created this setup, could successfully communicate with all three
>>> nodes. I then set the Vybrid to half of the bitrate. When I send a frame
>>> from Vybrid, the berr-counter tx immediately ends up at 128 and the
>>> device is in ERROR-PASSIVE:
>>
>> This is expected.
>>
>>> root@colibri-vf:~# ip -details link show can1
>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>> mode DEFAULT group default qlen 10
>>>     link/can  promiscuity 0
>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0
>>>     bitrate 124990 sample-point 0.739
>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>     clock 83368421
>>             ^^^^^^^^
>>
>> BTW: the can core has a really weird clock rate, have a look at the
>> datasheet if you manage to route a 24 MHz clock (or another multiple of
>> 8MHz) to the flexcan core. I had a quick glance at the datasheet, if I
>> understand it correctly the Fast OSC clock runs with 24 MHz.
>>
> 
> This pinmux is actually part of the IPs CTRL register (see CLKSRC).
> Currently this is set unconditionally to peripheral clock, which is on
> my device 83.3 MHz (500MHz/6). This would be a bit bigger change. Just
> thinking how to implement this. Maybe we have to reference a third clock
> "osc" and a device tree property fsl,can-clock-osc which one can choose
> between "osc" and "per" what do you think?
> 
>>> root@colibri-vf:~# cansend can1 1F334455#1122334455667788
>>> interface = can1, family = 29, type = 3, proto = 1
>>> root@colibri-vf:~# [  818.886664] flexcan_irq, esr=00062242
>>> [  818.890365] flexcan_irq, ctrl=1c3dac57
>>>
>>> root@colibri-vf:~# ip -details link show can1
>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>> mode DEFAULT group default qlen 10
>>>     link/can  promiscuity 0
>>>     can state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0
>>>     bitrate 124990 sample-point 0.739
>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>     clock 83368421
>>>
>>>
>>> When I send the frames from another device on the bus, I can see the rx
>>> count incrementing by one on each frame I send. As you expected, the
>>> device changes to ERROR-WARNING when crossing the 96 frame boundary:
>>
>> This is correct
>>
> 
> Just found out when using too high bitrate (500000 vs 125000), the error
> counter immediately goes up to 128 (or even beyond) and ends up in
> ERROR-PASSIVE:
> 
> # ip -details link show can1
> [  292.164820] flexcan_get_berr_counter, esr=00000000
> [  292.169715] flexcan_get_berr_counter, esr=00040190
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0
>     can state ERROR-PASSIVE (berr-counter tx 0 rx 135) restart-ms 0
>     bitrate 298811 sample-point 0.777
>     tq 371 prop-seg 3 phase-seg1 3 phase-seg2 2 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> 
> 
> 
>>> root@colibri-vf:~# ip -details link show can1
>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>> mode DEFAULT group default qlen 10
>>>     link/can  promiscuity 0
>>>     can state ERROR-ACTIVE (berr-counter tx 0 rx 92) restart-ms 0
>>>     bitrate 124990 sample-point 0.739
>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>     clock 83368421
>>> root@colibri-vf:~# [  448.331150] flexcan_irq, esr=0005050a
>>> [  448.334851] flexcan_irq, ctrl=1c3dac57
>>> ip -details link show can1
>>> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
>>> mode DEFAULT group default qlen 10
>>>     link/can  promiscuity 0
>>>     can state ERROR-WARNING (berr-counter tx 0 rx 102) restart-ms 0
>>>     bitrate 124990 sample-point 0.739
>>>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>>>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>>>     clock 83368421
>>
>>> However, once reaching 128, I don't get another interrupt and the device
>>> stays in ERROR-WARNING:
>>
>> The contents of the esr reg would be interesting, especially the
>> FLT_CONF part.
>>
> 
> root@vybridvf61-v11:~# [  222.221651] flexcan_irq, esr=0005050a
> [  222.225349] flexcan_irq, ctrl=1c3dac57
> ip -details link show can1
> [  223.791082] flexcan_get_berr_counter, esr=00000000
> [  223.796246] flexcan_get_berr_counter, esr=00040180
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 
> 10
>     link/can  promiscuity 0
>     can state ERROR-WARNING (berr-counter tx 0 rx 96) restart-ms 0
>     bitrate 124990 sample-point 0.739
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> root@vybridvf61-v11:~# ip -details link show can1
> [  243.571175] flexcan_get_berr_counter, esr=00000000
> [  243.576343] flexcan_get_berr_counter, esr=00040582
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 
> 10
>     link/can  promiscuity 0
>     can state ERROR-WARNING (berr-counter tx 0 rx 104) restart-ms 0
>     bitrate 124990 sample-point 0.739
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> 
> ...
> 
> root@vybridvf61-v11:~# ip -details link show can1
> [  299.831192] flexcan_get_berr_counter, esr=00000000
> [  299.836358] flexcan_get_berr_counter, esr=00040582
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-WARNING (berr-counter tx 0 rx 126) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> root@vybridvf61-v11:~# ip -details link show can1
> [  302.411025] flexcan_get_berr_counter, esr=00000000
> [  302.416195] flexcan_get_berr_counter, esr=00040592
> 3: can1: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN
> mode DEFAULT group default qlen 10
>     link/can  promiscuity 0 
>     can state ERROR-WARNING (berr-counter tx 0 rx 128) restart-ms 0 
>     bitrate 124990 sample-point 0.739 
>     tq 347 prop-seg 8 phase-seg1 8 phase-seg2 6 sjw 1
>     flexcan: tseg1 4..16 tseg2 2..8 sjw 1..4 brp 1..256 brp-inc 1
>     clock 83368421
> 
> The FLTCONF is in Error Passive, however the stack did not received that
> information. I still have the printk in the interrupt, however I did not
> get an interrupt here (while I get one when it switched to
> ERROR-WARNING... But it looks like the ERRINT is still pending...

Can you apply Alexander Stein's "[PATCH 2/4] can: flexcan: Detect error
passive state change" and recheck?

Marc
diff mbox

Patch

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index 1c31a5d..fe8b81c 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -19,6 +19,7 @@ 
  *
  */
 
+#define DEBUG
 #include <linux/netdevice.h>
 #include <linux/can.h>
 #include <linux/can/dev.h>
@@ -743,6 +744,9 @@  static irqreturn_t flexcan_irq(int irq, void
*dev_id)
 
        reg_iflag1 = flexcan_read(&regs->iflag1);
        reg_esr = flexcan_read(&regs->esr);
+
+       printk("flexcan_irq, esr=%08x\n", reg_esr);
+       printk("flexcan_irq, ctrl=%08x\n", flexcan_read(&regs->ctrl));
        /* ACK all bus error and state change IRQ sources */
        if (reg_esr & FLEXCAN_ESR_ALL_INT)