Message ID | 20210509124309.30024-4-dariobin@libero.it (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | can: c_can: cache frames to operate as a true FIFO | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Series ignored based on subject |
On 09.05.2021 14:43:09, Dario Binacchi wrote: > As reported by a comment in the c_can_start_xmit() this was not a FIFO. > C/D_CAN controller sends out the buffers prioritized so that the lowest > buffer number wins. > > What did c_can_start_xmit() do if it found tx_active = 0x80000000 ? It > waited until the only frame of the FIFO was actually transmitted by the > controller. Only one message in the FIFO but we had to wait for it to > empty completely to ensure that the messages were transmitted in the > order in which they were loaded. > > By storing the frames in the FIFO without requiring its transmission, we > will be able to use the full size of the FIFO even in cases such as the > one described above. The transmission interrupt will trigger their > transmission only when all the messages previously loaded but stored in > less priority positions of the buffers have been transmitted. The algorithm you implemented looks a bit too complicated to me. Let me sketch the algorithm that's implemented by several other drivers. - have a power of two number of TX objects - add a number of objects to struct priv (tx_num) (or make it a define, if the number of tx objects is compile time fixed) - add two "unsigned int" variables to your struct priv, one "tx_head", one "tx_tail" - the hard_start_xmit() writes to priv->tx_head & (priv->tx_num - 1) - increment tx_head - stop the tx_queue if there is no space or if the object with the lowest prio has been written - in TX complete IRQ, handle priv->tx_tail object - increment tx_tail - wake queue if there is space but don't wake if we wait for the lowest prio object to be TX completed. Special care needs to be taken to implement that lock-less and race free. I suggest to look the the mcp251xfd driver. Marc
On 10.05.2021 14:25:15, Marc Kleine-Budde wrote: > On 09.05.2021 14:43:09, Dario Binacchi wrote: > > As reported by a comment in the c_can_start_xmit() this was not a FIFO. > > C/D_CAN controller sends out the buffers prioritized so that the lowest > > buffer number wins. > > > > What did c_can_start_xmit() do if it found tx_active = 0x80000000 ? It > > waited until the only frame of the FIFO was actually transmitted by the > > controller. Only one message in the FIFO but we had to wait for it to > > empty completely to ensure that the messages were transmitted in the > > order in which they were loaded. > > > > By storing the frames in the FIFO without requiring its transmission, we > > will be able to use the full size of the FIFO even in cases such as the > > one described above. The transmission interrupt will trigger their > > transmission only when all the messages previously loaded but stored in > > less priority positions of the buffers have been transmitted. > > The algorithm you implemented looks a bit too complicated to me. Let me > sketch the algorithm that's implemented by several other drivers. > > - have a power of two number of TX objects > - add a number of objects to struct priv (tx_num) > (or make it a define, if the number of tx objects is compile time fixed) > - add two "unsigned int" variables to your struct priv, > one "tx_head", one "tx_tail" > - the hard_start_xmit() writes to priv->tx_head & (priv->tx_num - 1) > - increment tx_head > - stop the tx_queue if there is no space or if the object with the > lowest prio has been written > - in TX complete IRQ, handle priv->tx_tail object > - increment tx_tail > - wake queue if there is space but don't wake if we wait for the lowest > prio object to be TX completed. > > Special care needs to be taken to implement that lock-less and race > free. I suggest to look the the mcp251xfd driver. After converting the driver to the above outlined implementation it should be more straight forward to add the caching you implemented. regards, Marc
Hi Marc, > Il 10/05/2021 14:36 Marc Kleine-Budde <mkl@pengutronix.de> ha scritto: > > > On 10.05.2021 14:25:15, Marc Kleine-Budde wrote: > > On 09.05.2021 14:43:09, Dario Binacchi wrote: > > > As reported by a comment in the c_can_start_xmit() this was not a FIFO. > > > C/D_CAN controller sends out the buffers prioritized so that the lowest > > > buffer number wins. > > > > > > What did c_can_start_xmit() do if it found tx_active = 0x80000000 ? It > > > waited until the only frame of the FIFO was actually transmitted by the > > > controller. Only one message in the FIFO but we had to wait for it to > > > empty completely to ensure that the messages were transmitted in the > > > order in which they were loaded. > > > > > > By storing the frames in the FIFO without requiring its transmission, we > > > will be able to use the full size of the FIFO even in cases such as the > > > one described above. The transmission interrupt will trigger their > > > transmission only when all the messages previously loaded but stored in > > > less priority positions of the buffers have been transmitted. > > > > The algorithm you implemented looks a bit too complicated to me. Let me > > sketch the algorithm that's implemented by several other drivers. > > > > - have a power of two number of TX objects > > - add a number of objects to struct priv (tx_num) > > (or make it a define, if the number of tx objects is compile time fixed) > > - add two "unsigned int" variables to your struct priv, > > one "tx_head", one "tx_tail" > > - the hard_start_xmit() writes to priv->tx_head & (priv->tx_num - 1) > > - increment tx_head > > - stop the tx_queue if there is no space or if the object with the > > lowest prio has been written > > - in TX complete IRQ, handle priv->tx_tail object > > - increment tx_tail > > - wake queue if there is space but don't wake if we wait for the lowest > > prio object to be TX completed. > > > > Special care needs to be taken to implement that lock-less and race > > free. I suggest to look the the mcp251xfd driver. > > After converting the driver to the above outlined implementation it > should be more straight forward to add the caching you implemented. > I took some time to think about your suggestions. The submitted patch was developed trying to improve the CAN transmission using the current driver design for minimize the creation of bugs. If I'm not missing something you suggest me to change the driver design as a pre-condition to apply an updated version of my patch. IMHO this would increase the possibility of generating bugs, even for parts of the code that are considered stable. If the algorithm I have implemented is a bit too complicated, let's try to simplify it starting from the submitted patch. Waiting for your reply, thanks and regards Dario > regards, > Marc > > -- > Pengutronix e.K. | Marc Kleine-Budde | > Embedded Linux | https://www.pengutronix.de | > Vertretung West/Dortmund | Phone: +49-231-2826-924 | > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
diff --git a/drivers/net/can/c_can/c_can.h b/drivers/net/can/c_can/c_can.h index 4247ff80a29c..6abde6cbc0b1 100644 --- a/drivers/net/can/c_can/c_can.h +++ b/drivers/net/can/c_can/c_can.h @@ -191,6 +191,9 @@ struct c_can_priv { unsigned int msg_obj_tx_last; u32 msg_obj_rx_mask; atomic_t tx_active; + atomic_t tx_cached; + spinlock_t tx_cached_lock; + atomic_t tx_avail; atomic_t sie_pending; unsigned long tx_dir; int last_status; diff --git a/drivers/net/can/c_can/c_can_main.c b/drivers/net/can/c_can/c_can_main.c index 7588f70ca0fe..d2f44c07d47f 100644 --- a/drivers/net/can/c_can/c_can_main.c +++ b/drivers/net/can/c_can/c_can_main.c @@ -124,6 +124,9 @@ IF_COMM_TXRQST | \ IF_COMM_DATAA | IF_COMM_DATAB) +#define IF_COMM_TX_FRAME (IF_COMM_ARB | IF_COMM_CONTROL | \ + IF_COMM_DATAA | IF_COMM_DATAB) + /* For the low buffers we clear the interrupt bit, but keep newdat */ #define IF_COMM_RCV_LOW (IF_COMM_MASK | IF_COMM_ARB | \ IF_COMM_CONTROL | IF_COMM_CLR_INT_PND | \ @@ -432,19 +435,36 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb, { struct can_frame *frame = (struct can_frame *)skb->data; struct c_can_priv *priv = netdev_priv(dev); - u32 idx, obj; + u32 idx, obj, tx_active, tx_cached; if (can_dropped_invalid_skb(dev, skb)) return NETDEV_TX_OK; - /* This is not a FIFO. C/D_CAN sends out the buffers - * prioritized. The lowest buffer number wins. - */ - idx = fls(atomic_read(&priv->tx_active)); - obj = idx + priv->msg_obj_tx_first; - /* If this is the last buffer, stop the xmit queue */ - if (idx == priv->msg_obj_tx_num - 1) + if (atomic_read(&priv->tx_avail) == 0) netif_stop_queue(dev); + + tx_active = atomic_read(&priv->tx_active); + tx_cached = atomic_read(&priv->tx_cached); + idx = fls(tx_active); + if (idx > priv->msg_obj_tx_num - 1) { + idx = fls(tx_cached); + + obj = idx + priv->msg_obj_tx_first; + spin_lock_bh(&priv->tx_cached_lock); + /* prepare message object for transmission */ + c_can_setup_tx_object(dev, IF_TX, frame, idx); + /* Store the message but don't ask for its transmission */ + c_can_object_put(dev, IF_TX, obj, IF_COMM_TX_FRAME); + spin_unlock_bh(&priv->tx_cached_lock); + priv->dlc[idx] = frame->len; + can_put_echo_skb(skb, dev, idx, 0); + atomic_dec(&priv->tx_avail); + atomic_add(BIT(idx), &priv->tx_cached); + return NETDEV_TX_OK; + } + + obj = idx + priv->msg_obj_tx_first; + /* Store the message in the interface so we can call * can_put_echo_skb(). We must do this before we enable * transmit as we might race against do_tx(). @@ -453,6 +473,7 @@ static netdev_tx_t c_can_start_xmit(struct sk_buff *skb, priv->dlc[idx] = frame->len; can_put_echo_skb(skb, dev, idx, 0); + atomic_dec(&priv->tx_avail); /* Update the active bits */ atomic_add(BIT(idx), &priv->tx_active); /* Start transmission */ @@ -599,6 +620,8 @@ static int c_can_chip_config(struct net_device *dev) /* Clear all internal status */ atomic_set(&priv->tx_active, 0); + atomic_set(&priv->tx_cached, 0); + atomic_set(&priv->tx_avail, priv->msg_obj_tx_num); priv->tx_dir = 0; /* set bittiming params */ @@ -723,14 +746,31 @@ static void c_can_do_tx(struct net_device *dev) /* Clear the bits in the tx_active mask */ atomic_sub(clr, &priv->tx_active); - if (clr & BIT(priv->msg_obj_tx_num - 1)) - netif_wake_queue(dev); - if (pkts) { + atomic_add(pkts, &priv->tx_avail); + + if (netif_queue_stopped(dev)) + netif_wake_queue(dev); + stats->tx_bytes += bytes; stats->tx_packets += pkts; can_led_event(dev, CAN_LED_EVENT_TX); } + + if (atomic_read(&priv->tx_active) == 0) { + pend = atomic_read(&priv->tx_cached); + + clr = pend; + while ((idx = ffs(pend))) { + idx--; + pend &= ~(1 << idx); + + obj = idx + priv->msg_obj_tx_first; + c_can_object_put(dev, IF_TX, obj, IF_COMM_TXRQST); + } + atomic_sub(clr, &priv->tx_cached); + atomic_add(clr, &priv->tx_active); + } } /* If we have a gap in the pending bits, that means we either @@ -1193,6 +1233,7 @@ struct net_device *alloc_c_can_dev(int msg_obj_num) return NULL; priv = netdev_priv(dev); + spin_lock_init(&priv->tx_cached_lock); priv->msg_obj_num = msg_obj_num; priv->msg_obj_rx_num = msg_obj_num - msg_obj_tx_num; priv->msg_obj_rx_first = 1;
As reported by a comment in the c_can_start_xmit() this was not a FIFO. C/D_CAN controller sends out the buffers prioritized so that the lowest buffer number wins. What did c_can_start_xmit() do if it found tx_active = 0x80000000 ? It waited until the only frame of the FIFO was actually transmitted by the controller. Only one message in the FIFO but we had to wait for it to empty completely to ensure that the messages were transmitted in the order in which they were loaded. By storing the frames in the FIFO without requiring its transmission, we will be able to use the full size of the FIFO even in cases such as the one described above. The transmission interrupt will trigger their transmission only when all the messages previously loaded but stored in less priority positions of the buffers have been transmitted. Suggested-by: Gianluca Falavigna <gianluca.falavigna@inwind.it> Signed-off-by: Dario Binacchi <dariobin@libero.it> --- drivers/net/can/c_can/c_can.h | 3 ++ drivers/net/can/c_can/c_can_main.c | 63 ++++++++++++++++++++++++------ 2 files changed, 55 insertions(+), 11 deletions(-)