Message ID | 20210420194203.24759-1-mon@unformed.ru (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: sfc: Fix kernel panic introduced by commit 044588b96372 ("sfc: define inner/outer csum offload TXQ types") | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Guessed tree name to be net-next |
netdev/subject_prefix | warning | Target tree name not specified in the subject |
netdev/cc_maintainers | warning | 2 maintainers not CCed: ecree.xilinx@gmail.com habetsm.xilinx@gmail.com |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | fail | Errors and warnings before: 6 this patch: 6 |
netdev/kdoc | success | Errors and warnings before: 57 this patch: 57 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: line length of 84 exceeds 80 columns |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
On 20/04/2021 20:42, Yury Vostrikov wrote: > efx_get_tx_queue(efx, qid / EFX_MAX_TXQ_PER_CHANNEL, > qid % EFX_MAX_TXQ_PER_CHANNEL); > > This uses qid / EFX_MAX_TXQ_PER_CHANNEL as index inside > efx_nic->channels[] and qid % EFX_MAX_TXQ_PER_CHANNEL as index inside > channel->tx_queue_be_type[]. > > Indexing into bitset mapping with modulo operation seems to oversight > from the previous refactoring. This should be fixed by 5b1faa92289b ("sfc: farch: fix TX queue lookup in TX flush done handling") and 83b09a180741 ("sfc: farch: fix TX queue lookup in TX event handling"), which were applied to 'net' this morning. Do you see any call sites that I've missed and that remain broken? (The one in ef100_ev_tx() is technically incorrect, but qlabel is always 0 there so it shouldn't matter.) -ed
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h index 9f7dfdf708cf..4ab0fe21a3a6 100644 --- a/drivers/net/ethernet/sfc/net_driver.h +++ b/drivers/net/ethernet/sfc/net_driver.h @@ -1533,18 +1533,33 @@ static inline unsigned int efx_channel_num_tx_queues(struct efx_channel *channel } static inline struct efx_tx_queue * -efx_channel_get_tx_queue(struct efx_channel *channel, unsigned int type) +efx_channel_get_tx_queue_by_type(struct efx_channel *channel, unsigned int type) { EFX_WARN_ON_ONCE_PARANOID(type >= EFX_TXQ_TYPES); return channel->tx_queue_by_type[type]; } static inline struct efx_tx_queue * -efx_get_tx_queue(struct efx_nic *efx, unsigned int index, unsigned int type) +efx_get_tx_queue_by_type(struct efx_nic *efx, unsigned int index, unsigned int type) { struct efx_channel *channel = efx_get_tx_channel(efx, index); - return efx_channel_get_tx_queue(channel, type); + return efx_channel_get_tx_queue_by_type(channel, type); +} + +static inline struct efx_tx_queue * +efx_channel_get_tx_queue(struct efx_channel *channel, unsigned int label) +{ + EFX_WARN_ON_ONCE_PARANOID(label >= EFX_MAX_TXQ_PER_CHANNEL); + return &channel->tx_queue[label]; +} + +static inline struct efx_tx_queue * +efx_get_tx_queue(struct efx_nic *efx, unsigned int index, unsigned int label) +{ + struct efx_channel *channel = efx_get_tx_channel(efx, index); + + return efx_channel_get_tx_queue(channel, label); } /* Iterate over all TX queues belonging to a channel */ diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c index a39c5143b386..7de19d22dadc 100644 --- a/drivers/net/ethernet/sfc/ptp.c +++ b/drivers/net/ethernet/sfc/ptp.c @@ -1091,7 +1091,7 @@ static void efx_ptp_xmit_skb_queue(struct efx_nic *efx, struct sk_buff *skb) u8 type = efx_tx_csum_type_skb(skb); struct efx_tx_queue *tx_queue; - tx_queue = efx_channel_get_tx_queue(ptp_data->channel, type); + tx_queue = efx_channel_get_tx_queue_by_type(ptp_data->channel, type); if (tx_queue && tx_queue->timestamping) { efx_enqueue_skb(tx_queue, skb); } else { diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c index 1665529a7271..18742db2990d 100644 --- a/drivers/net/ethernet/sfc/tx.c +++ b/drivers/net/ethernet/sfc/tx.c @@ -533,7 +533,7 @@ netdev_tx_t efx_hard_start_xmit(struct sk_buff *skb, return efx_ptp_tx(efx, skb); } - tx_queue = efx_get_tx_queue(efx, index, type); + tx_queue = efx_get_tx_queue_by_type(efx, index, type); if (WARN_ON_ONCE(!tx_queue)) { /* We don't have a TXQ of the right type. * This should never happen, as we don't advertise offload
NIC has EFX_MAX_CHANNELS channels: struct efx_nic { [...] struct efx_channel *channel[EFX_MAX_CHANNELS]; [...] } Each channel has EFX_MAX_TXQ_PER_CHANNEL TX queues; There a reverse mapping from type to TX queue, holding at most EFX_TXQ_TYPES entries. This mapping is a bitset mapping because EFX_TXQ_TYPE_* is defined as non-overlapping bit enum: struct efx_channel { [...] struct efx_tx_queue tx_queue[EFX_MAX_TXQ_PER_CHANNEL]; struct efx_tx_queue *tx_queue_by_type[EFX_TXQ_TYPES]; [...] } Because channels and queues are enumerated in-order in efx_set_channels(), it is possible to get tx_queue be calling efx_get_tx_queue(efx, qid / EFX_MAX_TXQ_PER_CHANNEL, qid % EFX_MAX_TXQ_PER_CHANNEL); This uses qid / EFX_MAX_TXQ_PER_CHANNEL as index inside efx_nic->channels[] and qid % EFX_MAX_TXQ_PER_CHANNEL as index inside channel->tx_queue_be_type[]. Indexing into bitset mapping with modulo operation seems to oversight from the previous refactoring. Comments of other call sites also indicate that the second argument is indeed queue->label (which is an index into channel->tx_queue[]), not queue->type. It also looks like that some callers do need indexing by type, though. However, because the sizes of tx_queue[] and tx_queue_by_type[] are equal, and every single slot in both arrays is not equal to NULL, no crash occurs. commit 044588b96372 ("sfc: define inner/outer csum offload TXQ types") add additional TXQ_TYPE and bumps size of tx_queue_by_type to 8 elements. Some of its members are NULL now. During interface shutdown, tx_queues are flushed; this, in turn, results in a callback to efx_farch_handle_tx_flush_done, which then tries to use qid % EFX_MAX_TXQ_PER_CHANNEL as queue->type, gets NULL back, and crashes. Address this by adding efx_get_tx_queue_by_type() and updating relevant callers. Signed-off-by: Yury Vostrikov <mon@unformed.ru> --- drivers/net/ethernet/sfc/net_driver.h | 21 ++++++++++++++++++--- drivers/net/ethernet/sfc/ptp.c | 2 +- drivers/net/ethernet/sfc/tx.c | 2 +- 3 files changed, 20 insertions(+), 5 deletions(-)