diff mbox series

[net] net/mlx5: Fix error path in multi-packet WQE transmit

Message ID 20240910-fix-mlx5_dma_unmap-v1-1-6ae3d19d0b86@linux.ibm.com (mailing list archive)
State Awaiting Upstream
Delegated to: Netdev Maintainers
Headers show
Series [net] net/mlx5: Fix error path in multi-packet WQE transmit | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 16 this patch: 16
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 14 of 14 maintainers
netdev/build_clang success Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 16 this patch: 16
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 7 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-09-12--00-00 (tests: 764)

Commit Message

Gerd Bayer Sept. 10, 2024, 8:53 a.m. UTC
Remove the erroneous unmap in case no DMA mapping was established

The multi-packet WQE transmit code attempts to obtain a DMA mapping for
the skb. This could fail, e.g. under memory pressure, when the IOMMU
driver just can't allocate more memory for page tables. While the code
tries to handle this in the path below the err_unmap label it erroneously
unmaps one entry from the sq's FIFO list of active mappings. Since the
current map attempt failed this unmap is removing some random DMA mapping
that might still be required. If the PCI function now presents that IOVA,
the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
function in error state.

The erroneous behavior was seen in a stress-test environment that created
memory pressure.

Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs")
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
---
While running some stress tests that put our system under memory pressure
we observed the following splat, eventually:

    [ 1350.038775] ------------[ cut here ]------------
    [ 1350.038776] WARNING: CPU: 36 PID: 37194 at arch/s390/include/asm/pci_dma.h:136 dma_update_cpu_trans+0x66/0x70
    [ 1350.038799] Modules linked in: macvtap macvlan vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink lcs ctcm fsm dasd_fba_mod mlx5_ib ib_uverbs ib_core mlx5_core
    "
    "mlxfw psample rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs tls dm_service_time 8021q garp mrp rfkill sd_mod t10_pi sg sunrpc zfcp scsi_transport_fc dm_multipath dm_mod vfio_ccw mdev vfio_iommu_type1 vfio eadm_sch iommufd kvm drm i2c_core drm_panel_orientation_quirks xfs libcrc32c qeth_l2
    "
    " bridge stp llc ghash_s390 prng aes_s390 dasd_eckd_mod des_s390 libdes sha3_512_s390 qeth sha3_256_s390 dasd_mod ccwgroup qdio pkey zcrypt fuse
    [ 1350.038880] CPU: 36 PID: 37194 Comm: vhost-37179 Kdump: loaded Tainted: G               X  -------  ---  5.14.0-427.20.1.el9_4.s390x #1
    [ 1350.038884] Hardware name: IBM 3931 A01 400 (LPAR)
    [ 1350.038886] Krnl PSW : 0704f00180000000 00000056803d1eba (dma_update_cpu_trans+0x6a/0x70)
    [ 1350.038890]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
    [ 1350.038893] Krnl GPRS: 0000000000000000 0000000589eff400 0000003be2b477b0 0000000000000000
    [ 1350.038895]            0000000000000400 0000000000001000 0000000000000400 ffffffbe8000a000
    [ 1350.038897]            0000000000000001 0000000086d6bc00 0000000000000001 000000417fff7000
    [ 1350.038900]            000000012d5baa00 0000000000000000 00000056803d1f3e 0000038016df75d8
    [ 1350.038957] Krnl Code: 00000056803d1eae: af000000            mc      0,0
    [ 1350.038963]            00000056803d1eb2: a7f4fff9            brc     15,00000056803d1ea4
    [ 1350.038963]           #00000056803d1eb6: af000000            mc      0,0
    [ 1350.038970]           >00000056803d1eba: a7f4ffd9            brc     15,00000056803d1e6c
    [ 1350.038979]            00000056803d1ebe: 0707                bcr     0,%r7
    [ 1350.038983]            00000056803d1ec0: c004004b3334        brcl    0,0000005680d38528
    [ 1350.038983]            00000056803d1ec6: eb7ff0500024        stmg    %r7,%r15,80(%r15)
    [ 1350.038983]            00000056803d1ecc: b90400ef            lgr     %r14,%r15
    [ 1350.038994] Call Trace:
    [ 1350.038995]  [<00000056803d1eba>] dma_update_cpu_trans+0x6a/0x70
    [ 1350.038998] ([<00000056803d1f22>] __dma_update_trans+0x62/0x150)
    [ 1350.039001]  [<00000056803d2432>] s390_dma_unmap_pages+0x72/0x1c0
    [ 1350.039003]  [<000000568047e70c>] dma_unmap_page_attrs+0x3c/0x190
    [ 1350.039008]  [<000003ff807c5230>] mlx5e_sq_xmit_mpwqe+0x2b0/0x430 [mlx5_core]
    [ 1350.039170]  [<000003ff807c589e>] mlx5e_xmit+0x20e/0x5a0 [mlx5_core]
    [ 1350.039246]  [<0000005680aae326>] dev_hard_start_xmit+0xb6/0x210
    [ 1350.039252]  [<0000005680b144d8>] sch_direct_xmit+0x88/0x420
    [ 1350.039256]  [<0000005680aa9496>] __dev_xmit_skb+0x2c6/0x5c0
    [ 1350.039259]  [<0000005680aae93e>] __dev_queue_xmit+0x36e/0x840
    [ 1350.039262]  [<000003ff809e3b6a>] macvlan_start_xmit+0x6a/0x140 [macvlan]
    [ 1350.039266]  [<0000005680aae326>] dev_hard_start_xmit+0xb6/0x210
    [ 1350.039269]  [<0000005680aaeae8>] __dev_queue_xmit+0x518/0x840
    [ 1350.039271]  [<000003ff809b40f4>] tap_get_user_xdp.isra.0+0x134/0x300 [tap]
    [ 1350.039274]  [<000003ff809b4354>] tap_sendmsg+0x94/0xc0 [tap]
    [ 1350.039277]  [<000003ff809d4f06>] vhost_tx_batch.constprop.0+0x66/0x1a0 [vhost_net]
    [ 1350.039281]  [<000003ff809d6a5e>] handle_tx_copy+0x24e/0x340 [vhost_net]
    [ 1350.039283]  [<000003ff809d6c0c>] handle_tx+0xbc/0x100 [vhost_net]
    [ 1350.039286]  [<000003ff809bb6f2>] vhost_worker+0xa2/0x100 [vhost]
    [ 1350.039294]  [<000000568040be98>] kthread+0x108/0x110
    [ 1350.039299]  [<000000568038afdc>] __ret_from_fork+0x3c/0x60
    [ 1350.039302]  [<0000005680d2e89a>] ret_from_fork+0xa/0x40
    [ 1350.039307] Last Breaking-Event-Address:
    [ 1350.039308]  [<00000056803d1e68>] dma_update_cpu_trans+0x18/0x70
    [ 1350.039310] ---[ end trace a581115ebebd62f3 ]---
    
And here the IOMMU complains about the "rogue DMA attempt":
    [ 1350.043079] zpci: 0037:00:00.0: Event 0x7 reports an error for PCI function 0x3932
    
With some instrumentation in mlx5e_sq_xmit_mpwqe() to mimic a failure
to DMA map every 1000th buffer, I was able to reproduce this with recent
upstream code, too. I think the error handling of that routine has a bug
as it DMA unmaps a buffer/IOVA that might be used, still.
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 1 -
 1 file changed, 1 deletion(-)


---
base-commit: 8d53a5170c8677af9b3fbd9d0b75ae120fdefba2
change-id: 20240909-fix-mlx5_dma_unmap-e2a12e26e929

Best regards,

Comments

Zhu Yanjun Sept. 11, 2024, 1:37 a.m. UTC | #1
在 2024/9/10 16:53, Gerd Bayer 写道:
> Remove the erroneous unmap in case no DMA mapping was established
> 
> The multi-packet WQE transmit code attempts to obtain a DMA mapping for
> the skb. This could fail, e.g. under memory pressure, when the IOMMU
> driver just can't allocate more memory for page tables. While the code
> tries to handle this in the path below the err_unmap label it erroneously
> unmaps one entry from the sq's FIFO list of active mappings. Since the
> current map attempt failed this unmap is removing some random DMA mapping
> that might still be required. If the PCI function now presents that IOVA,
> the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
                 ~~~~~~~

s/assumes/assume ?

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

> function in error state.
> 
> The erroneous behavior was seen in a stress-test environment that created
> memory pressure.
> 
> Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs")
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
> While running some stress tests that put our system under memory pressure
> we observed the following splat, eventually:
> 
>      [ 1350.038775] ------------[ cut here ]------------
>      [ 1350.038776] WARNING: CPU: 36 PID: 37194 at arch/s390/include/asm/pci_dma.h:136 dma_update_cpu_trans+0x66/0x70
>      [ 1350.038799] Modules linked in: macvtap macvlan vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink lcs ctcm fsm dasd_fba_mod mlx5_ib ib_uverbs ib_core mlx5_core
>      "
>      "mlxfw psample rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs tls dm_service_time 8021q garp mrp rfkill sd_mod t10_pi sg sunrpc zfcp scsi_transport_fc dm_multipath dm_mod vfio_ccw mdev vfio_iommu_type1 vfio eadm_sch iommufd kvm drm i2c_core drm_panel_orientation_quirks xfs libcrc32c qeth_l2
>      "
>      " bridge stp llc ghash_s390 prng aes_s390 dasd_eckd_mod des_s390 libdes sha3_512_s390 qeth sha3_256_s390 dasd_mod ccwgroup qdio pkey zcrypt fuse
>      [ 1350.038880] CPU: 36 PID: 37194 Comm: vhost-37179 Kdump: loaded Tainted: G               X  -------  ---  5.14.0-427.20.1.el9_4.s390x #1
>      [ 1350.038884] Hardware name: IBM 3931 A01 400 (LPAR)
>      [ 1350.038886] Krnl PSW : 0704f00180000000 00000056803d1eba (dma_update_cpu_trans+0x6a/0x70)
>      [ 1350.038890]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
>      [ 1350.038893] Krnl GPRS: 0000000000000000 0000000589eff400 0000003be2b477b0 0000000000000000
>      [ 1350.038895]            0000000000000400 0000000000001000 0000000000000400 ffffffbe8000a000
>      [ 1350.038897]            0000000000000001 0000000086d6bc00 0000000000000001 000000417fff7000
>      [ 1350.038900]            000000012d5baa00 0000000000000000 00000056803d1f3e 0000038016df75d8
>      [ 1350.038957] Krnl Code: 00000056803d1eae: af000000            mc      0,0
>      [ 1350.038963]            00000056803d1eb2: a7f4fff9            brc     15,00000056803d1ea4
>      [ 1350.038963]           #00000056803d1eb6: af000000            mc      0,0
>      [ 1350.038970]           >00000056803d1eba: a7f4ffd9            brc     15,00000056803d1e6c
>      [ 1350.038979]            00000056803d1ebe: 0707                bcr     0,%r7
>      [ 1350.038983]            00000056803d1ec0: c004004b3334        brcl    0,0000005680d38528
>      [ 1350.038983]            00000056803d1ec6: eb7ff0500024        stmg    %r7,%r15,80(%r15)
>      [ 1350.038983]            00000056803d1ecc: b90400ef            lgr     %r14,%r15
>      [ 1350.038994] Call Trace:
>      [ 1350.038995]  [<00000056803d1eba>] dma_update_cpu_trans+0x6a/0x70
>      [ 1350.038998] ([<00000056803d1f22>] __dma_update_trans+0x62/0x150)
>      [ 1350.039001]  [<00000056803d2432>] s390_dma_unmap_pages+0x72/0x1c0
>      [ 1350.039003]  [<000000568047e70c>] dma_unmap_page_attrs+0x3c/0x190
>      [ 1350.039008]  [<000003ff807c5230>] mlx5e_sq_xmit_mpwqe+0x2b0/0x430 [mlx5_core]
>      [ 1350.039170]  [<000003ff807c589e>] mlx5e_xmit+0x20e/0x5a0 [mlx5_core]
>      [ 1350.039246]  [<0000005680aae326>] dev_hard_start_xmit+0xb6/0x210
>      [ 1350.039252]  [<0000005680b144d8>] sch_direct_xmit+0x88/0x420
>      [ 1350.039256]  [<0000005680aa9496>] __dev_xmit_skb+0x2c6/0x5c0
>      [ 1350.039259]  [<0000005680aae93e>] __dev_queue_xmit+0x36e/0x840
>      [ 1350.039262]  [<000003ff809e3b6a>] macvlan_start_xmit+0x6a/0x140 [macvlan]
>      [ 1350.039266]  [<0000005680aae326>] dev_hard_start_xmit+0xb6/0x210
>      [ 1350.039269]  [<0000005680aaeae8>] __dev_queue_xmit+0x518/0x840
>      [ 1350.039271]  [<000003ff809b40f4>] tap_get_user_xdp.isra.0+0x134/0x300 [tap]
>      [ 1350.039274]  [<000003ff809b4354>] tap_sendmsg+0x94/0xc0 [tap]
>      [ 1350.039277]  [<000003ff809d4f06>] vhost_tx_batch.constprop.0+0x66/0x1a0 [vhost_net]
>      [ 1350.039281]  [<000003ff809d6a5e>] handle_tx_copy+0x24e/0x340 [vhost_net]
>      [ 1350.039283]  [<000003ff809d6c0c>] handle_tx+0xbc/0x100 [vhost_net]
>      [ 1350.039286]  [<000003ff809bb6f2>] vhost_worker+0xa2/0x100 [vhost]
>      [ 1350.039294]  [<000000568040be98>] kthread+0x108/0x110
>      [ 1350.039299]  [<000000568038afdc>] __ret_from_fork+0x3c/0x60
>      [ 1350.039302]  [<0000005680d2e89a>] ret_from_fork+0xa/0x40
>      [ 1350.039307] Last Breaking-Event-Address:
>      [ 1350.039308]  [<00000056803d1e68>] dma_update_cpu_trans+0x18/0x70
>      [ 1350.039310] ---[ end trace a581115ebebd62f3 ]---
>      
> And here the IOMMU complains about the "rogue DMA attempt":
>      [ 1350.043079] zpci: 0037:00:00.0: Event 0x7 reports an error for PCI function 0x3932
>      
> With some instrumentation in mlx5e_sq_xmit_mpwqe() to mimic a failure
> to DMA map every 1000th buffer, I was able to reproduce this with recent
> upstream code, too. I think the error handling of that routine has a bug
> as it DMA unmaps a buffer/IOVA that might be used, still.
> ---
>   drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
> index b09e9abd39f3..f8c7912abe0e 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
> @@ -642,7 +642,6 @@ mlx5e_sq_xmit_mpwqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
>   	return;
>   
>   err_unmap:
> -	mlx5e_dma_unmap_wqe_err(sq, 1);
>   	sq->stats->dropped++;
>   	dev_kfree_skb_any(skb);
>   	mlx5e_tx_flush(sq);
> 
> ---
> base-commit: 8d53a5170c8677af9b3fbd9d0b75ae120fdefba2
> change-id: 20240909-fix-mlx5_dma_unmap-e2a12e26e929
> 
> Best regards,
Maxim Mikityanskiy Sept. 11, 2024, 8:59 a.m. UTC | #2
On Tue, 10 Sep 2024 at 10:53:51 +0200, Gerd Bayer wrote:
> Remove the erroneous unmap in case no DMA mapping was established
> 
> The multi-packet WQE transmit code attempts to obtain a DMA mapping for
> the skb. This could fail, e.g. under memory pressure, when the IOMMU
> driver just can't allocate more memory for page tables. While the code
> tries to handle this in the path below the err_unmap label it erroneously
> unmaps one entry from the sq's FIFO list of active mappings.

The fix looks valid to me, thanks!

Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com>

> Since the
> current map attempt failed this unmap is removing some random DMA mapping
> that might still be required. If the PCI function now presents that IOVA,
> the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI
> function in error state.
> 
> The erroneous behavior was seen in a stress-test environment that created
> memory pressure.
> 
> Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs")
> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
> ---
> While running some stress tests that put our system under memory pressure
> we observed the following splat, eventually:
> 
>     [ 1350.038775] ------------[ cut here ]------------
>     [ 1350.038776] WARNING: CPU: 36 PID: 37194 at arch/s390/include/asm/pci_dma.h:136 dma_update_cpu_trans+0x66/0x70
>     [ 1350.038799] Modules linked in: macvtap macvlan vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink lcs ctcm fsm dasd_fba_mod mlx5_ib ib_uverbs ib_core mlx5_core
>     "
>     "mlxfw psample rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs tls dm_service_time 8021q garp mrp rfkill sd_mod t10_pi sg sunrpc zfcp scsi_transport_fc dm_multipath dm_mod vfio_ccw mdev vfio_iommu_type1 vfio eadm_sch iommufd kvm drm i2c_core drm_panel_orientation_quirks xfs libcrc32c qeth_l2
>     "
>     " bridge stp llc ghash_s390 prng aes_s390 dasd_eckd_mod des_s390 libdes sha3_512_s390 qeth sha3_256_s390 dasd_mod ccwgroup qdio pkey zcrypt fuse
>     [ 1350.038880] CPU: 36 PID: 37194 Comm: vhost-37179 Kdump: loaded Tainted: G               X  -------  ---  5.14.0-427.20.1.el9_4.s390x #1
>     [ 1350.038884] Hardware name: IBM 3931 A01 400 (LPAR)
>     [ 1350.038886] Krnl PSW : 0704f00180000000 00000056803d1eba (dma_update_cpu_trans+0x6a/0x70)
>     [ 1350.038890]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3
>     [ 1350.038893] Krnl GPRS: 0000000000000000 0000000589eff400 0000003be2b477b0 0000000000000000
>     [ 1350.038895]            0000000000000400 0000000000001000 0000000000000400 ffffffbe8000a000
>     [ 1350.038897]            0000000000000001 0000000086d6bc00 0000000000000001 000000417fff7000
>     [ 1350.038900]            000000012d5baa00 0000000000000000 00000056803d1f3e 0000038016df75d8
>     [ 1350.038957] Krnl Code: 00000056803d1eae: af000000            mc      0,0
>     [ 1350.038963]            00000056803d1eb2: a7f4fff9            brc     15,00000056803d1ea4
>     [ 1350.038963]           #00000056803d1eb6: af000000            mc      0,0
>     [ 1350.038970]           >00000056803d1eba: a7f4ffd9            brc     15,00000056803d1e6c
>     [ 1350.038979]            00000056803d1ebe: 0707                bcr     0,%r7
>     [ 1350.038983]            00000056803d1ec0: c004004b3334        brcl    0,0000005680d38528
>     [ 1350.038983]            00000056803d1ec6: eb7ff0500024        stmg    %r7,%r15,80(%r15)
>     [ 1350.038983]            00000056803d1ecc: b90400ef            lgr     %r14,%r15
>     [ 1350.038994] Call Trace:
>     [ 1350.038995]  [<00000056803d1eba>] dma_update_cpu_trans+0x6a/0x70
>     [ 1350.038998] ([<00000056803d1f22>] __dma_update_trans+0x62/0x150)
>     [ 1350.039001]  [<00000056803d2432>] s390_dma_unmap_pages+0x72/0x1c0
>     [ 1350.039003]  [<000000568047e70c>] dma_unmap_page_attrs+0x3c/0x190
>     [ 1350.039008]  [<000003ff807c5230>] mlx5e_sq_xmit_mpwqe+0x2b0/0x430 [mlx5_core]
>     [ 1350.039170]  [<000003ff807c589e>] mlx5e_xmit+0x20e/0x5a0 [mlx5_core]
>     [ 1350.039246]  [<0000005680aae326>] dev_hard_start_xmit+0xb6/0x210
>     [ 1350.039252]  [<0000005680b144d8>] sch_direct_xmit+0x88/0x420
>     [ 1350.039256]  [<0000005680aa9496>] __dev_xmit_skb+0x2c6/0x5c0
>     [ 1350.039259]  [<0000005680aae93e>] __dev_queue_xmit+0x36e/0x840
>     [ 1350.039262]  [<000003ff809e3b6a>] macvlan_start_xmit+0x6a/0x140 [macvlan]
>     [ 1350.039266]  [<0000005680aae326>] dev_hard_start_xmit+0xb6/0x210
>     [ 1350.039269]  [<0000005680aaeae8>] __dev_queue_xmit+0x518/0x840
>     [ 1350.039271]  [<000003ff809b40f4>] tap_get_user_xdp.isra.0+0x134/0x300 [tap]
>     [ 1350.039274]  [<000003ff809b4354>] tap_sendmsg+0x94/0xc0 [tap]
>     [ 1350.039277]  [<000003ff809d4f06>] vhost_tx_batch.constprop.0+0x66/0x1a0 [vhost_net]
>     [ 1350.039281]  [<000003ff809d6a5e>] handle_tx_copy+0x24e/0x340 [vhost_net]
>     [ 1350.039283]  [<000003ff809d6c0c>] handle_tx+0xbc/0x100 [vhost_net]
>     [ 1350.039286]  [<000003ff809bb6f2>] vhost_worker+0xa2/0x100 [vhost]
>     [ 1350.039294]  [<000000568040be98>] kthread+0x108/0x110
>     [ 1350.039299]  [<000000568038afdc>] __ret_from_fork+0x3c/0x60
>     [ 1350.039302]  [<0000005680d2e89a>] ret_from_fork+0xa/0x40
>     [ 1350.039307] Last Breaking-Event-Address:
>     [ 1350.039308]  [<00000056803d1e68>] dma_update_cpu_trans+0x18/0x70
>     [ 1350.039310] ---[ end trace a581115ebebd62f3 ]---
>     
> And here the IOMMU complains about the "rogue DMA attempt":
>     [ 1350.043079] zpci: 0037:00:00.0: Event 0x7 reports an error for PCI function 0x3932
>     
> With some instrumentation in mlx5e_sq_xmit_mpwqe() to mimic a failure
> to DMA map every 1000th buffer, I was able to reproduce this with recent
> upstream code, too. I think the error handling of that routine has a bug
> as it DMA unmaps a buffer/IOVA that might be used, still.
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
> index b09e9abd39f3..f8c7912abe0e 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
> @@ -642,7 +642,6 @@ mlx5e_sq_xmit_mpwqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
>  	return;
>  
>  err_unmap:
> -	mlx5e_dma_unmap_wqe_err(sq, 1);
>  	sq->stats->dropped++;
>  	dev_kfree_skb_any(skb);
>  	mlx5e_tx_flush(sq);
> 
> ---
> base-commit: 8d53a5170c8677af9b3fbd9d0b75ae120fdefba2
> change-id: 20240909-fix-mlx5_dma_unmap-e2a12e26e929
> 
> Best regards,
> -- 
> Gerd Bayer <gbayer@linux.ibm.com>
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index b09e9abd39f3..f8c7912abe0e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -642,7 +642,6 @@  mlx5e_sq_xmit_mpwqe(struct mlx5e_txqsq *sq, struct sk_buff *skb,
 	return;
 
 err_unmap:
-	mlx5e_dma_unmap_wqe_err(sq, 1);
 	sq->stats->dropped++;
 	dev_kfree_skb_any(skb);
 	mlx5e_tx_flush(sq);