diff mbox series

[net-next,04/10] net/mlx5: hw counters: Drop unneeded cacheline alignment

Message ID 20240815054656.2210494-5-tariqt@nvidia.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net/mlx5: hw counters refactor and misc | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 29 this patch: 29
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 1 maintainers not CCed: linux-rdma@vger.kernel.org
netdev/build_clang success Errors and warnings before: 29 this patch: 29
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 29 this patch: 29
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest warning net-next-2024-08-15--12-00 (tests: 707)

Commit Message

Tariq Toukan Aug. 15, 2024, 5:46 a.m. UTC
From: Cosmin Ratiu <cratiu@nvidia.com>

The mlx5_fc struct has a cache for values queried from hw, which is
cacheline aligned. On x86_64, this results in:

struct mlx5_fc {
        u32                    id;                   /*     0     4 */
        bool                   aging;                /*     4     1 */

        /* XXX 3 bytes hole, try to pack */

        struct mlx5_fc_bulk *  bulk;                 /*     8     8 */

        /* XXX 48 bytes hole, try to pack */

        /* --- cacheline 1 boundary (64 bytes) --- */
        struct mlx5_fc_cache   cache __attribute__((__aligned__(64)));
	/*    64    24 */
        u64                    lastpackets;          /*    88     8 */
        u64                    lastbytes;            /*    96     8 */

        /* size: 128, cachelines: 2, members: 6 */
        /* sum members: 53, holes: 2, sum holes: 51 */
        /* padding: 24 */
        /* forced aligns: 1, forced holes: 1, sum forced holes: 48 */
} __attribute__((__aligned__(64)));

(output from pahole).

...So a 48+24=72 byte waste. As far as I can determine, this serves no
purpose other than maybe making sure that the values in the cache do not
span two cachelines in the worst case scenario, but that's not a valid
enough reason to waste 72 bytes per counter, especially since this code
is not performance-critical. There could potentially be hundreds of
thousands of counters (e.g. for connection-tracking), so this quickly
adds up to multiple MB wasted.

This commit removes the alignment, resulting in:
struct mlx5_fc {
        [...]
        /* size: 56, cachelines: 1, members: 6 */
        /* sum members: 53, holes: 1, sum holes: 3 */
        /* last cacheline: 56 bytes */
};

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
index 05d9351ff577..ef13941e55c2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
@@ -53,7 +53,7 @@  struct mlx5_fc {
 	u32 id;
 	bool aging;
 	struct mlx5_fc_bulk *bulk;
-	struct mlx5_fc_cache cache ____cacheline_aligned_in_smp;
+	struct mlx5_fc_cache cache;
 	/* last{packets,bytes} are used for calculating deltas since last reading. */
 	u64 lastpackets;
 	u64 lastbytes;