Message ID | 20250415121143.345227-1-idosch@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | vxlan: Convert FDB table to rhashtable | expand |
On 4/15/25 15:11, Ido Schimmel wrote: > The VXLAN driver currently stores FDB entries in a hash table with a > fixed number of buckets (256), resulting in reduced performance as the > number of entries grows. This patchset solves the issue by converting > the driver to use rhashtable which maintains a more or less constant > performance regardless of the number of entries. > > Measured transmitted packets per second using a single pktgen thread > with varying number of entries when the transmitted packet always hits > the default entry (worst case): > > Number of entries | Improvement > ------------------|------------ > 1k | +1.12% > 4k | +9.22% > 16k | +55% > 64k | +585% > 256k | +2460% > > The first patches are preparations for the conversion in the last patch. > Specifically, the series is structured as follows: > > Patch #1 adds RCU read-side critical sections in the Tx path when > accessing FDB entries. Targeting at net-next as I am not aware of any > issues due to this omission despite the code being structured that way > for a long time. Without it, traces will be generated when converting > FDB lookup to rhashtable_lookup(). > > Patch #2-#5 simplify the creation of the default FDB entry (all-zeroes). > Current code assumes that insertion into the hash table cannot fail, > which will no longer be true with rhashtable. > > Patches #6-#10 add FDB entries to a linked list for entry traversal > instead of traversing over them using the fixed size hash table which is > removed in the last patch. > > Patches #11-#12 add wrappers for FDB lookup that make it clear when each > should be used along with lockdep annotations. Needed as a preparation > for rhashtable_lookup() that must be called from an RCU read-side > critical section. > > Patch #13 treats dst cache initialization errors as non-fatal. See more > info in the commit message. The current code happens to work because > insertion into the fixed size hash table is slow enough for the per-CPU > allocator to be able to create new chunks of per-CPU memory. > > Patch #14 adds an FDB key structure that includes the MAC address and > source VNI. To be used as rhashtable key. > > Patch #15 does the conversion to rhashtable. > > Ido Schimmel (15): > vxlan: Add RCU read-side critical sections in the Tx path > vxlan: Simplify creation of default FDB entry > vxlan: Insert FDB into hash table in vxlan_fdb_create() > vxlan: Unsplit default FDB entry creation and notification > vxlan: Relocate assignment of default remote device > vxlan: Use a single lock to protect the FDB table > vxlan: Add a linked list of FDB entries > vxlan: Use linked list to traverse FDB entries > vxlan: Convert FDB garbage collection to RCU > vxlan: Convert FDB flushing to RCU > vxlan: Rename FDB Tx lookup function > vxlan: Create wrappers for FDB lookup > vxlan: Do not treat dst cache initialization errors as fatal > vxlan: Introduce FDB key structure > vxlan: Convert FDB table to rhashtable > > drivers/net/vxlan/vxlan_core.c | 542 ++++++++++++---------------- > drivers/net/vxlan/vxlan_private.h | 11 +- > drivers/net/vxlan/vxlan_vnifilter.c | 8 +- > include/net/vxlan.h | 5 +- > 4 files changed, 248 insertions(+), 318 deletions(-) > Nice work! For the set: Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Cheers, Nik