Message ID | 20250407184102.2155415-1-sidhartha.kumar@oracle.com (mailing list archive) |
---|---|
Headers | show |
Series | Track node vacancy to reduce worst case allocation counts | expand |
On 4/7/25 2:40 PM, Sidhartha Kumar wrote: > v3[3] -> v4: > - fix bug reported by Vasily Gorbik by fixing condition for > mast_overflow() and add test case for this condition. This fix > was also tested on s390 by Vasily. > > - cleanup new_height variable usage in mas_spanning_rebalance() > and add an additional test to make sure mt_height() is correct > after a rebalance causes a tree's height to decrease per Liam. > Hi Andrew, Could the following fixup patch be applied to the end of this series? If you'd prefer a new version of the series I can do that as well. Thanks, Sid commit 1be83a7dccbef0a5973a3723de5a4460270d90c7 (HEAD) Author: Sidhartha Kumar <sidhartha.kumar@oracle.com> Date: Wed Apr 9 17:44:51 2025 +0000 maple_tree: vacant_height series fixup l_mas.depth is no longer overloaded to track the height in mas_spanning_rebalance() so we can remove its initialization. Fix comments that refer to variables that track height as depth. Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h index 37dc9525dff6..9ef129038224 100644 --- a/include/linux/maple_tree.h +++ b/include/linux/maple_tree.h @@ -463,8 +463,8 @@ struct ma_wr_state { void __rcu **slots; /* mas->node->slots pointer */ void *entry; /* The entry to write */ void *content; /* The existing entry that is being overwritten */ - unsigned char vacant_height; /* Depth of lowest node with free space */ - unsigned char sufficient_height;/* Depth of lowest node with min sufficiency + 1 nodes */ + unsigned char vacant_height; /* Height of lowest node with free space */ + unsigned char sufficient_height;/* Height of lowest node with min sufficiency + 1 nodes */ }; #define mas_lock(mas) spin_lock(&((mas)->tree->ma_lock)) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 8425728e3c5a..affe979bd14d 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -2850,8 +2850,6 @@ static void mas_spanning_rebalance(struct ma_state *mas, unlikely(mast->bn->b_end <= mt_min_slots[mast->bn->type])) mast_spanning_rebalance(mast); - l_mas.depth = 0; - /* * Each level of the tree is examined and balanced, pushing data to the left or * right, or rebalancing against left or right nodes is employed to avoid > v2[2] -> v3: > - add r-b to patches 1,4, and 6 > > - update function parameter comments in patch 2 > > - remove line that sets mas->depth in patch 2 > > - remove redundant code for checking for a spanning write in patch 3 > > - rewrite commit message of patch 5 for additonal context and clarity > > > v1[1] -> v2: > - fix comment for vacant_height which refers to depth per Wei > > - add a patch to reorder switch case statements in mas_prealloc_calc and > mas_wr_store_entry > > - use sufficient height in spanning stores > > - modify patch 2 to use a counter to track ascending the tree rather > than overloading mas->depth to have this function. > > ================ overview ======================== > Currently, the maple tree preallocates the worst case number of nodes for > given store type by taking into account the whole height of the tree. This > comes from a worst case scenario of every node in the tree being full and > having to propagate node allocation upwards until we reach the root of the > tree. This can be optimized if there are vacancies in nodes that are at a > lower depth than the root node. This series implements tracking the level > at which there is a vacant node so we only need to allocate until this > level is reached, rather than always using the full height of the tree. > The ma_wr_state struct is modified to add a field which keeps track of the > vacant height and is updated during walks of the tree. This value is then > read in mas_prealloc_calc() when we decide how many nodes to allocate. > > For rebalancing and spanning stores, we also need to track the lowest > height at which a node has 1 more entry than the minimum sufficient number > of entries. This is because rebalancing can cause a parent node to become > insufficient which results in further node allocations. In this case, we > need to use the sufficient height as the worst case rather than the vacant > height. > > patch 1-2: preparatory patches > patch 3: implement vacant height tracking + update the tests > patch 4: support vacant height tracking for rebalancing writes > patch 5: implement sufficient height tracking > patch 6: reorder switch case statements > > ================ results ========================= > Bpftrace was used to profile the allocation path for requesting new maple > nodes while running stress-ng mmap 120s. The histograms below represent > requests to kmem_cache_alloc_bulk() and show the count argument. This > represnts how many maple nodes the caller is requesting in > kmem_cache_alloc_bulk() > > command: stress-ng --mmap 4 --timeout 120 > > mm-unstable > > @bulk_alloc_req: > [3, 4) 4 | | > [4, 5) 54170 |@ | > [5, 6) 0 | | > [6, 7) 893057 |@@@@@@@@@@@@@@@@@@@@ | > [7, 8) 4 | | > [8, 9) 2230287 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > [9, 10) 55811 |@ | > [10, 11) 77834 |@ | > [11, 12) 0 | | > [12, 13) 1368684 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | > [13, 14) 0 | | > [14, 15) 0 | | > [15, 16) 367197 |@@@@@@@@ | > > > @maple_node_total: 46,630,160 > @total_vmas: 46184591 > > mm-unstable + this series > > @bulk_alloc_req: > [2, 3) 198 | | > [3, 4) 4 | | > [4, 5) 43 | | > [5, 6) 0 | | > [6, 7) 1069503 |@@@@@@@@@@@@@@@@@@@@@ | > [7, 8) 4 | | > [8, 9) 2597268 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > [9, 10) 472191 |@@@@@@@@@ | > [10, 11) 191904 |@@@ | > [11, 12) 0 | | > [12, 13) 247316 |@@@@ | > [13, 14) 0 | | > [14, 15) 0 | | > [15, 16) 98769 |@ | > > > @maple_node_total: 37,813,856 > @total_vmas: 43493287 > > This represents a ~19% reduction in the number of bulk maple nodes allocated. > > For more reproducible results, a historgram of the return value of > mas_prealloc_calc() is displayed while running the maple_tree_tests > whcih have a deterministic store pattern > > mas_prealloc_calc() return value mm-unstable > 1 : (12068) > 3 : (11836) > 5 : ***** (271192) > 7 : ************************************************** (2329329) > 9 : *********** (534186) > 10 : (435) > 11 : *************** (704306) > 13 : ******** (409781) > > mas_prealloc_calc() return value mm-unstable + this series > 1 : (12070) > 3 : ************************************************** (3548777) > 5 : ******** (633458) > 7 : (65081) > 9 : (11224) > 10 : (341) > 11 : (2973) > 13 : (68) > > do_mmap latency was also measured for regressions: > command: stress-ng --mmap 4 --timeout 120 > > mm-unstable: > avg = 7162 nsecs, total: 16101821292 nsecs, count: 2248034 > > mm-unstable + this series: > avg = 6689 nsecs, total: 15135391764 nsecs, count: 2262726 > > > stress-ng --mmap4 --timeout 120 > > with vacant_height: > stress-ng: info: [257] 21526312 Maple Tree Read 0.176 M/sec > stress-ng: info: [257] 339979348 Maple Tree Write 2.774 M/sec > > without vacant_height: > stress-ng: info: [8228] 20968900 Maple Tree Read 0.171 M/sec > stress-ng: info: [8228] 312214648 Maple Tree Write 2.547 M/sec > > This respresents an increase of ~3% read throughput and ~9% increase in write throughput. > > [1]: https://lore.kernel.org/lkml/20241114170524.64391-1-sidhartha.kumar@oracle.com/T/ > [2]: https://lore.kernel.org/lkml/20250221163610.578409-1-sidhartha.kumar@oracle.com/ > [3]: https://lore.kernel.org/lkml/20250227204823.758784-1-sidhartha.kumar@oracle.com/ > Sidhartha Kumar (6): > maple_tree: convert mas_prealloc_calc() to take in a maple write state > maple_tree: use height and depth consistently > maple_tree: use vacant nodes to reduce worst case allocations > maple_tree: break on convergence in mas_spanning_rebalance() > maple_tree: add sufficient height > maple_tree: reorder mas->store_type case statements > > include/linux/maple_tree.h | 4 + > lib/maple_tree.c | 189 ++++++++++++++++++------------- > tools/testing/radix-tree/maple.c | 126 +++++++++++++++++++-- > 3 files changed, 233 insertions(+), 86 deletions(-) >
On Wed, 9 Apr 2025 13:51:29 -0400 Sidhartha Kumar <sidhartha.kumar@oracle.com> wrote: > On 4/7/25 2:40 PM, Sidhartha Kumar wrote: > > v3[3] -> v4: > > - fix bug reported by Vasily Gorbik by fixing condition for > > mast_overflow() and add test case for this condition. This fix > > was also tested on s390 by Vasily. > > > > - cleanup new_height variable usage in mas_spanning_rebalance() > > and add an additional test to make sure mt_height() is correct > > after a rebalance causes a tree's height to decrease per Liam. > > > Hi Andrew, > > Could the following fixup patch be applied to the end of this series? If > you'd prefer a new version of the series I can do that as well. Liam mentioned a couple of "nits" so I think a V5 would be appropriate, thanks.