Message ID | 20210209190253.108763-1-serapheim@delphix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] mm/vmalloc: use rb_tree instead of list for vread() lookups | expand |
> vread() has been linearly searching vmap_area_list for looking up > vmalloc areas to read from. These same areas are also tracked by > a rb_tree (vmap_area_root) which offers logarithmic lookup. > > This patch modifies vread() to use the rb_tree structure instead > of the list and the speedup for heavy /proc/kcore readers can > be pretty significant. Below are the wall clock measurements of > a Python application that leverages the drgn debugging library > to read and interpret data read from /proc/kcore. > > Before the patch: > ----- > $ time sudo sdb -e 'dbuf | head 3000 | wc' > (unsigned long)3000 > > real 0m22.446s > user 0m2.321s > sys 0m20.690s > ----- > > With the patch: > ----- > $ time sudo sdb -e 'dbuf | head 3000 | wc' > (unsigned long)3000 > > real 0m2.104s > user 0m2.043s > sys 0m0.921s > ----- > > Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> > --- > Changed in v2: > > - Use __find_vmap_area() for initial lookup but keep iteration via > va->list. > > mm/vmalloc.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 49ab9b6c001d..eb133d000394 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2860,7 +2860,10 @@ long vread(char *buf, char *addr, unsigned long count) > count = -(unsigned long) addr; > > spin_lock(&vmap_area_lock); > - list_for_each_entry(va, &vmap_area_list, list) { > + va = __find_vmap_area((unsigned long)addr); > + if (!va) > + goto finished; > + list_for_each_entry_from(va, &vmap_area_list, list) { > if (!count) > break; > > -- > 2.17.1 > Much better :) Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> -- Vlad Rezki
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 49ab9b6c001d..eb133d000394 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2860,7 +2860,10 @@ long vread(char *buf, char *addr, unsigned long count) count = -(unsigned long) addr; spin_lock(&vmap_area_lock); - list_for_each_entry(va, &vmap_area_list, list) { + va = __find_vmap_area((unsigned long)addr); + if (!va) + goto finished; + list_for_each_entry_from(va, &vmap_area_list, list) { if (!count) break;
vread() has been linearly searching vmap_area_list for looking up vmalloc areas to read from. These same areas are also tracked by a rb_tree (vmap_area_root) which offers logarithmic lookup. This patch modifies vread() to use the rb_tree structure instead of the list and the speedup for heavy /proc/kcore readers can be pretty significant. Below are the wall clock measurements of a Python application that leverages the drgn debugging library to read and interpret data read from /proc/kcore. Before the patch: ----- $ time sudo sdb -e 'dbuf | head 3000 | wc' (unsigned long)3000 real 0m22.446s user 0m2.321s sys 0m20.690s ----- With the patch: ----- $ time sudo sdb -e 'dbuf | head 3000 | wc' (unsigned long)3000 real 0m2.104s user 0m2.043s sys 0m0.921s ----- Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com> --- Changed in v2: - Use __find_vmap_area() for initial lookup but keep iteration via va->list. mm/vmalloc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)