Message ID | 20190630075650.8516-1-lpf.vector@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | mm/vmalloc.c: improve readability and rewrite vmap_area | expand |
On Sun 30-06-19 15:56:45, Pengfei Li wrote: > Hi, > > This series of patches is to reduce the size of struct vmap_area. > > Since the members of struct vmap_area are not being used at the same time, > it is possible to reduce its size by placing several members that are not > used at the same time in a union. > > The first 4 patches did some preparatory work for this and improved > readability. > > The fifth patch is the main patch, it did the work of rewriting vmap_area. > > More details can be obtained from the commit message. None of the commit messages talk about the motivation. Why do we want to add quite some code to achieve this? How much do we save? This all should be a part of the cover letter. > Thanks, > > Pengfei > > Pengfei Li (5): > mm/vmalloc.c: Introduce a wrapper function of insert_vmap_area() > mm/vmalloc.c: Introduce a wrapper function of > insert_vmap_area_augment() > mm/vmalloc.c: Rename function __find_vmap_area() for readability > mm/vmalloc.c: Modify function merge_or_add_vmap_area() for readability > mm/vmalloc.c: Rewrite struct vmap_area to reduce its size > > include/linux/vmalloc.h | 28 +++++--- > mm/vmalloc.c | 144 +++++++++++++++++++++++++++------------- > 2 files changed, 117 insertions(+), 55 deletions(-) > > -- > 2.21.0
On Mon, Jul 01, 2019 at 11:20:37AM +0200, Michal Hocko wrote: > On Sun 30-06-19 15:56:45, Pengfei Li wrote: > > Hi, > > > > This series of patches is to reduce the size of struct vmap_area. > > > > Since the members of struct vmap_area are not being used at the same time, > > it is possible to reduce its size by placing several members that are not > > used at the same time in a union. > > > > The first 4 patches did some preparatory work for this and improved > > readability. > > > > The fifth patch is the main patch, it did the work of rewriting vmap_area. > > > > More details can be obtained from the commit message. > > None of the commit messages talk about the motivation. Why do we want to > add quite some code to achieve this? How much do we save? This all > should be a part of the cover letter. > > > Thanks, > > > > Pengfei > > > > Pengfei Li (5): > > mm/vmalloc.c: Introduce a wrapper function of insert_vmap_area() > > mm/vmalloc.c: Introduce a wrapper function of > > insert_vmap_area_augment() > > mm/vmalloc.c: Rename function __find_vmap_area() for readability > > mm/vmalloc.c: Modify function merge_or_add_vmap_area() for readability > > mm/vmalloc.c: Rewrite struct vmap_area to reduce its size > > > > include/linux/vmalloc.h | 28 +++++--- > > mm/vmalloc.c | 144 +++++++++++++++++++++++++++------------- > > 2 files changed, 117 insertions(+), 55 deletions(-) > > > > -- > > 2.21.0 > > Pengfei Li (5): > > mm/vmalloc.c: Introduce a wrapper function of insert_vmap_area() > > mm/vmalloc.c: Introduce a wrapper function of > > insert_vmap_area_augment() > > mm/vmalloc.c: Rename function __find_vmap_area() for readability > > mm/vmalloc.c: Modify function merge_or_add_vmap_area() for readability > > mm/vmalloc.c: Rewrite struct vmap_area to reduce its size Fitting vmap_area to 1 cacheline boundary makes sense to me. I was thinking about that and i have patches in my pipeline to send out but implementation is different. I had a look at all 5 patches. What you are doing is reasonable to me, i mean when it comes to the idea of reducing the size to L1 cache line. I have a concern about implementation and all logic around when we can use va_start and when it is something else. It is not optimal at least to me, from performance point of view and complexity. All hot paths and tree traversal are affected by that. For example running the vmalloc test driver against this series shows the following delta: <5.2.0-rc6+> Summary: fix_size_alloc_test passed: loops: 1000000 avg: 969370 usec Summary: full_fit_alloc_test passed: loops: 1000000 avg: 989619 usec Summary: long_busy_list_alloc_test loops: 1000000 avg: 12895813 usec <5.2.0-rc6+> <this series> Summary: fix_size_alloc_test passed: loops: 1000000 avg: 1098372 usec Summary: full_fit_alloc_test passed: loops: 1000000 avg: 1167260 usec Summary: long_busy_list_alloc_test passed: loops: 1000000 avg: 12934286 usec <this series> For example, the degrade in second test is ~15%. -- Vlad Rezki
Michal Hocko <mhocko@kernel.org> 于2019年7月1日周一 下午5:20写道: > > On Sun 30-06-19 15:56:45, Pengfei Li wrote: > > Hi, > > > > This series of patches is to reduce the size of struct vmap_area. > > > > Since the members of struct vmap_area are not being used at the same time, > > it is possible to reduce its size by placing several members that are not > > used at the same time in a union. > > > > The first 4 patches did some preparatory work for this and improved > > readability. > > > > The fifth patch is the main patch, it did the work of rewriting vmap_area. > > > > More details can be obtained from the commit message. > > None of the commit messages talk about the motivation. Why do we want to > add quite some code to achieve this? How much do we save? This all > should be a part of the cover letter. > Hi Michal, Thank you for your comments. Sorry for the commit without any test data. I will add motivation and necessary test data in the next version. Best regards, Pengfei > > Thanks, > > > > Pengfei > > > > Pengfei Li (5): > > mm/vmalloc.c: Introduce a wrapper function of insert_vmap_area() > > mm/vmalloc.c: Introduce a wrapper function of > > insert_vmap_area_augment() > > mm/vmalloc.c: Rename function __find_vmap_area() for readability > > mm/vmalloc.c: Modify function merge_or_add_vmap_area() for readability > > mm/vmalloc.c: Rewrite struct vmap_area to reduce its size > > > > include/linux/vmalloc.h | 28 +++++--- > > mm/vmalloc.c | 144 +++++++++++++++++++++++++++------------- > > 2 files changed, 117 insertions(+), 55 deletions(-) > > > > -- > > 2.21.0 > > -- > Michal Hocko > SUSE Labs
Uladzislau Rezki <urezki@gmail.com> 于2019年7月1日周一 下午6:11写道: > > On Mon, Jul 01, 2019 at 11:20:37AM +0200, Michal Hocko wrote: > > On Sun 30-06-19 15:56:45, Pengfei Li wrote: > > > Hi, > > > > > > This series of patches is to reduce the size of struct vmap_area. > > > > > > Since the members of struct vmap_area are not being used at the same time, > > > it is possible to reduce its size by placing several members that are not > > > used at the same time in a union. > > > > > > The first 4 patches did some preparatory work for this and improved > > > readability. > > > > > > The fifth patch is the main patch, it did the work of rewriting vmap_area. > > > > > > More details can be obtained from the commit message. > > > > None of the commit messages talk about the motivation. Why do we want to > > add quite some code to achieve this? How much do we save? This all > > should be a part of the cover letter. > > > > > Thanks, > > > > > > Pengfei > > > > > > Pengfei Li (5): > > > mm/vmalloc.c: Introduce a wrapper function of insert_vmap_area() > > > mm/vmalloc.c: Introduce a wrapper function of > > > insert_vmap_area_augment() > > > mm/vmalloc.c: Rename function __find_vmap_area() for readability > > > mm/vmalloc.c: Modify function merge_or_add_vmap_area() for readability > > > mm/vmalloc.c: Rewrite struct vmap_area to reduce its size > > > > > > include/linux/vmalloc.h | 28 +++++--- > > > mm/vmalloc.c | 144 +++++++++++++++++++++++++++------------- > > > 2 files changed, 117 insertions(+), 55 deletions(-) > > > > > > -- > > > 2.21.0 > > > > Pengfei Li (5): > > > mm/vmalloc.c: Introduce a wrapper function of insert_vmap_area() > > > mm/vmalloc.c: Introduce a wrapper function of > > > insert_vmap_area_augment() > > > mm/vmalloc.c: Rename function __find_vmap_area() for readability > > > mm/vmalloc.c: Modify function merge_or_add_vmap_area() for readability > > > mm/vmalloc.c: Rewrite struct vmap_area to reduce its size > Fitting vmap_area to 1 cacheline boundary makes sense to me. I was thinking about > that and i have patches in my pipeline to send out but implementation is different. > > I had a look at all 5 patches. What you are doing is reasonable to me, i mean when > it comes to the idea of reducing the size to L1 cache line. > Thank you for your review. > I have a concern about implementation and all logic around when we can use va_start > and when it is something else. It is not optimal at least to me, from performance point > of view and complexity. All hot paths and tree traversal are affected by that. > > For example running the vmalloc test driver against this series shows the following > delta: > > <5.2.0-rc6+> > Summary: fix_size_alloc_test passed: loops: 1000000 avg: 969370 usec > Summary: full_fit_alloc_test passed: loops: 1000000 avg: 989619 usec > Summary: long_busy_list_alloc_test loops: 1000000 avg: 12895813 usec > <5.2.0-rc6+> > > <this series> > Summary: fix_size_alloc_test passed: loops: 1000000 avg: 1098372 usec > Summary: full_fit_alloc_test passed: loops: 1000000 avg: 1167260 usec > Summary: long_busy_list_alloc_test passed: loops: 1000000 avg: 12934286 usec > <this series> > > For example, the degrade in second test is ~15%. > > -- > Vlad Rezki Hi, Vlad I think the reason for the performance degradation is that the value of va_start is obtained by va->vm->addr. And since the vmap area in the BUSY tree is always page-aligned, there is no reason for _va_vmlid to override va_start, just let the va->flags use the bits that lower than PAGE_OFFSET. I will use this implementation in the next version and show almost no performance penalty in my local tests. I will send the next version soon. Thank you for taking your time for the review. Best regards, Pengfei