Message ID | 20250415023952.27850-3-bhe@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/vmalloc.c: code cleanup and improvements | expand |
On Tue, Apr 15, 2025 at 10:39:49AM +0800, Baoquan He wrote: > When finding VA in vn->busy, if VA spans several zones and the passed > addr is not the same as va->va_start, we should scan the vn in reverse > odrdr because the starting address of VA must be smaller than the passed > addr if it really resides in the VA. > > E.g on a system nr_vmap_nodes=100, > > <----va----> > -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- > ... n-1 n n+1 n+2 ... 100 0 1 > > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', > then succeed very soon. > > Meanwhile we still need loop around because VA could spans node from 'n' > to node 100, node 0, node 1. > > Anyway, changing to find in reverse order can improve efficiency on > many CPUs system. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index aca1905d3397..488d69b56765 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } > -- > 2.41.0 > It depends. Consider a below situation: addr | VA V <------------> <---|---|---|---|---|---|---|---> 0 1 2 3 0 1 2 3 basically it matters how big VA and how many nodes it spans. But i agree that an assumption to reverse back is more convinced in most cases. Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> -- Uladzislau Rezki
On 4/15/2025 8:09 AM, Baoquan He wrote: > When finding VA in vn->busy, if VA spans several zones and the passed > addr is not the same as va->va_start, we should scan the vn in reverse > odrdr because the starting address of VA must be smaller than the passed > addr if it really resides in the VA. > > E.g on a system nr_vmap_nodes=100, > > <----va----> > -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- > ... n-1 n n+1 n+2 ... 100 0 1 > > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', > then succeed very soon. > > Meanwhile we still need loop around because VA could spans node from 'n' > to node 100, node 0, node 1. > > Anyway, changing to find in reverse order can improve efficiency on > many CPUs system. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index aca1905d3397..488d69b56765 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> Thanks, Shivank
On 04/15/25 at 05:25pm, Uladzislau Rezki wrote: > On Tue, Apr 15, 2025 at 10:39:49AM +0800, Baoquan He wrote: > > When finding VA in vn->busy, if VA spans several zones and the passed > > addr is not the same as va->va_start, we should scan the vn in reverse > > odrdr because the starting address of VA must be smaller than the passed > > addr if it really resides in the VA. > > > > E.g on a system nr_vmap_nodes=100, > > > > <----va----> > > -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- > > ... n-1 n n+1 n+2 ... 100 0 1 > > > > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed > > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', > > then succeed very soon. > > > > Meanwhile we still need loop around because VA could spans node from 'n' > > to node 100, node 0, node 1. > > > > Anyway, changing to find in reverse order can improve efficiency on > > many CPUs system. > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > mm/vmalloc.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index aca1905d3397..488d69b56765 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > > > if (va) > > return va; > > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > > > return NULL; > > } > > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > > > if (va) > > return va; > > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > > > return NULL; > > } > > -- > > 2.41.0 > > > It depends. Consider a below situation: > > addr > | > VA V > <------------> > <---|---|---|---|---|---|---|---> > 0 1 2 3 0 1 2 3 > > basically it matters how big VA and how many nodes it spans. But i > agree that an assumption to reverse back is more convinced in most > cases. Agree, on small system with few CPUs and big VA case, the advantage is not apparent. > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Thanks.
diff --git a/mm/vmalloc.c b/mm/vmalloc.c index aca1905d3397..488d69b56765 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) if (va) return va; - } while ((i = (i + 1) % nr_vmap_nodes) != j); + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); return NULL; } @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) if (va) return va; - } while ((i = (i + 1) % nr_vmap_nodes) != j); + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); return NULL; }
When finding VA in vn->busy, if VA spans several zones and the passed addr is not the same as va->va_start, we should scan the vn in reverse odrdr because the starting address of VA must be smaller than the passed addr if it really resides in the VA. E.g on a system nr_vmap_nodes=100, <----va----> -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- ... n-1 n n+1 n+2 ... 100 0 1 VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', then succeed very soon. Meanwhile we still need loop around because VA could spans node from 'n' to node 100, node 0, node 1. Anyway, changing to find in reverse order can improve efficiency on many CPUs system. Signed-off-by: Baoquan He <bhe@redhat.com> --- mm/vmalloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)