Message ID | 20220606070804.40268-1-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | tcp: use kvmalloc_array() to allocate table_perturb | expand |
On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <songmuchun@bytedance.com> wrote: > > In our server, there may be no high order (>= 6) memory since we reserve > lots of HugeTLB pages when booting. Then the system panic. So use > kvmalloc_array() to allocate table_perturb. > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> Please add a Fixes: tag and CC original author ? Thanks.
On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <edumazet@google.com> wrote: > > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <songmuchun@bytedance.com> wrote: > > > > In our server, there may be no high order (>= 6) memory since we reserve > > lots of HugeTLB pages when booting. Then the system panic. So use > > kvmalloc_array() to allocate table_perturb. > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > Please add a Fixes: tag and CC original author ? > > Thanks. Also using alloc_large_system_hash() might be a better option anyway, spreading pages on multiple nodes on NUMA hosts.
On Tue, Jun 7, 2022 at 12:13 AM Eric Dumazet <edumazet@google.com> wrote: > > On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <edumazet@google.com> wrote: > > > > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <songmuchun@bytedance.com> wrote: > > > > > > In our server, there may be no high order (>= 6) memory since we reserve > > > lots of HugeTLB pages when booting. Then the system panic. So use > > > kvmalloc_array() to allocate table_perturb. > > > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > > > Please add a Fixes: tag and CC original author ? > > Will do. > > Thanks. > > Also using alloc_large_system_hash() might be a better option anyway, > spreading pages on multiple nodes on NUMA hosts. Using alloc_large_system_hash() LGTM, but I didn't see where the memory is allocated on multi-node in alloc_large_system_hash() or vmalloc_huge(), what I missed here? Thanks.
On Mon, Jun 6, 2022 at 8:56 PM Muchun Song <songmuchun@bytedance.com> wrote: > > On Tue, Jun 7, 2022 at 12:13 AM Eric Dumazet <edumazet@google.com> wrote: > > > > On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <songmuchun@bytedance.com> wrote: > > > > > > > > In our server, there may be no high order (>= 6) memory since we reserve > > > > lots of HugeTLB pages when booting. Then the system panic. So use > > > > kvmalloc_array() to allocate table_perturb. > > > > > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > > > > > Please add a Fixes: tag and CC original author ? > > > > > Will do. > > > > Thanks. > > > > Also using alloc_large_system_hash() might be a better option anyway, > > spreading pages on multiple nodes on NUMA hosts. > > Using alloc_large_system_hash() LGTM, but > I didn't see where the memory is allocated on multi-node > in alloc_large_system_hash() or vmalloc_huge(), what I > missed here? This is done by default. You do not have to do anything special. Just call alloc_large_system_hash(). For instance, on two socket system: # grep alloc_large_system_hash /proc/vmallocinfo 0x000000005536618c-0x00000000a4ae0198 12288 alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1 0x000000003beddc38-0x0000000092b61b54 12288 alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1 0x0000000092b61b54-0x000000005c33d7fb 12288 alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1 0x000000004c0588af-0x0000000012cf548f 12288 alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1 0x000000008d50035e-0x00000000f434e297 266240 alloc_large_system_hash+0x1df/0x2f0 pages=64 vmalloc N0=32 N1=32 0x00000000fe631da3-0x00000000b60e95b8 268439552 alloc_large_system_hash+0x1df/0x2f0 pages=65536 vmalloc vpages N0=32768 N1=32768 0x00000000b60e95b8-0x0000000062eb7a11 528384 alloc_large_system_hash+0x1df/0x2f0 pages=128 vmalloc N0=64 N1=64 0x0000000062eb7a11-0x000000005408af10 134221824 alloc_large_system_hash+0x1df/0x2f0 pages=32768 vmalloc vpages N0=16384 N1=16384 0x000000005408af10-0x0000000054fb99eb 4198400 alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512 N1=512 0x0000000054fb99eb-0x00000000a130e604 4198400 alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512 N1=512 0x00000000a130e604-0x00000000e6e62c85 4198400 alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512 N1=512 0x00000000e6e62c85-0x000000005ca0ef7c 2101248 alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256 0x000000005ca0ef7c-0x000000003bfe757f 1052672 alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128 0x000000003bfe757f-0x00000000bf49fcbd 4198400 alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512 N1=512 0x00000000bf49fcbd-0x00000000902de200 1052672 alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128 0x00000000902de200-0x00000000c3d2821a 2101248 alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256 0x00000000c3d2821a-0x000000002ddc68f6 2101248 alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256 You can see N0=X and N1=X meaning pages are evenly spread among the two nodes.
On Tue, Jun 7, 2022 at 12:03 PM Eric Dumazet <edumazet@google.com> wrote: > > On Mon, Jun 6, 2022 at 8:56 PM Muchun Song <songmuchun@bytedance.com> wrote: > > > > On Tue, Jun 7, 2022 at 12:13 AM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <edumazet@google.com> wrote: > > > > > > > > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <songmuchun@bytedance.com> wrote: > > > > > > > > > > In our server, there may be no high order (>= 6) memory since we reserve > > > > > lots of HugeTLB pages when booting. Then the system panic. So use > > > > > kvmalloc_array() to allocate table_perturb. > > > > > > > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > > > > > > > Please add a Fixes: tag and CC original author ? > > > > > > > > Will do. > > > > > > Thanks. > > > > > > Also using alloc_large_system_hash() might be a better option anyway, > > > spreading pages on multiple nodes on NUMA hosts. > > > > Using alloc_large_system_hash() LGTM, but > > I didn't see where the memory is allocated on multi-node > > in alloc_large_system_hash() or vmalloc_huge(), what I > > missed here? > > This is done by default. You do not have to do anything special. Just > call alloc_large_system_hash(). > > For instance, on two socket system: > > # grep alloc_large_system_hash /proc/vmallocinfo > 0x000000005536618c-0x00000000a4ae0198 12288 > alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1 > 0x000000003beddc38-0x0000000092b61b54 12288 > alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1 > 0x0000000092b61b54-0x000000005c33d7fb 12288 > alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1 > 0x000000004c0588af-0x0000000012cf548f 12288 > alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1 > 0x000000008d50035e-0x00000000f434e297 266240 > alloc_large_system_hash+0x1df/0x2f0 pages=64 vmalloc N0=32 N1=32 > 0x00000000fe631da3-0x00000000b60e95b8 268439552 > alloc_large_system_hash+0x1df/0x2f0 pages=65536 vmalloc vpages > N0=32768 N1=32768 > 0x00000000b60e95b8-0x0000000062eb7a11 528384 > alloc_large_system_hash+0x1df/0x2f0 pages=128 vmalloc N0=64 N1=64 > 0x0000000062eb7a11-0x000000005408af10 134221824 > alloc_large_system_hash+0x1df/0x2f0 pages=32768 vmalloc vpages > N0=16384 N1=16384 > 0x000000005408af10-0x0000000054fb99eb 4198400 > alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512 > N1=512 > 0x0000000054fb99eb-0x00000000a130e604 4198400 > alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512 > N1=512 > 0x00000000a130e604-0x00000000e6e62c85 4198400 > alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512 > N1=512 > 0x00000000e6e62c85-0x000000005ca0ef7c 2101248 > alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256 > 0x000000005ca0ef7c-0x000000003bfe757f 1052672 > alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128 > 0x000000003bfe757f-0x00000000bf49fcbd 4198400 > alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512 > N1=512 > 0x00000000bf49fcbd-0x00000000902de200 1052672 > alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128 > 0x00000000902de200-0x00000000c3d2821a 2101248 > alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256 > 0x00000000c3d2821a-0x000000002ddc68f6 2101248 > alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256 > > You can see N0=X and N1=X meaning pages are evenly spread among the two nodes. Thanks a lot. Really helpful information.
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index e8de5e699b3f..1ecbfdebc6bf 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -1026,8 +1026,8 @@ void __init inet_hashinfo2_init(struct inet_hashinfo *h, const char *name, init_hashinfo_lhash2(h); /* this one is used for source ports of outgoing connections */ - table_perturb = kmalloc_array(INET_TABLE_PERTURB_SIZE, - sizeof(*table_perturb), GFP_KERNEL); + table_perturb = kvmalloc_array(INET_TABLE_PERTURB_SIZE, + sizeof(*table_perturb), GFP_KERNEL); if (!table_perturb) panic("TCP: failed to alloc table_perturb"); }
In our server, there may be no high order (>= 6) memory since we reserve lots of HugeTLB pages when booting. Then the system panic. So use kvmalloc_array() to allocate table_perturb. Signed-off-by: Muchun Song <songmuchun@bytedance.com> --- net/ipv4/inet_hashtables.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)