Message ID | 20250331205709.1148069-1-joannelkoong@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1] fuse: add numa affinity for uring queues | expand |
On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > There is a 1:1 mapping between cpus and queues. Allocate the queue on > the numa node associated with the cpu to help reduce memory access > latencies. > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > --- > fs/fuse/dev_uring.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c > index accdce2977c5..0762d6229ac6 100644 > --- a/fs/fuse/dev_uring.c > +++ b/fs/fuse/dev_uring.c > @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring, > struct fuse_ring_queue *queue; > struct list_head *pq; > > - queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT); > + queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid)); > if (!queue) > return NULL; > pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL); On the same note I guess we should also allocate pq on the corresponding numa node too. > -- > 2.47.1 >
On 4/1/25 01:42, Joanne Koong wrote: > On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote: >> >> There is a 1:1 mapping between cpus and queues. Allocate the queue on >> the numa node associated with the cpu to help reduce memory access >> latencies. >> >> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> >> --- >> fs/fuse/dev_uring.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c >> index accdce2977c5..0762d6229ac6 100644 >> --- a/fs/fuse/dev_uring.c >> +++ b/fs/fuse/dev_uring.c >> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring, >> struct fuse_ring_queue *queue; >> struct list_head *pq; >> >> - queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT); >> + queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid)); >> if (!queue) >> return NULL; >> pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL); > > On the same note I guess we should also allocate pq on the > corresponding numa node too. So this is supposed to be called from a thread that already runs on this numa node and then kmalloc will allocate anyway on the right node, afaik. Do you have a use case where this is called from another node? If you do, all allocations in this file should be changed. Thanks, Bernd
On Tue, Apr 1, 2025 at 1:11 AM Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > > On 4/1/25 01:42, Joanne Koong wrote: > > On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote: > >> > >> There is a 1:1 mapping between cpus and queues. Allocate the queue on > >> the numa node associated with the cpu to help reduce memory access > >> latencies. > >> > >> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > >> --- > >> fs/fuse/dev_uring.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c > >> index accdce2977c5..0762d6229ac6 100644 > >> --- a/fs/fuse/dev_uring.c > >> +++ b/fs/fuse/dev_uring.c > >> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring, > >> struct fuse_ring_queue *queue; > >> struct list_head *pq; > >> > >> - queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT); > >> + queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid)); > >> if (!queue) > >> return NULL; > >> pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL); > > > > On the same note I guess we should also allocate pq on the > > corresponding numa node too. > > So this is supposed to be called from a thread that already runs on this > numa node and then kmalloc will allocate anyway on the right node, > afaik. Do you have a use case where this is called from another node? If > you do, all allocations in this file should be changed. > I don't have a use case I'm using but imo it seems hardier to ensure this at the kernel level for queue, pq, and ent allocations instead of assuming userspace will always submit the registration from the thread on the numa node corresponding to the qid it's registering. I don't feel strongly about this though so if you think the responsibility should be left to userspace, then that's fine with me. Thanks, Joanne > > Thanks, > Bernd
On Tue, Apr 1, 2025 at 10:50 AM Joanne Koong <joannelkoong@gmail.com> wrote: > > On Tue, Apr 1, 2025 at 1:11 AM Bernd Schubert > <bernd.schubert@fastmail.fm> wrote: > > > > On 4/1/25 01:42, Joanne Koong wrote: > > > On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > >> > > >> There is a 1:1 mapping between cpus and queues. Allocate the queue on > > >> the numa node associated with the cpu to help reduce memory access > > >> latencies. > > >> > > >> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > >> --- > > >> fs/fuse/dev_uring.c | 2 +- > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> > > >> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c > > >> index accdce2977c5..0762d6229ac6 100644 > > >> --- a/fs/fuse/dev_uring.c > > >> +++ b/fs/fuse/dev_uring.c > > >> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring, > > >> struct fuse_ring_queue *queue; > > >> struct list_head *pq; > > >> > > >> - queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT); > > >> + queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid)); > > >> if (!queue) > > >> return NULL; > > >> pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL); > > > > > > On the same note I guess we should also allocate pq on the > > > corresponding numa node too. > > > > So this is supposed to be called from a thread that already runs on this > > numa node and then kmalloc will allocate anyway on the right node, > > afaik. Do you have a use case where this is called from another node? If > > you do, all allocations in this file should be changed. > > > > I don't have a use case I'm using but imo it seems hardier to ensure > this at the kernel level for queue, pq, and ent allocations instead of > assuming userspace will always submit the registration from the thread > on the numa node corresponding to the qid it's registering. I don't > feel strongly about this though so if you think the responsibility > should be left to userspace, then that's fine with me. Thinking about this some more, if the responsibility for numa affinity should be left to userspace, then that should hold true too for configurable queues, no? Thanks, Joanne > > Thanks, > Joanne > > > > > Thanks, > > Bernd
On 4/1/25 19:50, Joanne Koong wrote: > On Tue, Apr 1, 2025 at 1:11 AM Bernd Schubert > <bernd.schubert@fastmail.fm> wrote: >> >> On 4/1/25 01:42, Joanne Koong wrote: >>> On Mon, Mar 31, 2025 at 1:57 PM Joanne Koong <joannelkoong@gmail.com> wrote: >>>> >>>> There is a 1:1 mapping between cpus and queues. Allocate the queue on >>>> the numa node associated with the cpu to help reduce memory access >>>> latencies. >>>> >>>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com> >>>> --- >>>> fs/fuse/dev_uring.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c >>>> index accdce2977c5..0762d6229ac6 100644 >>>> --- a/fs/fuse/dev_uring.c >>>> +++ b/fs/fuse/dev_uring.c >>>> @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring, >>>> struct fuse_ring_queue *queue; >>>> struct list_head *pq; >>>> >>>> - queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT); >>>> + queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid)); >>>> if (!queue) >>>> return NULL; >>>> pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL); >>> >>> On the same note I guess we should also allocate pq on the >>> corresponding numa node too. >> >> So this is supposed to be called from a thread that already runs on this >> numa node and then kmalloc will allocate anyway on the right node, >> afaik. Do you have a use case where this is called from another node? If >> you do, all allocations in this file should be changed. >> > > I don't have a use case I'm using but imo it seems hardier to ensure > this at the kernel level for queue, pq, and ent allocations instead of > assuming userspace will always submit the registration from the thread > on the numa node corresponding to the qid it's registering. I don't > feel strongly about this though so if you think the responsibility > should be left to userspace, then that's fine with me. > What I basically mean is that these threads are going to send the initial SQE, shouldn't it? Sounds a bit weird to me if the setup and sending the registration SQE are from different threads than the later handling. Thanks, Bernd
diff --git a/fs/fuse/dev_uring.c b/fs/fuse/dev_uring.c index accdce2977c5..0762d6229ac6 100644 --- a/fs/fuse/dev_uring.c +++ b/fs/fuse/dev_uring.c @@ -256,7 +256,7 @@ static struct fuse_ring_queue *fuse_uring_create_queue(struct fuse_ring *ring, struct fuse_ring_queue *queue; struct list_head *pq; - queue = kzalloc(sizeof(*queue), GFP_KERNEL_ACCOUNT); + queue = kzalloc_node(sizeof(*queue), GFP_KERNEL_ACCOUNT, cpu_to_node(qid)); if (!queue) return NULL; pq = kcalloc(FUSE_PQ_HASH_SIZE, sizeof(struct list_head), GFP_KERNEL);
There is a 1:1 mapping between cpus and queues. Allocate the queue on the numa node associated with the cpu to help reduce memory access latencies. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> --- fs/fuse/dev_uring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)